MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  --Some Performance Aspects-

Download:
Download as a PDF | Download as a PS
by Forschungszentrum Jlich Gmbh, Interner Bericht, P. Jansen, M. Marx, W. E. Nagel, M. Vaefien, M. Romberg, M. Romberg, M. Vaegen, R. Zimmermann, R. Zimmermann
http://www.kfa-juelich.de/zam/docs/printable/ib/ib-93/ib-9310.ps
Add To MetaCart

Abstract:

Today, most of the Cray multiprocessor systems are still used within a multiprogramming environment. In such environments, there are two main issues contributing to whether or not production codes should exploit parallelism. Firstly, in terms of turnaround time, the parallel program should run faster than the single-tasked program version, and secondly, the costs, i.e. CPU-time from the user's point of view as well as system throughput from the computer center's point of view, should remain reasonably constant. This becomes even more important if parallelism is introduced automatically by calling optimized library routines provided by the vendor. The Cray Scientific Library (libsci) is such an example: The routines have to be highly efficient because these kernels are often used in user codes as basic blocks to build more complex algorithms. Based on libsci 7.0, the fieldtest version libsci 8.0 and the revised libsci 8.005, this report in detail describes performance values obtained for some BLAS algorithms. As can be seen from our results, for the first two libsci versions, significant overhead (up to several hundred per cent) has been observed in many cases, also for large problem sizes. This fact was even more critical because many algorithms provided by third party libraries (i.e. NAG and IMSL) rely on libsci BLAS kernels. Under the UNICOS Rel. 8.0 operating system, the default value for the number of CPUs waiting in parallel was decreased from eight to four. This fact and some further optimizations in libsci 8.005 have mostly solved the problems and this libsci release is now the production version.

Citations

64 der Vorst. Parallel numerical linear algebra – Demmel, Heath, et al. - 1993
54 Matrix Eigensystem Routines: EISPACK Guide Extension – Garbow - 1972
28 Parallel Numerical Algorithms – Freeman, Philips - 1992
25 Matlab User's Guide – Moler - 1980
18 A Collection of Matrices for Testing Computational Algorithms, Wiley-Interscience – Gregory, Gregory, et al. - 1969
13 Numerical Methods for Mathematics, Science, and Engineering – Mathews - 1987
8 Benchmarking parallel programs in a multiprogramming environment: The PAR-Bench System – Nagel, Linn - 1991
6 Numerical methods in practice: using the NAG – Hopking, Phillips - 1988
4 Parallel programs and background load: Efficiency studies with the PAR-Bench system – Nagel, Linn - 1991
2 LAPACK user's guide, Siam Publication – Anderson, Bai, et al. - 1992
2 Improvements to nondedicated performance of autotasking programs – Ban'iuso, Kohn, et al. - 1990
2 Exploiting fine-grain parallelism in a multiprogramming environment on a CRAY Y-MP computer system – Ban'iuso, LaCroix, et al.
2 CF77 compiling system, Vol. 4: Parallel processing guide – CRAY - 1990
2 Linear Algebra Software on a Vector – Hake, Homberg
2 Multitasking: Experiences with applications on a CRAY X-MP, Parallel Computing 12 – Hossfeld, Knecht, et al. - 1989
2 Parallelizing QCD with dynamical fermions on a CRAY multiprocessor system, Parallel Computing 15 – Knecht, Laermann, et al. - 1990
2 Exploiting autotasking on a CRAY Y-MP: An improved software interface to multitasking, Parallel Computing 13 – Nagel - 1990
2 Parallelism on CRAY multiprocessor systems: Concepts of multitasking – Nagel - 1990
2 Vergleich von Standardbibliotheken mathemafischer Software auf IBM- und Cray-Rechnem anhand yon Beispielen aus der Linearen Algebra – Zimmermann, Jansen - 1992