Performance Comparisons of Basic OpenMP Constructs
OpenMP has become the de-facto standard for shared memory parallel programming. The directive based nature of OpenMP allows incremental and portable developement of parallel application for a wide range of platforms. The fact that OpenMP is easy to use implies that a lot of details are hidden from the end user. Therefore, basic factors like the runtime system, compiler optimizations and other implementation specific issues can have a significant impact on the performance of an OpenMP application. Frequently, OpenMP constructs can have widely varying performance on different operating platforms and even with different compilers on the same machine. This makes it very important to have a comparative study of the low-level performance of individual OpenMP constructs. In this paper, we present an enhanced set of microbenchmarks for OpenMP derived from the EPCC benchmarks and based on the SKaMPI benchmarking framework. We describe the methodology of evaluation followed by details of some of the constructs and their performance measurement. Results from experiments conducted on the IBM SP3 and the SUN SunFire systems are presented for each construct.
KeywordsPseudo Code Work Routine Single Directive Runtime System Benchmark Suite
Unable to display preview. Download preview PDF.
- 1.D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, D. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS Parallel Benchmarks. The International Journal of Supercomputer Applications, 5(3):63–73, Fall 1991.CrossRefGoogle Scholar
- 2.OpenMP Architecture Review Board. OpenMP Fortran Application Program Interface, Version 1.1, November 1999.Google Scholar
- 3.J. M. Bull. Measuring Synchronisation and Scheduling Overheads in OpenMP. In EWOMP’ 99, Lund, Sep., 1999., 1999.Google Scholar
- 4.Bronis R. de Supinski and John May. Benchmarking Pthreads Performance. In Proceedings of the 1999 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’ 99), June 1999.Google Scholar
- 6.R. Hockney and M. Berry. Public international benchmarks for parallel computers report. Technical report, Parkbench Committee, 1994.Google Scholar
- 7.R. W. Hockney and V. S. Getov. Low-level benchmarking: Performance profiles. In Proc. Euromicro Workshop on PDP, IEEE CS Press, Jan. 1998.Google Scholar
- 8.R. Reussner, P. Sanders, L. Prechelt, and M. Mueller. SKaMPI: A detailed accurate MPI benchmark. In Springer Lecture Notes in Computer Science., volume 1497, pages 52–59, 1998.Google Scholar
- 9.R.H. Reussner. Skalib: Skampi as a library-technical reference manual. Technical report, Department of Informatic, University of Karlsruhe, Germany, 1999.Google Scholar