Abstract
We present the design and implementation of UPMLIB, a runtime system that provides transparent facilities for dynamically tuning the memory performance of OpenMP programs on scalable shared-memory multiprocessors with hardware cache-coherence. UPMLIB integrates information from the com- piler and the operating system, to implement algorithms that perform accurate and timely page migrations. The algorithms and the associated mechanisms cor- relate memory reference information with the semantics of parallel programs and scheduling events that break the association between threads and data for which threads have memory affinity at runtime. Our experimental evidence shows that UPMLIB makes OpenMP programs immune to the page placement strategy of the operating system, thus obviating the need for introducing data placement directives in OpenMP. Furthermore, UPMlib provides solid improvements of throughput in multiprogrammed execution environments.
This work was supported by the E.C. through the TMR Contract No. ERBFMGECT-950062 and in part through the IV Framework (ESPRIT Programme, Project No. 21907,NANOS), the Greek Secretariat of Research and Technology (Contract No. E.D.-99-566) and the Spanish Ministry of Education through projects No. TIC98-511 and TIC97-1445CE. The experiments were conducted with resources provided by the European Center for Parallelism of Barcelona (CEPBA).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Ayguade et.al. NanosCompiler: A Research Platform for OpenMP Extensions. Proc. of the First EuropeanWorkshop on OpenMP, pp. 27–31. Lund, Sweden, October 1999.
D. Culler, J. P. Singh and A. Gupta. Parallel Computer Architecture: A Hardware/Software Approach.Morgan Kaufmann Publishers, 1998.
W. Gropp et.al. MPI: The Complete Reference, Vol. 2. MIT Press, 1998.
G. Howard and D. Lowenthal. An Integrated Compiler/Run-Time System for Global Data Distribution in Distributed SharedMemory Systems. Proc. of the 2ndWorkshop on Software Distributed Shared Memory, in conjunction with ACM ICS’2000. Santa Fe, New Mexico, May 2000.
D. Jiang and J. P. Singh. Scaling Application Performance on a Cache-Coherent Multiprocessor. Proc. of the 26th International Symposium on Computer Architecture, pp. 305–316. lAtlanta, Georgia, May 1999.
H. Jin, M. Frumkin and J. Yan. The OpenMP Implementation of NAS Parallel Benchmarks. Technical Report NAS-99-011, NASA Ames Research Center. October 1999.
J. Laudon and D. Lenoski. The SGI Origin2000: A ccNUMA Highly Scalable Server. Proc. of the 24th Int. Symposiumon Computer Architecture, pp. 241–251.Denver, Colorado, June 1997.
J. Levesque. The Future of OpenMP on IBM SMP Systems. Invited talk. First European Workshop on OpenMP. Lund, Sweden, October 1999.
D. Nikolopoulos et.al. A Case for User-Level Dynamic Page Migration. Proc. of the 14th ACM International Conference on Supercomputing, pp. 119–130. Santa Fe, New Mexico, May 2000.
D. Nikolopoulos et.al. User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors. To appear in the 29th International Conference on Parallel Processing. Toronto, Canada, August 2000.
D. Nikolopoulos et.al. Leveraging TransparentData Distribution in OpenMP via user-level Dynamic Page Migration. To appear in the 3rd International Symposium on High Performance Computing,Workshop on OpenMP Experiences and Implementations. Tokyo, Japan, October 2000.
OpenMP Architecture Review Board. OpenMP FORTRAN Application Programming Interface. Version 1.1, November 1999.
M. Resch and B. Sander. A Comparison of OpenMP and MPI for the Parallel CFD Case. Proc. of the First EuropeanWorkshop on OpenMP. Lund, Sweden, October 1999.
Silicon Graphics Inc. Origin2000 Performance Tuning and Optimization Guide. IRIX 6.5 Technical Publications, http://techpubs.sgi.com. Accessed January 2000.
B. Verghese, S. Devine, A. Gupta and M. Rosenblum. Operating System Support for Improving Data Locality on CC-NUMA Compute Servers. Proc. of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 279–289. Cambridge, Massachusetts, October 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nikolopoulos, D.S., Papatheodorou, T.S., Polychronopoulos2, C.D., Labarta3, J., Ayguadé3, E. (2000). UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors. In: Dwarkadas, S. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 2000. Lecture Notes in Computer Science, vol 1915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40889-4_7
Download citation
DOI: https://doi.org/10.1007/3-540-40889-4_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41185-7
Online ISBN: 978-3-540-40889-5
eBook Packages: Springer Book Archive