Skip to main content

Implementing an OpenMP Execution Environment on InfiniBand Clusters

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4315))

Abstract

Cluster systems interconnected via fast interconnection networks have been successfully applied to various research fields for parallel execution of large applications. Next to MPI, the conventional programming model, OpenMP is increasingly used for parallelizing sequential codes. Due to its easy programming interface and similar semantics with traditional programming languages, OpenMP is especially appropriate for non-professional users.

For exploiting scalable parallel computation, we have established a PC cluster using InfiniBand, a high-performance, de facto standard interconnection technology. In order to support the users with a simple parallel programming model, we have implemented an OpenMP execution environment on top of this cluster. As a global memory abstraction is needed for shared data, we first built a software distributed shared memory implementing a kind of Home-based Lazy Release Consistency protocol. We then modified an existing OpenMP source-to-source compiler for mapping shared data on this DSM and for handling issues with respect to process/thread activities and task distribution. Experimental results based on a set of different OpenMP applications show a speedup of up to 5.22 on systems with 6 processor nodes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basumallik, A., Min, S.-J., Eigenmann, R.: Towards OpenMP Execution on Software Distributed Shared Memory Systems. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 457–468. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Beltrametti, M., Bobey, K., Zorbas, J.R.: The Control Mechanism for the Myrias Parallel Computer System. ACM SIGARCH Computer Architecture News 16(4), 21–30 (1988)

    Article  Google Scholar 

  3. Cox, A.L., Dwarkadas, S., Keleher, P.J., Lu, H., Rajamony, R., Zwaenepoel, W.: Software Versus Hardware Shared-Memory Implementation: A Case Study. In: Proceedings of the 21th Annual International Symposium on Computer Architecture, April 1994, pp. 106–117 (1994)

    Google Scholar 

  4. Bailey, D., et al.: The NAS Parallel Benchmarks. Technical Report RNR-94-007, Department of Mathematics and Computer Science, Emory University (March 1994)

    Google Scholar 

  5. Gonzàlez, M., Ayguadé, E., Martorell, X., Labarta, J., Navarro, N., Oliver, J.: NanosCompiler: Supporting Flexible Multilevel Parallelism in OpenMP. Concurrency: Practice and Experience 12(12), 1205–1218 (2000)

    Article  MATH  Google Scholar 

  6. Iftode, L., Singh, J.P.: Shared Virtual Memory: Progress and Challenges. Proceedings of the IEEE, Special Issue on Distributed Shared Memory, 87, 498–507 (1999)

    Google Scholar 

  7. InfiniBand Trade Association. InfiniBand Architecture Specification, vol. 1 (November 2002)

    Google Scholar 

  8. Jin, H., Frumkin, M., Yan, J.: The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance. Technical Report NAS-99-011, NASA Ames Research Center (October 1999)

    Google Scholar 

  9. Keleher, P., Dwarkadas, S., Cox, A., Zwaenepoel, W.: TreadMarks: Distributed Shared Memory On Standard Workstations and Operating Systems. In: Proceedings of the 1994 Winter Usenix Conference, January 1994, pp. 115–131 (1994)

    Google Scholar 

  10. Keleher, P.J.: Lazy Release Consistency for Distributed Shared Memory. PhD thesis, Department of Computer Science, Rice University (January 1995)

    Google Scholar 

  11. Kusano, K., Satoh, S., Sato, M.: Performance Evaluation of the Omni OpenMP Compiler. In: Valero, M., Joe, K., Kitsuregawa, M., Tanaka, H. (eds.) ISHPC 2000. LNCS, vol. 1940, pp. 403–414. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  12. Lamport, L.: How to Make a Multiprocessor That Correctly Executes Multiprocess Programs. IEEE Transactions on Computers 28(9), 241–248 (1979)

    Article  Google Scholar 

  13. Li, K.: Shared Virtual Memory on Loosely Coupled Multiprocessors. PhD thesis, Yale University (September 1986)

    Google Scholar 

  14. Li, K.: IVY: A Shared Virtual Memory System for Parallel Computing. In: Proceedings of the International Conference on Parallel Processing, Software, vol. II, pp. 94–101 (1988)

    Google Scholar 

  15. Martorell, X., Ayguadé, E., Navarro, N., Corbalán, J., González, M., Labarta, J.: Thread Fork/Join Techniques for Multi-Level Parallelism Exploitation in NUMA Multiprocessors. In: Proceedings of the 1999 International Conference on Supercomputing, Rhodes, Greece, June 1999, pp. 294–301 (1999)

    Google Scholar 

  16. Osendorfer, C., Tao, J., Trinitis, C., Mairandres, M.: ViSMI: Software Distributed Shared Memory for InfiniBand Clusters. In: Proceedings of the 3rd IEEE International Symposium on Network Computing and Applications (IEEE NCA 2004), September 2004, pp. 185–191 (2004)

    Google Scholar 

  17. Rangarajan, M., Iftode, L.: Software Distributed Shared Memory over Virtual Interface Architecture: Implementation and Performance. In: Proceedings of the 4th Annual Linux Showcase, Extreme Linux Workshop, Atlanta, USA, October 2000, pp. 341–352 (2000)

    Google Scholar 

  18. Sato, M., Harada, H., Hasegawa, A.: Cluster-enabled OpenMP: An OpenMP compiler for the SCASH software distributed shared memory system. Scientific Programming 9(2-3), 123–130 (2001)

    Google Scholar 

  19. Standish, R.K.: SMP vs Vector: A Head-to-head Comparison. In: Proceedings of the HPCAsia 2001 (September 2001)

    Google Scholar 

  20. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, June 1995, pp. 24–36 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Matthias S. Mueller Barbara M. Chapman Bronis R. de Supinski Allen D. Malony Michael Voss

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tao, J., Karl, W., Trinitis, C. (2008). Implementing an OpenMP Execution Environment on InfiniBand Clusters. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds) OpenMP Shared Memory Parallel Programming. IWOMP 2005. Lecture Notes in Computer Science, vol 4315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68555-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68555-5_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68554-8

  • Online ISBN: 978-3-540-68555-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics