Advertisement

From MPI to OpenSHMEM: Porting LAMMPS

  • Chunyan Tang
  • Aurelien BouteillerEmail author
  • Thomas Herault
  • Manjunath Gorentla Venkata
  • George Bosilca
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9397)

Abstract

This work details the opportunities and challenges of porting a Petascale, MPI-based application —LAMMPS— to OpenSHMEM. We investigate the major programming challenges stemming from the differences in communication semantics, address space organization, and synchronization operations between the two programming models. This work provides several approaches to solve those challenges for representative communication patterns in LAMMPS, e.g., by considering group synchronization, peer’s buffer status tracking, and unpacked direct transfer of scattered data. The performance of LAMMPS is evaluated on the Titan HPC system at ORNL. The OpenSHMEM implementations are compared with MPI versions in terms of both strong and weak scaling. The results outline that OpenSHMEM provides a rich semantic to implement scalable scientific applications. In addition, the experiments demonstrate that OpenSHMEM can compete with, and often improve on, the optimized MPI implementation.

Keywords

Message Passing Interface Address Space Collective Operation Strong Scaling Synchronization Operation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This material is based upon work supported by the U.S. Department of Energy, under contract #DE-AC05-00OR22725, through UT Battelle subcontract #4000123323. The work at Oak Ridge National Laboratory (ORNL) is supported by the United States Department of Defense and used the resources of the Extreme Scale Systems Center located at the ORNL.

References

  1. 1.
    Using the GNI and DMAPP APIs. Technical Report S-2446-3103, Cray Inc. (2011). http://docs.cray.com/books/S-2446-3103/S-2446-3103.pdf
  2. 2.
    OpenSHMEM application programming interface (version 1.2). Technical report, Open Source Software Solutions, Inc. (OSSS) (2015). http://www.openshmem.org
  3. 3.
    Barriuso, R., Knies, A.: SHMEM’s user’s guide for C. Technical report, Cray Research Inc. (1994)Google Scholar
  4. 4.
    Gerstenberger, R., Besta, M., Hoefler, T.: Enabling highly-scalable remote memory access programming with MPI-3 one sided. Sci. Program. 22(2), 75–91 (2014). doi: 10.3233/SPR-140383 Google Scholar
  5. 5.
    Jose, J., Potluri, S., Subramoni, H., Lu, X., Hamidouche, K., Schulz, K., Sundar, H., Panda, D.K.: Designing scalable out-of-core sorting with hybrid MPI+PGAS programming models. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, PGAS 2014, pp. 7:1–7:9. ACM, New York (2014). doi: 10.1145/2676870.2676880
  6. 6.
    Jose, J., Potluri, S., Tomko, K., Panda, D.K.: Designing scalable graph500 benchmark with hybrid MPI+OpenSHMEM programming models. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 109–124. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  7. 7.
    Li, M., Lin, J., Lu, X., Hamidouche, K., Tomko, K., Panda, D.K.: Scalable MiniMD design with hybrid MPI and OpenSHMEM. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, PGAS 2014, pp. 24:1–24:4. ACM, New York (2014). doi: 10.1145/2676870.2676893
  8. 8.
    Li, M., Lu, X., Potluri, S., Hamidouche, K., Jose, J., Tomko, K., Panda, D.: Scalable graph500 design with MPI-3 RMA. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 230–238, September 2014. doi: 10.1109/CLUSTER.2014.6968755
  9. 9.
    MPI Forum. MPI: A Message-Passing Interface Standard (Version 2.2). High Performance Computing Center Stuttgart (HLRS), September 2009Google Scholar
  10. 10.
    Plimpton, S.: Parallel FFT package. Technical report, Sandia National Labs. http://www.sandia.gov/~sjplimp/docs/fft/README.html
  11. 11.
    Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995). doi: 10.1006/jcph.1995.1039 CrossRefzbMATHGoogle Scholar
  12. 12.
    Poole, S.W., Hernandez, O.R., Kuehn, J.A., Shipman, G.M., Curtis, A., Feind, K.: OpenSHMEM - toward a unified RMA model. In: Padua, D.A. (ed.) Encyclopedia of Parallel Computing, pp. 1379–1391. Springer, Heidelberg (2011). doi: 10.1007/978-0-387-09766-4_490 Google Scholar
  13. 13.
    Pophale, S., Nanjegowda, R., Curtis, T., Chapman, B., Jin, H., Poole, S., Kuehn, J.: OpenSHMEM Performance and Potential: An NPB Experimental Study. In: Proceedings of the 6th Conference on Partitioned Global Address Space Programming Model, PGAS 2012. ACM, New York (2012)Google Scholar
  14. 14.
    Vetter, J.S., McCracken, M.O.: Statistical scalability analysis of communication operations in distributed applications. SIGPLAN Not. 36(7), 123–132 (2001). doi: 10.1145/568014.379590 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Chunyan Tang
    • 1
  • Aurelien Bouteiller
    • 1
    Email author
  • Thomas Herault
    • 1
  • Manjunath Gorentla Venkata
    • 2
  • George Bosilca
    • 1
  1. 1.Innovative Computing LaboratoryUniversity of TennesseeKnoxvilleUSA
  2. 2.Oak Ridge National LaboratoryOak RidgeUSA

Personalised recommendations