Advertisement

On the Road to DiPOSH: Adventures in High-Performance OpenSHMEM

  • Camille CotiEmail author
  • Allen D. Malony
Conference paper
  • 39 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12043)

Abstract

Future HPC programming systems must address the challenge of how to integrate shared and distributed memory parallelism. The growing number of server cores argues in favor of shared memory multithreading at the node level, but makes interfacing with distributed communication libraries more problematic. Alternatively, implementing rich message passing libraries to run across codes can be cumbersome and inefficient. The paper describes an attempt to address the challenge with OpenSHMEM, where a lean API makes for a high-performance shared memory operation and communication semantics maps directly to fast networking hardware. DiPOSH is our initial attempt to implement OpenSHMEM with these objectives. Starting with our node-level POSH design, we leveraged MPI one-sided support to get initial internode functionality. The paper reports our progress. To our pleasant surprise, we discovered a natural and compatible integration of OpenSHMEM and MPI, in contrast to what is found in MPI+X hybrids today.

Keywords

OpenSHMEM Distributed run-time system One-sided communication 

Notes

Acknowledgment

Experiments presented in this paper were carried out using the Grid’5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).

References

  1. 1.
    Barrett, B., Squyres, J.M., Lumsdaine, A., Graham, R.L., Bosilca, G.: Analysis of the component architecture overhead in open MPI. In: Di Martino, B., Kranzlmüller, D., Dongarra, J. (eds.) EuroPVM/MPI 2005. LNCS, vol. 3666, pp. 175–182. Springer, Heidelberg (2005).  https://doi.org/10.1007/11557265_25CrossRefGoogle Scholar
  2. 2.
    Bland, W., Bouteiller, A., Hérault, T., Hursey, J., Bosilca, G., Dongarra, J.J.: An evaluation of user-level failure mitigation support in MPI. Computing 95(12), 1171–1184 (2013)CrossRefGoogle Scholar
  3. 3.
    Butelle, F., Coti, C.: Distributed snapshot for rollback-recovery with one-sided communications. In: 2018 International Conference on High Performance Computing & Simulation (HPCS), pp. 614–620. IEEE (2018)Google Scholar
  4. 4.
    Cappello, F., et al.: Grid’5000: a large scale and highly reconfigurable grid experimental testbed. In: SC 2005: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing CD, Seattle, Washington, USA, pp. 99–106. IEEE/ACM, November 2005Google Scholar
  5. 5.
    Cappello, F., Geist, A., Gropp, W., Kale, S., Kramer, B., Snir, M.: Toward exascale resilience: 2014 update. Supercomput. Front. Innov. 1(1), 5–28 (2014)Google Scholar
  6. 6.
    Coti, C.: POSH: Paris OpenSHMEM: a high-performance OpenSHMEM implementation for shared memory systems. Procedia Comput. Sci. 29, 2422–2431 (2014). 2014 International Conference on Computational Science (ICCS 2014)CrossRefGoogle Scholar
  7. 7.
    Goglin, B., Moreaud, S.: KNEM: a generic and scalable kernel-assisted intra-node MPI communication framework. J. Parallel Distrib. Comput. 73(2), 176–188 (2013)CrossRefGoogle Scholar
  8. 8.
    Hammond, J.R., Ghosh, S., Chapman, B.M.: Implementing OpenSHMEM using MPI-3 one-sided communication. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 44–58. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-05215-1_4CrossRefGoogle Scholar
  9. 9.
    Hao, P., et al.: Fault tolerance for OpenSHMEM. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, PGAS 2014, pp. 23:1–23:3. ACM, New York (2014)Google Scholar
  10. 10.
    Linford, J.C., Khuvis, S., Shende, S., Malony, A., Imam, N., Venkata, M.G.: Performance analysis of OpenSHMEM applications with TAU commander. In: Gorentla Venkata, M., Imam, N., Pophale, S. (eds.) OpenSHMEM 2017. LNCS, vol. 10679, pp. 161–179. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-73814-7_11CrossRefGoogle Scholar
  11. 11.
    Luo, M., Seager, K., Murthy, K.S., Archer, C.J., Sur, S., Hefty, S.: Early evaluation of scalable fabric interface for PGAS programming models. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, p. 1. ACM (2014)Google Scholar
  12. 12.
    Shamis, P., et al.: UCX: an open source framework for HPC network APIs and beyond. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 40–43. IEEE (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.LIPN, CNRS UMR 7030, Université Paris 13, Sorbonne Paris CitéVilletaneuseFrance
  2. 2.University of OregonEugeneUSA

Personalised recommendations