Designing a High Performance OpenSHMEM Implementation Using Universal Common Communication Substrate as a Communication Middleware

  • Pavel Shamis
  • Manjunath Gorentla Venkata
  • Stephen Poole
  • Aaron Welch
  • Tony Curtis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8356)


OpenSHMEM is an effort to standardize the well-known SHMEM parallel programming library. The project aims to produce an open-source and portable SHMEM API and is led by ORNL and UH. In this paper, we optimize the current OpenSHMEM reference implementation, based on GASNet, to achieve higher performance characteristics. To achieve these desired performance characteristics, we have redesigned an important component of the OpenSHMEM implementation, the network layer, to leverage a low-level communication library designed for implementing parallel programming models called UCCS. In particular, UCCS provides an interface and semantics such as native atomic operations and remote memory operations to better support PGAS programming models, including OpenSHMEM. Through the use of microbenchmarks, we evaluate this new OpenSHMEM implementation on various network metrics, including the latency of point-to-point and collective operations. Furthermore, we compare the performance of our OpenSHMEM implementation with the state-of-the-art SGI SHMEM. Our results show that the atomic operations of our OpenSHMEM implementation outperform SGI’s SHMEM implementation by 3%. Its RMA operations outperform both SGI’s SHMEM and the original OpenSHMEM reference implementation by as much as 18% and 12% for gets, and as much as 83% and 53% for puts.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chapman, B., Curtis, T., Pophale, S., Poole, S., Kuehn, J., Koelbel, C., Smith, L.: Introducing OpenSHMEM: SHMEM for the PGAS community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, PGAS 2010, New York, NY, USA (2010)Google Scholar
  2. 2.
    Poole, S.W., Hernandez, O., Kuehn, J.A., Shipman, G.M., Curtis, A., Feind, K.: OpenSHMEM - Toward a Unified RMA Model. In: Encyclopedia of Parallel Computing, pp. 1379–1391 (2011)Google Scholar
  3. 3.
    Pophale, S.S.: SRC: OpenSHMEM library development. In: Lowenthal, D.K., de Supinski, B.R., McKee, S.A. (eds.) ICS, p. 374. ACM (2011)Google Scholar
  4. 4.
    Bonachea, D.: GASNet Specification, v1.1. Technical report, Berkeley, CA, USA (2002)Google Scholar
  5. 5.
    Shamis, P., Venkata, M.G., Kuehn, J.A., Poole, S.W., Graham, R.L.: Universal Common Communication Substrate (UCCS) Specification. Version 0.1. Tech Report ORNL/TM-2012/339, Oak Ridge National Laboratory, ORNL (2012)Google Scholar
  6. 6.
    Graham, R.L., Shamis, P., Kuehn, J.A., Poole, S.W.: Communication Middleware Overview. Tech Report ORNL/TM-2012/120, Oak Ridge National Laboratory, ORNL (2012)Google Scholar
  7. 7.
    Yoon, C., Aggarwal, V., Hajare, V., George, A.D., Billingsley III, M. GSHMEM: A Portable Library for Lightweight, Shared-Memory, Parallel Programming. In: Partitioned Global Address Space, Galveston, Texas (2011)Google Scholar
  8. 8.
    Jose, J., Kandalla, K., Luo, M., Panda, D.K.: Supporting Hybrid MPI and OpenSHMEM over InfiniBand: Design and Performance Evaluation. In: Proceedings of the 2012 41st International Conference on Parallel Processing, ICPP 2012, pp. 219–228. IEEE Computer Society, Washington, DC (2012)Google Scholar
  9. 9.
    Brightwell, R., Hudson, T., Pedretti, K., Riesen, R., Underwood, K.D.: (Portals 3.3 on the Sandia/Cray Red Storm System)Google Scholar
  10. 10.
    Mellanox Technologies LTD.: Mellanox ScalableSHMEM: Support the OpenSHMEM Parallel Programming Language over InfiniBand (2012),
  11. 11.
    Mellanox Technologies LTD.: Mellanox Messaging (MXM): Message Accelerations over InfiniBand for MPI and PGAS libraries (2012),
  12. 12.
    Ho Lam, B.C., George, A.D., Lam, H.: TSHMEM: Shared-Memory Parallel Computing on Tilera Many-Core Processors. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, pp. 325–334 (2013),
  13. 13.
    Castain, R.H., Woodall, T.S., Daniel, D.J., Squyres, J.M., Barrett, B., Fagg, G.E.: The Open Run-Time Environment (OpenRTE): A Transparent Multi-Cluster Environment for High-Performance Computing. In: Di Martino, B., Kranzlmüller, D., Dongarra, J. (eds.) EuroPVM/MPI 2005. LNCS, vol. 3666, pp. 225–232. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  14. 14.
    Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: Simple linux utility for resource management. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  15. 15.
    Buntinas, D., Bosilica, G., Graham, R.L., Vallée, G., Watson, G.R.: A Scalable Tools Communication Infrastructure. In: Proceedings of the 22nd International High Performance Computing Symposium, HPCS 2008 (2008)Google Scholar
  16. 16.
    HPCC: RandomAccess Bechmark (2013),

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Pavel Shamis
    • 1
  • Manjunath Gorentla Venkata
    • 1
  • Stephen Poole
    • 1
  • Aaron Welch
    • 2
  • Tony Curtis
    • 2
  1. 1.Extreme Scale Systems Center (ESSC)Oak Ridge National Laboratory (ORNL)USA
  2. 2.Computer Science DepartmentUniversity of Houston (UH)USA

Personalised recommendations