Skip to main content

OpenSHMEM I/O Extensions for Fine-Grained Access to Persistent Memory Storage

  • Conference paper
  • First Online:
Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI (SMC 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1315))

Included in the following conference series:

Abstract

Application workflows use files to communicate between stages of data processing and analysis kernel executions. In large-scale high performance distributed systems, file based communication significantly penalizes performance by introducing overheads such as meta-data access, contention for file locks, and slow speed of spinning disks. Using files as system wide persistent storage also hinders fine-grained access to data when files are stored on block devices handled through the I/O software stack. To address speed and granularity, we employ persistent memory (PMEM) devices, which provide DRAM-like speeds and byte granular access combined with persistent storage capabilities. To address file and I/O software stack overheads, we deploy an Arm-based Mellanox Bluefield SmartNIC with attached NVDIMM-N modules. Both SmartNIC and PMEM introduce API design and system software integration challenges. We address this with the design and implementation for an innovative client-server software architecture with a client API extension to the OpenSHMEM library. We benchmark the implementation using a workflow of invocations of OpenSHMEM kernels on a persistent data set. Compared to the same workflow using a network file I/O client, our solution shows no degradation of performance as the number of clients increases. We accelerate startup and shutdown phases of each kernel by reducing the time to move file data in and out of OpenSHMEM process memory to the speed of one-sided memory access. We also support the creation of many small files with minimal overhead. OpenSHMEM workflows can leverage these changes to create more, shorter lived kernels with lower penalty. This API can replace file I/O with code that appears and behaves similar to other OpenSHMEM remote memory accesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chapman, B., et al.: Introducing openshmem: Shmem for the pgas community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model. pp. 1–3 (2010)

    Google Scholar 

  2. Dorożyński, P., et al.: Checkpointing of parallel mpi applications using mpi one-sided api with support for byte-addressable non-volatile ram. Procedia Comput. Sci. 80, 30–40 (2016)

    Article  Google Scholar 

  3. Hughey, C.: Tumbling down the graphblas rabbit hole with SHMEM. In: Pophale, S., Imam, N., Aderholdt, F., Gorentla Venkata, M. (eds.) OpenSHMEM 2018. LNCS, vol. 11283, pp. 125–136. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04918-8_8

    Chapter  Google Scholar 

  4. Keeton, K., Singhal, S., Raymond, M.: The OpenFAM API: a programming model for disaggregated persistent memory. In: Pophale, S., Imam, N., Aderholdt, F., Gorentla Venkata, M. (eds.) OpenSHMEM 2018. LNCS, vol. 11283, pp. 70–89. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04918-8_5

    Chapter  Google Scholar 

  5. Kogge, P.M.: Graph analytics: complexity, scalability, and architectures. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). pp. 1039–1047. IEEE (2017)

    Google Scholar 

  6. Liu, N., et al.: On the role of burst buffers in leadership-class storage systems. In: 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST). pp. 1–11. IEEE (2012)

    Google Scholar 

  7. Lu, Y., Shu, J., Chen, Y., Li, T.: Octopus: an rdma-enabled distributed persistent memory file system. In: 2017 USENIX Annual Technical Conference (USENIX ATC 17). pp. 773–785 (2017)

    Google Scholar 

  8. Pritchard, H., Curtis, A., Welch, A., Fridley, A.: Open shmem reference implementation. Technical Reports, Los Alamos National Lab. (LANL), Los Alamos, NM (United States) (2016)

    Google Scholar 

  9. Rivas-Gomez, S., Fanfarillo, A., Narasimhamurthy, S., Markidis, S.: Persistent coarrays: integrating mpi storage windows in coarray fortran. In: Proceedings of the 26th European MPI Users’ Group Meeting. p. 3. ACM (2019)

    Google Scholar 

  10. Rivas-Gomez, S., Gioiosa, R., Peng, I.B., Kestor, G., Narasimhamurthy, S., Laure, E., Markidis, S.: Mpi windows on storage for hpc applications. Parallel Comput. 77, 38–56 (2018)

    Article  MathSciNet  Google Scholar 

  11. Shamis, P., Lopez, M.G., Shainer, G.: Enabling one-sided communication semantics on arm. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). pp. 805–813. IEEE (2017)

    Google Scholar 

  12. Shamis, P., et al.: Ucx: an open source framework for hpc network apis and beyond. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects. pp. 40–43. IEEE (2015)

    Google Scholar 

  13. Shan, Y., Tsai, S.Y., Zhang, Y.: Distributed shared persistent memory. In: Proceedings of the 2017 Symposium on Cloud Computing. pp. 323–337. SoCC 2017, ACM, New York, N USA (2017). https://doi.org/10.1145/3127479.3128610

  14. The Unified Communication X Library. http://www.openucx.org

  15. Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation. pp. 307–320 (2006)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the United States Department of Defense and Los Alamos National Laboratory for their continued support of this project. In addition, we would like thank Gilad Shainer and Wang Wong from Mellanox Technologies for providing us BlueField development platform and enabling NVDIMM support in BIOS. We thank Luis E. Peña and Curtis Dunham from Arm for their reviews of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Megan Grodowitz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grodowitz, M., Shamis, P., Poole, S. (2020). OpenSHMEM I/O Extensions for Fine-Grained Access to Persistent Memory Storage. In: Nichols, J., Verastegui, B., Maccabe, A.‘., Hernandez, O., Parete-Koon, S., Ahearn, T. (eds) Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI. SMC 2020. Communications in Computer and Information Science, vol 1315. Springer, Cham. https://doi.org/10.1007/978-3-030-63393-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63393-6_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63392-9

  • Online ISBN: 978-3-030-63393-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics