An Architecture for High Performance Computing and Data Systems Using Byte-Addressable Persistent Memory

Jackson, Adrian; Weiland, Michèle; Parsons, Mark; Homölle, Bernhard

doi:10.1007/978-3-030-34356-9_21

An Architecture for High Performance Computing and Data Systems Using Byte-Addressable Persistent Memory

Adrian Jackson¹²,
Michèle Weiland¹²,
Mark Parsons¹² &
…
Bernhard Homölle¹³

Conference paper
First Online: 03 December 2019

5950 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11887))

Abstract

Non-volatile and byte-addressable memory technology with performance close to main memory has the potential to revolutionise computing systems in the near future. Such memory technology provides the potential for extremely large memory regions (i.e. >3 TB per server), very high performance I/O, and new ways of storing and sharing data for applications and workflows. This paper proposes hardware and system software architectures that have been designed to exploit such memory for High Performance Computing and High Performance Data Analytics systems, along with descriptions of how applications could benefit from such hardware, and initial performance results on a system with Intel Optane DC Persistent Memory.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
www.nextgenio.eu.

References

Sodani, A.: Knights landing (KNL): 2nd Generation Intel Xeon Phi Processor. In: IEEE Hot Chips 27 Symposium (HCS). IEEE, January 2015
Google Scholar
NVIDIA Volta. https://www.nvidia.com/en-us/data-center/volta-gpu-architecture
Jun, H., et al.: HBM (high bandwidth memory) DRAM technology and architecture. In: 2017 IEEE International Memory Workshop (IMW), pp. 1–4 (2017)
Google Scholar
Turner, A., Simon, M.-S.: A survey of application memory usage on a national supercomputer: an analysis of memory requirements on ARCHER. In: Stephen, J., Steven, W., Simon, H. (eds.) PMBS 2017. LNCS, vol. 10724, pp. 250–260. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-72971-8_13, http://www.archer.ac.uk/documentation/white-papers/memory-use/ARCHER_mem_use.pdf
Hady, F.T., Foong, A., Veal, B., Williams, D.: Platform storage performance with 3D XPoint technology. Proc. IEEE 105(9), 1–12 (2017). https://doi.org/10.1109/JPROC.2017.2731776
Article Google Scholar
NVDIMM Messaging and FAQ: SNIA website. Accessed Nov 2017. https://www.snia.org/sites/default/files/NVDIMM%20Messaging%20and%20FAQ%20Jan%2020143.pdf
Report on MCDRAM technology from Colfax Research. https://colfaxresearch.com/knl-mcdram/
Intel Patent on multi-level memory configuration for nonvolatile memory technology. https://www.google.com/patents/US20150178204
pmem.io. http://pmem.io/
Layton, J.: IO pattern characterization of HPC applications. In: Mewhort, D.J.K., Cann, N.M., Slater, G.W., Naughton, T.J. (eds.) HPCS 2009. LNCS, vol. 5976, pp. 292–303. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12659-8_22
Chapter Google Scholar
Luu, H., et al.: A multiplatform study of I/O behavior on petascale supercomputers. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2015), pp. 33–44. ACM, New York (2015). https://doi.org/10.1145/2749246.2749269
IEEE Std 1003.1-2008 (Revision of IEEE Std 1003.1-2004) - IEEE Standard for Information Technology - Portable Operating System Interface (POSIX(R))
Google Scholar
Schwan, P.: Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium, vol. 2003 (2003)
Google Scholar
Schmuck, F., Haskin, R.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST 2002), Article 19. USENIX Association, Berkeley (2002)
Google Scholar
Introduction to BeeGFS. http://www.beegfs.io/docs/whitepapers/Introduction_to_BeeGFS_by_ThinkParQ.pdf
Sun, J., Li, Z., Zhang, X.: The performance optimization of Lustre file system. In: 2012 7th International Conference on Computer Science and Education (ICCSE), Melbourne, VIC, pp. 214–217 (2012). https://doi.org/10.1109/ICCSE.2012.6295060
Choi, W., Jung, M., Kandemir, M., Das, C.: A scale-out enterprise storage architecture. In: IEEE International Conference on Computer Design (ICCD) (2017). https://doi.org/10.1109/ICCD.2017.96
Lin, K.-W., Byna, S., Chou, J., Wu, K.: Optimizing fastquery performance on lustre file system. In: Szalay, A., Budavari, T., Balazinska, M., Meliou, A., Sacan, A. (eds.) Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM), Article 29, 12 p. ACM, New York (2013). https://doi.org/10.1145/2484838.2484853
Carns, P., et al.: Understanding and improving computational science storage access through continuous characterization. In: Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST 2011), pp. 1–14. IEEE Computer Society, Washington (2011). https://doi.org/10.1109/MSST.2011.5937212
Kim, J., Lee, S., Vetter, J.S.: PapyrusKV: a high-performance parallel key-value store for distributed NVM architectures, SC, vol. 57, no. 14, pp. 1–57 (2017)
Google Scholar
Lofstead, J., Jimenez, I., Maltzahn, C., Koziol, Q., Bent, J., Barton, E.: DAOS and friends: a proposal for an exascale storage system. In: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 585–596, Salt Lake City (2016). https://doi.org/10.1109/SC.2016.49
Martí, J., Queralt, A., Gasull, D., Barceló, A., Costa, J.J., Cortes, T.: Dataclay: a distributed data store for effective inter-player data sharing. J. Syst. Softw. 131, 129–145 (2017). ISSN 0164–1212, https://doi.org/10.1016/j.jss.2017.05.080
Tejedor, E., et al.: PyCOMPSs: parallel computational workflows in Python. Int. J. High Perform. Comput. Appl. 31(1), 66–82 (2017). First Published August 19, 201, https://doi.org/10.1177/1094342015594678
Farsarakis, E., Panourgias, I., Jackson, A., Herrera, J.F.R., Weiland, M., Parsons, M.: Resource Requirement Specification for Novel Data-aware and Workflow-enabled HPC Job Schedulers, PDSW-DISCS17 (2017). http://www.pdsw.org/pdsw-discs17/wips/farsarakis-wip-pdsw-discs17.pdf
Weiland, M., Jackson, A., Johnson, N., Parsons, M.: Exploiting the performance benefits of storage class memory for HPC and HPDA Workflows. Supercomput. Front. Innov. 5(1), 79–94 (2018). https://doi.org/10.14529/jsfi180105
Article Google Scholar
ORNL Titan specification. http://phys.org/pdf285408062.pdf
Anantharaj, V., Foertter, F., Joubert, W., Wells, J.: Approaching exascale: application requirements for OLCF leadership computing, July 2013. https://www.olcf.ornl.gov/wp-content/uploads/2013/01/OLCF_Requirements_TM_2013_Final1.pdf
Daley, C., Ghoshal, D., Lockwood, G., Dosanjh, S., Ramakrishnan, L., Wright, N.: Performance characterization of scientific workflows for the optimal use of burst buffers. Future Gener. Comput. Syst. (2017). https://doi.org/10.1016/j.future.2017.12.022
Article Google Scholar
Mielke, N.R., Frickey, R.E., Kalastirsky, I., Quan, M., Ustinov, D., Vasudevan, V.J.: Reliability of solid-state drives based on NAND flash memory. Proc. IEEE 105(9), 1725–1750 (2017). https://doi.org/10.1109/JPROC.2017.2725738
Article Google Scholar
Li, C., Ding, C., Shen, K.: Quantifying the cost of context switch. In: Proceedings of the 2007 Workshop on Experimental Computer Science (ExpCS 2007), Article 2. ACM, New York (2007). https://doi.org/10.1145/1281700.1281702
Liu, N., et al.: On the role of burst buffers in leadership-class storage systems. In: 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–11, San Diego (2012). https://doi.org/10.1109/MSST.2012.6232369
Petersen, T.K., Bent, J.: Hybrid flash arrays for HPC storage systems: an alternative to burst buffers. In: High Performance Extreme Computing Conference (HPEC) 2017. IEEE, pp. 1–7 (2017)
Google Scholar
Vef, M.-A., et al.: GekkoFS - a temporary distributed file system for HPC applications. In: Proceedings of the 2018 IEEE International Conference on Cluster Computing (CLUSTER), Belfast, 10–13 September 2018
Google Scholar
Matej, A., Gregor, V., Nejc, B.: Cloud-based simulation of aerodynamics of light aircraft. https://hpc-forge.cineca.it/files/CoursesDev/public/2015/Workshop_HPC_Methods_for_Engineering/cloud_based_aircraft.pdf
Jasak, H.: OpenFOAM: open source CFD in research and industry. Int. J. Naval Architect. Ocean Eng. 1(2), 89–94 (2009). issn 2092-6782
Google Scholar
IPMCTL. https://github.com/intel/ipmctl
NDCTL - Utility library for managing the libnvdimm (non-volatile memory device) sub-system in the Linux kernel. https://github.com/pmem/ndctl
IOR. https://github.com/LLNL/ior

Download references

Acknowledgements

The NEXTGenIO project^{Footnote 1} and the work presented in this paper were funded by the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement no. 671951. All the NEXTGenIO Consortium members (EPCC, Allinea, Arm, ECMWF, Barcelona Supercomputing Centre, Fujitsu Technology Solutions, Intel Deutschland, Arctur and Technische Universität Dresden) contributed to the design of the architectures.

Author information

Authors and Affiliations

EPCC, The University of Edinburgh, Edinburgh, UK
Adrian Jackson, Michèle Weiland & Mark Parsons
SVA System Vertrieb Alexander GmbH, Paderborn, Germany
Bernhard Homölle

Authors

Adrian Jackson
View author publications
You can also search for this author in PubMed Google Scholar
Michèle Weiland
View author publications
You can also search for this author in PubMed Google Scholar
Mark Parsons
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Homölle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrian Jackson .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Sachsen, Germany
Guido Juckeland
Swiss National Supercomputing Centre, Lugano, Ticino, Switzerland
Sadaf Alam
University of Tennessee at Knoxville, Knoxville, TN, USA
Heike Jagode

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jackson, A., Weiland, M., Parsons, M., Homölle, B. (2019). An Architecture for High Performance Computing and Data Systems Using Byte-Addressable Persistent Memory. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-34356-9_21
Published: 03 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics