Abstract
Among the broad variety of challenges that arise from workloads in a converged HPC and Cloud infrastructure, data movement is of paramount importance, especially oncoming exascale systems featuring multiple tiers of memory and storage. While the focus has, for years, been primarily on optimizing computations, the importance of improving data handling on such architectures is now well understood. As optimization techniques can be applied at different stages (operating system, run-time system, programming environment, and so on), a middleware providing a uniform and consistent data awareness becomes necessary. In this paper, we introduce a novel memory- and data-aware middleware called Maestro, designed for data orchestration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
CORTX object store. https://github.com/Seagate/cortx
MultIO - a multiplexing I/O library. https://github.com/ecmwf/multio
Jones, T., et al.: Unity: Unified memory and file space. In: Ross ’17 Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers Ross 2017, article no. 6 (2017). https://doi.org/10.1145/3095770.3095776
Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., Zheng, F.: Datastager: Scalable data staging services for petascale applications. Cluster Comput. 13, 277–290 (2009). https://doi.org/10.1007/s10586-010-0135-6
Aspesi, G., Bai, J., Deese, R., Shin, L.: Havery mudd 2014–2015 computer science conduit clinic final report (2015). https://doi.org/10.2172/1184132. https://www.osti.gov/biblio/1184132-havery-mudd-computer-science-conduit-clinic-final-report
Bauer, P., Dueben, P.D., Hoefler, T., Quintino, T., Schulthess, T.C., Wedi, N.P.: The digital revolution of earth-system science. Nat. Comput. Sci. 1(2), 104–113 (2021)
Folk, M., Heber, G., Koziol, Q., Pourmal, E., Robinson, D.: An overview of the hdf5 technology suite and its applications. In: Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases, pp. 36–47. AD ’11, Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/1966895.1966900
Freitas, R.F., Wilcke, W.W.: Storage-class memory: the next storage system technology. IBM J. Res. Dev. 52(4.5), 439–447 (2008)
Godoy, W.F., et al.: Adios 2: The adaptable input output system. A framework for high-performance data management. SoftwareX 12, 100561 (2020). https://doi.org/10.1016/j.softx.2020.100561, http://www.sciencedirect.com/science/article/pii/S2352711019302560
Henseler, D., Landsteiner, B., Petesch, D., Wright, C., Wright, N.J.: Architecture and design of Cray DataWarp. In: Proceedings of 2016 Cray User Group (CUG) Meeting (2016)
Jianwei, L., et al.: Parallel netCDF: A high-performance scientific I/O interface. In: SC ’03: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, pp. 39–39 (2003)
Kougkas, A., Devarajan, H., Lofstead, J., Sun, X.H.: Labios: A distributed label-based I/O system. In: Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing, pp. 13–24. HPDC ’19, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3307681.3325405
Liu, Q., et al.: Hello adios: The challenges and lessons of developing leadership class I/O frameworks. Concurrency Comput. Pract. Experience 26(7), 1453–1473 (2014)
Luu, H., et al.: A multiplatform study of I/O behavior on petascale supercomputers. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pp. 33–44. HPDC ’15, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2749246.2749269
Meurdesoif, Y.: XIOS current developments and roadmap (2020). https://forge.ipsl.jussieu.fr/ioserver/raw-attachment/wiki/WikiStart/XIOS-ROADMAP-15102020.pdf
Otstott, D., Zhao, M., Williams, S., Ionkov, L., Lang, M.: A foundation for automated placement of data. In: 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW), pp. 50–59 (2019)
Perarnau, S., Videau, B., Denoyelle, N., Monna, F., Iskra, K., Beckman, P.: Explicit data layout management for autotuning exploration on complex memory topologies. In: 2019 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC), pp. 58–63 (2019)
Ross, R., et al.: Storage systems and I/O: Organizing, storing, and accessing data for scientific discovery. Report for the DOE ASCR Workshop on Storage Systems and I/O (2018). https://doi.org/10.2172/1491994
Smart, S., Quintino, T., Raoult, B.: A high-performance distributed object-store for exascale numerical weather prediction and climate. In: Proceedings of the Platform for Advanced Scientific Computing Conference, pp. 1–11 (2019)
Tang, H., et al.: Toward scalable and asynchronous object-centric data management for HPC. In: 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 113–122 (2018)
Tessier, F., Martinasso, M., Chesi, M., Klein, M., Gila, M.: Dynamic provisioning of storage resources: a case study with burst buffers. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1027–1035 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00173
Unat, D., et al.: Tida: high-level programming abstractions for data locality management. In: Kunkel, J.M., Balaji, P., Dongarra, J. (eds.) High Performance Computing, pp. 116–135. Springer International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-41321-1_7
Unat, D., Shalf, J., Hoefler, T., Schulthess, T., (Editors), A.D., Besta, M., et al.: Programming Abstractions for Data Locality. Technical report (2014)
Venkata, M.G., Aderholdt, F., Parchman, Z.: Sharp: towards programming extreme-scale systems with hierarchical heterogeneous memory. In: 2017 46th International Conference on Parallel Processing Workshops (ICPPW), pp. 145–154 (2017)
Weil, S.A., Leung, A.W., Brandt, S.A., Maltzahn, C.: Rados: a scalable, reliable storage service for petabyte-scale storage clusters. In: Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunction with Supercomputing’07, pp. 35–44 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Haine, C. et al. (2021). A Middleware Supporting Data Movement in Complex and Software-Defined Storage and Memory Architectures. In: Jagode, H., Anzt, H., Ltaief, H., Luszczek, P. (eds) High Performance Computing. ISC High Performance 2021. Lecture Notes in Computer Science(), vol 12761. Springer, Cham. https://doi.org/10.1007/978-3-030-90539-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-90539-2_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90538-5
Online ISBN: 978-3-030-90539-2
eBook Packages: Computer ScienceComputer Science (R0)