An I/O Analysis of HPC Workloads on CephFS and Lustre

Chiusole, Alberto; Cozzini, Stefano; van der Ster, Daniel; Lamanna, Massimo; Giuliani, Graziano

doi:10.1007/978-3-030-34356-9_24

An I/O Analysis of HPC Workloads on CephFS and Lustre

Conference paper
First Online: 03 December 2019

6162 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11887))

Abstract

In this contribution we compare the performance of the Input/Output load (I/O) of a High-Performance Computing (HPC) application on two different File Systems: CephFS and Lustre; our goal is to assess whether CephFS could be considered a valid choice for intense HPC applications. We perform our analysis using a real HPC workload, namely RegCM, a climate simulation application, and IOR, a synthetic benchmark application, to simulate several I/O patterns using different I/O parallel libraries (MPI-IO, HDF5, PnetCDF). We compare writing performance for the two different I/O approaches that RegCM implements: the so-called spokesperson or serial, and a truly parallel one. The small difference registered between the serial I/O approach and the parallel one motivates us to explore in detail how the software stack interacts with the underlying File Systems. For this reason, we use IOR and MPI-IO hints related to Collective Buffering and Data Sieving to analyze several I/O patterns on the two different File Systems.

Finally we investigate Lazy I/O, a unique feature of CephFS, which disables file coherency locks introduced by the File System; this allows Ceph to buffer writes and to fully exploit its parallel and distributed architecture. Two clusters were set up for these benchmarks, one at CNR-IOM and a second one at Pawsey Supercomputing Centre; we performed similar tests on both installations, and we recorded a four-times I/O performance improvement with Lazy I/O enabled.

Preliminary results collected so far are quite promising and further actions and new possible I/O optimizations are presented and discussed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://github.com/ictp-esp/RegCM/releases/tag/4.7.3.4.
2.
https://www.mcs.anl.gov/projects/romio/2014/06/12/romio-and-intel-mpi/.
3.
Refer to “How do I use hints?” in the FAQ of the project: https://ior.readthedocs.io/en/latest/userDoc/faq.html.

References

Giorgi F., Coppola E., Solmon F., Mariotti L.: RegCM4: model description and preliminary tests over multiple CORDEX domains. Clim. Res. 52, 7–29 (2011). https://doi.org/10.3354/cr01018. The RegCM versions used in this study are available in the release page of the project on GitHub: https://github.com/ictp-esp/RegCM/releases
Li, J., et al.: Parallel netCDF: a scientific high-performance I/O interface. In: The Proceedings of ACM/IEEE Conference on Supercomputing, pp. 39, November 2003. Project web-site: https://parallel-netcdf.github.io. Accessed 22 July 2019
Wauteleta, P., Kestener, P.: Parallel IO performance and scalability study on the PRACE curie supercomputer, Partnership For Advanced Computing in Europe (PRACE), Technical report, September 2009
Google Scholar
Thakur, R., Ross, R., Lusk, E., Gropp, W., Latham, R.: Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation. http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=pdf&doi=10.1.1.218.9852
Wautelet, P.: Parallel I/O experiments on massively parallel computers. In: ScicomP 17 Conference, Paris, May 2011. http://www.idris.fr/docs/docu/IDRIS/IDRISioscicomp2011.pdf
IOR benchmark project on GitHub. https://github.com/hpc/ior. Accessed 22 July 2019
CephFS documentation about POSIX compliance. http://docs.Ceph.com/docs/mimic/Cephfs/posix/. Accessed 22 July 2019
CephFS documentation about Lazy I/O feature. http://docs.Ceph.com/docs/master/Cephfs/lazyio/. Accessed 22 July 2019
Carns, P., et al.: Understanding and improving computational science storage access through continuous characterization. In: Proceedings of 27th IEEE Conference on Mass Storage Systems and Technologies (MSST 2011) (2011). http://www.mcs.anl.gov/uploads/cels/papers/P1859.pdf. Project web-site: https://www.mcs.anl.gov/research/projects/darshan/

Download references

Acknowledgments

We thank Luca Cervigni, Pawsey Supercomputing Centre, for the opportunity to collaborate and extract Lazy I/O timings from their Ceph test cluster.

We thank Pablo Llopis, CERN, for all the precious HPC support provided.

Author information

Authors and Affiliations

eXact Lab s.r.l., via Beirut 2, 34151, Trieste, Italy
Alberto Chiusole & Stefano Cozzini
CNR-IOM c/o SISSA, Via Bonomea 265, 34136, Trieste, Italy
Stefano Cozzini
CERN, Geneva 23, Switzerland
Daniel van der Ster & Massimo Lamanna
ICTP, Strada Costiera 11, 34151, Trieste, Italy
Graziano Giuliani

Authors

Alberto Chiusole
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Cozzini
View author publications
You can also search for this author in PubMed Google Scholar
Daniel van der Ster
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Lamanna
View author publications
You can also search for this author in PubMed Google Scholar
Graziano Giuliani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefano Cozzini .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Sachsen, Germany
Guido Juckeland
Swiss National Supercomputing Centre, Lugano, Ticino, Switzerland
Sadaf Alam
University of Tennessee at Knoxville, Knoxville, TN, USA
Heike Jagode

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chiusole, A., Cozzini, S., van der Ster, D., Lamanna, M., Giuliani, G. (2019). An I/O Analysis of HPC Workloads on CephFS and Lustre. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-34356-9_24
Published: 03 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics