Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

European Conference on Parallel Processing

Euro-Par 2011: Euro-Par 2011: Parallel Processing Workshops pp 312–321Cite as

  1. Home
  2. Euro-Par 2011: Parallel Processing Workshops
  3. Conference paper
Can Checkpoint/Restart Mechanisms Benefit from Hierarchical Data Staging?

Can Checkpoint/Restart Mechanisms Benefit from Hierarchical Data Staging?

  • Raghunath Rajachandrasekar30,
  • Xiangyong Ouyang30,
  • Xavier Besseron30,
  • Vilobh Meshram30 &
  • …
  • Dhabaleswar K. Panda30 
  • Conference paper
  • 1101 Accesses

  • 2 Citations

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7156)

Abstract

Given the ever-increasing size of supercomputers, fault resilience and the ability to tolerate faults have become more of a necessity than an option. Checkpoint-Restart protocols have been widely adopted as a practical solution to provide reliability. However, traditional checkpointing mechanisms suffer from heavy I/O bottleneck while dumping process snapshots to a shared filesystem. In this context, we study the benefits of data staging, using a proposed hierarchical and modular data staging framework which reduces the burden of checkpointing on client nodes without penalizing them in terms of performance. During a checkpointing operation in this framework, the compute nodes transmit their process snapshots to a set of dedicated staging I/O servers through a high-throughput RDMA-based data pipeline. Unlike the conventional checkpointing mechanisms that block an application until the checkpoint data has been written to a shared filesystem, we allow the application to resume its execution immediately after the snapshots have been pipelined to the staging I/O servers, while data is simultaneously being moved from these servers to a backend shared filesystem. This framework eases the bottleneck caused by simultaneous writes from multiple clients to the underlying storage subsystem. The staging framework considered in this study is able to reduce the time penalty an application pays to save a checkpoint by 8.3 times.

Keywords

  • checkpoint-restart
  • data staging
  • aggregation
  • RDMA

This research is supported in part by U.S. Department of Energy grants #DE-FC02-06ER25749 and #DE-FC02-06ER25755; National Science Foundation grants #CCF-0621484, #CCF-0833169, #CCF-0916302, #OCI-0926691 and #CCF-0937842; grant from Wright Center for Innovation #WCI04-010-OSU-0.

Download conference paper PDF

References

  1. Filesystem in userspace, http://fuse.sourceforge.net

  2. IOzone filesystem benchmark, http://www.iozone.org

  3. Top 500 supercomputers, http://www.top500.org

  4. Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., Zheng, F.: Datastager: Scalable data staging services for petascale applications. In: HPDC (2009)

    Google Scholar 

  5. Bent, J., Gibson, G., Grider, G., McClelland, B., Nowoczynski, P., Nunez, J., Polte, M., Wingate, M.: PLFS: a checkpoint filesystem for parallel applications. In: SC (2009)

    Google Scholar 

  6. Buntinas, D., Coti, C., Herault, T., Lemarinier, P., Pilard, L., Rezmerita, A., Rodriguez, E., Cappello, F.: Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant mpi protocols. Future Generation Computer Systems (2008)

    Google Scholar 

  7. Cappello, F., Geist, A., Gropp, B., Kale, L., Kramer, B., Snir, M.: Toward exascale resilience. IJHPCA (2009)

    Google Scholar 

  8. Gao, Q., Yu, W., Huang, W., Panda, D.K.: Application-transparent checkpoint/restart for mpi programs over infiniband. In: ICPP (2006)

    Google Scholar 

  9. Gupta, R., Beckman, P., Park, B.H., Lusk, E., Hargrove, P., Geist, A., Panda, D.K., Lumsdaine, A., Dongarra, J.: Cifts: A coordinated infrastructure for fault-tolerant systems. In: ICPP (2009)

    Google Scholar 

  10. Hargrove, P.H., Duell, J.C.: Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters. In: SciDAC (2006)

    Google Scholar 

  11. Hursey, J., Lumsdaine, A.: A composable runtime recovery policy framework supporting resilient hpc applications. Tech. rep., University of Tennessee (2010)

    Google Scholar 

  12. Hursey, J., Squyres, J.M., Mattox, T.I., Lumsdaine, A.: The design and implementation of checkpoint/restart process fault tolerance for Open MPI. In: IPDPS (2007)

    Google Scholar 

  13. InfiniBand Trade Association: The InfiniBand Architecture, http://www.infinibandta.org

  14. Isaila, F., Garcia Blas, J., Carretero, J., Latham, R., Ross, R.: Design and evaluation of multiple-level data staging for blue gene systems. TPDS (2011)

    Google Scholar 

  15. Moody, A., Bronevetsky, G., Mohror, K., de Supinski, B.R.: Design, modeling, and evaluation of a scalable multi-level checkpointing system. In: SC (2010)

    Google Scholar 

  16. Ouyang, X., Gopalakrishnan, K., Gangadharappa, T., Panda, D.K.: Fast Checkpointing by Write Aggregation with Dynamic Buffer and Interleaving on Multicore Architecture. HiPC (2009)

    Google Scholar 

  17. Ouyang, X., Rajachandrasekhar, R., Besseron, X., Wang, H., Huang, J., Panda, D.K.: CRFS: A lightweight user-level filesystem for generic checkpoint/restart. In: ICPP (2011) (to appear)

    Google Scholar 

  18. Plank, J.S., Chen, Y., Li, K., Beck, M., Kingsley, G.: Memory exclusion: Optimizing the performance of checkpointing systems. In: Software: Practice and Experience (1999)

    Google Scholar 

  19. Schroeder, B., Gibson, G.A.: Understanding failures in petascale computers. Journal of Physics: Conference Series (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Network-Based Computing Laboratory, Department of Computer Science and Engineering, The Ohio State University, USA

    Raghunath Rajachandrasekar, Xiangyong Ouyang, Xavier Besseron, Vilobh Meshram & Dhabaleswar K. Panda

Authors
  1. Raghunath Rajachandrasekar
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Xiangyong Ouyang
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Xavier Besseron
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Vilobh Meshram
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. Dhabaleswar K. Panda
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Scilytics, Koellnerhofgasse 3/15A, 1010, Vienna, Austria

    Michael Alexander

  2. ICAR-CNR, Via P. Castellino, 111, 80131, Napoli, Italy

    Pasqua D’Ambra

  3. University of Amsterdam, 1090, Amsterdam, Netherlands

    Adam Belloum

  4. Innovative Computing Laboratory, The University of Tennessee, US

    George Bosilca

  5. Department of Experimental Medicine and Clinic, University Magna Græcia, 88100, Catanzaro, Italy

    Mario Cannataro

  6. Computer Science Department, University of Pisa, Italy

    Marco Danelutto

  7. Second University of Naples, Italy

    Beniamino Di Martino

  8. TUMünchen,, Boltzmannstr. 3, ,, 85748, Garching, Germany

    Michael Gerndt

  9. Equipe Runtime, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Emmanuel Jeannot & Raymond Namyst & 

  10. Equipe HIEPACS, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Jean Roman

  11. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831-6164, Oak Ridge, TN, USA

    Stephen L. Scott

  12. Department of Scientific Computing, University of Vienna, Nordbergstr. 15/3C, 1090, Vienna, Austria

    Jesper Larsson Traff

  13. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA

    Geoffroy Vallée

  14. Technische Universität München, Germany

    Josef Weidendorfer

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rajachandrasekar, R., Ouyang, X., Besseron, X., Meshram, V., Panda, D.K. (2012). Can Checkpoint/Restart Mechanisms Benefit from Hierarchical Data Staging?. In: Alexander, M., et al. Euro-Par 2011: Parallel Processing Workshops. Euro-Par 2011. Lecture Notes in Computer Science, vol 7156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29740-3_35

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-29740-3_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29739-7

  • Online ISBN: 978-3-642-29740-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature