The Journal of Supercomputing

, Volume 74, Issue 11, pp 6236–6257 | Cite as

Zeroing memory deallocator to reduce checkpoint sizes in virtualized HPC environments

  • Ramy Gad
  • Simon PickartzEmail author
  • Tim Süß
  • Lars Nagel
  • Stefan Lankes
  • Antonello Monti
  • André Brinkmann


Virtualization has become an indispensable tool in data centers and cloud environments to flexibly assign virtual machines (VMs) to resources. Virtualization also becomes more and more attractive for high-performance computing (HPC). This is mainly due to the strong isolation of VMs which enables: (1) the sharing of cluster nodes and optimization of the system’s overall utilization; (2) load balancing by means of migrations due to the reduction of residual dependencies; and (3) the creation of system-level checkpoints increasing the fault tolerance in an application-transparent way. On the downside, the additional virtualization layer conceals information that is only available on the process level. This information has a direct influence on the checkpoint size which should be kept as small as possible. In this paper, we propose a novel technique for checkpoint size reduction in virtualized environments. We exploit the fact that the hypervisor detects zero pages which are omitted when capturing a checkpoint. Moreover, compression techniques are applied for a further reduction of the checkpoint size. We therefore fill freed memory regions with zeros supporting both the zero-page detection and the compression. We evaluate our approach by taking the example of HPC applications. The results reveal a reduction of the checkpoint size by up to 9% when compression is disabled in the hypervisor and up to 49% with compression enabled. Furthermore, memory zeroing is able to reduce VM migration time by up to 10% when compression is disabled and by up to 60% when compression is enabled.


Virtualization Checkpoint/restart Migration HPC 



This research and development was supported by the Federal Ministry of Education and Research (BMBF) under Grant 01IH13004 (Project FAST) and Grant 01IH16010B (Project Envelope).


  1. 1.
    Ansel J, Arya K, Cooperman G (2009) DMTCP: transparent checkpointing for cluster computations and the desktop. In: 23rd IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2009), Rome, Italy, May 23–29, 2009, pp 1–12Google Scholar
  2. 2.
    Bellard F (2005) QEMU, a fast and portable dynamic translator. In: Proceedings of the FREENIX Track: 2005 USENIX Annual Technical Conference, April 10–15, 2005, Anaheim, CA, USA, pp 41–46Google Scholar
  3. 3.
    Bergman K, Borkar S, Campbell D et al (2008) ExaScale computing study: technology challenges in achieving exascale systems. Peter Kogge, editor & study leadGoogle Scholar
  4. 4.
    Birkenheuer G, Brinkmann A, Kaiser J, Keller A, Keller M, Kleineweber C, Konersmann C, Niehörster O, Schäfer T, Simon J, Wilhelm M (2012) Virtualized HPC: a contradiction in terms? Softw Pract Exp 42(4):485–500CrossRefGoogle Scholar
  5. 5.
    Breitbart J, Pickartz S, Weidendorfer J, Monti A (2016) Viability of virtual machines in HPC—a state of the art analysis. In: Euro-Par 2016: Parallel Processing Workshops—Euro-Par 2016 International Workshops, Grenoble, France, August 24–26, 2016, Revised Selected Papers, pp 721–733Google Scholar
  6. 6.
    Breitbart J, Weidendorfer J, Trinitis C (2015) Case study on co-scheduling for HPC applications. In: 44th International Conference on Parallel Processing Workshops (ICPPW 2015), Beijing, China, September 1–4, 2015, pp 277–285Google Scholar
  7. 7.
    Bronevetsky G, Marques D, Pingali K, Szwed PK, Schulz M (2004) Application-level checkpointing for shared memory programs. In: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2004), Boston, MA, USA, October 7–13, 2004, pp 235–247Google Scholar
  8. 8.
    Clark C, Fraser K, Hand S, Hansen JG, Jul E, Limpach C, Pratt I, Warfield A (2005) Live migration of virtual machines. In: 2nd Symposium on Networked Systems Design and Implementation Proceedings (NSDI 2005), May 2–4, 2005, Boston, MA, USAGoogle Scholar
  9. 9.
    Darling A, Carey L, Feng WC (2003) The design, implementation, and evaluation of mpiBLAST. In: 4th International Conference on Linux Clusters: The HPC Revolution 2003 in conjunction with ClusterWorld Conference & Expo, pp 13–15Google Scholar
  10. 10.
    Dongarra J, Beckman P, Moore T et al (2011) The international exascale software project roadmap. Int J High Perform Comput Appl 25(1):3–60. CrossRefGoogle Scholar
  11. 11.
    Duell J (2003) The design and implementation of Berkeley Lab’s linux checkpoint/restart. Tech. rep, Lawrence Berkeley National LaboratoryGoogle Scholar
  12. 12.
    Dusser J, Seznec A (2011) Decoupled zero-compressed memory. In: Proceedings of the High Performance Embedded Architectures and Compilers, 6th International Conference (HiPEAC 2011), Heraklion, Crete, Greece, January 24–26 2011, pp 77–86Google Scholar
  13. 13.
    Ferguson JN (2007) Understanding the heap by breaking it. Black Hat, USA, pp 1–39Google Scholar
  14. 14.
    FrantzDale B, Plimpton SJ, Shephard MS (2010) Software components for parallel multiscale simulation: an example with LAMMPS. Eng Comput 26(2):205–211CrossRefGoogle Scholar
  15. 15.
    Gad R, Pickartz S, Süß T, Nagel L, Lankes S, Brinkmann A (2016) Accelerating application migration in HPC. In: High Performance Computing—ISC High Performance 2016 International Workshops, Frankfurt, Germany, June 19–23, 2016, Revised Selected Papers, pp 663–673Google Scholar
  16. 16.
    Glosli JN, Richards DF, Caspersen KJ, Rudd RE, Gunnels JA, Streitz FH (2007) Extending stability beyond cpu millennium: a micron-scale atomistic simulation of Kelvin–Helmholtz instability. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC’07. ACM, New York, NY, USA, pp 58:1–58:11.
  17. 17.
    Hirofuchi T, Nakada H, Itoh S, Sekiguchi S (2011) Reactive consolidation of virtual machines enabled by postcopy live migration. In: Proceedings of the 5th International Workshop on Virtualization Technologies in Distributed Computing, VTDC@HPDC 2011, San Jose, CA, USA, June 8, 2011, pp 11–18Google Scholar
  18. 18.
    Hu J, Gu J, Sun G, Zhao T (2010) A scheduling strategy on load balancing of virtual machine resources in cloud computing environment. In: Third International Symposium on Parallel Architectures, Algorithms and Programming (PAAP 2010), Dalian, China, 18–20 December, 2010, pp 89–96Google Scholar
  19. 19.
    Huang W, Gao Q, Liu J, Panda DK (2007) High performance virtual machine migration with RDMA over modern interconnects. In: Proceedings of the 2007 IEEE International Conference on Cluster Computing, 17–20 September 2007, Austin, TX, USA, pp 11–20Google Scholar
  20. 20.
    Ibtesham D, Arnold D, Ferreira KB, Bridges PG (2012) On the viability of checkpoint compression for extreme scale fault tolerance. In: Proceedings of the 2011 International Conference on Parallel Processing—Volume 2, Euro-Par’11. Springer, Berlin, pp 302–311. CrossRefGoogle Scholar
  21. 21.
    Islam TZ, Mohror K, Bagchi S, Moody A, de Supinski BR, Eigenmann R (2012) McrEngine: a scalable checkpointing system using data-aware aggregation and compression. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC’12. IEEE, pp 17:1–17:11Google Scholar
  22. 22.
    Jin H, Ke T, Chen Y, Sun XH (2012) Checkpointing orchestration: Toward a scalable hpc fault-tolerant environment. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2012), CCGRID’12. IEEE Computer Society, Washington, DC, USA, pp 276–283.
  23. 23.
    Kaiser J, Gad R, Süß T, Padua F, Nagel L, Brinkmann A (2016) Deduplication potential of HPC applications’ checkpoints. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan, September 12–16, 2016, pp 413–422Google Scholar
  24. 24.
    Kivity A, Kamay Y, Laor D, Lublin U, Liguori A (2007) KVM: the Linux Virtual Machine Monitor. In: Proceedings of the Linux symposium, pp 225–230Google Scholar
  25. 25.
    Knowlton KC (1965) A fast storage allocator. Commun ACM 8(10):623–624. CrossRefzbMATHGoogle Scholar
  26. 26.
    Kozuch M, Satyanarayanan M (2002) Internet suspend/resume. In: 4th IEEE Workshop on Mobile Computing Systems and Applications (WMCSA 2002), 20–21 June 2002, Callicoon, NY, USA, p 40Google Scholar
  27. 27.
    Lankes S, Pickartz S, Breitbart J (2016) HermitCore—a unikernel for extreme scale computing. In: Proceedings of the International Workshop on Runtime and Operating Systems for Supercomputers (ROSS 2016), Held in Conjunction with 25th International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2016). Kyoto, JapanGoogle Scholar
  28. 28.
    Lartillot N, Lepage T, Blanquart S (2009) Phylobayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25(17):2286–2288CrossRefGoogle Scholar
  29. 29.
    Mäsker M, Nagel L, Brinkmann A, Lotfifar F, Johnson M (2016) Smart grid-aware scheduling in data centres. Comput Commun 96:73–85CrossRefGoogle Scholar
  30. 30.
    Nagarajan AB, Mueller F, Engelmann C, Scott SL (2007) Proactive fault tolerance for HPC with xen virtualization. In: Proceedings of the 21th Annual International Conference on Supercomputing, ICS 2007, Seattle, Washington, USA, June 17–21, 2007, pp 23–32Google Scholar
  31. 31.
    Phillips JC, Braun R, Wang W, Gumbart JC, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kalé LV, Schulten K (2005) Scalable molecular dynamics with NAMD. J Comput Chem 26(16):1781–1802CrossRefGoogle Scholar
  32. 32.
    Pickartz S, Clauss C, Breitbart J, Lankes S, Monti A (2018) Prospects and challenges of virtual machine migration in HPC. Concurr Comput Pract Exp. CrossRefGoogle Scholar
  33. 33.
    Pickartz S, Gad R, Lankes S, Nagel L, Süß T, Brinkmann A, Krempel S (2014) Migration techniques in HPC environments. In: Euro-Par 2014: Parallel Processing Workshops—Euro-Par 2014 International Workshops, Porto, Portugal, August 25–26, 2014, Revised Selected Papers, Part II, pp 486–497Google Scholar
  34. 34.
    Plank JS, Beck M, Kingsley G, Li K (1995) Libckpt: transparent checkpointing under UNIX. In: USENIX 1995 Technical Conference on UNIX and Advanced Computing Systems, Conference Proceedings, New Orleans, LA, USA, January 16–20, 1995, pp 213–224Google Scholar
  35. 35.
    Pronk S, Páll S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, Shirts MR, Smith JC, Kasson PM, van der Spoel D, Hess B, Lindahl E (2013) GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics 29(7):845–854CrossRefGoogle Scholar
  36. 36.
    Randles M, Lamb DJ, Taleb-Bendiab A (2010) A comparative study into distributed load balancing algorithms for cloud computing. In: 24th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2010, Perth, Australia, 20–13, April 2010, pp 551–556Google Scholar
  37. 37.
    Satyanarayanan M, Gilbert B, Toups M, Tolia N, Surie A, O’Hallaron DR, Wolbach A, Harkes J, Perrig A, Farber DJ, Kozuch M, Helfrich C, Nath P, Lagar-Cavilla HA (2007) Pervasive personal computing in an internet suspend/resume system. IEEE Internet Comput 11(2):16–25CrossRefGoogle Scholar
  38. 38.
    Schulz M, Bronevetsky G, Fernandes R, Marques D, Pingali K, Stodghill P (2004) Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs. In: Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 6–12 November 2004, Pittsburgh, PA, USA, p 38Google Scholar
  39. 39.
    Si Quang L, Gascuel O, Lartillot N (2008) Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24(20):2317–2323CrossRefGoogle Scholar
  40. 40.
    Süß T, Döring N, Gad R, Nagel L, Brinkmann A, Feld D, Schricker E, Soddemann T (2016) Impact of the scheduling strategy in heterogeneous systems that provide co-scheduling. In: Proceedings of the 1st COSH Workshop on Co-Scheduling of HPC Applications, COSH@HiPEAC 2016, Prague, Czech Republic, January 19, 2016, pp 37–42Google Scholar
  41. 41.
    Svärd P, Tordsson J, Hudzia B, Elmroth E (2011) High performance live migration through dynamic page transfer reordering and compression. In: IEEE 3rd International Conference on Cloud Computing Technology and Science, CloudCom 2011, Athens, Greece, November 29–December 1, 2011, pp 542–548Google Scholar
  42. 42.
    Uhlig R, Neiger G, Rodgers D, Santoni AL, Martins FCM, Anderson AV, Bennett SM, Kägi A, Leung FH, Smith L (2005) Intel virtualization technology. IEEE Comput 38(5):48–56CrossRefGoogle Scholar
  43. 43.
    Walters JP, Chaudhary V, Cha M, G Jr S, Gallo SM (2008) A comparison of virtualization technologies for HPC. In: 22nd International Conference on Advanced Information Networking and Applications (AINA 2008), GinoWan, Okinawa, Japan, March 25–28, 2008, pp 861–868Google Scholar
  44. 44.
    Wang C, Mueller F, Engelmann C, Scott SL (2012) Proactive process-level live migration and back migration in HPC environments. J Parallel Distrib Comput 72(2):254–267CrossRefGoogle Scholar
  45. 45.
    Youseff L, Wolski R, Gorda BC, Krintz C (2006) Evaluating the performance impact of xen on MPI and process execution for HPC systems. In: Proceedings of the First International Workshop on Virtualization Technology in Distributed Computing, VTDC@SC 2006, Tampa, FL, USA, November 17, 2006, p 1Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Zentrum für Datenverarbeitung, Johannes Gutenberg-Universität MainzMainzGermany
  2. 2.Institute for Automation of Complex Power Systems, E.ON Energy Research CenterRWTH AachenAachenGermany
  3. 3.Department of Computer ScienceLoughborough UniversityLoughboroughUK

Personalised recommendations