Shared Memory Pipelined Parareal

  • Daniel RuprechtEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10417)


For the parallel-in-time integration method Parareal, pipelining can be used to hide some of the cost of the serial correction step and improve its efficiency. The paper introduces a basic OpenMP implementation of pipelined Parareal and compares it to a standard MPI-based variant. Both versions yield almost identical runtimes, but, depending on the compiler, the OpenMP variant consumes about 7% less energy and has a significantly smaller memory footprint. However, its higher implementation complexity might make it difficult to use in legacy codes and in combination with spatial parallelisation.


Parareal Parallel-in-time integration Pipelining OpenMP 


  1. 1.
    Arteaga, A., Ruprecht, D., Krause, R.: A stencil-based implementation of parareal in the C++ domain specific embedded language STELLA. Appl. Math. Comput. 267, 727–741 (2015). MathSciNetGoogle Scholar
  2. 2.
    Aubanel, E.: Scheduling of tasks in the parareal algorithm. Parallel Comput. 37, 172–182 (2011). MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Ayguade, E., Copty, N., Duran, A., Hoeflinger, J., Lin, Y., Massaioli, F., Teruel, X., Unnikrishnan, P., Zhang, G.: The design of OpenMP tasks. IEEE Trans. Parallel Distrib. Syst. 20(3), 404–418 (2009)CrossRefGoogle Scholar
  4. 4.
    Barry, A.: Resource utilization reporting: gathering and evaluating HPC system usage. In: CUG 2013 Proceedings (2013).
  5. 5.
    Berry, L.A., Elwasif, W.R., Reynolds-Barredo, J.M., Samaddar, D., Sánchez, R.S., Newman, D.E.: Event-based parareal: a data-flow based implementation of parareal. J. Comput. Phys. 231(17), 5945–5954 (2012).
  6. 6.
    Dongarra, J., et al.: Applied mathematics research for exascale computing. Technical report, LLNL-TR-651000, Lawrence Livermore National Laboratory (2014).
  7. 7.
    Gander, M.J.: 50 years of time parallel time integration. In: Carraro, T., Geiger, M., Körkel, S., Rannacher, R. (eds.) Multiple Shooting and Time Domain Decomposition Methods. CMCS, vol. 9, pp. 69–113. Springer, Cham (2015). doi: 10.1007/978-3-319-23321-5_3 CrossRefGoogle Scholar
  8. 8.
    Gander, M.J., Vandewalle, S.: Analysis of the parareal time-parallel time-integration method. SIAM J. Sci. Comput. 29(2), 556–578 (2007). MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Gerstenberger, R., Besta, M., Hoefler, T.: Enabling highly-scalable remote memory access programming with MPI-3 one sided. Sci. Program. 22(2), 75–91 (2014)Google Scholar
  10. 10.
    Isci, C., Martonosi, M.: Runtime power monitoring in high-end processors: methodology and empirical data. In: Proceedings of 36th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 36, p. 93 (2003).
  11. 11.
    Krause, R., Ruprecht, D.: Hybrid space–time parallel solution of Burgers’ equation. In: Erhel, J., Gander, M.J., Halpern, L., Pichot, G., Sassi, T., Widlund, O. (eds.) Domain Decomposition Methods in Science and Engineering XXI. LNCSE, vol. 98, pp. 647–655. Springer, Cham (2014). doi: 10.1007/978-3-319-05789-7_62 Google Scholar
  12. 12.
    Lecouvez, M., Falgout, R., Woodward, C., Top, P.: A parallel multigrid reduction in time method for power systems (2016).
  13. 13.
    Lions, J.L., Maday, Y., Turinici, G.: A “parareal” in time discretization of PDE’s. Comptes Rendus de l’Académie des Sciences - Series I - Mathematics 332, 661–668 (2001). CrossRefzbMATHGoogle Scholar
  14. 14.
    Minion, M.L.: A hybrid parareal spectral deferred corrections method. Commun. Appl. Math. Comput. Sci. 5(2), 265–301 (2010). MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: 17th Euromicro International Conference on Parallel, Distributed and Network-based processing, pp. 427–436 (2009)Google Scholar
  16. 16.
    Ruprecht, D.: PararealF90: shared memory pipelined Parareal (2017).
  17. 17.
    Schreiber, M., Peddle, A., Haut, T., Wingate, B.: A decentralized parallelization-in-time approach with Parareal (2015).
  18. 18.
    Shu, C.W., Osher, S.: Efficient implementation of essentially non-oscillatory shock-capturing schemes II. J. Comput. Phys. 83, 32–78 (1989)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Mechanical EngineeringLeedsUK
  2. 2.Institute of Computational ScienceUniversità della Svizzera italianaLuganoSwitzerland

Personalised recommendations