Advertisement

Shared Memory Pipelined Parareal

  • Daniel Ruprecht
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10417)

Abstract

For the parallel-in-time integration method Parareal, pipelining can be used to hide some of the cost of the serial correction step and improve its efficiency. The paper introduces a basic OpenMP implementation of pipelined Parareal and compares it to a standard MPI-based variant. Both versions yield almost identical runtimes, but, depending on the compiler, the OpenMP variant consumes about 7% less energy and has a significantly smaller memory footprint. However, its higher implementation complexity might make it difficult to use in legacy codes and in combination with spatial parallelisation.

Keywords

Parareal Parallel-in-time integration Pipelining OpenMP 

References

  1. 1.
    Arteaga, A., Ruprecht, D., Krause, R.: A stencil-based implementation of parareal in the C++ domain specific embedded language STELLA. Appl. Math. Comput. 267, 727–741 (2015). http://dx.doi.org/10.1016/j.amc.2014.12.055 MathSciNetGoogle Scholar
  2. 2.
    Aubanel, E.: Scheduling of tasks in the parareal algorithm. Parallel Comput. 37, 172–182 (2011). http://dx.doi.org/10.1016/j.parco.2010.10.004 MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Ayguade, E., Copty, N., Duran, A., Hoeflinger, J., Lin, Y., Massaioli, F., Teruel, X., Unnikrishnan, P., Zhang, G.: The design of OpenMP tasks. IEEE Trans. Parallel Distrib. Syst. 20(3), 404–418 (2009)CrossRefGoogle Scholar
  4. 4.
    Barry, A.: Resource utilization reporting: gathering and evaluating HPC system usage. In: CUG 2013 Proceedings (2013). https://cug.org/proceedings/cug2013_proceedings/includes/files/pap103.pdf
  5. 5.
    Berry, L.A., Elwasif, W.R., Reynolds-Barredo, J.M., Samaddar, D., Sánchez, R.S., Newman, D.E.: Event-based parareal: a data-flow based implementation of parareal. J. Comput. Phys. 231(17), 5945–5954 (2012). http://dx.doi.org/10.1016/j.jcp.2012.05.016
  6. 6.
    Dongarra, J., et al.: Applied mathematics research for exascale computing. Technical report, LLNL-TR-651000, Lawrence Livermore National Laboratory (2014). http://science.energy.gov/%7E/media/ascr/pdf/research/am/docs/EMWGreport.pdf
  7. 7.
    Gander, M.J.: 50 years of time parallel time integration. In: Carraro, T., Geiger, M., Körkel, S., Rannacher, R. (eds.) Multiple Shooting and Time Domain Decomposition Methods. CMCS, vol. 9, pp. 69–113. Springer, Cham (2015). doi: 10.1007/978-3-319-23321-5_3 CrossRefGoogle Scholar
  8. 8.
    Gander, M.J., Vandewalle, S.: Analysis of the parareal time-parallel time-integration method. SIAM J. Sci. Comput. 29(2), 556–578 (2007). http://dx.doi.org/10.1137/05064607X MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Gerstenberger, R., Besta, M., Hoefler, T.: Enabling highly-scalable remote memory access programming with MPI-3 one sided. Sci. Program. 22(2), 75–91 (2014)Google Scholar
  10. 10.
    Isci, C., Martonosi, M.: Runtime power monitoring in high-end processors: methodology and empirical data. In: Proceedings of 36th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 36, p. 93 (2003). http://dl.acm.org/citation.cfm?id=956417.956567
  11. 11.
    Krause, R., Ruprecht, D.: Hybrid space–time parallel solution of Burgers’ equation. In: Erhel, J., Gander, M.J., Halpern, L., Pichot, G., Sassi, T., Widlund, O. (eds.) Domain Decomposition Methods in Science and Engineering XXI. LNCSE, vol. 98, pp. 647–655. Springer, Cham (2014). doi: 10.1007/978-3-319-05789-7_62 Google Scholar
  12. 12.
    Lecouvez, M., Falgout, R., Woodward, C., Top, P.: A parallel multigrid reduction in time method for power systems (2016). http://www.osti.gov/scitech/biblio/1281664
  13. 13.
    Lions, J.L., Maday, Y., Turinici, G.: A “parareal” in time discretization of PDE’s. Comptes Rendus de l’Académie des Sciences - Series I - Mathematics 332, 661–668 (2001). http://dx.doi.org/10.1016/S0764-4442(00)01793-6 CrossRefzbMATHGoogle Scholar
  14. 14.
    Minion, M.L.: A hybrid parareal spectral deferred corrections method. Commun. Appl. Math. Comput. Sci. 5(2), 265–301 (2010). http://dx.doi.org/10.2140/camcos.2010.5.265 MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: 17th Euromicro International Conference on Parallel, Distributed and Network-based processing, pp. 427–436 (2009)Google Scholar
  16. 16.
    Ruprecht, D.: PararealF90: shared memory pipelined Parareal (2017). http://doi.org/10.5281/zenodo.260095
  17. 17.
    Schreiber, M., Peddle, A., Haut, T., Wingate, B.: A decentralized parallelization-in-time approach with Parareal (2015). http://arxiv.org/abs/1506.05157
  18. 18.
    Shu, C.W., Osher, S.: Efficient implementation of essentially non-oscillatory shock-capturing schemes II. J. Comput. Phys. 83, 32–78 (1989)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Mechanical EngineeringLeedsUK
  2. 2.Institute of Computational ScienceUniversità della Svizzera italianaLuganoSwitzerland

Personalised recommendations