Advertisement

Run-Time Exploitation of Application Dynamism for Energy-Efficient Exascale Computing

  • Per Gunnar KjeldsbergEmail author
  • Robert Schöne
  • Michael Gerndt
  • Lubomir Riha
  • Venkatesh Kannan
  • Kai Diethelm
  • Marie-Christine Sawley
  • Jan Zapletal
  • Andreas Gocht
  • Nico Reissmann
  • Ondrej Vysocky
  • Madhura Kumaraswamy
  • Wolfgang E. Nagel
Chapter

Abstract

As in the embedded systems domain, energy efficiency has recently become one of the main design criteria in high performance computing. The European Union Horizon 2020 project READEX (Run-time Exploitation of Application Dynamism for Energy-efficient eXascale computing) has developed a tools-aided auto-tuning methodology inspired by system scenario based design. Applying similar concepts as those presented in earlier chapters of this book, the dynamic behavior of HPC applications is exploited to achieve improved energy efficiency and performance. Driven by a consortium of European experts from academia, HPC resource providers, and industry, the READEX project has developed the first generic framework of its kind for split design-time and run-time tuning while targeting heterogeneous systems at the Exascale level. Using a real-life boundary element application, energy savings of more than 30% can be shown.

Keywords

Application dynamism High performance computing Exascale Methodology Auto-tuning Design-time Run-time Instrumentation Tuning model Control plugin Partial differential equations 

Notes

Acknowledgements

The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under grant agreement number 671657.

References

  1. 1.
    S. Benkner et al., PEPPHER: efficient and productive usage of hybrid computing systems. IEEE Micro 31(5), 28–41 (2011)CrossRefGoogle Scholar
  2. 2.
    S. Benkner, F. Franchetti, H.M. Gerndt, J.K. Hollingsworth, Automatic application tuning for HPC architectures (Dagstuhl Seminar 13401), in Dagstuhl Reports, vol. 3, no. 9, pp. 214–244, 2014, http://drops.dagstuhl.de/opus/volltexte/2014/4423Google Scholar
  3. 3.
    E. César, A. Moreno, J. Sorribes, E. Luque, Modeling master/worker applications for automatic performance tuning. Parallel Comput. 32(7), 568–589 (2006)CrossRefGoogle Scholar
  4. 4.
    European Union FP7 project 248481, Automatic online tuning (AutoTune), http://www.autotune-project.eu/. Accessed 25 Nov 2016
  5. 5.
    European Union FP7 project 248647, ENabling technologies for a programmable many-CORE (ENCORE), http://cordis.europa.eu/project/rcn/94045_en.html. Accessed 26 Mar 2018
  6. 6.
    European Union Horizon 2020 project 671657, Run-time exploitation of application dynamism for energy-efficient exascale computing (READEX), http://www.readex.eu. Accessed 11 Feb 2019
  7. 7.
    I. Filippopoulos, F. Catthoor, P.G. Kjeldsberg, Exploration of energy efficient memory organisations for dynamic multimedia applications using system scenarios. Des. Autom. Embed. Syst. 17(34), 669692 (2013)Google Scholar
  8. 8.
    V. Gheorghita, M. Palkovic, J. Hamers, A. Vandecappelle, S. Mamagkakis, T. Basten, L. Eeckhout, H. Corporaal, F. Catthoor, F. Vandeputte, K. De Bosschere, System scenario based design of dynamic embedded systems. ACM Trans. Des. Autom. Embed. Syst. 14(1), article 3 (2009)Google Scholar
  9. 9.
    D. Hackenberg et al., HDEEM: high definition energy efficiency monitoring, in Energy Efficient Supercomputing Workshop, E2SC, New Orleans, USA, 2014Google Scholar
  10. 10.
    P.G. Kjeldsberg, A. Gocht, M. Gerndt, L. Riha, J. Schuchart, U.S. Mian, READEX: Linking two ends of the computing continuum to improve energy efficiency in dynamic applications, in Design Automation and Test in Europe Conference & Exhibition, DATE 2017, Lausanne, Switzerland, March 2017Google Scholar
  11. 11.
    A. Knüpfer et al., Score-p: a joint performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir, in Tools for High Performance Computing 2011, ed. by H. Brunst, M. Müller, W.E. Nagel, M.M. Resch (Springer, Berlin, 2012), pp. 79–91CrossRefGoogle Scholar
  12. 12.
    Z. Ma et al., Systematic Methodology for Real-Time Cost-Effective Mapping of Dynamic Concurrent Task-Based Systems on Heterogenous Platforms (Springer, Dordrecht, 2007). ISBN 978-1-4020-6328-2CrossRefGoogle Scholar
  13. 13.
    M. Merta, J. Zapletal, BEM4I, in IT4Innovations National Supercomputing Center, 2013, http://bem4i.it4i.cz/
  14. 14.
    M. Merta, J. Zapletal, J. Jaros, Many core acceleration of the boundary element method, in High Performance Computing in Science and Engineering: Second International Conference, HPCSE 2015, Soláň, Czech Republic, May 25–28, 2015, Revised Selected Papers (Springer, New York, 2016), pp. 116–125CrossRefGoogle Scholar
  15. 15.
    R. Miceli et al., Autotune: a plugin-driven approach to the automatic tuning of parallel applications, in Applied Parallel and Scientific Computing. Lecture Notes in Computer Science, ed. by P. Manninen, P. Öster, vol. 7782, pp. 328–342 (Springer, Berlin, 2013)Google Scholar
  16. 16.
    L. Riha, M. Merte, R. Vavrik, T. Brzobohaty, A. Markopoulos, O. Meca, O. Vysoocky, T. Kozubek, V. Vondrak, A massively parallel and memory-efficient FEM toolbox with a hybrid total FETI solver with accelerator support. Int. J. High Perform. Comput. Appl. 33(4), 660–677 (2019)CrossRefGoogle Scholar
  17. 17.
    R. Schöne et al., Extending the functionality of score-p through plugins: interfaces and use cases, in Tools for High Performance Computing 2016, ed. by C. Niethammer et al. (Springer, Berlin, 2017), pp. 59–82CrossRefGoogle Scholar
  18. 18.
    C. Silvano et al., The ANTAREX approach to autotuning and adaptivity for energy efficient HPC systems, in Proceedings of the ACM International Conference on Computing Frontiers, CF ’16 (ACM, New York, 2016), pp. 288–293Google Scholar
  19. 19.
    The OmpSs Programming Model, https://pm.bsc.es/ompss. Accessed 25 Nov 2016
  20. 20.
    A. Tiwari, C. Chen, J. Chame, M. Hall, J.K. Hollingsworth, A scalable auto-tuning framework for compiler optimization, in IEEE International Parallel & Distributed Processing Symposium. IPDPS 2009, pp. 1–12, 2009Google Scholar
  21. 21.
    J. Zapletal, M. Merta, L. Maly, Boundary element quadrature schemes for multi- and many-core architectures. Comput. Math. Appl. 74(1), 157–173 (2016)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Per Gunnar Kjeldsberg
    • 1
    Email author
  • Robert Schöne
    • 2
  • Michael Gerndt
    • 3
  • Lubomir Riha
    • 4
  • Venkatesh Kannan
    • 5
  • Kai Diethelm
    • 6
    • 7
  • Marie-Christine Sawley
    • 8
  • Jan Zapletal
    • 4
  • Andreas Gocht
    • 2
  • Nico Reissmann
    • 1
  • Ondrej Vysocky
    • 4
  • Madhura Kumaraswamy
    • 3
  • Wolfgang E. Nagel
    • 2
  1. 1.Norwegian University of Science and TechnologyNTNUTrondheimNorway
  2. 2.Technische Universität DresdenDresdenGermany
  3. 3.Technische Universität MünchenMünchenGermany
  4. 4.VSB - Technical University of OstravaOstravaCzech Republic
  5. 5.Irish Center for High-End ComputingGalwayIreland
  6. 6.Gesellschaft für numerische SimulationBraunschweigGermany
  7. 7.University of Applied Sciences Würzburg-SchweinfurtSchweinfurtGermany
  8. 8.Intel ExaScale LabsParisFrance

Personalised recommendations