Performance Prediction of Multigrid-Solver Configurations

  • Alexander GrebhahnEmail author
  • Norbert Siegmund
  • Harald Köstler
  • Sven Apel
Conference paper
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 113)


Geometric multigrid solvers are among the most efficient methods for solving partial differential equations. To optimize performance, developers have to select an appropriate combination of algorithms for the hardware and problem at hand. Since a manual configuration of a multigrid solver is tedious and does not scale for a large number of different hardware platforms, we have been developing a code generator that automatically generates a multigrid-solver configuration tailored to a given problem. However, identifying a performance-optimal solver configuration is typically a non-trivial task, because there is a large number of configuration options from which developers can choose. As a solution, we present a machine-learning approach that allows developers to make predictions of the performance of solver configurations, based on quantifying the influence of individual configuration options and interactions between them. As our preliminary results on three configurable multigrid solvers were encouraging, we focus on a larger, non-tivial case-study in this work. Furthermore, we discuss and demonstrate how to integrate domain knowledge in our machine-learning approach to improve accuracy and scalability and to explore how the performance models we learn can help developers and domain experts in understanding their system.


Integrating Domain Knowledge Multigrid Solver Configuration Options Post-smoothing Steps Median Error Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We thank the Jülich Supercomputing Center for providing access to the supercomputer JuQueen. This work is supported by the German Research Foundation (DFG), as part of the Priority Program 1648 “Software for Exascale Computing”, under the contract RU 422/15-1 and AP 206/7-1. Sven Apel’s work is also supported by the DFG under the contracts AP 206/4-1 and AP 206/6-1.


  1. 1.
    Agakov, F., Bonilla, E., Cavazos, J., Franke, B., Fursin, G., O’Boyle, M.F.P., Thomson, J., Toussaint, M., Williams, C.K.I.: Using machine learning to focus iterative optimization. In: Proceedings of the International Symposium on Code Generation and Optimization (CGO), Manhattan, pp. 295–305. IEEE (2006)Google Scholar
  2. 2.
    Bergstra, J., Pinto, N., Cox, D.: Machine learning for predictive auto-tuning with boosted regression trees. In: Proceedings of the Innovative Parallel Computing (InPar), San Jose, pp. 1–9. IEEE (2012)Google Scholar
  3. 3.
    Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Math. Comput. 31 (138), 333–390 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Ganapathi, A., Datta, K., Fox, A., Patterson, D.: A case for machine learning to optimize multicore performance. In: Proceedings of the USENIX Conference on Hot Topics in Parallelism (HotPar), Berkeley, pp. 1–6. USENIX Association (2009)Google Scholar
  5. 5.
    Grebhahn, A., Kuckuk, S., Schmitt, C., Köstler, H., Siegmund, N., Apel, S., Hannig, F., Teich, J.: Experiments on optimizing the performance of stencil codes with SPL conqueror. Parallel Process. Lett. 24 (3), 19 (2014). Article 1441001Google Scholar
  6. 6.
    Guo, J., Czarnecki, K., Apel, S., Siegmund, N., Wasowski, A.: Variability-aware performance prediction: a statistical learning approach. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE), Palo Alto, pp. 301–311. IEEE (2013)Google Scholar
  7. 7.
    Hackbusch, W.: Multi-grid Methods and Applications. Springer, Berlin (2003)zbMATHGoogle Scholar
  8. 8.
    Ipek, E., de Supinski, B.R., Schulz, M., McKee, S.A.: An approach to performance prediction for parallel applications. In: Euro-Par 2005 Parallel Processing, Lisboa, pp. 196–205. Springer (2005)Google Scholar
  9. 9.
    Jain, N., Bhatele, A., Robson, M.P., Gamblin, T., Kale, L.V.: Predicting application performance using supervised learning on communication features. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), Denver, pp. 95:1–95:12. ACM (2013)Google Scholar
  10. 10.
    Kerbyson, D.J., Alme, H.J., Hoisie, A., Petrini, F., Wasserman, H.J., Gittings, M.: Predictive performance and scalability modeling of a large-scale application. In: Proceedings of the ACM/IEEE Conference on Supercomputing (SC), Denver, pp. 37–48. ACM (2001)Google Scholar
  11. 11.
    Kronawitter, S., Lengauer, C.: Optimizations Applied by the ExaStencils Code Generator. Technical report MIP-1502, Faculty of Informatics and Mathematics, p. 10. University of Passau (2015)Google Scholar
  12. 12.
    Kuckuk, S., Gmeiner, B., Köstler, H., Rüde, U.: A generic prototype to benchmark algorithms and data structures for hierarchical hybrid grids. In: Parallel Computing: Accelerating Computational Science and Engineering (CSE), pp. 813–822. IOS Press (2013)Google Scholar
  13. 13.
    Lengauer, C., Apel, S., Bolten, M., Größlinger, A., Hannig, F., Köstler, H., Rüde, U., Teich, J., Grebhahn, A., Kronawitter, S., Kuckuk, S., Rittich, H., Schmitt, C.: ExaStencils: advanced stencil-code engineering. In: Euro-Par 2014: Parallel Processing Workshops, Part II. Lecture Notes in Computer Science, Porto, vol. 8806, pp. 553–564. Springer (2014)Google Scholar
  14. 14.
    Magni, A., Dubach, C., O’Boyle, M.: Automatic optimization of thread-coarsening for graphics processors. In: Proceedings of the International Conference on Parallel Architectures and Compilation (PACT), Alberta, pp. 455–466. ACM (2014)Google Scholar
  15. 15.
    Membarth, R., Reiche, O., Hannig, F., Teich, J., Korner, M., Eckert, W.: Hipacc: a domain-specific language and compiler for image processing. IEEE Trans. Parallel Distrib. Syst. PP (99), 1–1 (2015)Google Scholar
  16. 16.
    Montgomery, D.C.: Design and Analysis of Experiments. Wiley, New York/Chichester (2006)Google Scholar
  17. 17.
    Nguyen, A., Satish, N., Chhugani, J., Kim, C., Dubey, P.: 3.5-d blocking optimization for stencil computations on modern cpus and gpus. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), New Orleans, pp. 1–13. IEEE (2010)Google Scholar
  18. 18.
    Plackett, R.L., Burman, J.P.: The design of optimum multifactorial experiments. Biometrika 33 (4), 305–325 (1946)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Püschel, M., Franchetti, F., Voronenko, Y.: Spiral. In: Encyclopedia of Parallel Computing, pp. 1920–1933. Springer (2011)Google Scholar
  20. 20.
    Püschel, M., Moura, J.M.F., Singer, B., Xiong, J., Johnson, J., Padua, D., Veloso, M., Johnson, R.W.: Spiral: a generator for platform-adapted libraries of signal processing algorithms. J. High Perform. Comput. Appl. 18, 21–45 (2004)CrossRefGoogle Scholar
  21. 21.
    Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M., Tarantola, S.: Global Sensitivity Analysis. The Primer. Wiley, New York/Chichester (2008)zbMATHGoogle Scholar
  22. 22.
    Schmitt, C., Kuckuk, S., Hannig, F., Köstler, H., Teich, J.: Exaslang: a domain-specific language for highly scalable multigrid solvers. In: Proceedings of the International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), New Orleans, pp. 42–51. IEEE (2014)Google Scholar
  23. 23.
    Siegmund, N.: Measuring and predicting non-functional properties of customizable programs. Dissertation, University of Magdeburg (2012)Google Scholar
  24. 24.
    Siegmund, N., Grebhahn, A., Apel, S., Kästner, C.: Performance-influence models for highly configurable systems. In: Proceedings of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), Bergamo, pp. 284–294. ACM (2015)Google Scholar
  25. 25.
    Siegmund, N., Kolesnikov, S.S., Kästner, C., Apel, S., Batory, D., Rosenmüller, M., Saake, G.: Predicting performance via automated feature-interaction detection. In: Proceedings of the International Conference on Software Engineering (ICSE), Zürich, pp. 167–177. IEEE (2012)Google Scholar
  26. 26.
    Simon, D.: Evolutionary optimization algorithms. Wiley, New York/Chichester (2013)Google Scholar
  27. 27.
    Trottenberg, U., Oosterlee, C.W., Schüller, A.: Multigrid. Academic Press, Orlando (2001)zbMATHGoogle Scholar
  28. 28.
    Wesseling, P.: An Introduction to Multigrid Methods. Wiley, New York/Chichester (1992)zbMATHGoogle Scholar
  29. 29.
    Wirth, N.: Program development by stepwise refinement. Commun. ACM 14 (4), 221–227 (1971)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Alexander Grebhahn
    • 1
    Email author
  • Norbert Siegmund
    • 1
  • Harald Köstler
    • 2
  • Sven Apel
    • 1
  1. 1.University of PassauPassauGermany
  2. 2.Friedrich-Alexander University Erlangen-NürnbergErlangenGermany

Personalised recommendations