Following the Blind Seer – Creating Better Performance Models Using Less Information

  • Patrick Reisert
  • Alexandru Calotoiu
  • Sergei ShudlerEmail author
  • Felix Wolf
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10417)


Offering insights into the behavior of applications at higher scale, performance models are useful for finding performance bugs and tuning the system. Extra-P, a tool for automated performance modeling, uses statistical methods to automatically generate, from a small number of performance measurements, models that can be used to predict performance where no measurements are available. However, the current version requires the manual pre-configuration of a search space, which might turn out to be unsuitable for the problem at hand. Furthermore, noise in the data often leads to models that indicate a worse behavior than there actually is. In this paper, we propose a new model-generation algorithm that solves both of the above problems: The search space is built and automatically refined on demand, and a scale-independent error metric tells both when to stop the refinement process and whether a model reflects faithfully enough the behavior the data exhibits. This makes Extra-P easier to use, while also allowing it to produce more accurate results. Using data from previous case studies, we show that the mean relative prediction error decreases from 46% to 13%.


Parallel computing Performance tools Performance modeling 



This work was supported in part by the German Research Foundation (DFG) through the Priority Programme 1648 Software for Exascale Computing (SPPEXA) and the Programme Performance Engineering for Scientific Software. Additional support was provided by the German Federal Ministry of Education and Research (BMBF) under Grant No. 01IH16008, and by the US Department of Energy under Grant No. DE-SC0015524. Finally, we would like to thank the University Computing Center (Hochschulrechenzentrum) of TU Darmstadt for providing us with access to the Lichtenberg Cluster.


  1. 1.
    Extra-P - automated performance-modeling tool.
  2. 2.
    Barnes, B.J., Rountree, B., Lowenthal, D.K., Reeves, J., de Supinski, B., Schulz, M.: A regression-based approach to scalability prediction. In: Proceedings of the International Conference on Supercomputing (ICS), pp. 368–377. ACM (2008)Google Scholar
  3. 3.
    Bauer, G., Gottlieb, S., Hoefler, T.: Performance modeling and comparative analysis of the MILC lattice QCD application Su3_Rmd. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 652–659. IEEE Computer Society (2012)Google Scholar
  4. 4.
    Calotoiu, A., Beckingsale, D., Earl, C.W., Hoefler, T., Karlin, I., Schulz, M., Wolf, F.: Fast multi-parameter performance modeling. In: Proceedings of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1–10. IEEE Computer Society (2016)Google Scholar
  5. 5.
    Calotoiu, A., Hoefler, T., Poke, M., Wolf, F.: Using automated performance modeling to find scalability bugs in complex codes. In: Proceedings of the 2013 ACM/IEEE Conference on Supercomputing (SC), pp. 45:1–45:12. ACM (2013)Google Scholar
  6. 6.
    Chatzopoulos, G., Dragojević, A., Guerraoui, R.: ESTIMA: extrapolating scalability of in-memory applications. In: Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 27:1–27:11. ACM (2016)Google Scholar
  7. 7.
    Duplyakin, D., Brown, J., Ricci, R.: Active learning in performance analysis. In: Proceedings of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 182–191. IEEE Computer Society (2016)Google Scholar
  8. 8.
    Guthery, S.B.: A Motif of Mathematics. Docent Press, Boston (2011)zbMATHGoogle Scholar
  9. 9.
    Hoefler, T., Gropp, W., Thakur, R., Träff, J.L.: Toward performance models of MPI implementations for understanding application scaling issues. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 21–30. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15646-5_3 CrossRefGoogle Scholar
  10. 10.
    Ilyas, K., Calotoiu, A., Wolf, F.: Off-road performance modeling - how to deal with segmented data. In: Rivera, F.F., et al. (eds.) Euro-Par 2017. LNCS, vol. 10417, pp. 36–48. Springer, Cham (2017)CrossRefGoogle Scholar
  11. 11.
    Iwainsky, C., Shudler, S., Calotoiu, A., Strube, A., Knobloch, M., Bischof, C., Wolf, F.: How many threads will be too many? On the scalability of OpenMP implementations. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 451–463. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-48096-0_35 CrossRefGoogle Scholar
  12. 12.
    Kreinovich, V., Nguyen, H.T., Ouncharoen, R.: How to estimate forecasting quality: a system-motivated derivation of symmetric mean absolute percentage error (SMAPE) and other similar characteristics. Technical report, Paper 865, University of Texas at El Paso (2014)Google Scholar
  13. 13.
    Meswani, M.R., Carrington, L., Unat, D., Snavely, A., Baden, S., Poole, S.: Modeling and predicting performance of high performance computing applications on hardware accelerators. Int. J. High Perform. Comput. Appl. 27(2), 89–108 (2013)CrossRefGoogle Scholar
  14. 14.
    an Mey, D., et al.: Score-P: a unified performance measurement system for petascale applications. In: Bischof, C., Hegering, H.G., Nagel, W., Wittum, G. (eds.) Competence in High Performance Computing 2010, pp. 85–97. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-24025-6_8 CrossRefGoogle Scholar
  15. 15.
    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes: The Art of Scientific Computing, 3rd edn. Cambridge University Press, Cambridge (2007)zbMATHGoogle Scholar
  16. 16.
    Shudler, S., Calotoiu, A., Hoefler, T., Strube, A., Wolf, F.: Exascaling your library: will your implementation meet your expectations? In: Proceedings of the International Conference on Supercomputing (ICS), pp. 165–175. ACM (2015)Google Scholar
  17. 17.
    Shudler, S., Calotoiu, A., Hoefler, T., Wolf, F.: Isoefficiency in practice: configuring and understanding the performance of task-based applications. In: Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 1–13. ACM (2017)Google Scholar
  18. 18.
    Spafford, K.L., Vetter, J.S.: Aspen: a domain specific language for performance modeling. In: Proceedings of the 2012 ACM/IEEE Conference on Supercomputing (SC), pp. 84:1–84:11. IEEE Computer Society Press (2012)Google Scholar
  19. 19.
    Tallent, N.R., Hoisie, A.: Palm: easing the burden of analytical performance modeling. In: Proceedings of the 28th ACM International Conference on Supercomputing (ICS), pp. 221–230. ACM (2014)Google Scholar
  20. 20.
    Vogel, A., Calotoiu, A., Strube, A., Reiter, S., Nägel, A., Wolf, F., Wittum, G.: 10,000 performance models per minute – scalability of the UG4 simulation framework. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 519–531. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-48096-0_40 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Patrick Reisert
    • 1
  • Alexandru Calotoiu
    • 1
  • Sergei Shudler
    • 1
    Email author
  • Felix Wolf
    • 1
  1. 1.Technische Universität DarmstadtDarmstadtGermany

Personalised recommendations