Advertisement

On the Accuracy and Usefulness of Analytic Energy Models for Contemporary Multicore Processors

  • Johannes Hofmann
  • Georg Hager
  • Dietmar Fey
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10876)

Abstract

This paper presents refinements to the execution-cache-memory performance model and a previously published power model for multicore processors. The combination of both enables a very accurate prediction of performance and energy consumption of contemporary multicore processors as a function of relevant parameters such as number of active cores as well as core and Uncore frequencies. Model validation is performed on Intel Sandy Bridge-EP, Broadwell-EP, and AMD Epyc processors. Production-related variations in chip quality are demonstrated through a statistical analysis of the fit parameters obtained on one hundred Broadwell-EP CPUs of the same model. Insights from the models are used to explain the performance- and energy-related behavior of the processors for scalable as well as saturating (i.e., memory-bound) codes. In the process we demonstrate the models’ capability to identify optimal operating points with respect to highest performance, lowest energy-to-solution, and lowest energy-delay product and identify a set of best practices for energy-efficient execution.

Keywords

Performance modeling Power modeling Energy modeling 

References

  1. 1.
    Intel 64 and IA-32 Architectures Optimization Reference Manual. Intel Press, June 2016. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
  2. 2.
    De Vogeleer, K., Memmi, G., Jouvelot, P., Coelho, F.: The energy/frequency convexity rule: modeling and experimental validation on mobile devices. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2013. LNCS, vol. 8384, pp. 793–803. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-642-55224-3_74CrossRefGoogle Scholar
  3. 3.
    Freeh, V.W., Lowenthal, D.K., Pan, F., Kappiah, N., Springer, R., Rountree, B.L., Femal, M.E.: Analyzing the energy-time trade-off in high-performance computing applications. IEEE Trans. Parallel Distrib. Syst. 18(6), 835–848 (2007).  https://doi.org/10.1109/TPDS.2007.1026CrossRefGoogle Scholar
  4. 4.
    Hackenberg, D., Schöne, R., Ilsche, T., Molka, D., Schuchart, J., Geyer, R.: An energy efficiency feature survey of the Intel Haswell processor. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp. 896–904, May 2015Google Scholar
  5. 5.
    Hager, G., Treibig, J., Habich, J., Wellein, G.: Exploring performance and power properties of modern multicore chips via simple machine models. Concurr. Comput. Pract. Exper. (2013).  https://doi.org/10.1002/cpe.3180
  6. 6.
    Hammer, J., Eitzinger, J., Hager, G., Wellein, G.: Kerncraft: a tool for analytic performance modeling of loop kernels. In: Niethammer, C., Gracia, J., Hilbrich, T., Knüpfer, A., Resch, M.M., Nagel, W.E. (eds.) Tools for High Performance Computing 2016, pp. 1–22. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-56702-0_1CrossRefGoogle Scholar
  7. 7.
    Hofmann, J., Hager, G., Wellein, G., Fey, D.: An analysis of core- and chip-level architectural features in four generations of intel server processors. In: Kunkel, J.M., Yokota, R., Balaji, P., Keyes, D. (eds.) ISC 2017. LNCS, vol. 10266, pp. 294–314. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-58667-0_16CrossRefGoogle Scholar
  8. 8.
    Inadomi, Y., Patki, T., Inoue, K., Aoyagi, M., Rountree, B., Schulz, M., Lowenthal, D., Wada, Y., Fukazawa, K., Ueda, M., Kondo, M., Miyoshi, I.: Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 78:1–78:12. ACM, New York (2015). http://doi.acm.org/10.1145/2807591.2807638
  9. 9.
    Khabi, D., Küster, U.: Power consumption of kernel operations. In: Resch, M.M., Bez, W., Focht, E., Kobayashi, H., Kovalenko, Y. (eds.) Sustained Simulation Performance 2013, pp. 27–45. Springer, Cham (2013).  https://doi.org/10.1007/978-3-319-01439-5_3CrossRefGoogle Scholar
  10. 10.
    Rauber, T., Rünger, G.: Towards an energy model for modular parallel scientific applications. In: 2012 IEEE International Conference on Green Computing and Communications, pp. 523–532, November 2012Google Scholar
  11. 11.
    Rauber, T., Rünger, G., Schwind, M., Xu, H., Melzner, S.: Energy measurement, modeling, and prediction for processors with frequency scaling. J. Supercomput. 70(3), 1451–1476 (2014).  https://doi.org/10.1007/s11227-014-1236-4CrossRefGoogle Scholar
  12. 12.
    Song, S., Su, C., Rountree, B., Cameron, K.W.: A simplified and accurate model of power-performance efficiency on emergent GPU architectures. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 673–686, May 2013Google Scholar
  13. 13.
    Stengel, H., Treibig, J., Hager, G., Wellein, G.: Quantifying performance bottlenecks of stencil computations using the Execution-Cache-Memory model. In: Proceedings of the 29th ACM International Conference on Supercomputing, ICS 2015. ACM, New York (2015). http://doi.acm.org/10.1145/2751205.2751240
  14. 14.
    Wilde, T., Auweter, A., Shoukourian, H., Bode, A.: Taking advantage of node power variation in homogenous HPC systems to save energy. In: Kunkel, J.M., Ludwig, T. (eds.) ISC High Performance 2015. LNCS, vol. 9137, pp. 376–393. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-20119-1_27CrossRefGoogle Scholar
  15. 15.
    Williams, S., Waterman, A., Patterson, D.: Roofline: An insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009). http://doi.acm.org/10.1145/1498765.1498785CrossRefGoogle Scholar
  16. 16.
    Wittmann, M., Hager, G., Zeiser, T., Treibig, J., Wellein, G.: Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations. Concurr. Comput. Pract. Exper. 28(7), 2295–2315 (2016).  https://doi.org/10.1002/cpe.3489CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computer ArchitectureUniversity of Erlangen-NurembergErlangenGermany
  2. 2.Erlangen Regional Computing Center (RRZE)ErlangenGermany

Personalised recommendations