Cluster Computing

, Volume 17, Issue 4, pp 1323–1333 | Cite as

Pre-execution power consumption prediction of computational multithreaded workloads

  • Hamid Fadishei
  • Hossein Deldari
  • Mahmoud Naghibzadeh
Article
  • 162 Downloads

Abstract

Power management in large-scale computational environments can significantly benefit from predictive models. Such models provide information about the power consumption behavior of workloads prior to running them. Power consumption depends on the characteristics of both the machine and the workload. However, combinational features such as the cache miss rate cannot be considered due to their unavailability before running the workload. Therefore, pre-execution power modeling requires both machine-independent workload characteristics and workload-independent machine characteristics. In this paper the predictive modeling problem is tackled by the proposal of a two-stage modeling framework. In the first stage, a machine learning approach is taken to predict single-threaded workload power consumption at a specific frequency. The second stage analytically scales this output to any intended thread/frequency configuration. Experimental results show that the proposed approach can yield highly accurate predictions about workload power consumption with an average error of 3.7 % on six different test platforms.

Keywords

Power-aware computing Computational power modeling Abstract workload modeling  Simultaneous multithreading 

References

  1. 1.
    Kołodziej, J., Khan, S.U., Wang, L., Byrski, A., Min-Allah, N., Madani, S.A.: Hierarchical genetic-based grid scheduling with energy optimization. Clust. Comput. 16(3), 591–609 (2013)CrossRefGoogle Scholar
  2. 2.
    Nesmachnow, S., Dorronsoro, B., Pecero, J.E., Bouvry, P.: Energy-aware scheduling on multicore heterogeneous grid computing systems. Springer J. Grid Comput. 11(4), 653–680 (2013)CrossRefGoogle Scholar
  3. 3.
    Valentini, G.L., Lassonde, W., Khan, S.U., Min-Allah, N., Madani, S.A., Li, J., Zhang, L., Wang, L., Ghani, N., Kolodziej, J., et al.: An overview of energy efficiency techniques in cluster computing systems. Clust. Comput. 16(1), 3–15 (2013)CrossRefGoogle Scholar
  4. 4.
    Wang, L., Khan, S.U., Chen, D., Kołodziej, J., Ranjan, R., Xu, C.Z., Zomaya, A.: Energy-aware parallel task scheduling in a cluster. Futur. Gener. Comput. Syst. 29(7), 1661–1670 (2013)Google Scholar
  5. 5.
    Laszewski, G.V., Wang, L., Younge, A.J., He, X.: Power-aware scheduling of virtual machines in DVFS-enabled clusters. In: Proc. of CLUSTER’09, pp. 1–10. IEEE, New Orleans, LA (2009).Google Scholar
  6. 6.
    Jiang, C., Wan, J., You, X., Zhao, Y.: Power aware job scheduling in multi-processor system with service level agreements constraints. J. Comput. 5(8), 1193–1203 (2010)Google Scholar
  7. 7.
    Bertran, R., Gonzàlez, M., Martorell, X., Navarro, N., Ayguadé, E.: Decomposable and responsive power models for multicore processors using performance counters. In: Proceedings of the ICS’10, pp. 147–158. ACM, New York, NY, Tsukuba, Ibaraki (2010).Google Scholar
  8. 8.
    Bertran, R., Gonzàlez, M., Martorell, X., Navarro, N., Ayguadé, E.: A systematic methodology to generate decomposable and responsive power models for CMPs. IEEE Trans. Comput. 62(7), 1289–1302 (2013)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Hager, G., Treibig, J., Habich, J., Wellein, G.: Exploring performance and power properties of modern multi-core chips via simple machine models. Concurr. Comput.: Pract. Exp. (2014).Google Scholar
  10. 10.
    Fan, X., Weber, W.D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. ACM SIGARCH Comput. Arch. News 35(2), 13–23 (2007)CrossRefGoogle Scholar
  11. 11.
    Rivoire, S., Ranganathan, P., Kozyrakis, C.: A comparison of high-level full-system power models. In: Proceedings of the HotPower’08. USENIX, San Diego, CA (2008).Google Scholar
  12. 12.
    Bellosa, F., Kellner, S., Waitz, M., Weissel, A.: Event-driven energy accounting for dynamic thermal management. In: Proceedings of the COLP’03. New Orleans, Louisiana (2003).Google Scholar
  13. 13.
    Bircher, W.L., John, L.K.: Complete system power estimation using processor performance events. IEEE Trans. Comput. 61(4), 563–577 (2012)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Goel, B.: Per-core power estimation and power aware scheduling strategies for CMPs. Master’s thesis, Chalmers University of Technology, Gothenburg, Sweden (2011).Google Scholar
  15. 15.
    Singh, K., Bhadauria, M., McKee, S.A.: Prediction-based power estimation and scheduling for CMPs. In: Proceedings of the ICS’09, pp. 501–502. ACM, New York, NY (2009).Google Scholar
  16. 16.
    Li, T., John, L.K.: Run-time modeling and estimation of operating system power consumption. In: Proceedings of the SIGMETRICS’03, vol. 31, pp. 160–171. ACM, New York, NY (2003).Google Scholar
  17. 17.
    Pathak, A., Hu, Y.C., Zhang, M., Bahl, P., Wang, Y.M.: Fine-grained power modeling for smartphones using system call tracing. In: Proceedings of the EuroSys’11, pp. 153–168. ACM, New York, NY (2011).Google Scholar
  18. 18.
    Chen, X., Xu, C., Dick, R.P., Mao, Z.M.: Performance and power modeling in a multi-programmed multi-core environment. In: Proceedings of the DAC’10, pp. 813–818. ACM, New York, NY, Anaheim, CA (2010).Google Scholar
  19. 19.
    Hu, C., Jiménez, D.A., Kremer, U.: Combining edge vector and event counter for time-dependent power behavior characterization. Springer Trans. High-Perform. Embed. Arch. Compil. 5470, 85–104 (2009).Google Scholar
  20. 20.
    Wang, S., Chen, H., Shi, W, (2011) SPAN: a software power analyzer for multicore computer systems. Elsevier Sustain. Comput.: Inform. Syst. 1(1), 23–34.Google Scholar
  21. 21.
    Singh, K., Bhadauria, M., McKee, S.A.: Real time power estimation and thread scheduling via performance counters. ACM SIGARCH Comput. Arch. News 37(2), 46–55 (2009)CrossRefGoogle Scholar
  22. 22.
    Zamani, R., Afsahi, A.: Adaptive estimation and prediction of power and performance in high performance computing. Springer Comput. Sci. Res. Dev. 25(3), 177–186 (2010)CrossRefGoogle Scholar
  23. 23.
    Bertran, R., Gonzàlez, M., Martorell, X., Navarro, N., Ayguadé, E.: Counter-based power modeling methods: top-down vs. bottom-up. Comput. J. 56(2), 198–213 (2013)CrossRefGoogle Scholar
  24. 24.
    Joshi, A.M., Eeckhout, L., John, L.K., Isen, C.: Automated microprocessor stressmark generation. In: Proceedings of the High HPCA’08, pp. 229–239. IEEE, Salt Lake City, UT (2008).Google Scholar
  25. 25.
    Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L.K., Bosschere, K.D.: Performance prediction based on inherent program similarity. In: Proceedings of the PACT’06, pp. 114–122. ACM, Seattle, Washington (2006).Google Scholar
  26. 26.
    Joshi, A., Phansalkar, A., Eeckhout, L., John, L.K.: Measuring benchmark similarity using inherent program characteristics. IEEE Trans. Comput. 55(6), 769–782 (2006)CrossRefGoogle Scholar
  27. 27.
    Lau, J., Sampson, J., Perelman, E., Hamerly, G., Calder, B.: The strong correlation between code signatures and performance. In: Proceedings of the ISPASS05, pp. 236–247. IEEE, Austin, TX (2005).Google Scholar
  28. 28.
    Hoste, K., Eeckhout, L.: Microarchitecture-independent workload characterization. IEEE Micro 27(3), 63–72 (2007)CrossRefGoogle Scholar
  29. 29.
    Franklin, M., Sohi, G.S.: Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors. ACM SIGMICRO Newslett. 23(1–2), 236–245 (1992)CrossRefGoogle Scholar
  30. 30.
    Lafage, T., Seznec, A.: Choosing representative slices of program execution for microarchitecture simulations: a preliminary application to the data stream. Springer Workload Charact. Emerg. Comput. Appl. pp. 145–163 (2001).Google Scholar
  31. 31.
    Haungs, M., Sallee, P., Farrens, M.: Branch transition rate: a new metric for improved branch classification analysis. In: Proceedings of the HPCA’00, pp. 241–250. IEEE, Toulouse (2000).Google Scholar
  32. 32.
    Moore, R.: Predicting application performance for chip multiprocessors. Ph.D. thesis, University of Pittsburgh (2014).Google Scholar
  33. 33.
    Liu, D., Svensson, C.: Power consumption estimation in CMOS VLSI chips. IEEE J. Solid-State Circuits 29(6), 663–670 (1994)CrossRefGoogle Scholar
  34. 34.
    Sakurai, T., Newton, A.R.: Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas. IEEE J. Solid-State Circuits 25(2), 584–594 (1990)CrossRefGoogle Scholar
  35. 35.
    Mitchell, T.: Machine Learning. McGraw Hill, New York (1997).Google Scholar
  36. 36.
    Shanno, D.F.: Conditioning of quasi-newton methods for function minimization. Math. Comput. 24(111), 647–656 (1970)CrossRefMathSciNetGoogle Scholar
  37. 37.
    Hsu, C.H., Kremer, U.: The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction. ACM SIGPLAN Not. 38(5), 38–48 (2003)Google Scholar
  38. 38.
    Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the AFIPS’67, pp. 483–485 (1967). Google Scholar
  39. 39.
    Cho, S., Melhem, R.G.: Corollaries to Amdahl’s law for energy. Comput. Arch. Lett. 7(1), 25–28 (2008)CrossRefGoogle Scholar
  40. 40.
    Woo, D.H., Lee, H.H.S.: Extending Amdahl’s law for energy-efficient computing in the many-core era. IEEE Comput. 41(12), 24–31 (2008)CrossRefGoogle Scholar
  41. 41.
    Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the PACT’08, pp. 72–81. Toronto, Canada (2008).Google Scholar
  42. 42.
    Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S.: The NAS parallel benchmarks. Int. J. High Perform. Comput. Appl. 5(3), 63–73 (1991)CrossRefGoogle Scholar
  43. 43.
    Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Not. 40(6), 190–200 (2005)CrossRefGoogle Scholar
  44. 44.
    Armstrong, J.S., Collopy, F.: Error measures for generalizing about forecasting methods: empirical comparisons. Int. J. Forecasting 8(1), 69–80 (1992)CrossRefGoogle Scholar
  45. 45.
    Alonso, P., Dolz, M.F., Mayo, R., Quintana-Ortí, E.S.: Modeling power and energy of the task-parallel cholesky factorization on multicore processors. Springer Comput. Sci. Res. Dev. 29(2), 105–112 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Hamid Fadishei
    • 1
  • Hossein Deldari
    • 1
  • Mahmoud Naghibzadeh
    • 1
  1. 1.Ferdowsi University of MashhadMashhadIran

Personalised recommendations