Skip to main content

Multistep temperature prediction for proactive thermal management on chip multiprocessors

Abstract

In chip multiprocessors, the thermal hot spot prediction is vital to take proactive thermal management decisions for mitigating its impact on the performance of the application. Core temperature prediction for a range of future time steps is challenging because workload changes dynamically at runtime. This article presents an adaptive multistep temperature prediction (MSTP) algorithm, which uses iterative steps to generate core temperature predictions for a range of future time steps using the initial temperature information and the application workload profile obtained offline. MSTP considers the autocorrelation of temperature and addresses the nonlinear heteroscedasticity across observations. We evaluate the proposed MSTP on a real hardware platform and compare it to the model-based approach for a wide range of parallel benchmarks. Experimental results show that the proposed algorithm achieves more than 96% prediction accuracy and minimizes the heteroscedastic residual variance. In case of thermal hot spot prediction, we show that the performance throttling problem is alleviated using a process migration mechanism with a minimal migration overhead.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

References

  1. 1.

    Bienia C, Kumar S, Singh JP, Li K (2008) The parsec benchmark suite: Characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 72–81. ACM

  2. 2.

    Brooks D, Martonosi M (2001) Dynamic thermal management for high-performance microprocessors. In: Proceedings HPCA Seventh International Symposium on High-performance Computer Architecture, pp. 171–182. IEEE

  3. 3.

    Chhablani M, Koren I, Krishna C (2016) Online inertia-based temperature estimation for reliability enhancement. J Low Power Electron 12(3):159–171

    Article  Google Scholar 

  4. 4.

    Choi J, Cher CY, Franke H, Hamann H, Weger A, Bose P (2007) Thermal-aware task scheduling at the system software level. In: Proceedings of the 2007 International Symposium on Low Power Electronics and Design, pp. 213–218. ACM

  5. 5.

    Chung SW, Skadron K (2006) Using on-chip event counters for high-resolution, real-time temperature measurement. In: Thermal and Thermomechanical Proceedings 10th Intersociety Conference on Phenomena in Electronics Systems, 2006. ITHERM 2006., pp. 114–120. IEEE

  6. 6.

    Cochran R, Reda S (2013) Thermal prediction and adaptive control through workload phase detection. ACM Trans Design Autom Electron Syst (TODAES) 18(1):7

    Google Scholar 

  7. 7.

    Constantinou T, Sazeides Y, Michaud P, Fetis D, Seznec A (2005) Performance implications of single thread migration on a chip multi-core. ACM SIGARCH Comput Archit News 33(4):80–91

    Article  Google Scholar 

  8. 8.

    Coskun AK, Rosing TS, Gross KC (2008) Proactive temperature balancing for low cost thermal management in mpsocs. In: 2008 IEEE/ACM International Conference on Computer-aided Design, pp. 250–257. IEEE

  9. 9.

    CRIU: Checkpoint/Restore In Userspace (2019). Retrieved from https://github.com/checkpoint-restore/criu

  10. 10.

    CYRIL INGENIERRIE: CoreFreq (2018). Retrieved from https://github.com/cyring/CoreFreq

  11. 11.

    De Melo AC (2010) The new linux perf tools. In: Slides from Linux Kongress, vol. 18

  12. 12.

    Donald J, Martonosi M (2006) Techniques for multicore thermal management: Classification and new exploration. In: ACM SIGARCH Computer Architecture News, vol. 34, pp. 78–88. IEEE Computer Society

  13. 13.

    Eguia TJ, Tan SXD, Shen R, Li D, Pacheco EH, Tirumala M, Wang L (2011) General parameterized thermal modeling for high-performance microprocessor design. IEEE Trans Very Large Scale Integ (VLSI) Syst 20(2):211–224

    Article  Google Scholar 

  14. 14.

    Ferroni M, Corna A, Damiani A, Brondolin R, Colmenares JA, Hofmeyr S, Kubiatowicz JD, Santambrogio MD (2017) Power consumption models for multi-tenant server infrastructures. ACM Trans Archit Code Optim (TACO) 14(4):1–22

    Article  Google Scholar 

  15. 15.

    Hanson H, Keckler SW, Ghiasi S, Rajamani K, Rawson F, Rubio J (2007) Thermal response to dvfs: Analysis with an intel pentium m. In: Proceedings of the 2007 International Symposium on Low Power Electronics and Design (ISLPED’07), pp. 219–224. IEEE

  16. 16.

    Intel: Speedstep technology for the intel pentium m processor. White Paper (2004)

  17. 17.

    Intel (2011) Intel® 64 and IA-32 architectures software developer’s manual. Volume 3B: System programming Guide, Part 2(11). https://www.intel.in/content/www/in/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.html

  18. 18.

    Iqbal SMZ, Liang Y, Grahn H (2010) Parmibench-an open-source benchmark for embedded multiprocessor systems. IEEE Comput Archit Lett 9(2):45–48

    Article  Google Scholar 

  19. 19.

    Kim YG, Kim M, Kong J, Chung SW (2020) An adaptive thermal management framework for heterogeneous multi-core processors. IEEE Trans Comput 69(6):894–906

    Article  Google Scholar 

  20. 20.

    Lee JS, Skadron K, Chung SW (2009) Predictive temperature-aware dvfs. IEEE Trans Comput 59(1):127–133

    MathSciNet  Article  Google Scholar 

  21. 21.

    Li D, Tan SXD, Pacheco EH, Tirumala M (2010) Parameterized architecture-level dynamic thermal models for multicore microprocessors. ACM Trans Design Autom Electron Syst (TODAES) 15(2):1–22

    Article  Google Scholar 

  22. 22.

    Liu Z, Tan SXD, Huang X, Wang H (2014) Task migrations for distributed thermal management considering transient effects. IEEE Trans Very Large Scale Integr (VLSI) Syst 23(2):397–401

    Google Scholar 

  23. 23.

    Liu Z, Tan SXD, Wang H, Hua Y, Gupta A (2014) Compact thermal modeling for packaged microprocessor design with practical power maps. Integration 47(1):71–85

    Article  Google Scholar 

  24. 24.

    Long JS, Ervin LH (2000) Using heteroscedasticity consistent standard errors in the linear regression model. Am Stat 54(3):217–224

    Google Scholar 

  25. 25.

    Michaud P, Seznec A, Fetis D, Sazeides Y, Constantinou T (2007) A study of thread migration in temperature-constrained multicores. ACM Trans Archit Code Optim (TACO) 4(2):9–es

    Article  Google Scholar 

  26. 26.

    Milojičić DS, Douglis F, Paindaveine Y, Wheeler R, Zhou S (2000) Process migration. ACM Comput Surv (CSUR) 32(3):241–299

    Article  Google Scholar 

  27. 27.

    NASA Advanced SuperComputing Division: Nas parallel benchmarks(npb) (2018). Retrieved from https://www.nas.nasa.gov/publications/npb.html

  28. 28.

    Rapp M, Pathania A, Mitra T, Henkel J (2019) Prediction-based task migration on s-nuca many-cores. In: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1579–1582. IEEE

  29. 29.

    Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Annal Math Stat 27(3):832–837. https://projecteuclid.org/euclid.aoms/1177728190. https://doi.org/10.1214/aoms/1177728190

  30. 30.

    Sadiqbatcha S, Zhao Y, Zhang J, Amrouch H, Henkel J, Tan SXD (2020) Machine learning based online full-chip heatmap estimation. In: Proceedings of the Asia South Pacific Design Automation Conference, pp. 229–234. ASP-DAC, Beijing, China

  31. 31.

    Salami B, Baharani M, Noori H (2014) Proactive task migration with a self-adjusting migration threshold for dynamic thermal management of multi-core processors. J Supercomput 68(3):1068–1087

    Article  Google Scholar 

  32. 32.

    Sehgal VK (2014) Markovian models based stochastic communication in networks-in-package. IEEE Trans Parallel Distrib Syst 26(10):2806–2821

    Article  Google Scholar 

  33. 33.

    Vandeputte F, Eeckhout L, De Bosschere K (2007) Exploiting program phase behavior for energy reduction on multi-configuration processors. J Syst Archit 53(8):489–500

    Article  Google Scholar 

  34. 34.

    Wächter EW, De Bellefroid C, Basireddy KR, Singh AK, Al-Hashimi BM, Merrett G (2019) Predictive thermal management for energy-efficient execution of concurrent applications on heterogeneous multicores. IEEE Trans Very Large Scale Integr (VLSI) Syst 27(6):1404–1415

    Article  Google Scholar 

  35. 35.

    Wang H, Guo X, Tan SX, Zhang C, Tang H, Yuan Y (2020) Leakage-aware predictive thermal management for multicore systems using echo state network. IEEE Trans Comput-Aided Design Integr Circuits Syst 39(7):1400–1413. https://doi.org/10.1109/TCAD.2019.2915316

    Article  Google Scholar 

  36. 36.

    Wang H, Hu L, Guo X, Nie Y, Tang H (2020) Compact piecewise linear model based temperature control of multicore systems considering leakage power. IEEE Trans Indust Inform 16(12):7556–7565. https://doi.org/10.1109/TII.2019.2960414

    Article  Google Scholar 

  37. 37.

    Wojciechowski B, Biernat J (2012) Temperature prediction for multi-core microprocessors with application to dynamic thermal management. In: 18th International Workshop on THERMal INvestigation of ICs and Systems, pp. 1–6. IEEE

  38. 38.

    Yeo I, Liu CC, Kim EJ (2008) Predictive dynamic thermal management for multicore systems. In: Proceedings of the 45th Annual Design Automation Conference, pp. 734–739. ACM

  39. 39.

    Yun B, Shin KG, Wang S (2013) Predicting thermal behavior for temperature management in time-critical multicore systems. In: 2013 IEEE 19th Real-time and Embedded Technology and Applications Symposium (RTAS), pp. 185–194. IEEE

  40. 40.

    Zhang K, Guliani A, Ogrenci-Memik S, Memik G, Yoshii K, Sankaran R, Beckman P (2017) Machine learning-based temperature prediction for runtime thermal management across system components. IEEE Trans Parallel Distrib Syst 29(2):405–419

    Article  Google Scholar 

  41. 41.

    Zhou X, Yang J, Chrobak M, Zhang Y (2010) Performance-aware thermal management via task scheduling. ACM Trans Archit Code Optim (TACO) 7(1):5

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to N. Revathi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Revathi, N., Sumathi, G. Multistep temperature prediction for proactive thermal management on chip multiprocessors. J Supercomput 77, 8967–8994 (2021). https://doi.org/10.1007/s11227-020-03611-5

Download citation

Keywords

  • Multistep prediction
  • Proactive thermal management
  • Process migration
  • Chip multiprocessors