Abstract
In chip multiprocessors, the thermal hot spot prediction is vital to take proactive thermal management decisions for mitigating its impact on the performance of the application. Core temperature prediction for a range of future time steps is challenging because workload changes dynamically at runtime. This article presents an adaptive multistep temperature prediction (MSTP) algorithm, which uses iterative steps to generate core temperature predictions for a range of future time steps using the initial temperature information and the application workload profile obtained offline. MSTP considers the autocorrelation of temperature and addresses the nonlinear heteroscedasticity across observations. We evaluate the proposed MSTP on a real hardware platform and compare it to the model-based approach for a wide range of parallel benchmarks. Experimental results show that the proposed algorithm achieves more than 96% prediction accuracy and minimizes the heteroscedastic residual variance. In case of thermal hot spot prediction, we show that the performance throttling problem is alleviated using a process migration mechanism with a minimal migration overhead.
Similar content being viewed by others
References
Bienia C, Kumar S, Singh JP, Li K (2008) The parsec benchmark suite: Characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 72–81. ACM
Brooks D, Martonosi M (2001) Dynamic thermal management for high-performance microprocessors. In: Proceedings HPCA Seventh International Symposium on High-performance Computer Architecture, pp. 171–182. IEEE
Chhablani M, Koren I, Krishna C (2016) Online inertia-based temperature estimation for reliability enhancement. J Low Power Electron 12(3):159–171
Choi J, Cher CY, Franke H, Hamann H, Weger A, Bose P (2007) Thermal-aware task scheduling at the system software level. In: Proceedings of the 2007 International Symposium on Low Power Electronics and Design, pp. 213–218. ACM
Chung SW, Skadron K (2006) Using on-chip event counters for high-resolution, real-time temperature measurement. In: Thermal and Thermomechanical Proceedings 10th Intersociety Conference on Phenomena in Electronics Systems, 2006. ITHERM 2006., pp. 114–120. IEEE
Cochran R, Reda S (2013) Thermal prediction and adaptive control through workload phase detection. ACM Trans Design Autom Electron Syst (TODAES) 18(1):7
Constantinou T, Sazeides Y, Michaud P, Fetis D, Seznec A (2005) Performance implications of single thread migration on a chip multi-core. ACM SIGARCH Comput Archit News 33(4):80–91
Coskun AK, Rosing TS, Gross KC (2008) Proactive temperature balancing for low cost thermal management in mpsocs. In: 2008 IEEE/ACM International Conference on Computer-aided Design, pp. 250–257. IEEE
CRIU: Checkpoint/Restore In Userspace (2019). Retrieved from https://github.com/checkpoint-restore/criu
CYRIL INGENIERRIE: CoreFreq (2018). Retrieved from https://github.com/cyring/CoreFreq
De Melo AC (2010) The new linux perf tools. In: Slides from Linux Kongress, vol. 18
Donald J, Martonosi M (2006) Techniques for multicore thermal management: Classification and new exploration. In: ACM SIGARCH Computer Architecture News, vol. 34, pp. 78–88. IEEE Computer Society
Eguia TJ, Tan SXD, Shen R, Li D, Pacheco EH, Tirumala M, Wang L (2011) General parameterized thermal modeling for high-performance microprocessor design. IEEE Trans Very Large Scale Integ (VLSI) Syst 20(2):211–224
Ferroni M, Corna A, Damiani A, Brondolin R, Colmenares JA, Hofmeyr S, Kubiatowicz JD, Santambrogio MD (2017) Power consumption models for multi-tenant server infrastructures. ACM Trans Archit Code Optim (TACO) 14(4):1–22
Hanson H, Keckler SW, Ghiasi S, Rajamani K, Rawson F, Rubio J (2007) Thermal response to dvfs: Analysis with an intel pentium m. In: Proceedings of the 2007 International Symposium on Low Power Electronics and Design (ISLPED’07), pp. 219–224. IEEE
Intel: Speedstep technology for the intel pentium m processor. White Paper (2004)
Intel (2011) Intel® 64 and IA-32 architectures software developer’s manual. Volume 3B: System programming Guide, Part 2(11). https://www.intel.in/content/www/in/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.html
Iqbal SMZ, Liang Y, Grahn H (2010) Parmibench-an open-source benchmark for embedded multiprocessor systems. IEEE Comput Archit Lett 9(2):45–48
Kim YG, Kim M, Kong J, Chung SW (2020) An adaptive thermal management framework for heterogeneous multi-core processors. IEEE Trans Comput 69(6):894–906
Lee JS, Skadron K, Chung SW (2009) Predictive temperature-aware dvfs. IEEE Trans Comput 59(1):127–133
Li D, Tan SXD, Pacheco EH, Tirumala M (2010) Parameterized architecture-level dynamic thermal models for multicore microprocessors. ACM Trans Design Autom Electron Syst (TODAES) 15(2):1–22
Liu Z, Tan SXD, Huang X, Wang H (2014) Task migrations for distributed thermal management considering transient effects. IEEE Trans Very Large Scale Integr (VLSI) Syst 23(2):397–401
Liu Z, Tan SXD, Wang H, Hua Y, Gupta A (2014) Compact thermal modeling for packaged microprocessor design with practical power maps. Integration 47(1):71–85
Long JS, Ervin LH (2000) Using heteroscedasticity consistent standard errors in the linear regression model. Am Stat 54(3):217–224
Michaud P, Seznec A, Fetis D, Sazeides Y, Constantinou T (2007) A study of thread migration in temperature-constrained multicores. ACM Trans Archit Code Optim (TACO) 4(2):9–es
Milojičić DS, Douglis F, Paindaveine Y, Wheeler R, Zhou S (2000) Process migration. ACM Comput Surv (CSUR) 32(3):241–299
NASA Advanced SuperComputing Division: Nas parallel benchmarks(npb) (2018). Retrieved from https://www.nas.nasa.gov/publications/npb.html
Rapp M, Pathania A, Mitra T, Henkel J (2019) Prediction-based task migration on s-nuca many-cores. In: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1579–1582. IEEE
Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Annal Math Stat 27(3):832–837. https://projecteuclid.org/euclid.aoms/1177728190. https://doi.org/10.1214/aoms/1177728190
Sadiqbatcha S, Zhao Y, Zhang J, Amrouch H, Henkel J, Tan SXD (2020) Machine learning based online full-chip heatmap estimation. In: Proceedings of the Asia South Pacific Design Automation Conference, pp. 229–234. ASP-DAC, Beijing, China
Salami B, Baharani M, Noori H (2014) Proactive task migration with a self-adjusting migration threshold for dynamic thermal management of multi-core processors. J Supercomput 68(3):1068–1087
Sehgal VK (2014) Markovian models based stochastic communication in networks-in-package. IEEE Trans Parallel Distrib Syst 26(10):2806–2821
Vandeputte F, Eeckhout L, De Bosschere K (2007) Exploiting program phase behavior for energy reduction on multi-configuration processors. J Syst Archit 53(8):489–500
Wächter EW, De Bellefroid C, Basireddy KR, Singh AK, Al-Hashimi BM, Merrett G (2019) Predictive thermal management for energy-efficient execution of concurrent applications on heterogeneous multicores. IEEE Trans Very Large Scale Integr (VLSI) Syst 27(6):1404–1415
Wang H, Guo X, Tan SX, Zhang C, Tang H, Yuan Y (2020) Leakage-aware predictive thermal management for multicore systems using echo state network. IEEE Trans Comput-Aided Design Integr Circuits Syst 39(7):1400–1413. https://doi.org/10.1109/TCAD.2019.2915316
Wang H, Hu L, Guo X, Nie Y, Tang H (2020) Compact piecewise linear model based temperature control of multicore systems considering leakage power. IEEE Trans Indust Inform 16(12):7556–7565. https://doi.org/10.1109/TII.2019.2960414
Wojciechowski B, Biernat J (2012) Temperature prediction for multi-core microprocessors with application to dynamic thermal management. In: 18th International Workshop on THERMal INvestigation of ICs and Systems, pp. 1–6. IEEE
Yeo I, Liu CC, Kim EJ (2008) Predictive dynamic thermal management for multicore systems. In: Proceedings of the 45th Annual Design Automation Conference, pp. 734–739. ACM
Yun B, Shin KG, Wang S (2013) Predicting thermal behavior for temperature management in time-critical multicore systems. In: 2013 IEEE 19th Real-time and Embedded Technology and Applications Symposium (RTAS), pp. 185–194. IEEE
Zhang K, Guliani A, Ogrenci-Memik S, Memik G, Yoshii K, Sankaran R, Beckman P (2017) Machine learning-based temperature prediction for runtime thermal management across system components. IEEE Trans Parallel Distrib Syst 29(2):405–419
Zhou X, Yang J, Chrobak M, Zhang Y (2010) Performance-aware thermal management via task scheduling. ACM Trans Archit Code Optim (TACO) 7(1):5
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Revathi, N., Sumathi, G. Multistep temperature prediction for proactive thermal management on chip multiprocessors. J Supercomput 77, 8967–8994 (2021). https://doi.org/10.1007/s11227-020-03611-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03611-5