Abstract
Stress migration-induced lifetime reliability is an important aging concern in modern embedded devices designed with high-performance multi-core processors. Due to the constraints in area requirements, these processing cores are manufactured with a high level of integration. These densely integrated processors, when operated continuously at elevated temperatures, accelerate processor wear-out. Since stress migration has a strong dependence on temperature, accurate and computationally efficient thermal estimation is important to take real-time control action to ensure that the chip lifetime meets the requirements. This research work consists of two sections. In the first part of this work, we propose a thermal estimation model based on the K-Nearest Neighbor machine learning algorithm for the significant energy-consuming elements of a processor core. In the second part, an aging-aware scheduler is proposed which can do workload assignment and processor adaptation based on the aging estimation performed with the developed thermal model to improve processor lifetime. According to experimental findings, the suggested system is accurate, efficient, and suitable for real-time implementation.
Similar content being viewed by others
References
Wang L, Lv P, Liu L, Han J, Leung H-F, Wang X, Yin S, Wei S, Mak T (2018) A lifetime reliability-constrained runtime mapping for throughput optimization in many-core systems. IEEE Trans Comput Aided Des Integr Circuits Syst 38(9):1771–1784
Sharifi F, Rohbani N, Hessabi S (2020) Aging-aware context switching in multicore processors based on workload classification. In: IEEE computer architecture letters, 19(2):159–162. https://doi.org/10.1109/LCA.2020.3040326
Sahoo SS, Behnaz R, Akash K (2021) Reliability-aware resource management in multi-/many-core systems: a perspective paper. J Low Power Electron Appl 11:7
Ma Y, Zhou J, Chantem T, Dick RP, Hu XS (2021) Resource management for improving overall reliability of multi-processor systems-on-chip. In: Henkel J, Dutt N (eds) Dependable embedded systems, Springer, Cham. https://doi.org/10.1007/978-3-030-52017-5_10
McPherson JW (2018) Brief history of JEDEC qualification standards for silicon technology and their applicability (?) to WBG semiconductors. In: 2018 IEEE International Reliability Physics Symposium (IRPS), pp 3B-1, IEEE, 2018
Bizot G, Dimiter A, Fabien C (2016) Analysis of adaptive mapping of parallelized application on multicore system. In: 2016 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pp 1329–1338, IEEE
Takeda E, Nakagome Y, Kume H, Asai S (1983) New hot-carrier injection and device degradation in submicron MOSFETs. IEEE Proc I Solid-State Electron Dev 130(3):144–150
Chen K-L, Saller S, Shah R (1986) The case of AC stress in the hot-carrier effect. IEEE Trans Electron Devices 33(3):424–426
Miura Y, Matukura Y (1966) Investigation of silicon-silicon dioxide interface using MOS structure. Jpn J Appl Phys 5(2):180–180
Jeppson K, Svensson C (1977) Negative Bias Stress of MOS devices at high electric field and degradation of MOS devices. J Appl Phys 48(5):2004–2014
Passage JM, Azhari N, Lloyd JR (2019) Stress migration followed by electromigration reliability testing. IEEE Int Reliab Phys Sympos (IRPS) 2019:1–5. https://doi.org/10.1109/IRPS.2019.8720473
Paliwal M, Chilla RR, Prasanth NN et al (2022) Parallel implementation of solving linear equations using OpenMP. Int J Inf Technol 14:1677–1687. https://doi.org/10.1007/s41870-022-00899-9
Thiarles S. Medeiros, Luan Pereira, Fábio D. Rossi, Marcelo C. Luizelli, Antonio Carlos S. Beck, Arthur F. Lorenzon, “Mitigating the processor aging through dynamic concurrency throttling, Journal of Parallel and Distributed Computing,” Volume 156,2021,Pages 86–100,ISSN 0743–7315, https://doi.org/10.1016/j.jpdc.2021.05.006.
Yugank HK, Sharma R, Gupta SH (2022) An approach to analyse energy consumption of an IoT system. Int J Inf Technol 14:2549–2558. https://doi.org/10.1007/s41870-022-00954-5
As J. Srinivasan, S. V. Adve, P. Bose and J. A. Rivers, "The case for lifetime reliability-aware microprocessors," Proceedings. 31st Annual International Symposium on Computer Architecture, 2004., 2004, pp. 276–287, doi: https://doi.org/10.1109/ISCA.2004.1310781.
B. Ozceylan, B. R. Haverkort, M. de Graaf and M. E. T. Gerards, "A Generic Processor Temperature Estimation Method," 2019 25th International Workshop on Thermal Investigations of ICs and Systems (THERMINIC), 2019, pp. 1–6, doi: https://doi.org/10.1109/THERMINIC.2019.8923636.
Alzemiro Lucas da Silva, André Luís del Mestre Martins, and Fernando Gehm Moraes, “Fine-grain temperature monitoring for many-core systems”, In Proceedings of the 32nd Symposium on Integrated Circuits and Systems Design (SBCCI '19). 2019, Association for Computing Machinery, New York, NY, USA, Article 4, 1–6
D. Oh, N. S. Kim, C. C. P. Chen, A. Davoodi and Yu Hen Hu, "Runtime temperature-based power estimation for optimizing throughput of thermal-constrained multi-core processors," 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC), 2010, pp. 593–599, doi: https://doi.org/10.1109/ASPDAC.2010.5419815.
W. Jin, S. Sadiqbatcha, J. Zhang and S. X. . -D. Tan, "Full-Chip Thermal Map Estimation for Commercial Multi-Core CPUs with Generative Adversarial Learning**This work is supported in part by NSF grants under No. CCF-1816361, in part by NSF grant under No. CCF-2007135 and No. OISE-1854276.," 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2020, pp. 1–9.
Hosseinimotlagh S, Enright D, Shelton CR, Kim H (2021) Data-Driven Structured Thermal Modeling for COTS Multi-Core Processors. IEEE Real-Time Systems Symposium (RTSS) 2021:201–213. https://doi.org/10.1109/RTSS52674.2021.00028
Reda S, Cochran R, Nowroz AN (2011) Improved thermal tracking for processors using hard and soft sensor allocation techniques. IEEE Trans Comput 60(6):841–851
Carlton Knox, Zihao Yuan, and Ayse K. Coskun, “ Machine Learning and Simulation Based Temperature Prediction on High-performance Processors,” In Proceedings of ASME International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems (InterPACK), July 2022.
Kaicheng Zhang, Akhil Guliani, Seda Ogrenci-Memik, Gokhan Memik, Kazutomo Yoshii, Rajesh Sankaran, Pete Beckman., "Machine Learning-Based Temperature Prediction for Runtime Thermal Management Across System Components," in IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 2, pp. 405–419, 1 Feb. 2018, doi: https://doi.org/10.1109/TPDS.2017.2732951.
Maurya AK, Meena A, Singh D et al (2022) An energy-efficient scheduling approach for memory-intensive tasks in multi-core systems. Int j inf tecnol. https://doi.org/10.1007/s41870-022-01042-4
T. -H. Tsai, Y. -S. Chen, X. -X. He and C. -Y. Li, "STEM: A Thermal-Constrained Real-Time Scheduling for 3D Heterogeneous-ISA Multicore Processors," in IEEE Transactions on Computers, vol. 67, no. 6, pp. 874–889, 1 June 2018, doi: https://doi.org/10.1109/TC.2017.2783941.
Seth S (2019) Singh, N, “ Dynamic heterogeneous shortest job first (DHSJF): a task scheduling approach for heterogeneous cloud computing systems.” Int J Inf Technol 11:653–657. https://doi.org/10.1007/s41870-018-0156-6
Motaqi A (2020) Energy-performance management in battery powered reconfigurable processors for standalone IoT systems. Int j inf tecnol 12:653–668
M. A. Belaïd and A. Almusallam, "Thermal effect on performance of N-MOSFET transistor under pulsed RF tests," 2021 27th International Workshop on Thermal Investigations of ICs and Systems (THERMINIC), 2021, pp. 1–4, doi: https://doi.org/10.1109/THERMINIC52472.2021.9626537.
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge and R. B. Brown, "MiBench: A free, commercially representative embedded benchmark suite," Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538), pp. 3–14, 2001.
Jeyaraj Andrews, Thangappan Sasikala, “Evaluation of Various Compiler Optimization Techniques Related to Mibench Benchmark Applications,” Journal of Computer Science 9 (6): 749–756, 2013, ISSN: 1549–3636,© 2013 Science Publications, doi:https://doi.org/10.3844/jcssp.2013.749.756.
Chen YL, Chang MF, Yu CW, Chen XZ, Liang WY (2018) Learning-directed dynamic voltage and frequency scaling scheme with adjustable performance for single-core and multi-core embedded and mobile systems. Sensors (Basel) 18(9):3068. https://doi.org/10.3390/s18093068.PMID:30213128;PMCID:PMC6163884
Nathan, Binkert., Bradford Beckmann., Gabriel Black., et.al., “The gem5 Simulator,” ACM SIGARCH Computer Architecture News, pp. 1–7 , May 2011.
Sheng, Li., Jung, Ho, Ahn., Richard, D, Strong., et al., “McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures,” 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 12–16 Dec 2009, pp.469–480.
Kevin, Skadron., Mircea, R, Stan.,Wei,Huang., et.al, “Temperatureaware microarchitecture,” ACM SIGARCH Computer Architecture News - ISCA 2003, May 2003, 31, (2), pp. 2–13.
Pedregosa, Fabian, et al., "Scikit-learn: Machine learning in Python," the Journal of machine Learning research 12: pp. 2825–2830, 2011.
S. . -F. Chen et al., "Investigation of new stress migration failure modes in highly scaled Cu/low-k interconnects," 2012 IEEE International Reliability Physics Symposium (IRPS), 2012, pp. 5E.3.1–5E.3.5, doi: https://doi.org/10.1109/IRPS.2012.6241858.
Yang, Liao, et al., "Bias Temperature Instability of 4H-SiC p-and n-Channel MOSFETs Induced by Negative Stress at 200° C." IEEE Transactions on Electron Devices (2022).
Mahapatra S, Goel N, Desai S, Gupta S, Jose B, Mukhopadhyay S, Joshi K, Jain A, Islam AE, Alam MA (2013) A comparative study of different physics-based NBTI models. IEEE Trans Electron Devices 60(3):901–916
Sharov FV, Moxim SJ, Lenahan PM, Hughart DR, Haase GS, McKay CG (2021) Understanding the Initial Stages of Time Dependent Dielectric Breakdown in Si/SiO2 MOSFETs Utilizing EDMR and NZFMR. IEEE International Integrated Reliability Workshop (IIRW) 2021:1–5. https://doi.org/10.1109/IIRW53245.2021.9635607
Y. Kimura, X. Zhao, M. Saka, “Evaluation of electromigration near a corner composed of dissimilar metals by analyzing atomic flux at the interface,” Recent Advances in Structural Integrity Analysis - Proceedings of the International Congress (APCF/SIF-2014), Woodhead Publishing, 2014, Pages 515–518.
Failure Mechanisms and Models for Semiconductor Devices. In JEDEC Publication JEP122-A, 2002.
Allan Webber, “Calculating Useful Lifetimes of Embedded Processors,” Texas Instruments Application Report, SPRABX4B, November 2014 Revised March 2020.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumar, P.J., Mini, M.G. Machine learning based workload balancing scheme for minimizing stress migration induced aging in multicore processors. Int. j. inf. tecnol. 15, 399–410 (2023). https://doi.org/10.1007/s41870-022-01105-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-022-01105-6