Skip to main content

Advertisement

Log in

Machine learning based workload balancing scheme for minimizing stress migration induced aging in multicore processors

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

Stress migration-induced lifetime reliability is an important aging concern in modern embedded devices designed with high-performance multi-core processors. Due to the constraints in area requirements, these processing cores are manufactured with a high level of integration. These densely integrated processors, when operated continuously at elevated temperatures, accelerate processor wear-out. Since stress migration has a strong dependence on temperature, accurate and computationally efficient thermal estimation is important to take real-time control action to ensure that the chip lifetime meets the requirements. This research work consists of two sections. In the first part of this work, we propose a thermal estimation model based on the K-Nearest Neighbor machine learning algorithm for the significant energy-consuming elements of a processor core. In the second part, an aging-aware scheduler is proposed which can do workload assignment and processor adaptation based on the aging estimation performed with the developed thermal model to improve processor lifetime. According to experimental findings, the suggested system is accurate, efficient, and suitable for real-time implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Wang L, Lv P, Liu L, Han J, Leung H-F, Wang X, Yin S, Wei S, Mak T (2018) A lifetime reliability-constrained runtime mapping for throughput optimization in many-core systems. IEEE Trans Comput Aided Des Integr Circuits Syst 38(9):1771–1784

    Article  Google Scholar 

  2. Sharifi F, Rohbani N, Hessabi S (2020) Aging-aware context switching in multicore processors based on workload classification. In: IEEE computer architecture letters, 19(2):159–162. https://doi.org/10.1109/LCA.2020.3040326

  3. Sahoo SS, Behnaz R, Akash K (2021) Reliability-aware resource management in multi-/many-core systems: a perspective paper. J Low Power Electron Appl 11:7

    Article  Google Scholar 

  4. Ma Y, Zhou J, Chantem T, Dick RP, Hu XS (2021) Resource management for improving overall reliability of multi-processor systems-on-chip. In: Henkel J, Dutt N (eds) Dependable embedded systems, Springer, Cham. https://doi.org/10.1007/978-3-030-52017-5_10

  5. McPherson JW (2018) Brief history of JEDEC qualification standards for silicon technology and their applicability (?) to WBG semiconductors. In: 2018 IEEE International Reliability Physics Symposium (IRPS), pp 3B-1, IEEE, 2018

  6. Bizot G, Dimiter A, Fabien C (2016) Analysis of adaptive mapping of parallelized application on multicore system. In: 2016 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pp 1329–1338, IEEE

  7. Takeda E, Nakagome Y, Kume H, Asai S (1983) New hot-carrier injection and device degradation in submicron MOSFETs. IEEE Proc I Solid-State Electron Dev 130(3):144–150

    Article  Google Scholar 

  8. Chen K-L, Saller S, Shah R (1986) The case of AC stress in the hot-carrier effect. IEEE Trans Electron Devices 33(3):424–426

    Article  Google Scholar 

  9. Miura Y, Matukura Y (1966) Investigation of silicon-silicon dioxide interface using MOS structure. Jpn J Appl Phys 5(2):180–180

    Article  Google Scholar 

  10. Jeppson K, Svensson C (1977) Negative Bias Stress of MOS devices at high electric field and degradation of MOS devices. J Appl Phys 48(5):2004–2014

    Article  Google Scholar 

  11. Passage JM, Azhari N, Lloyd JR (2019) Stress migration followed by electromigration reliability testing. IEEE Int Reliab Phys Sympos (IRPS) 2019:1–5. https://doi.org/10.1109/IRPS.2019.8720473

    Article  Google Scholar 

  12. Paliwal M, Chilla RR, Prasanth NN et al (2022) Parallel implementation of solving linear equations using OpenMP. Int J Inf Technol 14:1677–1687. https://doi.org/10.1007/s41870-022-00899-9

    Article  Google Scholar 

  13. Thiarles S. Medeiros, Luan Pereira, Fábio D. Rossi, Marcelo C. Luizelli, Antonio Carlos S. Beck, Arthur F. Lorenzon, “Mitigating the processor aging through dynamic concurrency throttling, Journal of Parallel and Distributed Computing,” Volume 156,2021,Pages 86–100,ISSN 0743–7315, https://doi.org/10.1016/j.jpdc.2021.05.006.

  14. Yugank HK, Sharma R, Gupta SH (2022) An approach to analyse energy consumption of an IoT system. Int J Inf Technol 14:2549–2558. https://doi.org/10.1007/s41870-022-00954-5

    Article  Google Scholar 

  15. As J. Srinivasan, S. V. Adve, P. Bose and J. A. Rivers, "The case for lifetime reliability-aware microprocessors," Proceedings. 31st Annual International Symposium on Computer Architecture, 2004., 2004, pp. 276–287, doi: https://doi.org/10.1109/ISCA.2004.1310781.

  16. B. Ozceylan, B. R. Haverkort, M. de Graaf and M. E. T. Gerards, "A Generic Processor Temperature Estimation Method," 2019 25th International Workshop on Thermal Investigations of ICs and Systems (THERMINIC), 2019, pp. 1–6, doi: https://doi.org/10.1109/THERMINIC.2019.8923636.

  17. Alzemiro Lucas da Silva, André Luís del Mestre Martins, and Fernando Gehm Moraes, “Fine-grain temperature monitoring for many-core systems”, In Proceedings of the 32nd Symposium on Integrated Circuits and Systems Design (SBCCI '19). 2019, Association for Computing Machinery, New York, NY, USA, Article 4, 1–6

  18. D. Oh, N. S. Kim, C. C. P. Chen, A. Davoodi and Yu Hen Hu, "Runtime temperature-based power estimation for optimizing throughput of thermal-constrained multi-core processors," 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC), 2010, pp. 593–599, doi: https://doi.org/10.1109/ASPDAC.2010.5419815.

  19. W. Jin, S. Sadiqbatcha, J. Zhang and S. X. . -D. Tan, "Full-Chip Thermal Map Estimation for Commercial Multi-Core CPUs with Generative Adversarial Learning**This work is supported in part by NSF grants under No. CCF-1816361, in part by NSF grant under No. CCF-2007135 and No. OISE-1854276.," 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2020, pp. 1–9.

  20. Hosseinimotlagh S, Enright D, Shelton CR, Kim H (2021) Data-Driven Structured Thermal Modeling for COTS Multi-Core Processors. IEEE Real-Time Systems Symposium (RTSS) 2021:201–213. https://doi.org/10.1109/RTSS52674.2021.00028

    Article  Google Scholar 

  21. Reda S, Cochran R, Nowroz AN (2011) Improved thermal tracking for processors using hard and soft sensor allocation techniques. IEEE Trans Comput 60(6):841–851

    Article  MathSciNet  Google Scholar 

  22. Carlton Knox, Zihao Yuan, and Ayse K. Coskun, “ Machine Learning and Simulation Based Temperature Prediction on High-performance Processors,” In Proceedings of ASME International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems (InterPACK), July 2022.

  23. Kaicheng Zhang, Akhil Guliani, Seda Ogrenci-Memik, Gokhan Memik, Kazutomo Yoshii, Rajesh Sankaran, Pete Beckman., "Machine Learning-Based Temperature Prediction for Runtime Thermal Management Across System Components," in IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 2, pp. 405–419, 1 Feb. 2018, doi: https://doi.org/10.1109/TPDS.2017.2732951.

  24. Maurya AK, Meena A, Singh D et al (2022) An energy-efficient scheduling approach for memory-intensive tasks in multi-core systems. Int j inf tecnol. https://doi.org/10.1007/s41870-022-01042-4

    Article  Google Scholar 

  25. T. -H. Tsai, Y. -S. Chen, X. -X. He and C. -Y. Li, "STEM: A Thermal-Constrained Real-Time Scheduling for 3D Heterogeneous-ISA Multicore Processors," in IEEE Transactions on Computers, vol. 67, no. 6, pp. 874–889, 1 June 2018, doi: https://doi.org/10.1109/TC.2017.2783941.

  26. Seth S (2019) Singh, N, “ Dynamic heterogeneous shortest job first (DHSJF): a task scheduling approach for heterogeneous cloud computing systems.” Int J Inf Technol 11:653–657. https://doi.org/10.1007/s41870-018-0156-6

    Article  Google Scholar 

  27. Motaqi A (2020) Energy-performance management in battery powered reconfigurable processors for standalone IoT systems. Int j inf tecnol 12:653–668

    Article  Google Scholar 

  28. M. A. Belaïd and A. Almusallam, "Thermal effect on performance of N-MOSFET transistor under pulsed RF tests," 2021 27th International Workshop on Thermal Investigations of ICs and Systems (THERMINIC), 2021, pp. 1–4, doi: https://doi.org/10.1109/THERMINIC52472.2021.9626537.

  29. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge and R. B. Brown, "MiBench: A free, commercially representative embedded benchmark suite," Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538), pp. 3–14, 2001.

  30. Jeyaraj Andrews, Thangappan Sasikala, “Evaluation of Various Compiler Optimization Techniques Related to Mibench Benchmark Applications,” Journal of Computer Science 9 (6): 749–756, 2013, ISSN: 1549–3636,© 2013 Science Publications, doi:https://doi.org/10.3844/jcssp.2013.749.756.

  31. Chen YL, Chang MF, Yu CW, Chen XZ, Liang WY (2018) Learning-directed dynamic voltage and frequency scaling scheme with adjustable performance for single-core and multi-core embedded and mobile systems. Sensors (Basel) 18(9):3068. https://doi.org/10.3390/s18093068.PMID:30213128;PMCID:PMC6163884

    Article  Google Scholar 

  32. Nathan, Binkert., Bradford Beckmann., Gabriel Black., et.al., “The gem5 Simulator,” ACM SIGARCH Computer Architecture News, pp. 1–7 , May 2011.

  33. Sheng, Li., Jung, Ho, Ahn., Richard, D, Strong., et al., “McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures,” 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 12–16 Dec 2009, pp.469–480.

  34. Kevin, Skadron., Mircea, R, Stan.,Wei,Huang., et.al, “Temperatureaware microarchitecture,” ACM SIGARCH Computer Architecture News - ISCA 2003, May 2003, 31, (2), pp. 2–13.

  35. Pedregosa, Fabian, et al., "Scikit-learn: Machine learning in Python," the Journal of machine Learning research 12: pp. 2825–2830, 2011.

  36. S. . -F. Chen et al., "Investigation of new stress migration failure modes in highly scaled Cu/low-k interconnects," 2012 IEEE International Reliability Physics Symposium (IRPS), 2012, pp. 5E.3.1–5E.3.5, doi: https://doi.org/10.1109/IRPS.2012.6241858.

  37. Yang, Liao, et al., "Bias Temperature Instability of 4H-SiC p-and n-Channel MOSFETs Induced by Negative Stress at 200° C." IEEE Transactions on Electron Devices (2022).

  38. Mahapatra S, Goel N, Desai S, Gupta S, Jose B, Mukhopadhyay S, Joshi K, Jain A, Islam AE, Alam MA (2013) A comparative study of different physics-based NBTI models. IEEE Trans Electron Devices 60(3):901–916

    Article  Google Scholar 

  39. Sharov FV, Moxim SJ, Lenahan PM, Hughart DR, Haase GS, McKay CG (2021) Understanding the Initial Stages of Time Dependent Dielectric Breakdown in Si/SiO2 MOSFETs Utilizing EDMR and NZFMR. IEEE International Integrated Reliability Workshop (IIRW) 2021:1–5. https://doi.org/10.1109/IIRW53245.2021.9635607

    Article  Google Scholar 

  40. Y. Kimura, X. Zhao, M. Saka, “Evaluation of electromigration near a corner composed of dissimilar metals by analyzing atomic flux at the interface,” Recent Advances in Structural Integrity Analysis - Proceedings of the International Congress (APCF/SIF-2014), Woodhead Publishing, 2014, Pages 515–518.

  41. Failure Mechanisms and Models for Semiconductor Devices. In JEDEC Publication JEP122-A, 2002.

  42. Allan Webber, “Calculating Useful Lifetimes of Embedded Processors,” Texas Instruments Application Report, SPRABX4B, November 2014 Revised March 2020.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Jagadeesh Kumar.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, P.J., Mini, M.G. Machine learning based workload balancing scheme for minimizing stress migration induced aging in multicore processors. Int. j. inf. tecnol. 15, 399–410 (2023). https://doi.org/10.1007/s41870-022-01105-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-022-01105-6

Keywords

Navigation