Skip to main content

Deep neural network learning for power limited heterogeneous system with workload classification

Abstract

Heterogeneous systems providing diverse computational capabilities have unlocked a new pathway in multicore processors. The versatility in applications and their ever-increasing performance demands have brought a paradigm shift to the heterogeneous systems. We use a deep neural network (DNN) based model to maximize performance under power constraints in heterogeneous systems. The dynamic power management technique is implemented in three stages. In the first stage, the core statistics and workload characteristics are collected for the DNN training at the later stage. This step dynamically estimates workload change for the current epoch using dynamic voltage frequency scaling (DVFS) based heuristic algorithm. In the second stage, DNN is trained through collected data points. The third stage uses the trained DNN model to identify a suitable voltage-frequency values to maximize performance under power capping. The power manager controls power-consumption at both per-core and per-chip levels. Our DNN prediction model is trained to address both core types (Large and Small) thus, improving the accuracy of the model. Simulations indicate that the proposed approach can achieve an overall 10.63% reduction in power-consumption and 5.5–6.8% improvement in power-savings when compared with the existing approaches. Besides, the proposed DNN model-based approach is able to maintain power capping with 95.81% accuracy with performance degradation of only 5.38% for a quad-core architecture.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

References

  1. 1.

    Greenhalgh P (2011) Big. little processing with arm cortex-a15 & cortex-a7. ARM White paper, 17

  2. 2.

    NVIDIA (2011) Tegra 3 (kal-el) quad-core mobile processor. http://www.nvidia.com/object/tegra-3-processor.html

  3. 3.

    Ferrandi F, Lanzi PL, Pilato C, Sciuto D, Tumeo A (2010) Ant colony heuristic for mapping and scheduling tasks and communications on heterogeneous embedded systems. IEEE Trans Comput Aided Des Integr Circuits Syst 29(6):911–924

    Article  Google Scholar 

  4. 4.

    Gupta M, Bhargava L, Indu S (2021) Mapping techniques in multicore processors: current and future trends. J Supercomput 1–56

  5. 5.

    Dhiman G, Rosing TS (2007) Dynamic voltage frequency scaling for multi-tasking systems using online learning. In: Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED’07). IEEE, pp 207–212

  6. 6.

    Cebrián JM, Sánchez D, Aragón JL, Kaxiras S (2013) Efficient inter-core power and thermal balancing for multicore processors. Computing 95(7):537–66

    Article  Google Scholar 

  7. 7.

    Själlander M, Martonosi M, Kaxiras S (2014) Power-efficient computer architectures: recent advances. Synth Lect Comput Archit 9(3):1–96

    Google Scholar 

  8. 8.

    Bogdanski M, Lewis PR, Becker T, Yao X (2011) Improving scheduling techniques in heterogeneous systems with dynamic, on-line optimisations. In: 2011 International conference on complex, intelligent, and software intensive systems. IEEE, pp 496–501

  9. 9.

    Gupta M, Bhargava L, Indu S (2020) Artificial neural network based task scheduling for heterogeneous systems. In: 2020 3rd International conference on emerging technologies in computer engineering: machine learning and internet of things (ICETCE) 2020 Feb 7. IEEE, pp 74–79

  10. 10.

    Gupta M, Bhargava L, Indu S (2020) Dynamic workload-aware DVFS for multicore systems using machine learning. Computing 8:1–23

    Google Scholar 

  11. 11.

    Das A, Kumar A, Veeravalli B, Shafik R, Merrett G, Al-Hashimi B (2015) Workload uncertainty characterization and adaptive frequency scaling for energy minimization of embedded systems. In: 2015 Design, automation & test in Europe conference & exhibition (DATE). IEEE, pp 43–48

  12. 12.

    Cochran R, Hankendi C, Coskun AK, Reda S (2011) Pack & Cap: adaptive DVFS and thread packing under power caps. In: 2011 44th Annual IEEE/ACM international symposium on microarchitecture (MICRO). IEEE, pp 175–185

  13. 13.

    Jung H, Pedram M (2010) Supervised learning based power management for multicore processors. IEEE Trans Comput Aided Des Integr Circuits Syst 29(9):1395–408

    Article  Google Scholar 

  14. 14.

    Chen Z, Marculescu D (2015) Distributed reinforcement learning for power limited many-core system performance optimization. In: 2015 Design, automation & test in Europe conference & exhibition (DATE). IEEE, pp 1521–1526

  15. 15.

    Gupta U, Mandal SK, Mao M, Chakrabarti C, Ogras UY (2019) A deep Q-learning approach for dynamic management of heterogeneous processors. IEEE Comput Archit Lett 18(1):14–7

    Article  Google Scholar 

  16. 16.

    Van Craeynest K, Akram S, Heirman W, Jaleel A, Eeckhout L (2013) Fairness-aware scheduling on single-ISA heterogeneous multi-cores. In: Proceedings of the 22nd international conference on parallel architectures and compilation techniques. IEEE, pp 177–187

  17. 17.

    Edun A, Vazquez R, Gordon-Ross A, Stitt G (2019) Dynamic scheduling on heterogeneous multicores. In: 2019 Design, automation & test in Europe conference & exhibition (DATE). IEEE, pp 1685–1690

  18. 18.

    LeCun Y (2012) Learning invariant feature hierarchies. European conference on computer vision. Springer, Berlin, pp 496–505

    Google Scholar 

  19. 19.

    Reddy BK, Singh AK, Biswas D, Merrett GV, Al-Hashimi BM (2017) Inter-cluster thread-to-core mapping and DVFS on heterogeneous multi-cores. IEEE Trans Multi-Scale Comput Syst 4(3):369–82

    Article  Google Scholar 

  20. 20.

    Ma Y, Chantem T, Dick RP, Hu XS (2017) Improving system-level lifetime reliability of multicore soft real-time systems. IEEE Trans Very Large Scale Integr (VLSI) Syst 25(6):1895–905

  21. 21.

    Carlson TE, Heirman W, Eeckhout L (2011) Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulations. In: International conference for high performance computing, networking, storage and analysis (SC)

  22. 22.

    Li S, Ahn JH, Strong RD, Brockman JB, Tullsen DM, Jouppi NP (2009) “McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: 42nd Annual IEEE/ACM international symposium on, pp 469–480. IEEE

  23. 23.

    Shahid A, Fahad M, Manumachu RR, Lastovetsky A (2020) A comparative study of techniques for energy predictive modeling using performance monitoring counters on modern multicore CPUs. IEEE Access 3(8):143306–32

    Article  Google Scholar 

  24. 24.

    Wang Z, Tian Z, Xu J, Maeda RK, Li H, Yang P, Wang Z, Duong LH, Wang Z, Chen X (2017) Modular reinforcement learning for self-adaptive energy efficiency optimization in multicore system. In: 2017 22nd asia and south pacific design automation conference (ASP-DAC). IEEE, pp 684–689

  25. 25.

    Ren S, He L, Li J, Chen Z, Jiang P, Li CT (2019) Contention-aware prediction for performance impact of task co-running in multicore computers. Wirel Netw 13:1–8

    Google Scholar 

  26. 26.

    Walia AS (2017) The vanishing gradient problem. Medium

  27. 27.

    http://ark.intel.com/products/37106

  28. 28.

    Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The SPLASH-2 programs: characterization and methodological considerations. ACM SIGARCH Comput Archit News 23(2):24–36

    Article  Google Scholar 

  29. 29.

    Negi A, Kumar PK (2005) Applying machine learning techniques to improve linux process scheduling. In: TENCON 2005–2005 IEEE region 10 conference. IEEE, pp 1–6

  30. 30.

    Moghaddam MG, Ababei C (2017) Dynamic energy management for chip multi-processors under performance constraints. Microprocess Microsyst 1(54):1–3

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to S. Indu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gupta, M., Bhargava, L. & Indu, S. Deep neural network learning for power limited heterogeneous system with workload classification. Computing (2021). https://doi.org/10.1007/s00607-021-01018-5

Download citation

Keywords

  • Heterogeneous architecture
  • Deep neural network
  • Power-performance tradeoff
  • Dynamic voltage frequency scaling
  • Workload decomposition

Mathematics Subject Classification

  • 68U20
  • 68T05
  • 68M20
  • 62M45