Skip to main content

Orchestrated Co-scheduling, Resource Partitioning, and Power Capping on CPU-GPU Heterogeneous Systems via Machine Learning

  • 595 Accesses

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13642)

Abstract

CPU-GPU heterogeneous architectures are now commonly used in a wide variety of computing systems from mobile devices to supercomputers. Maximizing the throughput for multi-programmed workloads on such systems is indispensable as one single program typically cannot fully exploit all avaiable resources. At the same time, power consumption is a key issue and often requires optimizing power allocations to the CPU and GPU while enforcing a total power constraint, in particular when the power/thermal requirements are strict. The result is a system-wide optimization problem with several knobs. In particular we focus on (1) co-scheduling decisions, i.e., selecting programs to co-locate in a space sharing manner; (2) resource partitioning on both CPUs and GPUs; and (3) power capping on both CPUs and GPUs. We solve this problem using predictive performance modeling using machine learning in order to coordinately optimize the above knob setups. Our experiential results using a real system show that our approach achieves up to 67% of speedup compared to a time-sharing-based scheduling with a naive power capping that evenly distributes power budgets across components.

Keywords

  • Co-scheduling
  • Resource partitioning
  • Power capping
  • CPU-GPU heterogeneous systems
  • Machine learning

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-031-21867-5_4
  • Chapter length: 17 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   54.99
Price excludes VAT (USA)
  • ISBN: 978-3-031-21867-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   69.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.
Fig. 11.

Notes

  1. 1.

    In case no profile is available for a job, which we do not cover in the paper, we can exclude it from the co-scheduling candidates at the first stage in the diagram and execute it exclusively without power capping while obtaining the profile for the future references.

  2. 2.

    One GPC must be disabled when using MIG. Other partitioning options such as 1GPC/6GPCs or 2GPCs/5GPCs are not supported. We first create one GI with 7GPCs and then create CIs consisting of 3GPCs/4GPCs inside of it [4, 21].

References

  1. Top 500 list (2022). https://www.top500.org/lists. Accessed 24 July 2022

  2. perf: Linux profiling with performance counters (2022). https://perf.wiki.kernel.org/index.php/Main_Page. Accessed 24 July 2022

  3. Arima, E., Hanawa, T., Trinitis, C., Schulz, M.: Footprint-aware power capping for hybrid memory based systems. In: Sadayappan, P., Chamberlain, B.L., Juckeland, G., Ltaief, H. (eds.) ISC High Performance 2020. LNCS, vol. 12151, pp. 347–369. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50743-5_18

    CrossRef  Google Scholar 

  4. Arima, E., et al.: Optimizing hardware resource partitioning and job allocations on modern gpus under power caps. In: ICPPW (2022)

    Google Scholar 

  5. Azimi, R., et al.: PowerCoord: a coordinated power capping controller for multi-CPU/GPU servers. In: IGSC, pp. 1–9 (2018)

    Google Scholar 

  6. Bailey, P.E., et al.: Adaptive configuration selection for power-constrained heterogeneous systems. In: ICPP, pp. 371–380 (2014)

    Google Scholar 

  7. Barnes, B.J., et al.: A regression-based approach to scalability prediction. In: ICS, pp. 368–377 (2008)

    Google Scholar 

  8. Barrera, S., et al.: Modeling and optimizing NUMA effects and prefetching with machine learning. In: ICS, no. 34 (2020)

    Google Scholar 

  9. Bhadauria, M., et al.: An approach to resource-aware co-scheduling for CMPs. In: ICS, pp. 189–199 (2010)

    Google Scholar 

  10. Cao, T., et al.: Demand-aware power management for power-constrained HPC systems. In: CCGrid, pp. 21–31 (2016)

    Google Scholar 

  11. Che, S., et al.: Rodinia: a benchmark suite for heterogeneous computing. In: IISWC, pp. 44–54 (2009)

    Google Scholar 

  12. Cochran, R., et al.: Pack & cap: adaptive DVFs and thread packing under power caps. In: MICRO, pp. 175–185 (2011)

    Google Scholar 

  13. Cook, W., et al.: Computing minimum-weight perfect matchings. In: INFORMS Journal on Computing, vol. 11 (1999)

    Google Scholar 

  14. Dennard, R.H., et al.: Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J. Solid-State Circ. 9(5), 256–268 (1974)

    CrossRef  Google Scholar 

  15. Eeckhout, L.: Heterogeneity in response to the power wall. IEEE Micro 35(04), 2–3 (2015)

    CrossRef  Google Scholar 

  16. Greathouse, J.L., et al.: Machine learning for performance and power modeling of heterogeneous systems. In: ICCAD, pp. 1–6 (2018)

    Google Scholar 

  17. ïpek, E., et al.: Efficiently exploring architectural design spaces via predictive modeling. In: ASPLOS, pp. 195–206 (2006)

    Google Scholar 

  18. Lee, B.C., et al.: Accurate and efficient regression modeling for microarchitectural performance and power prediction. In: ASPLOS, pp. 185–194 (2006)

    Google Scholar 

  19. Nagasaka, H., et al.: Statistical power modeling of GPU kernels using performance counters. In: International Conference on Green Computing, pp. 115–122 (2010)

    Google Scholar 

  20. NVIDIA: Nsight compute (2022). https://developer.nvidia.com/nsight-compute. Accessed 24 July 2022

  21. Nvidia: Nvidia multi-instance GPU (2022). https://www.nvidia.com/en-us/technologies/multi-instance-gpu/. Accessed 24 July 2022

  22. Patki, T., et al.: Exploring hardware overprovisioning in power-constrained, high performance computing. In: ICS, pp. 173–182 (2013)

    Google Scholar 

  23. Sarood, O., et al.: Maximizing throughput of overprovisioned HPC data centers under a strict power budget. In: SC, pp. 807–818 (2014)

    Google Scholar 

  24. Sasaki, H., et al.: Coordinated power-performance optimization in manycores. In: PACT, pp. 51–61 (2013)

    Google Scholar 

  25. Zhu, Q., et al.: Co-run scheduling with power cap on integrated CPU-GPU systems. In: IPDPS, pp. 967–977 (2017)

    Google Scholar 

  26. Zhuravlev, S., et al.: Addressing shared resource contention in multicore processors via scheduling. In: ASPLOS, pp. 129–142 (2010)

    Google Scholar 

Download references

Acknowledgements

This work has received funding under the European Commission’s EuroHPC and H2020 programmes under grant agreement no. 956560 and was supported by the NVIDIA Academic Hardware Grant Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eishi Arima .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Saba, I., Arima, E., Liu, D., Schulz, M. (2022). Orchestrated Co-scheduling, Resource Partitioning, and Power Capping on CPU-GPU Heterogeneous Systems via Machine Learning. In: Schulz, M., Trinitis, C., Papadopoulou, N., Pionteck, T. (eds) Architecture of Computing Systems. ARCS 2022. Lecture Notes in Computer Science, vol 13642. Springer, Cham. https://doi.org/10.1007/978-3-031-21867-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21867-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21866-8

  • Online ISBN: 978-3-031-21867-5

  • eBook Packages: Computer ScienceComputer Science (R0)