Evolutionary Based Solutions for Green Computing pp 161-185 | Cite as
Thermal Management in Many Core Systems
Abstract
High power densities and operating temperatures in multi-processor systems impose a number of undesirable effects, including performance degradation, high operational and cooling costs, and reliability deterioration leading to system failures. Many-core systems bring exciting opportunities in design and system management owing to the ample hardware parallelism, while introducing novel challenges due to their complexity and the highly variant workload that is expected to run on these systems. Efficient thermal monitoring and management, designing thermally-aware architectures, and multi-level parameter optimization can alleviate some of the undesirable thermal effects while maintaining the desired performance and energy levels. In particular, these techniques aid in the evolution of green computing systems. This chapter provides a qualitative discussion on thermal management techniques for many-core systems. We will elucidate the following questions in detail: What are the specific design challenges in monitoring the temperature of large-scale systems? How can we exploit the multi-level optimizations at runtime in response to the dynamic behavior of the processor’s workload? How do emerging workloads affect the thermal distribution on many-core systems?
Keywords
Thermal Sensor Thermal Management Sensor Placement Core System Sequential Probability Ratio TestPreview
Unable to display preview. Download preview PDF.
References
- 1.Alley, R., Soto, M., Kwark, L., Crocco, P., Koester, D.: Modeling and validation of on-die cooling of dual-core cpu using embedded thermoelectric devices. In: Twenty-fourth Annual IEEE Semiconductor Thermal Measurement and Management Symposium, Semi-Therm 2008, pp. 77–82 (March 2008)Google Scholar
- 2.Memik, S.O., Mukherjee, R., Ni, M., Long, J.: Optimizing Thermal Sensor Allocation for Microprocessors. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 27(3), 516–527 (2008), http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4454017 CrossRefGoogle Scholar
- 3.Viswanath, R., Wakharkar, V., Watwe, A., Lebonheur, V.: Thermal performance challenges from silicon to systems. Intel Technology Journal Q3, 1–16 (2000)Google Scholar
- 4.Xiang, Y., Chantem, T., Dick, R.P., Hu, X.S., Shang, L.: System-level reliability modeling for mpsocs. In: Proceedings of the Eighth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES/ISSS 2010, pp. 297–306 (2010), http://doi.acm.org/10.1145/1878961.1879013
- 5.Borkar, S.: Design challenges of technology scaling. IEEE Micro 19(4), 23–29 (1999)CrossRefGoogle Scholar
- 6.Gronowski, P., Bowhill, W., Preston, R., Gowan, M., Allmon, R.: High-performance microprocessor design. IEEE Journal of Solid-State Circuits 33(5), 676–686 (1998)CrossRefGoogle Scholar
- 7.Skadron, K., Huang, W.: Analytical model for sensor placement on microprocessors. In: 2005 International Conference on Computer Design, pp. 24–27 (2005), http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1524125
- 8.Mukherjee, R., Memik, S.O.: Systematic temperature sensor allocation and placement for microprocessors. In: Proceedings of the 43rd Annual Design Automation Conference, DAC 2006, pp. 542–547. ACM, New York (2006)CrossRefGoogle Scholar
- 9.Kwasinski, A., Kudithipudi, D.: Towards integrated circuit thermal profiling for reduced power consumption: Evaluation of distributed sensing techniques. In: Proceedings of the International Conference on Green Computing, GREENCOMP 2010, pp. 503–508. IEEE Computer Society, Washington, DC (2010)CrossRefGoogle Scholar
- 10.Yun, X.: On-Chip Thermal Sensor Placement, Master’s, University of Massachusetts Amherst (2008), http://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1242&context=theses
- 11.Dellaquila, K.: Thermal Profiling of Homogeneous Multi-Core Processors Using Sensor Mini-Networks, Master’s, Rochester institute of Technology (2010), https://ritdml.rit.edu/bitstream/.../KDellaquilaThesis8-2010.pdf?...1
- 12.SPEC-CPU 2000, Standard Performance Evaluation Council, Performance Evaluation in the New Millennium, Version 1.1 (2000)Google Scholar
- 13.Skadron, K., Lee, K.: Using Performance Counters for Runtime Temperature Sensing in High-Performance Processors. In: 19th IEEE International Parallel and Distributed Processing Symposium, pp. 232a–232a (2005), http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1420152
- 14.Burger, D., Austin, T.: SimpleScalar Tutorial. In: 30th International Symposium on (1997), http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:SimpleScalar+Tutorial#2
- 15.Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)Google Scholar
- 16.Long, J., Memik, S., Memik, G., Mukherjee, R.: Thermal monitoring mechanisms for chip multiprocessors. ACM Transactions on Architecture and Code Optimization 5(2), 1–33 (2008), http://portal.acm.org/citation.cfm?doid=1400112.1400114 CrossRefGoogle Scholar
- 17.Skadron, K., Stan, M., Huang, W., Velusamy, S.: Temperature-aware microarchitecture: Extended discussion and results. University of Virginia, Department of Computer Science (2003), http://scholar.google.com/scholar?q=intitle:Temperature-Aware+Microarchitecture:+Extended+Discussion+and+Results#0
- 18.Sabuncu, M.R., Ramadge, P.J.: Gradient based nonuniform subsampling for information-theoretic alignment methods. In: 26th Annual International Conference of the IEEE on Engineering in Medicine and Biology Society (IEMBS), pp. 1683–1686 (2004)Google Scholar
- 19.Cochran, R., Nowroz, A.N., Reda, S.: Post-silicon power characterization using thermal infrared emissions. In: ISLPED, pp. 331–336 (2010)Google Scholar
- 20.Mesa-Martinez, F.J., Nayfach-Battilana, J., Renau, J.: Power model validation through thermal measurements. In: ISCA, pp. 302–311 (2007)Google Scholar
- 21.Pedram, M., Nazarian, S.: Thermal modeling, analysis, and management in vlsi circuits: Principles and methods. Proceedings of the IEEE 94(8), 1487–1501 (2006)CrossRefGoogle Scholar
- 22.Isci, C., Martonosi, M.: Runtime power monitoring in high-end processors: Methodology and empirical data. In: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 36, p. 93 (2003)Google Scholar
- 23.Khan, M.A., Hankendi, C., Coskun, A.K., Herbordt, M.C.: Software optimization for performance, energy, and thermal distribution: Initial case studies. In: IEEE Workshop on Thermal Modeling and Management: From Chips to Data Centers (in conj. with Green Computing Conference (IGCC)) (2011)Google Scholar
- 24.Brooks, D., Tiwari, V., Martonosi, M.: Wattch: a framework for architectural-level power analysis and optimizations. In: ISCA, pp. 83–94 (2000)Google Scholar
- 25.Li, S., Ahn, J.H., Strong, R.D., Brockman, J.B., Tullsen, D.M., Jouppi, N.P.: Mcpat: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: MICRO 42: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 469–480 (2009)Google Scholar
- 26.Skadron, K., Stan, M., Huang, W., Velusamy, S., Sankaranarayanan, K., Tarjan, D.: Temperature-Aware Microarchitecture. In: ISCA, pp. 2–13 (2003)Google Scholar
- 27.Skadron, K., Stan, M.R., Sankaranarayanan, K., Huang, W., Velusamy, S., Tarjan, D.: Temperature-aware microarchitecture: Modeling and implementation. In: TACO, vol. 1(1), pp. 94–125 (2004)Google Scholar
- 28.Atienza, D., Valle, P.D., Paci, G., Poletti, F., Benini, L., Micheli, G.D., Mendias, J.M.: A Fast HW/SW FPGA-Based Thermal Emulation Framework for Multi-Processor System-on-Chip. In: Design Automation Conference (DAC), pp. 618–623 (2006)Google Scholar
- 29.Coskun, A.K., Atienza, D., Rosing, T.S., Brunschwiler, T., Michel, B.: Energy-efficient variable-flow liquid cooling in 3d stacked architectures. In: DATE, pp. 111–116 (2010)Google Scholar
- 30.Link, G.M., Vijaykrishnan, N.: Thermal trends in emerging technologies. In: Proceedings of the 7th International Symposium on Quality Electronic Design, ISQED 2006, pp. 625–632. IEEE Computer Society, Washington, DC (2006)CrossRefGoogle Scholar
- 31.Sridhar, A., Vincenzi, A., Ruggiero, M., Atienza, D., Brunschwiler, T.: 3d-ice: Fast compact transient thermal modeling for 3D-ICs with inter-tier liquid cooling. In: International Conference on Computer-Aided Design, ICCAD 2010 (2010)Google Scholar
- 32.Huang, W., Stan, M.R., Skadron, K., Sankaranarayanan, K., Ghosh, S., Velusam, S.: Compact thermal modeling for temperature-aware design. In: Proceedings of the 41st Annual Design Automation Conference, DAC 2004, pp. 878–883. ACM, New York (2004)CrossRefGoogle Scholar
- 33.Velusamy, S., Huang, W., Lach, J., Stan, M.R., Skadron, K.: Monitoring temperature in fpga based socs. In: ICCD, pp. 634–640 (2005)Google Scholar
- 34.Sridhar, A., Vincenzi, A., Ruggiero, M., Brunschwiler, T., Atienza Alonso, D.: Compact transient thermal model for 3D ICs with liquid cooling via enhanced heat transfer cavity geometries. In: Proceedings of the 16th International Workshop on Thermal Investigations of ICs and Systems (THERMINIC 2010), vol. 1(1), pp. 105–110. IEEE Press, New York (2010)Google Scholar
- 35.Hung, W.-L., Link, G.M., Xie, Y., Vijaykrishnan, N., Irwin, M.J.: Interconnect and thermal-aware floorplanning for 3d microprocessors. In: ISQED, pp. 98–104 (2006)Google Scholar
- 36.Sharifi, S., Rosing, T.V.: Accurate direct and indirect on-chip temperature sensing for efficient dynamic thermal management. Trans. Comp.-Aided Des. Integ. Cir. Sys. 29, 1586–1599 (2010)CrossRefGoogle Scholar
- 37.Cochran, R., Reda, S.: Spectral techniques for high-resolution thermal characterization with limited sensor data. In: DAC, pp. 478–483 (2009)Google Scholar
- 38.Coskun, A.K., Rosing, T.V., Gross, K.C.: Utilizing Predictors for Efficient Thermal Management in Multiprocessor SoCs. IEEE Transactions on CAD 28, 1503–1516 (2009)Google Scholar
- 39.Yeo, I., Liu, C.C., Kim, E.J.: Predictive dynamic thermal management for multicore systems. In: DAC, pp. 734–739 (June 2008)Google Scholar
- 40.Coskun, A.K., Rosing, T., Gross, K.: Proactive Temperature Balancing for Low-Cost Thermal Management in MPSoCs. In: International Conference on Computer-Aided Design (ICCAD), pp. 250–257 (2008)Google Scholar
- 41.Cochran, R., Reda, S.: Consistent runtime thermal prediction and control through workload phase detection. In: Design Automation Conference, DAC (2010)Google Scholar
- 42.Yeo, I., Liu, C.C., Kim, E.J.: Predictive dynamic thermal management for multicore systems. In: Proceedings of the 45th Annual Design Automation Conference, DAC 2008, pp. 734–739. ACM, New York (2008), http://doi.acm.org/10.1145/1391469.1391658 CrossRefGoogle Scholar
- 43.Coskun, A., Rosing, T., Gross, K.: Utilizing predictors for efficient thermal management in multiprocessor socs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28(10), 1503–1516 (2009)CrossRefGoogle Scholar
- 44.Ge, Y., Malani, P., Qiu, Q.: Distributed task migration for thermal management in many-core systems. In: 2010 47th ACM/IEEE on Design Automation Conference (DAC), pp. 579–584 (June 2010)Google Scholar
- 45.Bartolini, A., Cacciari, M., Tilli, A., Benini, L.: A distributed and self-calibrating model-predictive controller for energy and thermal management of high-performance multicores. In: Design, Automation Test in Europe Conference Exhibition (DATE), pp. 1–6 (March 2011)Google Scholar
- 46.Ebi, T., Faruque, M., Henkel, J.: Tape: Thermal-aware agent-based power econom multi/many-core architectures. In: IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers, ICCAD 2009, pp. 302–309 (November 2009)Google Scholar
- 47.Ge, Y., Qiu, Q.: Dynamic thermal management for multimedia applications using machine learning. In: 2011 48th ACM/EDAC/IEEE on Design Automation Conference (DAC), pp. 95–100 (June 2011)Google Scholar
- 48.Coskun, A., Rosing, T., Gross, K.: Temperature management in multiprocessor socs using online learning. In: 45th ACM/IEEE on Design Automation Conference, DAC 2008, pp. 890–893 (June 2008)Google Scholar
- 49.Pakbaznia, E., Ghasemazar, M., Pedram, M.: Temperature-aware dynamic resource provisioning in a power-optimized datacenter. In: Design, Automation Test in Europe Conference Exhibition (DATE), pp. 124–129 (March 2010)Google Scholar
- 50.JEDEC, 3D IC Standards (2011), http://www.jedec.org/standards-documents/technology-focus-areas/3d-ics