Skip to main content

Many-Core Architecture for NTC: Energy Efficiency from the Ground Up

  • Chapter
Near Threshold Computing
  • 663 Accesses

Abstract

The high energy efficiency of NTC enables multicore architectures with unprecedented levels of integration, such as multicores that include 1000 sizable cores and substantial memory on the die. However, to construct such a chip, we need to fundamentally rethink the whole compute stack from the ground up for energy efficiency. First of all, we need techniques that minimize and tolerate process variation. It is also important to conceive highly-efficient voltage regulation, so that each region of the chip can operate at the most efficient voltage and frequency point. At the architecture level, we want simple cores organized in a hierarchy of clusters. Moreover, techniques to reduce the leakage power of on-chip memories are also needed, as well as dynamic voltage guard-band reduction in variation-afflicted on-chip networks. It is also crucial to develop techniques to minimize data movement, which is a major source of energy waste. Among the techniques proposed are automatically managing the data in the cache hierarchy, processing in near-memory compute engines, and efficient fine-grained synchronization. Finally, we need core-assignment algorithms that are both effective and simple to implement. In this chapter, we describe these issues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kaul H, Anders M, Mathew S, Hsu S, Agarwal A, Krishnamurthy R, Borkar S (2008) A 320 mV 56μW 411GOPS/Watt ultra-low voltage motion estimation accelerator in 65 nm CMOS. In: International solid-state circuits conference, February 2008

    Google Scholar 

  2. Chang L, Frank DJ, Montoye RK, Koester SJ, Ji BL, Coteus PW, Dennard RH, Haensch W (2010) Practical strategies for power-efficient computing technologies. In: Proceedings of the IEEE, February 2010

    Google Scholar 

  3. Dreslinski RG, Wieckowski M, Blaauw D, Sylvester D, Mudge T (2010) Near-threshold computing: reclaiming Moore’s law through energy efficient integrated circuits. In: Proceedings of the IEEE, February 2010

    Google Scholar 

  4. Markovic D, Wang CC, Alarcon LP, Liu T-T, Rabaey JM (2010) Ultralow-power design in near-threshold region. In: Proceedings of the IEEE, February 2010

    Google Scholar 

  5. Moore GE (1965) Cramming more components onto integrated circuits. Electronics 38(8):114–117

    Google Scholar 

  6. Dennard RH, Gaensslen FH, Rideout VL, Bassous E, LeBlanc AR (1974) Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J Solid State Circuits 9(5):256–268

    Article  Google Scholar 

  7. Silvano C, Palermo G, Xydis S, Stamelakos I (2014) Voltage island management in near threshold manycore architectures to mitigate dark silicon. In: Conference on design, automation and test in Europe, March 2014

    Google Scholar 

  8. Karpuzcu UR, Sinkar A, Kim NS, Torrellas J (2013) EnergySmart: toward energy-efficient manycores for near-threshold computing. In: International symposium on high performance computer architecture, February 2013

    Google Scholar 

  9. James N, Restle P, Friedrich J, Huott B, McCredie B (2007) Comparison of split versus connected-core supplies in the POWER6 microprocessor. In: International solid-state circuits conference, February 2007

    Google Scholar 

  10. Wang H, Kim NS (2013) Improving platform energy-chip area trade-off in near-threshold computing environment. In: International conference on computer aided design, November 2013

    Google Scholar 

  11. Ghasemi HR, Sinkar A, Schulte M, Kim NS (2012) Cost-effective power delivery to support per-core voltage domains for power-constrained processors. In: Design automation conference, June 2012

    Google Scholar 

  12. Ishihara F, Sheikh F, Nikolic B (2004) Level conversion for dual-supply systems. IEEE Trans Very Large Scale Integr Syst 12(2):185–195

    Article  Google Scholar 

  13. Gemmeke T, Sabry MM, Stuijt J, Raghavan P, Catthoor F, Atienza D (2014) Resolving the memory bottleneck for single supply near-threshold computing. In: Conference on design, automation and test in Europe, March 2014

    Google Scholar 

  14. Agrawal A, Jain P, Ansari A, Torrellas J (2013) Refrint: intelligent refresh to minimize power in on-chip multiprocessor cache hierarchies. In: International symposium on high performance computer architecture, February 2013

    Google Scholar 

  15. Agrawal A, Ansari A, Torrellas J (2014) Mosaic: exploiting the spatial locality of process variation to reduce refresh energy in on-chip eDRAM modules. In: International symposium on high performance computer architecture, February 2014

    Google Scholar 

  16. Ansari A, Mishra A, Xu J, Torrellas J (2014) Tangle: route-oriented dynamic voltage minimization for variation-afflicted, energy-efficient on-chip networks. In: International symposium on high performance computer architecture, February 2014

    Google Scholar 

  17. Kogge P et al (2008) ExaScale computing study: technology challenges in achieving exascale systems. In: DARPA-IPTO sponsored study, DARPA. September 2008

    Google Scholar 

  18. Feautrier P (1996) Some efficient solutions to the affine scheduling problem. Part I: One-dimensional time. Unpublished manuscript

    Google Scholar 

  19. Kogge P (1994) The EXECUBE approach to massively parallel processing. In: International conference on parallel processing, August 1994

    Google Scholar 

  20. Micron Technology Inc. (2011) Hybrid memory cube. http://www.micron.com/products/hybrid-memory-cube

  21. Fraguela B, Feautrier P, Renau J, Padua D, Torrellas J (2003) Programming the FlexRAM parallel intelligent memory system. In: International symposium on principles and practice of parallel programming, June 2003

    Google Scholar 

  22. Smith BJ (1982) Architecture and applications of the HEP multiprocessor computer system. In: Real-time signal processing IV, pp 241–248

    Google Scholar 

  23. Bikshandi G, Guo J, Hoeflinger D, Almasi G, Fraguela BB, Garzaran MJ, Padua D, von Praun C (2006) Programming for parallelism and locality with hierarchically tiled arrays. In: International symposium on principles and practice of parallel programming

    Google Scholar 

  24. Budimlic Z, Chandramowlishwaran A, Knobe K, Lowney G, Sarkar V, Treggiari L (2009) Multi-core implementations of the concurrent collections programming model. In: Workshop on compilers for parallel computers

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Josep Torrellas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Torrellas, J. (2016). Many-Core Architecture for NTC: Energy Efficiency from the Ground Up. In: Hübner, M., Silvano, C. (eds) Near Threshold Computing. Springer, Cham. https://doi.org/10.1007/978-3-319-23389-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23389-5_2

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23388-8

  • Online ISBN: 978-3-319-23389-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics