Abstract
With Moore’s law supplying billions of transistors, and uniprocessor architectures delivering diminishing performance, multicore chips are emerging as the prevailing architecture in both general-purpose and application-specific markets. As the core count increases, the need for a scalable on-chip communication fabric that can deliver high bandwidth is gaining in importance, leading to recent multicore chips interconnected with sophisticated on-chip networks. In this chapter, we first present a tutorial on on-chip network architecture fundamentals including on-chip network interfaces, topologies, routing, flow control, and router microarchitectures. Next, we detail case studies on two recent prototypes of on-chip networks: the UT-Austin TRIPS operand network and the Intel TeraFLOPS on-chip network. This chapter organization seeks to provide the foundations of on-chip networks so that readers can appreciate the different design choices faced in the two case studies. Finally, this chapter concludes with an outline of the challenges facing research into on-chip network architectures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
N. R. Adiga, M. A. Blumrich, D. Chen, P. Coteus, A. Gara, M. E. Giampapa, P. Heidelberger, S. Singh, B. D. Steinmacher-Burow, T. Takken, M. Tsao, and P. Vranas. Blue Gene/L torus interconnection network. IBM Journal of Research and Development, 49(2/3):265–276, 2005.
P. Bai, C. Auth, S. Balakrishnan, M. Bost, R. Brain, V. Chikarmane, R. Heussner, M. Hussein, J. Hwang, D. Ingerly, R. James, J. Jeong, C. Kenyon, E. Lee, S.-H. Lee, N. Lindert, M. Liu, Z. Ma, T. Marieb, A. Murthy, R. Nagisetty, S. Natarajan, J. Neirynck, A. Ott, C. Parker, J. Sebastian, R. Shaheed, S. Sivakumar, J. Steigerwald, S. Tyagi, C. Weber, B. Woolery, A.Yeoh, K. Zhang, and M. Bohr. A 65 nm Logic Technology Featuring 35 nm Gate Lengths, Enhanced Channel Strain, 8 Cu Interconnect Layers, Low-k ILD and 0.57 um2 SRAM Cell. In International Electron Devices Meeting (IEDM), pages 657–660, Dec 2004.
S. Bell, B. Edwards, J. Amann, R. Conlin, K. Joyce, V. Leung, J. MacKay, and M. Reif. TILE64 processor: A 64-core SoC with mesh interconnect. In International Solid State Circuits Conference, Feb 2008.
S. Borkar. Thousand core chips: a technology perspective. In Design Automation Conference, pages 746–749, June 2007.
D. Burger, S. Keckler, K. McKinley, M. Dahlin, L. John, C. Lin, C. Moore, J. Burrill, R. McDonald, and W. Yoder. Scaling to the End of Silicon with EDGE Architectures. IEEE Computer, 37(7):44–55, July 2004.
M. Butts. Synchronization through communication in a massively parallel processor array. IEEE Micro, 27(5):32–40, Sep/Oct 2007.
M. F. Chang, J. Cong, A. Kaplan, M. Naik, G. Reinman, E. Socher, and S. Tam. CMP network-on-chip overlaid with multi-band RF-interconnect. In International Conference on High-Performance Computer Architecture, Feb 2008.
M. Coppola, R. Locatelli, G. Maruccia, L. Pieralisi, and A. Scandurra. Spidergon: a novel on-chip communication network. In International Symposium on System-on-Chip, page 15, Nov 2004.
W. J. Dally. Virtual-channel flow control. In International Symposium of Computer Architecture, pages 60–68, May 1990.
W. J. Dally, A. Chien, S. Fiske, W. Horwat, R. Lethin, M. Noakes, P. Nuth, E. Spertus, D. Wallach, D. S. Wills, A. Chang, and J. Keen. Retrospective: the J-machine. In 25 years of the International Symposium on Computer Architecture (selected papers), pages 54–58, 1998.
W. J. Dally and C. L. Seitz. The torus routing chip. Journal of Distributed Computing, 1:187–196, 1986.
W. J. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, San Francisco, CA, 2004.
J. Duato, S. Yalamanchili, and L. Ni. Interconnection Networks. Morgan Kaufmann Publishers, San Francisco, CA, 2003.
W. Eatherton. The push of network processing to the top of the pyramid. Keynote speech, International Symposium on Architectures for Networking and Communications Systems.
N. Enright-Jerger, L.-S. Peh, and M. Lipasti. Circuit-switched coherence. In International Symposium on Networks-on-Chip, April 2008.
M. Galles. Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip. In Hot Interconnects 4, Aug 1996.
P. Gratz, B. Grot, and S. Keckler. Regional congestion awareness for load balance in networks-on-chip. In International Conference on High-Performance Computer Architecture, pages 203–214, Feb 2008.
P. Gratz, C. Kim, R. McDonald, S. W. Keckler, and D. Burger. Implementation and Evaluation of On-Chip Network Architectures. In International Conference on Computer Design, pages 477–484, Sep 2006.
R. Ho, K. Mai, and M. Horowitz. The future of wires. Proceedings of the IEEE, 89(4), Apr 2001.
D. Hopkins, A. Chow, R. Bosnyak, B. Coates, J. Ebergen, S. Fairbanks, J. Gainsley, R. Ho, J. Lexau, F. Liu, T. Ono, J. Schauer, I. Sutherland, and R. Drost. Circuit techniques to enable 430 GB/s/mm2 proximity communication. International Solid-State Circuits Conference, pages 368–609, Feb 2007.
Infiniband trade organization. http://www.infinibandta.org/
A. P. Jose, G. Patounakis, and K. L. Shepard. Pulsed current-mode signaling for nearly speed-of-light intrachip communication. Proceedings of the IEEE, 41(4):772–780, April 2006.
J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. Introduction to the cell multiprocessor. IBM Journal of Research and Development, 49(4/5):589–604, 2005.
P. Kermani and L. Kleinrock. Virtual cut-through: A new computer communication switching technique. Computer Networks, 3:267–286, 1979.
B. Kim and V. Stojanovic. Equalized interconnects for on-chip networks: Modeling and optimization framework. In International Conference on Computer-Aided Design, pages 552–559, November 2007.
J. Kim, J. Balfour, and W. J. Dally. Flattened butterfly topology for on-chip networks. In International Symposium on Microarchitecture, pages 172–182, December 2007.
J. Kim, C. A. Nicopoulos, D. Park, N. Vijaykrishnan, M. S. Yousif, and C. R. Das. A gracefully degrading and energy-efficient modular router architecture for on-chip networks. In International Symposium on Computer Architecture, pages 4–15, June 2006.
N. Kirman, M. Kirman, R. K. Dokania, J. F. Martinez, A. B. Apsel, M. A. Watkins, and D. H. Albonesi. Leveraging optical technology in future bus-based chip multiprocessors. In International Symposium on Microarchitecture, pages 492–503, December 2006.
P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-way multithreaded sparc processor. IEEE Micro, 25(2):21–29, March/April 2005.
A. Kumar, L.-S. Peh, P. Kundu, and N. K. Jha. Express virtual channels: Towards the ideal interconnection fabric. In International Symposium on Computer Architecture, pages 150–161, June 2007.
S. S. Mukherjee, P. Bannon, S. Lang, A. Spink, and D. Webb. The Alpha 21364 network architecture. IEEE Micro, 22(1):26–35, Jan/Feb 2002.
R. Mullins, A. West, and S. Moore. Low-latency virtual-channel routers for on-chip networks. In International Symposium on Computer Architecture, pages 188–197, June 2004.
C. A. Nicopoulos, D. Park, J. Kim, N. Vijaykrishnan, M. S. Yousif, and C. R. Das. ViChaR: A dynamic virtual channel regulator for network-on-chip routers. In International Symposium on Microarchitecture, pages 333–346, December 2006.
J. D. Owens, W. J. Dally, R. Ho, D. N. J. Jayasimha, S. W. Keckler, and L.-S. Peh. Research challenges for on-chip interconnection networks. IEEE Micro, 27(5):96–108, Sep/Oct 2007.
L.-S. Peh and W. J. Dally. Flit-reservation flow control. In International Symposium on High-Performance Computer Architecture, pages 73–84, Jan 2000.
L.-S. Peh and W. J. Dally. A delay model and speculative architecture for pipelined routers. In International Conference on High-Performance Computer Architecture, pages 255–266, January 2001.
K. Sankaralingam, R. Nagarajan, P. Gratz, R. Desikan, D. Gulati, H. Hanson, C. Kim, H. Liu, N. Ranganathan, S. Sethumadhavan, S. Sharif, P. Shivakumar, W. Yoder, R. McDonald, S. Keckler, and D. Burger. The Distributed Microarchitecture of the TRIPS Prototype Processor. In International Symposium on Microarchitecture, pages 480–491, December 2006.
A. Shacham, K. Bergman, and L. P. Carloni. The case for low-power photonic networks on chip. In Design Automation Conference, pages 132–135, June 2007.
L. Shang, L.-S. Peh, A. Kumar, and N. K. Jha. Thermal modeling, characterization and management of on-chip networks. In International Symposium on Microarchitecture, pages 67–78, Decemeber 2004.
A. Singh, W. J. Dally, A. K. Gupta, and B. Towles. Goal: a load-balanced adaptive routing algorithm for torus networks. In International Symposium on Computer Architecture, pages 194–205, June 2003.
M. B. Taylor, W. Lee, S. P. Amarasinghe, and A. Agarwal. Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architecture. In International Symposium on High-Performance Computer Architecture, pages 341–353, Feb 2003.
J. Tschanz, S. Narendra, Y. Ye, B. Bloechel, S. Borkar, and V. De. Dynamic Sleep Transistor and Body Bias for Active Leakage Power Control of Microprocessors. IEEE Journal of Solid-State Circuits, 38(11):1838–1845, Nov 2003.
S. Vangal, Y. Hoskote, N. Borkar, and A. Alvandpour. A 6.2-GFLOPS Floating-Point Multiply-Accumulator with Conditional Normalization. IEEE Journal of Solid-State Circuits, 41(10):2314–2323, Oct 2006.
S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, A. Singh, T. Jacob, S. Jain, C. Roberts, Y. Hoskote, N. Borkar, and S. Borkar. An 80-Tile Sub-100 W TeraFLOPS Processor in 65-nm CMOS. IEEE Journal of Solid-State Circuits, 43(1):29–41, Jan 2008.
S. Vangal, A. Singh, J. Howard, S. Dighe, N. Borkar, and A. Alvandpour. A 5.1 GHz 0.34 mm2 Router for Network-on-Chip Applications. In International Symposium on VLSI Circuits, pages 42–43, June 2007.
H.-S. Wang, L.-S. Peh, and S. Malik. Power-driven design of router microarchitectures in on-chip networks. In International Symposium on Microarchitecture, pages 105–116, Nov 2003.
H. Wilson and M. Haycock. A Six-port 30-GB/s Non-blocking Router Component Using Point-to-Point Simultaneous Bidirectional Signaling for High-bandwidth Interconnects. IEEE Journal of Solid-State Circuits, 36(12):1954–1963, Dec 2001.
Acknowledgements
Dr. Peh wishes to thank her entire Princeton research group, as well as students who have taken the ELE580 graduate course on interconnection networks as those research and teaching experiences helped significantly in the writing of this chapter. Her research has been kindly supported by the National Science Foundation, Intel Corporation, and the MARCO Gigascale Systems Research Center. Dr. Keckler thanks the entire TRIPS team, in particular Doug Burger, Paul Gratz, Heather Hanson, Robert McDonald, and Karthikeyan Sankarlingam, for their contributions to the design and implementation of the TRIPS operand network. The TRIPS project was supported by the Defense Advanced Research Projects Agency under contract F33615-01-C-4106 and by NSF CISE Research Infrastructure grant EIA-0303609. Dr. Vangal thanks the entire TeraFLOPS processor design team at Circuit Research Laboratories, Intel Corporation, for flawless execution of the design.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag US
About this chapter
Cite this chapter
Peh, LS., Keckler, S.W., Vangal, S. (2009). On-Chip Networks for Multicore Systems. In: Keckler, S., Olukotun, K., Hofstee, H. (eds) Multicore Processors and Systems. Integrated Circuits and Systems. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-0263-4_2
Download citation
DOI: https://doi.org/10.1007/978-1-4419-0263-4_2
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-0262-7
Online ISBN: 978-1-4419-0263-4
eBook Packages: Computer ScienceComputer Science (R0)