Fast Network-on-Chip Design

Mandal, Ayan; Khatri, Sunil P.; Mahapatra, Rabi N.

doi:10.1007/978-1-4614-9405-8_3

Ayan Mandal⁴,
Sunil P. Khatri⁵ &
Rabi N. Mahapatra⁶

963 Accesses

Abstract

In previous Chapter, we showed how resonant clocking can be used as a high-speed, low power, stable, on-chip clock generation and distribution schemes. In this chapter, we use such a clock to design a high speed source-synchronous ring-based NoC architecture. In Sect. 3.1, we introduce our NoC design, which comprises of extremely fast, intersecting source-synchronous data rings. These source-synchronous data rings traverse the CMP in both the horizontal and vertical directions providing complete connectivity to all the PEs in a CMP. In our approach, the interconnection network operates on a different clock domain which runs significantly faster than the PE clocks. This helps us achieve inter-processor communication with minimal latency. We perform architectural simulations of the ring-based NoC in Sect. 3.2. We propose a deadlock-free routing protocol of the source-synchronous ring-based NoC by using link ordering and virtual channel based buffered flow control. Architectural results obtained on synthetic and real traffic demonstrate that the source-synchronous ring-based NoC has significantly lower latency and higher maximum sustained injection rate compared to a state of the art mesh-based NoC. Next, in Sect. 3.3, we propose a modified source-synchronous design in which the PEs extract a low jitter clock directly from the high speed ring clock by division, and hence are synchronous with the NoC. This is feasible due to the extremely good jitter characteristics of the SWO based clock generation and distribution scheme of Sect. 2.2. Using the above modified design, we propose a class of source-synchronous NoCs organized in an H-tree topology which consume lower logic and wiring area compared to a state of the art mesh. Architectural simulations on synthetic and real traffic show that our H-tree based NoC designs can provide significantly lower latency and are able to sustain a higher injection rate compared to a state of the art mesh. Using the modified source-synchronous design proposed in Sect. 3.3, we also evaluate two more floorplan-friendly NoC topologies in Sect. 3.4. These two floorplan-friendly NoC topologies consume significantly lower logic and wiring area compared to a state of the art mesh. Architectural simulations on synthetic and real traffic show that they can provide significantly lower latency while achieving same or better maximum sustained injection rate compared to a state of the art mesh.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Rajeev Balasubramonian, Naveen Muralimanohar, Karthik Ramani, and Venkatanand Venkatachalapathy, “Microarchitectural Wire Management for Performance and Power in Partitioned Architectures,” in Proceedings of the 11th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, 2005, pp. 28–39, IEEE Computer Society.
Google Scholar
James D. Balfour and William J. Dally, “Design tradeoffs for tiled CMP on-chip networks,” in International Conference on Supercomputing, 2006, pp. 187–198.
Google Scholar
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li, “The PARSEC benchmark suite: Characterization and architectural implications,” Tech. Rep., IN PRINCETON UNIVERSITY, 2008.
Google Scholar
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood, “The GEM5 simulator,” SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1–7, Aug. 2011.
Article Google Scholar
T. Bjerregaard, “The MANGO clockless network-on-chip: Concepts and implementation,” 2005, Supervised by Assoc. Prof. Jens Sparsø, IMM.
Google Scholar
L. Bononi, N. Concer, M. Grammatikakis, M. Coppola, and R. Locatelli, “NoC Topologies Exploration based on Mapping and Simulation Models,” in Digital System Design Architectures, Methods and Tools, 2007. DSD 2007. 10th Euromicro Conference on, 2007, pp. 543–546.
Google Scholar
T. Chelcea and S.M. Nowick, “A low-latency FIFO for mixed-clock systems,” in VLSI, 2000. Proceedings. IEEE Computer Society Workshop on, 2000, pp. 119–126.
Google Scholar
D. M. Chiu, M. Kadansky, R. Perlman, J. Reynders, G. Steele, and M. Yuksel, “Deadlock-free routing based on ordered links,” in Proceedings of the 27th Annual IEEE Conference on Local Computer Networks, Washington, DC, USA, 2002, LCN '02, pp. 0062–, IEEE Computer Society.
Google Scholar
E C Cummings and Peter Alfke, “Simulation and Synthesis Techniques for Asynchronous FIFO Design with Asynchronous Pointer Comparisons,” Technical Report, Sunburst Design, 2002.
Google Scholar
W. J. Dally and C. L. Seitz, “The Torus Routing Chip,” The Journal of Distributed Computing, vol. 1(3), pp. 187–196, 1986.
Google Scholar
W. J. Dally and C. L. Seitz, “Deadlock-free message routing in multiprocessor interconnection networks,” IEEE Trans. Comput., vol. 36, no. 5, pp. 547–553, May 1987.
Article MATH Google Scholar
W J Dally and J W Poulton, Digital Systems Engineering, Cambridge University Press, 1998.
Google Scholar
W.J. Dally and B. Towles, “Route packets, not wires: on-chip interconnection networks,” in Design Automation Conference, 2001. Proceedings, 2001, pp. 684–689.
Google Scholar
Jose Duato, Sudhakar Yalamanchili, and Ni Lionel, Interconnection Networks: An Engineering Approach, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002.
Google Scholar
G. Gerosa, S. Curtis, M. D’Addeo, Bo Jiang, B. Kuttanna, F. Merchant, B. Patel, M.H. Taufique, and H. Samarchi, “A Sub-2W Low Power IA Processor for Mobile Internet Devices in 45 nm High-k Metal Gate CMOS,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 1, pp. 73–82, 2009.
Google Scholar
P. Gratz, Changkyu Kim, R. McDonald, S.W. Keckler, and D. Burger, “Implementation and Evaluation of On-Chip Network Architectures,” in Computer Design, 2006. ICCD 2006. International Conference on, Oct 2006, pp. 477–484.
Google Scholar
M.N. Horak, S.M. Nowick, M. Carlberg, and U. Vishkin, “A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors,” in Networks-on-Chip (NOCS), 2010 Fourth ACM/IEEE International Symposium on, May 2010, pp. 43–50.
Google Scholar
Jingcao Hu, Yangdong Deng, and Radu Marculescu, “System-level point-to-point communication synthesis using floorplanning information,” in Proceedings of the 2002 Asia and South Pacific Design Automation Conference, Washington, DC, USA, 2002, ASP-DAC '02, pp. 573–, IEEE Computer Society.
Google Scholar
Inc Meta-Software, “HSPICE user’s manual,” Campbell, CA.
Google Scholar
F. Karim, A. Nguyen, and S. Dey, “An interconnect architecture for networking systems on chips,” Micro, IEEE, vol. 22, no. 5, pp. 36–45, Sep/Oct 2002.
Google Scholar
J. Kim, J. Balfour, and W.J. Dally, “Flattened butterfly topology for on-chip networks,” Computer Architecture Letters, vol. 6, no. 2, pp. 37–40, Feb. 2007.
Article Google Scholar
M.M. Kim, J.D. Davis, M. Oskin, and T. Austin, “Polymorphic On-Chip Networks,” in Computer Architecture, 2008. ISCA '08. 35th International Symposium on, June 2008, pp. 101–112.
Google Scholar
Charles E. Leiserson, “Fat-trees: universal networks for hardware-efficient supercomputing,” IEEE Trans. Comput., vol. 34, pp. 892–901, October 1985.
Article Google Scholar
Daniele Ludovici, Alessandro Strano, Georgi N. Gaydadjiev, and Davide Bertozzi, “Mesochronous NoC technology for power-efficient GALS MPSoCs,” in Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip, New York, NY, USA, 2011, INA-OCMC '11, pp. 27–30, ACM.
Google Scholar
George Michelogiannakis, Daniel Sanchez, William J. Dally, and Christos Kozyrakis, “Evaluating bufferless flow control for on-chip networks,” in Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip, Washington, DC, USA, 2010, NOCS '10, pp. 9–16, IEEE Computer Society.
Google Scholar
U Nawathe, “Design and implementation of Sun’s Niagara2 processor,” Technical Report, Sun Microsystems, 2007.
Google Scholar
L Peh H Wang and S Malik, “Power-driven design of router microarchitectures in on-chip networks,” in Microarchitecture, 2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Symposium on, dec. 2003, pp. 105–116.
Google Scholar
“PTM website,” http://www.eas.asu.edu/~ptm (Accessed April 22, 2013).
“Raphael Interconnect Analysis Tool: User’s Guide,”.
Google Scholar
H. Samuelsson and S. Kumar, “Ring Road NoC architecture,” in Norchip, 2004, pp. 16–19.
Google Scholar
Daniel Sanchez, George Michelogiannakis, and Christos Kozyrakis, “An analysis of on-chip interconnection networks for large-scale chip multiprocessors,” ACM Trans. Archit. Code Optim., vol. 7, pp. 4:1–4:28, May 2010.
Google Scholar
Yvain Thonnart, Pascal Vivet, and Fabien Clermidy, “A fully-asynchronous low-power framework for GALS NoC integration,” in Proceedings of the Conference on Design, Automation and Test in Europe, 3001 Leuven, Belgium, Belgium, 2010, DATE '10, pp. 33–38, European Design and Automation Association.
Google Scholar
Sergio Tota, Mario R. Casu, and Luca Macchiarulo, “Implementation analysis of NoC: a MPSoC trace-driven approach,” in Proceedings of the 16th ACM Great Lakes symposium on VLSI. 2006, GLSVLSI '06, pp. 204–209, ACM.
Google Scholar
Anh Thien Tran, Dean Nguyen Truong, and B. Baas, “A Reconfigurable Source-Synchronous On-Chip Network for GALS Many-Core Platforms,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 29, no. 6, pp. 897–910, June 2010.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, Texas A&M University, College Station, USA
Ayan Mandal
Electrical and Computer Engineering, Texas A&M University, College Station, USA
Sunil P. Khatri
Computer Science and Engineering, Texas A&M University, College Station, USA
Rabi N. Mahapatra

Authors

Ayan Mandal
View author publications
You can also search for this author in PubMed Google Scholar
Sunil P. Khatri
View author publications
You can also search for this author in PubMed Google Scholar
Rabi N. Mahapatra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ayan Mandal .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mandal, A., Khatri, S., Mahapatra, R. (2014). Fast Network-on-Chip Design. In: Source-Synchronous Networks-On-Chip. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9405-8_3

Download citation

DOI: https://doi.org/10.1007/978-1-4614-9405-8_3
Published: 14 November 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-9404-1
Online ISBN: 978-1-4614-9405-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics