Distributed Computing

, Volume 1, Issue 4, pp 187–196 | Cite as

The torus routing chip

  • William J. Dally
  • Charles L. Seitz
Original Articles


The torus routing chip (TRC) is a selftimed chip that performs deadlock-freecut-through routing ink-aryn-cube multiprocessor interconnection networks using a new method of deadlock avoidance calledvirtual channels. A prototype TRC with byte wide self-timed communication channels achieved on first silicon a throughput of 64 Mbits/s in each dimension, about an order of magnitude better performance than the communication networks used by machines such as the Caltech Cosmic Cube or Intel iPSC. The latency of the cut-through routing of only 150 ns per routing step largely eliminates message locality considerations in the concurrent programs for such machines. The design and testing of the TRC as a self-timed chip was no more difficult than it would have been for a synchronous chip.

Key words

VLSI Interconnection networks Communication networks Concurrent computation Parallel processing Deadlock-free routing Self-timed logic Asynchronous logic Message-passing multiprocessers 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bryant R, Schuster M, Whiting D (1983) MOSSIM II: A Switch-Level Simulator for MOS LSI User's manual Caltech Tech Rep 5033: TR: 82 (January)Google Scholar
  2. 2.
    Dally WJ, Seitz CL (1986) Deadlock-free message routing in multiprocessor interconnection networks. Dept Comput Sci, California Institute of Technology, Tech Rep 5206: TR:86Google Scholar
  3. 3.
    Dally WJ: CNTK: An embedded language for circuit description. Dept Comput Sci, California Institute of Technology, Display File (in preparation)Google Scholar
  4. 4.
    Fisher AL, Kung HT (1985) Synchronizing large VLSI Processor arrays. IEEE Trans Comp, C-34(8) 734–740Google Scholar
  5. 5.
    Gunther KD (1981) Prevention of deadlocks in packetswitched data transport systems. IEEE Trans Commun COM-29, (4) 512–524Google Scholar
  6. 6.
    Intel iPSC User's Guide (1985) Intel Document No. 175455-001 (August)Google Scholar
  7. 7.
    Kermani, P, Kleinrock L (1979) Virtual, cut-through: a new computer communication switching technique. Comput Networks 3:267–286Google Scholar
  8. 8.
    Kleinrock L (1976) Queuing systems; vol 2. Wiley, New York, pp 438–440Google Scholar
  9. 9.
    Lang CR (1982) The extension of object-oriented languages to a homogeneous concurrent architecture. Dept. of Computer Science, California Institute of Technology, Technical Report, 5014: TR:82 118–124Google Scholar
  10. 10.
    Ousterhout, JK et al. (1985) The magic VLSI layout system. IEEE Design and Test of Computers, 2(1), 19–30Google Scholar
  11. 11.
    Seitz, CL (1980) System Timing. In: Mead CA, Conway LA, Introduction to VLSI Systems, Addison Wesley, London Amsterdam ParisGoogle Scholar
  12. 12.
    Seitz CL (1984) Concurrent VLSI Architectures. IEEE Trans Comput, C-33(12), 1247–1265Google Scholar
  13. 13.
    Seitz CL (1985) The cosmic cube CACM, 28(1), 22–33Google Scholar
  14. 14.
    Seitz CL et al. (1985) The Hypercube Commun Chip. Dept. of Computer Science, California Institute of Technology, Display Fiel 5182: DF:85Google Scholar
  15. 15.
    Steele CS (1985) Placement of communicating processes on multiprocessor networks. Dept Comput Sci, California Institute of Technology, Tech Rep 5184:TR:85.Google Scholar
  16. 16.
    Tanenbaum AS (1981) Comput networks. Prentice Hall, Englwood Cliffs, NJ, pp 15–21Google Scholar
  17. 17.
    Trotter D (1985) Miss MOSIS, Scalable CMOS Rules, Version 1.2Google Scholar

Copyright information

© Springer-Verlag 1986

Authors and Affiliations

  • William J. Dally
    • 1
  • Charles L. Seitz
    • 1
  1. 1.Department of Computer ScienceCalifornia Institute of TechnologyPasadenaUSA

Personalised recommendations