, Volume 1, Issue 4, pp 187-196

The torus routing chip

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

The torus routing chip (TRC) is a selftimed chip that performs deadlock-freecut-through routing ink-aryn-cube multiprocessor interconnection networks using a new method of deadlock avoidance calledvirtual channels. A prototype TRC with byte wide self-timed communication channels achieved on first silicon a throughput of 64 Mbits/s in each dimension, about an order of magnitude better performance than the communication networks used by machines such as the Caltech Cosmic Cube or Intel iPSC. The latency of the cut-through routing of only 150 ns per routing step largely eliminates message locality considerations in the concurrent programs for such machines. The design and testing of the TRC as a self-timed chip was no more difficult than it would have been for a synchronous chip.

Bill Dally received his B. S. degree in Electrical Engineering from the Virginia Polytechnic Institute in 1980 and his M.S. degree in Electrical Engineering from Stanford University in 1981. From 1980 to 1982 he worked at Bell Telephone Laboratories, where he contributed to the design of the BELLMAC-32 microprocessor. From 1982 to 1983 he worked as a consultant in the area of digital systems design. Since 1983 he has been a graduate student in Computer Science at Caltech, and is expected to complete his Ph.D. studies in the spring 1986. His current research interests include computer architecture, computer aided design, VLSI, design, and concurrent systems.
Chuck Seitz earned B.S., M.S., and Ph.D. degrees from M.I.T. Before joining the Computer Science faculty at Caltech in 1977, he worked as a member of the technical staff of the Evans & Sutherland Computer Corporation from 1969 to 1971, as an Assistant Professor of Computer Science at the University of Utah from 1970 to 1972, and as a consultant to Burroughs Corporation from 1971 to 1978. He is currently a Professor of Computer Science at Caltech, where his research and teaching activities are in the areas of VLSI architecture and design, concurrent computation, and self-timed systems.
The research described in this paper was sponsored in part by the Defense Advanced Research Projects Agency, ARPA Order number 3771, and monitored by the Office, of Naval Research under contract number N 00014-79-C-0597, in part by Intel Corporation, and in part by an AT & T Ph.D. fellowship