The Journal of Supercomputing

, Volume 10, Issue 3, pp 285–302 | Cite as

NuMesh: An architecture optimized for scheduled communication

  • David Shoemaker
  • Frank Honoré
  • Chris Metcalf
  • Steve Ward


The NuMesh system defines a high-speed communication substrate optimized for off-line routing. By determining possible communication paths at compile time, highly efficient hardware and software constructs can be exploited to yield superior network performance. These communication paths can be independently tuned to allow more utilized paths greater bandwidth. Although communication paths are scheduled, data need not be sent during every scheduled cycle. Flow-control protocols allow for empty communication cycles as well as for data backup in the network. Limited gate delays between NuMesh registers as well as single-cycle message transfers allow for a high clock frequency and low network latency. A highly pipelined architecture for this communication is presented and a mechanism for efficient flow-controlled communication is discussed. A unique communication protocol is presented and shown to provide single-cycle transfers between nodes. An overview of the necessary compiler support is also provided. Preliminary results and a description of the current hardware and software status are listed.


Scalable network off-line routing parallel processing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    R. Bianchini and J. P. Shen. Interprocessor traffic scheduling algorithm for multiple-processor networks. IEEE Transactions on Computers, C-36(4): 396–409, April 1987.Google Scholar
  2. 2.
    S. Borkar et al. iWarp: An integrated solution to high-speed parallel computing. InProceedings of Supercomputing '88, pp. 330–339, November 1988.Google Scholar
  3. 3.
    B. Dally and H. Aoki. Deadlock-free adaptive routing in multicomputer networks using virtual channels. IEEE Transactions on Parallel and Distributed Systems, 2(3): 176–183, November 1991.Google Scholar
  4. 4.
    B. Dally and C. L. Seitz. Deadlock-free message routing in multiprocessors. IEEE Transactions on Computers, C-36(5): 547–53, May 1987.Google Scholar
  5. 5.
    T. Gross et al. The impact of communication style on machine resource usage for the iWarp parallel processor. Carnegie-Mellon Report CMU-CS=92-215, November 1992.Google Scholar
  6. 6.
    C. Metcalf. The Numesh simulator, nsim. NuMesh Group, MIT LCS, January 1994.Google Scholar
  7. 7.
    G. Pratt and J. Nguyen. Distributed synchronous clocking. IEEE TPDS, 6(3): 314–328, March 1995.Google Scholar
  8. 8.
    D. Shoemaker. Hardware architecture and communication protocols optimized for static routing. InMIT Workshop on Scalable Computing, August 1995.Google Scholar
  9. 9.
    D. Shoemaker, C. Metcalf, and S. Ward. NuMesh: An architecture for static routing. InProceedings of International Conference on Parallel and Distributed Processing Techniques and Applications, pp. 425–434, November 1995.Google Scholar
  10. 10.
    S. Shukla and D. Agrawal. Scheduling pipelined communications in distributed memory multiprocessors for real-time applications. In18th Annual ACM SIGARCH Computer Architecture News, 19(3): 222–231, 1991.Google Scholar
  11. 11.
    J. P. Singh, W. D. Weber, and A. Gupta. SPLASH: Stanford parallel applications for shared memory. Computer Architecture News, 20(1): 5–44, 1993.Google Scholar
  12. 12.
    S. Ward et al. The NuMesh: A modular, scalable, communications substrate. InInternational Conference on Supercomputing, pp. 123–132, July 1993.Google Scholar

Copyright information

© Kluwer Academic Publishers 1996

Authors and Affiliations

  • David Shoemaker
    • 1
  • Frank Honoré
    • 1
  • Chris Metcalf
    • 1
  • Steve Ward
    • 1
  1. 1.MIT Laboratory for Computer ScienceCambridgeUSA

Personalised recommendations