A Fine-Grain, Message-Passing Processing Node

  • William J. Dally


This paper describes a processing node for a fine-grain message-passing concurrent computer. The node consists of a processor, a communication unit, and a memory. To reduce the overhead of message passing and task switching to 5µs, the node incorporates a send instruction, a fast communication system, hardware message buffering and dispatch, and a general translation mechanism. These mechanisms work together to implement a fine-grain programming system.


Destination Node Clock Cycle Processing Node Concurrent Programming Message Delivery 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Agha, Gul A., Actors: A Model of Concurrent Computation in Distributed Systems, MIT Press, 1986.Google Scholar
  2. [2]
    Athas, W.C., and Seitz, C.L., Cantor Language Report, Technical Report 5232:TR:86, Dept. of Computer Science, California Institute of Technology, 1986.Google Scholar
  3. [3]
    BBN Advanced Computers, Inc., Butterfly Parallel Processor Overview, BBN Report No. 6148, March 1986.Google Scholar
  4. [4]
    Blahut, Richard E., Theory and Practice of Error Control Codes, Addison-Wesley, 1983, pp. 65–90.Google Scholar
  5. [5]
    Dally, William J., A VLSI Architecture for Concurrent Data Structures, Kluwer, Hingham, MA, 1987.CrossRefGoogle Scholar
  6. [6]
    Dally, William J. and Seitz, Charles L., “The Torus Routing Chip,” J. Distributed Systems, Vol. 1, No. 3, 1986, pp. 187–196.Google Scholar
  7. [7]
    Dally, William J. “Wire Efficient VLSI Multiprocessor Communication Networks,” Proceedings Stanford Conference on Advanced Research in VLSI, March 1987, pp. 391–415.Google Scholar
  8. [8]
    Dally, William J. and Seitz, Charles L., “ Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Transactions on Computers, Vol. C-36, No. 5, May 1987, pp. 547–553.CrossRefGoogle Scholar
  9. [9]
    Dally, William J. et al., “Architecture of a Message-Driven Processor,” Proceedings of the 14th Symposium on Computer Architecture, June 1987, pp. 189–196.Google Scholar
  10. [10]
    Dally, William J., and Song, Paul., “Design of a Self-Timed VLSI Multicomputer Communication Controller,” To appear in, Proc. ICCD-87, 1987.Google Scholar
  11. [11]
    Dally, William J., “Concurrent Data Structures,” Chapter 7 in [26].Google Scholar
  12. [12]
    Dally, William J., “The J-Machine: A Concurrent VLSI Message-Passing Computer for Symbolic and Numeric Processing,” to appear.Google Scholar
  13. [13]
    Dennis, Jack B., “Data Flow Supercomputers,” IEEE Computer, Vol. 13, No. 11, Nov. 1980, pp. 48–56.CrossRefGoogle Scholar
  14. [14]
    Goldberg, Adele, and Robson, David, Smalltalk-80, The Language and its Implementation, Addison-Wesley, Reading, Mass., 1983.Google Scholar
  15. [15]
    Halstead, Robert H., “Parallel Symbolic Computation,” IEEE Computer, Vol. 19, No. 8, Aug. 1986, pp. 35–43.CrossRefGoogle Scholar
  16. [16]
    Hoare, C.A.R., “Communicating Sequential Processes,” CA CM, Vol. 21, No. 8, August 1978, pp. 666–677.Google Scholar
  17. [17]
    Inmos Limited, /MS T424 Reference Manual, Order No. 72 TRN 006 00, Bristol, United Kingdom, November 1984.Google Scholar
  18. [18]
    Intel Scientific Computers, iPSC User’s Guide, Order No. 175455–001, Santa Clara, CA, Aug. 1985.Google Scholar
  19. [19]
    Kermani, Parviz and Kleinrock, Leonard, “Virtual Cut-Through: A New Computer Communication Switching Technique,” Computer Networks, Vol 3., 1979, pp. 267–286.Google Scholar
  20. [20]
    Lutz, C., et al., “Design of the Mosaic Element,” Proc. MIT Conference on Advanced Research in VLSI, Artech Books, 1984, pp. 1–10.Google Scholar
  21. [21]
    Palmer, John F., “The NCUBE Family of Parallel Supercomputers,” Proc. IEEE International Conference on Computer Design, ICCD-86, 1986, p. 107.Google Scholar
  22. [22]
    Pfister, G.F. et al., “The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture”, Proceedings ICPP, 1985, pp. 764–771.Google Scholar
  23. [23]
    Seitz, Charles L., “System Timing” in Introduction to VLSI Systems,C. A. Mead and L. A. Conway, Addison-Wesley, 1980, Ch. 7.Google Scholar
  24. [24]
    Seitz, Charles L., et al., The Hypercube Communications Chip, Display File 5182:DF:85, Dept. of Computer Science, California Institute of Technology, March 1985.Google Scholar
  25. [25]
    Seitz, Charles L., “The Cosmic Cube”, Comm. ACM, Vol. 28, No. 1, Jan. 1985, pp. 22–33.CrossRefGoogle Scholar
  26. [26]
    Seitz, Charles L., Athas, William C., Dally, William J., Faucette, Reese, Martin, Alain J., Mattisson, Sven, Steele, Craig S., and Su, Wen-King, Message-Passing Concurrent Computers: Their Architecture and Programming, Addison-Wesley, publication expected 1987.Google Scholar
  27. [27]
    Stefik, Mark and Bobrow, Daniel G., “Object-Oriented Programming: Themes and Variations,” AI Magazine, Vol. 6, No. 4, Winter 1986, pp. 40–62.Google Scholar
  28. [28]
    Su, Wen-King, Faucette, Reese, and Seitz, Charles L., C Programmer’s Guide to the Cosmic Cube, Technical Report 5203:TR:85, Dept. of Computer Science, California Institute of Technology, September 1985.Google Scholar
  29. [29]
    Theriault D.G., Issues in the Design and Implementation of Act2,MIT Artificial Intelligence Laboratory, Technical Report 728, June 1983.Google Scholar

Copyright information

© Plenum Press, New York 1988

Authors and Affiliations

  • William J. Dally
    • 1
  1. 1.Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations