A Fine-Grain, Message-Passing Processing Node

  • William J. Dally


This paper describes a processing node for a fine-grain message-passing concurrent computer. The node consists of a processor, a communication unit, and a memory. To reduce the overhead of message passing and task switching to 5µs, the node incorporates a send instruction, a fast communication system, hardware message buffering and dispatch, and a general translation mechanism. These mechanisms work together to implement a fine-grain programming system.


Destination Node Clock Cycle Processing Node Concurrent Programming Message Delivery 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Agha, Gul A., Actors: A Model of Concurrent Computation in Distributed Systems, MIT Press, 1986.Google Scholar
  2. [2]
    Athas, W.C., and Seitz, C.L., Cantor Language Report, Technical Report 5232:TR:86, Dept. of Computer Science, California Institute of Technology, 1986.Google Scholar
  3. [3]
    BBN Advanced Computers, Inc., Butterfly Parallel Processor Overview, BBN Report No. 6148, March 1986.Google Scholar
  4. [4]
    Blahut, Richard E., Theory and Practice of Error Control Codes, Addison-Wesley, 1983, pp. 65–90.Google Scholar
  5. [5]
    Dally, William J., A VLSI Architecture for Concurrent Data Structures, Kluwer, Hingham, MA, 1987.CrossRefGoogle Scholar
  6. [6]
    Dally, William J. and Seitz, Charles L., “The Torus Routing Chip,” J. Distributed Systems, Vol. 1, No. 3, 1986, pp. 187–196.Google Scholar
  7. [7]
    Dally, William J. “Wire Efficient VLSI Multiprocessor Communication Networks,” Proceedings Stanford Conference on Advanced Research in VLSI, March 1987, pp. 391–415.Google Scholar
  8. [8]
    Dally, William J. and Seitz, Charles L., “ Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Transactions on Computers, Vol. C-36, No. 5, May 1987, pp. 547–553.CrossRefGoogle Scholar
  9. [9]
    Dally, William J. et al., “Architecture of a Message-Driven Processor,” Proceedings of the 14th Symposium on Computer Architecture, June 1987, pp. 189–196.Google Scholar
  10. [10]
    Dally, William J., and Song, Paul., “Design of a Self-Timed VLSI Multicomputer Communication Controller,” To appear in, Proc. ICCD-87, 1987.Google Scholar
  11. [11]
    Dally, William J., “Concurrent Data Structures,” Chapter 7 in [26].Google Scholar
  12. [12]
    Dally, William J., “The J-Machine: A Concurrent VLSI Message-Passing Computer for Symbolic and Numeric Processing,” to appear.Google Scholar
  13. [13]
    Dennis, Jack B., “Data Flow Supercomputers,” IEEE Computer, Vol. 13, No. 11, Nov. 1980, pp. 48–56.CrossRefGoogle Scholar
  14. [14]
    Goldberg, Adele, and Robson, David, Smalltalk-80, The Language and its Implementation, Addison-Wesley, Reading, Mass., 1983.Google Scholar
  15. [15]
    Halstead, Robert H., “Parallel Symbolic Computation,” IEEE Computer, Vol. 19, No. 8, Aug. 1986, pp. 35–43.CrossRefGoogle Scholar
  16. [16]
    Hoare, C.A.R., “Communicating Sequential Processes,” CA CM, Vol. 21, No. 8, August 1978, pp. 666–677.Google Scholar
  17. [17]
    Inmos Limited, /MS T424 Reference Manual, Order No. 72 TRN 006 00, Bristol, United Kingdom, November 1984.Google Scholar
  18. [18]
    Intel Scientific Computers, iPSC User’s Guide, Order No. 175455–001, Santa Clara, CA, Aug. 1985.Google Scholar
  19. [19]
    Kermani, Parviz and Kleinrock, Leonard, “Virtual Cut-Through: A New Computer Communication Switching Technique,” Computer Networks, Vol 3., 1979, pp. 267–286.Google Scholar
  20. [20]
    Lutz, C., et al., “Design of the Mosaic Element,” Proc. MIT Conference on Advanced Research in VLSI, Artech Books, 1984, pp. 1–10.Google Scholar
  21. [21]
    Palmer, John F., “The NCUBE Family of Parallel Supercomputers,” Proc. IEEE International Conference on Computer Design, ICCD-86, 1986, p. 107.Google Scholar
  22. [22]
    Pfister, G.F. et al., “The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture”, Proceedings ICPP, 1985, pp. 764–771.Google Scholar
  23. [23]
    Seitz, Charles L., “System Timing” in Introduction to VLSI Systems,C. A. Mead and L. A. Conway, Addison-Wesley, 1980, Ch. 7.Google Scholar
  24. [24]
    Seitz, Charles L., et al., The Hypercube Communications Chip, Display File 5182:DF:85, Dept. of Computer Science, California Institute of Technology, March 1985.Google Scholar
  25. [25]
    Seitz, Charles L., “The Cosmic Cube”, Comm. ACM, Vol. 28, No. 1, Jan. 1985, pp. 22–33.CrossRefGoogle Scholar
  26. [26]
    Seitz, Charles L., Athas, William C., Dally, William J., Faucette, Reese, Martin, Alain J., Mattisson, Sven, Steele, Craig S., and Su, Wen-King, Message-Passing Concurrent Computers: Their Architecture and Programming, Addison-Wesley, publication expected 1987.Google Scholar
  27. [27]
    Stefik, Mark and Bobrow, Daniel G., “Object-Oriented Programming: Themes and Variations,” AI Magazine, Vol. 6, No. 4, Winter 1986, pp. 40–62.Google Scholar
  28. [28]
    Su, Wen-King, Faucette, Reese, and Seitz, Charles L., C Programmer’s Guide to the Cosmic Cube, Technical Report 5203:TR:85, Dept. of Computer Science, California Institute of Technology, September 1985.Google Scholar
  29. [29]
    Theriault D.G., Issues in the Design and Implementation of Act2,MIT Artificial Intelligence Laboratory, Technical Report 728, June 1983.Google Scholar

Copyright information

© Plenum Press, New York 1988

Authors and Affiliations

  • William J. Dally
    • 1
  1. 1.Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations