Skip to main content

The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor

  • Chapter
Scalable Shared Memory Multiprocessors

Abstract

The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low dimension direct interconnection network to provide scalable communication bandwidth, while allowing the exploitation of locality. Despite its distributed memory architecture, Alewife allows efficient shared memory programming through a multilayered approach to locality management. A new scalable cache coherence scheme called Limit LESS directories allows the use of caches for reducing communication latency and network bandwidth requirements. Alewife also employs run-time and compile-time methods for partitioning and placement of data and processes to enhance communication locality. While the above methods attempt to minimize communication latency, remote communication with distant processors cannot be completely avoided. Alewife’s processor, Sparcle, is designed to tolerate these latencies by rapidly switching between threads of computation. This paper describes the Alewife architecture and concentrates on the novel hardware features of the machine including LimitLESS directories and the rapid context switching processor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Sarita V. Adve and Mark D. Hill. Weak Ordering—A New Definition. In Proceedings 17th Annual International Symposium on Computer Architecture, June 1990.

    Google Scholar 

  2. Anant Agarwal. Limits on Interconnection Network Performance. IEEE Transactions on Parallel and Distributed Systems, 1991. To appear.

    Google Scholar 

  3. Anant Agarwal. Performance Tradeoffs in Multithreaded Processors. September 1989. MIT VLSI Memo 89-566, Laboratory for Computer Science. Submitted for publication.

    Google Scholar 

  4. Anant Agarwal, Beng-Hong Lim, David A. Kranz, and John Kubiatowicz. APRIL: A Processor Architecture for Multiprocessing. In Proceedings 17th Annual International Symposium on Computer Architecture, pages 104–114, June 1990.

    Google Scholar 

  5. Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An Evaluation of Directory Schemes for Cache Coherence. In Proceedings of the 15th International Symposium on Computer Architecture, IEEE, New York, June 1988.

    Google Scholar 

  6. Lucien M. Censier and Paul Feautrier. A New Solution to Coherence Problems in Multicache Systems. IEEE Transactions on Computers, C-27(12):1112–1118, December 1978.

    Article  Google Scholar 

  7. David Chaiken, Craig Fields, Kiyoshi Kurihara, and Anant Agarwal. Directory-Based Cache-Coherence in Large-Scale Multiprocessors. IEEE Computer, June 1990.

    Google Scholar 

  8. David Chaiken, John Kubiatowicz, and Anant Agarwal. LimitLESS Directories: A Scalable Cache Coherence Scheme. In Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (AS-PLOS IV). To appear., ACM, April 1991.

    Google Scholar 

  9. Mathews Cherian. A Study of Backoff Barrier Synchronization in Shared-Memory Multiprocessors. Technical Report, S.M. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, May 1989.

    Google Scholar 

  10. D. R. Cheriton, H. A. Goosen,, and P. D. Boyle. ParaDIGM: A Highly Scalable Shared-Memory Multi-computer Architecture. IEEE Computer. To appear.

    Google Scholar 

  11. William J. Dally. A VLSI Architecture for Concurrent Data Structures. Kluwer Academic Publishers, 1987.

    Google Scholar 

  12. Michel Dubois, Christoph Scheurich, and Faye A. Briggs. Synchronization, coherence, and event ordering in multiprocessors. IEEE Computer, 9–21, February 1988.

    Google Scholar 

  13. David A. Kranz et al. ORBIT: An Optimizing Compiler for Scheme, In Proceedings of SIGPLAN’ 86, Symposium on Compiler Construction, June 1986.

    Google Scholar 

  14. Daniel Gajski, David Kuck, Duncan Lawrie, and Ahmed Saleh. Cedar — A Large Scale Multiprocessor. In International Conference on Parallel Processing, pages 524–529, August 1983.

    Google Scholar 

  15. James R. Goodman. Using Cache Memory to Reduce Processor-Memory Traffic. In Proceedings of the 10th Annual Symposium on Computer Architecture, pages 124–131, IEEE, New York, June 1983.

    Google Scholar 

  16. James R. Goodman and Philip J. Woest. The Wisconsin Multicube: A New Large Scale Cache-Coherent Multiprocessor. In Proceedings of the 15th Annual International Symposium on Computer Architecture, pages 422–431, Hawaii, June 1988.

    Google Scholar 

  17. A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir. The NYU Ultracomputer — Designing a MIMD Shared-Memory Parallel Machine. IEEE Transactions on Computers, C-32(2):175–189, February 1983.

    Article  Google Scholar 

  18. R.H. Halstead and T. Fujita. MASA: A Multithreaded Processor Architecture for Parallel Symbolic Computing. In Proceedings of the 15th Annual International Symposium on Computer Architecture, pages 443–451, IEEE, New York, June 1988.

    Chapter  Google Scholar 

  19. W. D. Hillis. The Connection Machine. The MIT Press, Cambridge, MA, 1985.

    Google Scholar 

  20. David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Distributed-Directory Scheme: Scalable Coherent Interface. IEEE Computer, 74–77, June 1990.

    Google Scholar 

  21. Parviz Kermani and Leonard Kleinrock. Virtual Cut-Through: A New Computer Communication Switching Technique. Computer Networks, 3:267–286, October 1979.

    MathSciNet  MATH  Google Scholar 

  22. David A. Kranz. ORBIT: An Optimizing Compiler for Scheme. PhD thesis, Yale University, February 1988. Technical Report YALEU/DCS/RR-632.

    Google Scholar 

  23. David A. Kranz, R. Halstead, and E. Mohr. Mul-T: A High-Performance Parallel Lisp. In Proceedings of SIGPLAN’ 89, Symposium on Programming Languages Design and Implementation, June 1989.

    Google Scholar 

  24. James T. Kuehn and Burton J. Smith. The HORIZON Supercomputing System: Architecture and Software. In Proceedings of Supercomputing’ 88, November 1988.

    Google Scholar 

  25. Kiyoshi Kurihara. Performance Evaluation of Large-Scale Multiprocessors. Technical Report, S.M. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, September 1990.

    Google Scholar 

  26. D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy. The Directory-Based Cache Coherence Protocol for the DASH Multiprocessor. In Proceedings 17th Annual International Symposium on Computer Architecture, pages 49–58, June 1990.

    Google Scholar 

  27. Gino Maa. The WAIF Intermediate Graphical Form. Oct. 1990. Alewife Memo.

    Google Scholar 

  28. Eric Mohr, David A. Kranz, and Robert H. Halstead. Lazy task creation: a technique for increasing the granularity of parallel programs. In Proceedings of Symposium on Lisp and Functional Programming, June 1990.

    Google Scholar 

  29. Dan Nussbaum and Anant Agarwal. Scalability of Parallel Machines. Communications of the ACM, March 1990. To appear.

    Google Scholar 

  30. Brian W. O’Krafka and A. Richard Newton. An Empirical Evaluation of Two Memory-Efficient Directory Methods. In Proceedings 17th Annual International Symposium on Computer Architecture, June 1990.

    Google Scholar 

  31. G. M. Papadopoulos and D.E. Culler. Monsoon: An Explicit Token-Store Architecture. In Proceedings 17th Annual International Symposium on Computer Architecture, June 1990.

    Google Scholar 

  32. G. F. Pfister et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture. In Proceedings ICPP, pages 764–771, August 1985.

    Google Scholar 

  33. G. N. S. Prasanna. Structure Driven Multiprocessor Compilation of Numeric Problems. PhD thesis, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 1990.

    Google Scholar 

  34. Charles L. Seitz. Concurrent VLSI Architectures. IEEE Transactions on Computers, C-33(12):1247–1265, December 1984.

    Article  Google Scholar 

  35. B.J. Smith. Architecture and Applications of the HEP Multiprocessor Computer System. SPIE, 298:241–248, 1981.

    Google Scholar 

  36. SPARC Architecture Manual. 1988. SUN Microsystems, Mountain View, California.

    Google Scholar 

  37. C. K. Tang. Cache Design in the Tightly Coupled Multiprocessor System. In AFIPS Conference Proceedings, National Computer Conference, NY, NY, pages 749–753, June 1976.

    Google Scholar 

  38. Charles P. Thacker and Lawrence C. Stewart. Firefly: a Multiprocessor Workstation. In Proceedings of ASPLOS II, pages 164–172, October 1987.

    Google Scholar 

  39. Wolf-Dietrich Weber and Anoop Gupta. Analysis of Cache Invalidation Patterns in Multiprocessors. In Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS III), April 1989.

    Google Scholar 

  40. Wolf-Dietrich Weber and Anoop Gupta. Exploring the Benefits of Multiple Hardware Contexts in a Multiprocessor Architecture: Preliminary Results. In Proceedings 16th Annual International Symposium on Computer Architecture, IEEE, New York, June 1989.

    Google Scholar 

  41. Andrew Wilson. Hierarchical Cache/Bus Architecture for Shared Memory Multiprocessors. In Proceedings of the 14th Annual International Symposium on Computer Architecture, pages 244–252, June 1987.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer Science+Business Media New York

About this chapter

Cite this chapter

Agarwal, A. et al. (1992). The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor. In: Dubois, M., Thakkar, S. (eds) Scalable Shared Memory Multiprocessors. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-3604-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-3604-8_13

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-6601-0

  • Online ISBN: 978-1-4615-3604-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics