Skip to main content
Log in

Experiments with the Fresh Breeze tree-based memory model

  • Special Issue Paper
  • Published:
Computer Science - Research and Development

Abstract

The Fresh Breeze memory model and system architecture is proposed as an approach to achieving significant improvements in massively parallel computation by supporting fine-grain management of memory and processing resources and utilizing a global shared name space for all processors and computation tasks. Memory management and the scheduling of tasks are done by hardware realizations, eliminating nearly all operating system execution cycles for data access, task scheduling and security. In particular, the Fresh Breeze memory model uses trees of fixed-size chunks of memory to represent all data objects, which eliminates data consistency issues and simplifies memory management. Low-cost reference-count garbage collection is used to support modular programming in type-safe programming languages.

The main contributions of this paper are: (1) a program exection model for massively parallel computing as the Fresh Breeze application programming interface (API) comprising a radical memory model and a scheme for expressing concurrency; (2) an experimental implementation of the API through simulation using the FAST simulator of the IBM Cyclops 64 many-core chip; (3) simulation results that demonstrate that (a) fine-grain hardware-implemented resource management mechanisms can support massive parallelism and high processor utilization through the latency-hiding properties of multi-tasking; and (b) hardware implementation of a work stealing scheme incorporated in our simulation can effectively distribute tasks over the processors of a many-core parallel computer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Dennis JB (1997) A parallel program execution model supporting modular software construction. In: Massively parallel programming models. IEEE Comput Soc, Los Alamitos, pp 50–60

    Google Scholar 

  2. Dennis JB (2003) Fresh breeze: a multiprocessor chip architecture guided by modular programming principles. SIGARCH Comput Archit News 31(1):7–15

    Article  Google Scholar 

  3. Dennis JB, Horn ECV (1966) Programming semantics for multi-programmed computations. Commun ACM, 9, Feb 1966

  4. Levy H (1984) Capability-based computer systems. Butterworth-Heinemann, Stoneham-London

    Google Scholar 

  5. Wilkes MV (1979) The Cambridge CAP computer and its operating system (Operating and programming systems series). Operating and programming systems series. North-Holland, Amsterdam

    Google Scholar 

  6. Shapiro JS, Smith JM, Farber DJ (1999) Eros: a fast capability system. In: Proceedings of the seventeenth ACM symposium on operating systems principles, SOSP’99. ACM, New York, pp 170–185

    Chapter  Google Scholar 

  7. Dennis JB (2006) The Fresh Breeze model of thread execution. In: Workshop on programming models for ubiquitous parallelism. IEEE Comput Soc, Los Alamitos. Published with PACT-2006

    Google Scholar 

  8. Frigo M, Leiserson CE, Randall KH (1998) The implementation of the Cilk-5 multithreaded language. ACM SIGPLAN Not 33:212–223

    Article  Google Scholar 

  9. Ginzburg I (2007) Compiling array computations for the Fresh Breeze parallel processor. Thesis for the Master of Engineering degree, MIT Department of Electrical Engineering and Computer Science, May 2007

  10. del Cuvillo J, Zhu W, Hu Z, Gao GR (2005) Tiny threads: a thread virtual machine for the Cyclops 64 cellular architecture. In: International parallel and distributed processing symposium. IEEE Comput Soc, Los Alamitos, p 265

    Google Scholar 

  11. Schmidt B (2008) A shared memory system for Fresh Breeze. Master’s thesis, MIT Department of Electrical Engineering and Computer Science, May 2008

  12. del Cuvillo J, Zhu W, Hu Z, Gao GR (2005) FAST: a functionally accurate simulation toolset for the Cyclops 64 cellular architecture

  13. Bensoussan A, Clingen CT, Daley RC (1969) The Multics virtual memory. In: Proceedings of the second symposium on operating systems principles. ACM, New York, pp 30–42

    Chapter  Google Scholar 

  14. Soltis FG (1996) Inside the AS/400. Duke Press, Loveland

    Google Scholar 

  15. Vee V-Y, Hsu W-J (1999) Applying Cilk in provably efficient task scheduling. Comput J 42:699–712

    Article  MATH  Google Scholar 

  16. Theobald KB (1999) EARTH: an efficient architecture for running threads. PhD thesis, University of Delaware, May 1999

  17. Hum HHJ, Maquelin O, Theobald KB, Tian X, Tang X, Gao GR (1995) A design study of the EARTH multiprocessor. In: Conference on parallel architectures and compilation techniques, PACT. IEEE Comput Soc, Los Alamitos, pp 59–68

    Google Scholar 

  18. Theobald KB, Gao GR, Sterling TL (1999) Superconducting processors for HTMT: Issues and challenges. In: ACM’87: the 7th symp on the frontiers of massively parallel computation: today and tomorrow. ACM, New York, pp 260–267

    Google Scholar 

  19. Charles P, Grotho C, Saraswat V, Donawa C, Kielstra A, Ebcioglu K, von Praun C, Sarkar V (2005) X10: an object-oriented approach to non-uniform cluster computing. In: 2005 conference on objectoriented programming. ACM, New York, pp 519–538

    Google Scholar 

  20. Sarkar V, Hennessy J (1986) Compile-time partitioning and scheduling of parallel programs. In: 86 symposium on compiler construction, SIGPLAN. ACM, New York, pp 17–26

    Chapter  Google Scholar 

  21. Shirako J, Peixotto D, Sarkar V, Scherer W (2008) Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. In: Twenty-second international conference on supercomputing. IEEE Comput Soc, Los Alamitos

    Google Scholar 

  22. Guo Y, Barik R, Raman R, Sarkar V (2009) Work-first and help-first scheduling policies for async-finish task parallelism. In: International parallel and distributed processing symposium, IPDPS. IEEE Comput Soc, Los Alamitos

    Google Scholar 

  23. Callahan D, Chamberlain BL, Zima HP (2004) The Cascade high productivity language. In: Ninth international workshop on high-level parallel programming models and supportive environments

    Google Scholar 

  24. Yuba T, Hiraki K, Shimada T, Sekiguchi S, Nishida K (1987) The Sigma-1 dataflow computer. In: ACM’87: proceedings of the 1987 fall joint computer conference on exploring technology: today and tomorrow. IEEE Comput Soc, Los Alamitos, pp 578–585

    Google Scholar 

  25. Darringer J, Davidson E, Hathaway D, Koenemann B, Lavin M, Morrell J, Rahmat K, Roesner W, Schanzenbach E, Tellez G, Trevillyan L (2000) EDA in IBM: past present, and future. IEEE Trans Comput-Aided Des Integr Circuits Syst 19:1476–1497

    Article  Google Scholar 

  26. Dubois M, Jeong J, Song Y, Moga A (1998) Rapid hardware prototyping on RPM-2. IEEE Des Test Comput, pp 112–118

  27. Wawrzynek J, Patterson D, Oskin M, Lu S-L, Kozyrakis C, Hoe J, Chiou D, Asanovic K (2007) RAMP: research accelerator for multiple processors. IEEE MICRO 27:46–57

    Article  Google Scholar 

  28. Cavé V, Budimlić Z, Sarkar V (2010) Comparing the usability of library vs. language approaches to task parallelism. ACM PLATEAU’10, evaluation and usability of programming languages and tools, pp 9.1–9.6

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jack B. Dennis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dennis, J.B., Gao, G.R. & Meng, X.X. Experiments with the Fresh Breeze tree-based memory model. Comput Sci Res Dev 26, 325–337 (2011). https://doi.org/10.1007/s00450-011-0165-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-011-0165-1

Keywords

Navigation