Coping with very high latencies in petaflop computer systems

Ryan, Sean; Amaral, José N.; Gao, Guang; Ruiz, Zachary; Marquez, Andres; Theobald, Kevin

doi:10.1007/BFb0094912

Coping with very high latencies in petaflop computer systems

Sean Ryan¹,
José N. Amaral¹,
Guang Gao¹,
Zachary Ruiz¹,
Andres Marquez¹ &
…
Kevin Theobald¹

II System Architecture
Conference paper
First Online: 19 October 2006

113 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1615))

Abstract

The very long and highly variable latencies in the deep memory hierarchy of a petaflop-scale architecture design, such as the Hybrid Technology Multi-Threaded Architecture (HTMT) [13], present a new challenge to its programming and execution model. A solution to coping with such high and variable latencies is to directly and explicity expose the different memory regions of the machine to the program execution model, allowing better management of communication. In this paper we describe the novel percolation model that lies at the heart of the HTMT program execution model [13]. The Percolation Model combines multithreading with dynamic prefetching of coarse-grain contexts. In the past, prefetching techniques have concentrated on moving blocks of data within the memory hierarchy. Instead of only moving contiguous blocks of data, the thread percolation approach manages contexts that include data, program instructions, and control states.

The main contributions of this paper include the specification of the HTMT runtime execution model based on the concept of percolation, and a discussion of the role of the compiler in a machine that exposes the memory hierarchy to the programming model.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques (PACT '96), Boston, Massachusetts, October 20–23, 1996. IEEE Computer Society Press.
Google Scholar
ACM SIGARCH and IEEE Computer Society. Proceedings of the 20th Annual International Symposium on Computer Architecture, San Diego, California, May 17–19, 1993. Computer Architecture News 21(2), May 1993.
Google Scholar
ACM SIGARCH and IEEE Computer Society. Proceedings of the 22nd Annual International Symposium on Computer Architecture, Santa Margherita Ligure, Italy, June 22–24, 1995. Computer Architecture News, 23(2), May 1995.
Google Scholar
Anant Agarwal, Ricardo Bianchini, David Chaiken, Kirk L. Johnson, David Kranz, John Kubiatowicz, Beng-Hong Lim, Kenneth Mackenzie, and Donald Yeung. The MIT Alewife machine: Architecture and performance. In Proceedings of the 22nd Annual International Symposium on Computer Architecture [3], Santa Margherita Ligure, Italy, June 22–24, 1995, pages 2–13. Computer Architecture News, 23(2), May 1995.
Google Scholar
Gail Alverson, Robert Alverson, David Callahan, Brian Koblenz, Allan Porterfield, and Burton Smith. Exploiting heterogeneous parallelism on a multithreaded multiprocessor. Presented at the Workshop on Multithreaded Computers, held at Supercomputing '91, Albuquerque, New Mexico, November 1991.
Google Scholar
Jose Nelson Amaral, Guang R. Gao, Phillip Merkey, Thomas Sterling, Zachary Ruiz, and Sean Ryan. An htmt performance prediction case study: implementing cannon's dense matrix multiply algorithm. Technical report, University of Delaware, 1999.
Google Scholar
Karen Bergman and Coke Reed. Hybrid technology multithreaded architecture program design and development of the data vortex network. Technical report, Princeton University, 1998. Technical Note 2.0.
Google Scholar
Derek Chiou, Boon S. Ang, Robert Greiner, Arvind, James C. Hoe, Michael J. Beckerle, James E. Hicks, and Andy Boughton. StarT-NG: Delivering seamless parallel computing. In Seif Haridi, Khayri Ali, and Peter Magnusson, editors, Proceedings of the First International EURO-PAR Conference, number 966 in Lecture Notes in Computer Science, pages 101–116, Stockholm, Sweden, August 29–31, 1995. Springer-Verlag.
Google Scholar
David E. Culler, Seth C. Goldstein, Klaus E. Schauser, and Thorsten von Eicken. TAM—a compiler controlled threaded abstract machine. Journal of Parallel and Distributed Computing, 18:347–370, July 1993.
Article Google Scholar
David E. Culler, Anurag Sah, Klaus Erik Schauser, Thorsten von Eicken, and John Wawrzynek. Fine-grain parallelism with minimal hardware support: A compiler-controlled threaded abstract machine. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 164–175, Santa Clara, California, April 8–11, 1991. ACM SIGARCH, SIGPLAN, SIGOPS, and the IEEE Computer Society. Computer Architecture News, 19(2), April 1991; Operating Systems Review, 25, April 1991; SIGPLAN Notices, 26(4), April 1991.
Google Scholar
Mikhail Dorojevets, Paul Bunyk, Dmitri Zinoviev, and Konstantin Likharev. Petaflops rsfq system design. In Applied Superconductivity Conference, Sept 1998.
Google Scholar
Marco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny Gurevich, and Whay S. Lee. The M-Machine multicomputer. In Proceedings of the 28th Annual International Symposium on Microarchitecture, pages 146–156, Ann Arbor, Michigan, November 29–December 1, 1995. IEEE-CS TC-MICRO and ACM SIGMICRO.
Google Scholar
Guang R. Gao, Kevin B. Theobald, Andrés Márquez, and Thomas Sterling. The HTMT program execution model. CAPSL Technical Memo 09, Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware, July 1997. Inftp://ftp.capsl.udel.edu/pub/doc/memos.
Google Scholar
Laurie J. Hendren, Xinan Tang, Yingchun Zhu, Guang R. Gao, Xun Xue, Haiying Cai, and Pierre Ouellet. Compiling C for the EARTH multithreaded architecture. In Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques (PACT '96) [1], pages 12–23.
Google Scholar
HTMT. Hybid technology multi-threaded architectures. http://htmt.caltech.edu, 1998.
Google Scholar
Herbert H. J. Hum, Olivier Maquelin, Kevin B. Theobald, Xinmin Tian, Guang R. Gao, and Laurie J. Hendren. A study of the EARTH-MANNA multithreaded system. International Journal of Parallel Programming, 24(4):319–347, August 1996.
Google Scholar
Robert A. Iannucci. A dataflow/von Neumann hybrid architecture. Technical Report MIT/LCS/TR-418, MIT Laboratory for Computer Science, Cambridge, Massachusetts, July 1988. PhD thesis, May 1988.
Google Scholar
Robert A. Iannucci, Guang R. Gao, Robert H. Halstead, Jr., and Burton Smith, editors. Multithreaded Computer Architecture: A Summary of the State of the Art. Kluwer Academic Publishers, Norwell, Massachusetts, 1994. Book contains papers presented at the Workshop on Multithreaded Computers, held in conjunction with Supercomputing '91 in Albuquerque, New Mexico, November 1991.
Google Scholar
Yuetsu Kodama, Hirohumi Sakane, Mitsuhisa Sato, Hayato Yamana, Shuichi Sakai, and Yoshinori Yamaguchi. The EM-X parallel computer: Architecture and basic performance. In Proceedings of the 22nd Annual International Symposium on Computer Architecture [3] Santa Margherita Ligure, Italy, June 22–24, 1995, pages 14–23. Computer Architecture News, 23(2), May 1995.
Google Scholar
Peter M. Kogge, Jay B. Brockman, Thomas Sterling, and Guang Gao. Processingin-memory: Chips to petaflops. Technical report, International Symposium on Computer Architecture, Denver, Co., June 1997.
Google Scholar
Andrés Márquez, Kevin B. Theobald, Xinan Tang, and Guang R. Gao. A superstrand architecture. CAPSL Technical Memo 14, Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware, December 1997. In ftp://ftp.capsl.udel.edu/pub/doc/memos.
Google Scholar
Andrés Márquez, Kevin B. Theobald, Xinan Tang, Thomas Sterling, and Guang R. Gao. A superstrand architecture and its compilation. CAPSL Technical Memo 18, Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware, March 1998.
Google Scholar
R. S. Nikhil and Arvind. Id: a language with implicit parallelism. In J. Feo, editor, A Comparative Study of Parallel Programming Languages: The Salishan Problems. Elsevier Science Publishers, February 1990.
Google Scholar
Michael D. Noakes, Deborah A. Wallah, and William J. Dally. The J-Machine multicomputer: An architectural evaluation. In Proceedings of the 20th Annual International Symposium on Computer Architecture [2] San Diego, California, May 17–19, 1993, pages 224–235. Computer Architecture News, 21(2), May 1993.
Google Scholar
Kazuaki Okamoto, Shuichi Sakai, Hiroshi Matsuoka, Takashi Yokota, and Hideo Hirono. Multithread execution mechanisms on RICA-1 for massively parallel computation. In Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques (PACT '96) [1] Massachusetts, October 20–23, 1996. IEEE Computer Society Press. pages 116–121.
Google Scholar
Demetri Psaltis and Geoffrey W. Burr. Holographic data storage. Computer, 31(2):52–60, Febuary 1998.
Article Google Scholar
Klaus E. Schauser, David E. Culler, and Seth C. Goldstein. Separation constraint partitioning—A new algorithm for partitioning non-strict programs into sequential threads. In Conference Record of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 259–271, San Francisco, California, January 22–25, 1995.
Google Scholar
Klaus Eric Schauser, David E. Culler, and Thorsten von Eiken. Compiler-controlled multithreading for lenient parallel languages. Report No. UCB/CSD 91/640, Computer Science Division, University of California at Berkeley, 1991.
Google Scholar
Ellen Spertus, Seth Copen Goldstein, Klaus Erik Schauser, Thorsten von Eicken, David E. Culler, and William J. Dally. Evaluation of mechanisms for fine-grained parallel programs in the J-Machine and the CM-5. In Proceedings of the 20th Annual International Symposium on Computer Architecture [2] San Diego, California, May 17–19, 1993 pages 302–313. Computer Architecture News, 21(2), May 1993.
Google Scholar
Xinan Tang, Jian Wang, Kevin B. Theobald, and Guang R. Gao. Thread partitioning and scheduling based on cost model. In Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 272–281, Newport, Rhode Island, June 22–25, 1997. SIGACT/SIGARCH and EATCS.
Google Scholar
Kevin B. Theobald, José Nelson Amaral, Gerd Heber, Olivier Maquelin, Xinan Tang, and Guang R. Gao. Overview of the Threaded-C language. CAPSL Technical Memo 19, Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware, March 1998. In ftp://ftp.capsl.udel.edu/pub/doc/memos.
Google Scholar
L. Wittie, D. Zinoviev, G. Sazaklis, and K. Likharev. CNET: Design of an RSFQ Switching network for petaflops-scale computing. IEEE Trans. on Appl. Supercond., June 1999. In press.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Architecture and Parallel Systems Laboratory, University of Delaware, Newark, DE, USA
Sean Ryan, José N. Amaral, Guang Gao, Zachary Ruiz, Andres Marquez & Kevin Theobald

Authors

Sean Ryan
View author publications
You can also search for this author in PubMed Google Scholar
José N. Amaral
View author publications
You can also search for this author in PubMed Google Scholar
Guang Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zachary Ruiz
View author publications
You can also search for this author in PubMed Google Scholar
Andres Marquez
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Theobald
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Constantine Polychronopoulos Kazuki Joe Akira Fukuda Shinji Tomita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ryan, S., Amaral, J.N., Gao, G., Ruiz, Z., Marquez, A., Theobald, K. (1999). Coping with very high latencies in petaflop computer systems. In: Polychronopoulos, C., Fukuda, K.J.A., Tomita, S. (eds) High Performance Computing. ISHPC 1999. Lecture Notes in Computer Science, vol 1615. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0094912

Download citation

DOI: https://doi.org/10.1007/BFb0094912
Published: 19 October 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65969-3
Online ISBN: 978-3-540-48821-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics