High Performance Computing architectures are expected to change dramatically in the next decade as power and cooling constraints limit increases in microprocessor clock speeds. Consequently computer companies are dramatically increasing on-chip parallelism to improve performance. The traditional doubling of clock speeds every 18-24 months is being replaced by a doubling of cores or other parallelism mechanisms. During the next decade the amount of parallelism on a single microprocessor will rival the number of nodes in early massively parallel supercomputers that were built in the 1980s. Applications and algorithms will need to change and adapt as node architectures evolve. In particular, they will need to manage locality to achieve performance. A key element of the strategy as we move forward is the co-design of applications, architectures and programming environments. There is an unprecedented opportunity for application and algorithm developers to influence the direction of future architectures so that they meet DOE mission needs. This article will describe the technology challenges on the road to exascale, their underlying causes, and their effect on the future of HPC system design.


Exascale HPC codesign 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    A Platform Strategy for the Advanced Simulation and Computing Program (NA-ASC-113R-07-Vol. 1-Rev. 0) Google Scholar
  3. 3.
    DARPA Exascale Computing Study (TR-2008-13),
  4. 4.
    Miller, D.A., Ozaktas, H.M.: Limit to the bit-rate capacity of electrical interconnects from the aspect ratio of the system architecture. J. Parallel Distrib. Comput. 41(1), 42–52 (1997), DOI CrossRefGoogle Scholar
  5. 5.
    Miller, D.A.B.: Rationale and challenges for optical interconnects to electronic chips. Proc. IEEE, 728–749 (2000)Google Scholar
  6. 6.
    Horowitz, M., Yang, C.K.K., Sidiropoulos, S.: High-speed electrical signaling: Overview and limitations. IEEE Micro. 18(1), 12–24 (1998)CrossRefGoogle Scholar
  7. 7.
    IAA Interconnection Network Workshop, San Jose, California, July 21-22 (2008),
  8. 8.
    Architectures and Technology for Extrame Scale Computing Workshop, San Diego, California, December 8-10 (2009),
  9. 9.
    Asanovic, K., et al.: The Landscape of Parallel Computing Research: A View from Berkeley, Electrical Engineering and Computer Sciences. University of California at Berkeley, Technical Report No. UCB/EECS-2006-183, December 18 (2006)Google Scholar
  10. 10.
    Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., Hanrahan, P.: Larrabee: a many-core x86 architecture for visual computing. ACM Trans. Graph. 27(3), 1–15 (2008)CrossRefGoogle Scholar
  11. 11.
    Liu, Y., Zhu, H.: A survey of the research on power management techniques for high-performance systems. Softw. Pract. Exper. 40(11), 943–964 (2010)CrossRefGoogle Scholar
  12. 12.
    Colmenares, J.A., Bird, S., Cook, H., Pearce, P., Zhu, D., Shalf, J., Hofmeyr, S., Asanović, K., Kubiatowicz, J.: Resource Management in the Tesselation Manycore OS. In: HotPar 2010, Berkeley (2010),
  13. 13.
    U.S. Department of Energy, DOE Data Center Energy Efficiency Program (April 2009) Google Scholar
  14. 14.
    Moody, A., Bronevetsky, G., Mohror, K., de Supinski, B.R.: Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System. In: IEEE/ACM Supercomputing Conference (SC) (November 2010)Google Scholar
  15. 15.
    Kamil, S., Oliker, L., Pinar, A., Shalf, J.: Communication Requirements and Interconnect Optimization for High-End Scientific Applications. IEEE Transactions on Parallel and Distributed Systems (2009)Google Scholar
  16. 16.
    Balfour, J., Dally, W.J.: Design tradeoffs for tiled CMP on-chip networks. In: Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006, Cairns, Queensland, Australia, June 28-July 01, pp. 187–198. ACM, New York (2006)Google Scholar
  17. 17.
    Kim, J., Dally, W., Scott, S., Abts, D.: Cost-Efficient Dragonfly Topology for Large-Scale Systems. IEEE Micro. 29(1), 33–40 (2009)CrossRefGoogle Scholar
  18. 18.
    Top500 List Home,
  19. 19.
    Hayt, W.H.: Engineering Electromagnetics, 7th edn. McGraw Hill, New York (2006)Google Scholar
  20. 20.
    Guha, B., Kyotoku, B.B.C., Lipson, M.: CMOS-compatible athermal silicon microring resonators. Optics Express 18(4) (2010)Google Scholar
  21. 21.
    Hendry, G., Chan, J., Kamil, S., Oliker, L., Shalf, J., Carloni, L.P., Bergman, K.: Silicon Nanophotonic Network-On-Chip Using TDM Arbitration. In: IEEE Symposium on High Performance Interconnects (HOTI) 5.1 (August 2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • John Shalf
    • 1
  • Sudip Dosanjh
    • 2
  • John Morrison
    • 3
  1. 1.NERSC DivisionLawrence Berkeley National LaboratoryBerkeleyUSA
  2. 2.Sandia National LaboratoriesUSA
  3. 3.Los Alamos National LaboratoryLos AlamosUSA

Personalised recommendations