Skip to main content

Quantifying Architectural Requirements of Contemporary Extreme-Scale Scientific Applications

  • Conference paper
  • First Online:
High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation (PMBS 2013)

Abstract

As detailed in recent reports, HPC architectures will continue to change over the next decade in an effort to improve energy efficiency, reliability, and performance. At this time of significant disruption, it is critically important to understand specific application requirements, so that these architectural changes can include features that satisfy the requirements of contemporary extreme-scale scientific applications. To address this need, we have developed a methodology supported by a toolkit that allows us to investigate detailed computation, memory, and communication behaviors of applications at varying levels of resolution. Using this methodology, we performed a broad-based, detailed characterization of 12 contemporary scalable scientific applications and benchmarks. Our analysis reveals numerous behaviors that sometimes contradict conventional wisdom about scientific applications. For example, the results reveal that only one of our applications executes more floating-point instructions than other types of instructions. In another example, we found that communication topologies are very regular, even for applications that, at first glance, should be highly irregular. These observations emphasize the necessity of measurement-driven analysis of real applications, and help prioritize features that should be included in future architectures.

Support for this work was provided by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research. The work was performed at the Oak Ridge National Laboratory, which is managed by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 to the U.S. Government. Accordingly, the U.S. Government retains a non-exclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U.S. Government purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dongarra, J., Beckman, P., Moore, T., Aerts, P., Aloisio, G., Andre, J.C., Barkai, D., Berthou, J.Y., Boku, T., Braunschweig, B., Cappello, F., Chapman, B., Chi, X., Choudhary, A., Dosanjh, S., Dunning, T., Fiore, S., Geist, A., Gropp, B., Harrison, R., Hereld, M., Heroux, M., Hoisie, A., Hotta, K., Jin, Z., Ishikawa, Y., Johnson, F., Kale, S., Kenway, R., Keyes, D., Kramer, B., Labarta, J., Lichnewsky, A., Lippert, T., Lucas, B., Maccabe, B., Matsuoka, S., Messina, P., Michielse, P., Mohr, B., Mueller, M.S., Nagel, W.E., Nakashima, H., Papka, M.E., Reed, D., Sato, M., Seidel, E., Shalf, J., Skinner, D., Snir, M., Sterling, T., Stevens, R., Streitz, F., Sugar, B., Sumimoto, S., Tang, W., Taylor, J., Thakur, R., Trefethen, A., Valero, M., van der Steen, A., Vetter, J., Williams, P., Wisniewski, R., Yelick, K.: The international exascale software project roadmap. International Journal of High Performance Computing Applications 25(1), 3–60 (2011)

    Article  Google Scholar 

  2. Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hill, K., Hiller, J., Karp, S., Keckler, S., Klein, D., Lucas, R., Richards, M., Scarpelli, A., Scott, S., Snavely, A., Sterling, T., Williams, R.S., Yelick, K.: Exascale computing study: Technology challenges in achieving exascale systems. Technical report, DARPA Information Processing Techniques Office (2008)

    Google Scholar 

  3. Snir, M., Gropp, W.D., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W. (eds.): MPI-the complete reference (2-volume set) 2nd edn. Scientific and Engineering Computation. MIT Press, Cambridge (1998)

    Google Scholar 

  4. Asanovic, K., Bodik, R., Catanzaro, B., Gebis, J., Husbands, P., Keutzer, K., Patterson, D., Plishker, W., Shalf, J., Williams, S.: The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley (2006)

    Google Scholar 

  5. Vetter, J.S., Yoo, A.: An empirical performance evaluation of scalable scientific applications. In: SC 2002, Baltimore, MD, USA. IEEE (2002)

    Google Scholar 

  6. Shalf, J., Kamil, S., Oliker, L., Skinner, D.: Analyzing ultra-scale application communication requirements for a reconfigurable hybrid interconnect. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, p. 17. IEEE Computer Society (2005)

    Google Scholar 

  7. Brightwell, R., Underwood, K.D.: An analysis of the impact of mpi overlap and independent progress. In: Proceedings of the 18th Annual International Conference on Supercomputing, Malo, France, pp. 298–305. ACM (2004)

    Google Scholar 

  8. Riesen, R.: Communication patterns. In: 20th International Parallel and Distributed Processing Symposium (IPDPS), 8 p. (2006)

    Google Scholar 

  9. Vetter, J.S., Mueller, F.: Communication characteristics of large-scale scientific applications for contemporary cluster architectures. In: International Parallel and Distributed Processing Symposium (IPDPS), Ft. Lauderdale, Florida (2002)

    Google Scholar 

  10. Vetter, J.S., Glassbrook, R., Dongarra, J., Schwan, K., Loftis, B., McNally, S., Meredith, J., Rogers, J., Roth, P., Spafford, K., Yalamanchili, S.: Keeneland: Bringing heterogeneous GPU computing to the computational science community. IEEE Computing in Science and Engineering 13(5), 90–95 (2011)

    Article  Google Scholar 

  11. Dongarra, J.J., Luszczek, P.: Introduction to the hpcchallenge benchmark suite. Technical Report ICL-UT-05-01, Innovative Computing Laboratory, University of Tennessee-Knoxville (2005)

    Google Scholar 

  12. Brown, P.N., Falgout, R.D., Jones, J.E.: Semicoarsening multigrid on distributed memory machines. SIAM Journal on Scientific Computing 21(5), 1823–1834 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  13. Smith, M.A., Marin-Lafleche, A., Yang, W.S., Kaushik, D., Siegel, A.: Method of characteristics development targeting the high performance Blue Gene/P computer at argonne national laboratory. In: Proceedings of the International Conference on Mathematics and Computational Methods Applied to Nuclear Science and Engineering (MC 2011). American Nuclear Society (2011)

    Google Scholar 

  14. Karlin, I., Bhatele, A., Chamberlain, B.L., Cohen, J., Devito, Z., Gokhale, M., Haque, R., Hornung, R., Keasler, J., Laney, D., Luke, E., Lloyd, S., McGraw, J., Neely, R., Richards, D., Schulz, M., Still, C.H., Wang, F., Wong, D.: Lulesh programming model and performance ports overview. Technical Report LLNL-TR-608824, Lawrence Livermore National Laboratory (December 2012)

    Google Scholar 

  15. Chen, J.H., Choudhary, A., de Supinski, B., DeVries, M., Hawkes, E.R., Klasky, S., Liao, W.K., Ma, K.L., Mellor-Crummey, J., Podhorszki, N., Sankaran, R., Shende, S., Yoo, C.S.: Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science and Discovery 2(1) (2009)

    Google Scholar 

  16. Spafford, K.L., Meredith, J.S., Vetter, J.S., Chen, J., Grout, R., Sankaran, R.: Accelerating S3D: A GPGPU case study. In: HeteroPar 2009: Proceedings of the Seventh International Workshop on Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Platforms (2009)

    Google Scholar 

  17. Germann, T.C., Kadau, K.: Trillion-atom molecular dynamics becomes a reality. International Journal of Modern Physics C 19(09), 1315–1319 (2008)

    Article  MATH  Google Scholar 

  18. Lee, W.W.: Gyrokinetic approach in particle simulation. Physics of Fluids 26, 556–562 (1983)

    Article  MATH  Google Scholar 

  19. Richards, D.F., Glosli, J.N., Chan, B., Dorr, M.R., Draeger, E.W., Fattebert, J.L., Krauss, W.D., Spelce, T., Streitz, F.H., Surh, M.P., Gunnels, J.A.: Beyond homogeneous decomposition: Scaling long-range forces on massively parallel systems. In: Proceedings of the Conference on High Performance Computing, Networking, Storage and Analysis, SC 2009. ACM, New York (2009)

    Google Scholar 

  20. Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. Journal of Computational Physics 117, 1–19 (1995)

    Article  MATH  Google Scholar 

  21. Fischer, P., Lottes, J., Kerkemeier, S.: Nek5000 website (2008)

    Google Scholar 

  22. Smith, R.D., Dukowicz, J.K., Malone, R.C.: Parallel ocean general circulation modeling. Physica D 60(1–4), 38–61 (1992)

    Article  MATH  Google Scholar 

  23. Collins, W.D., Blackmon, M.L., Bonan, G.B., Hack, J.J., Henderson, T.B., Kielh, J.T., Large, W.G., McKenna, D.S., Bitz, C.M., Bretherton, C.S., Carton, J.A., Chang, P., Doney, S.C., Santer, B.D., Smith, R.D.: The Community Climate System Model version 3 (CCSM3). Journal of Climate 19(11), 2122–2143 (2006)

    Article  Google Scholar 

  24. Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2005, pp. 190–200. ACM, New York (2005)

    Google Scholar 

  25. Intel Corporation: XED, http://software.intel.com/sites/landingpage/pintool/docs/53271/Xed/html

  26. Intel Corporation: Intel Architecture software developer’s manual, vol. 1: basic architecture (1999)

    Google Scholar 

  27. Advanced Micro Devices Inc: 3DNow! technology manual (2000)

    Google Scholar 

  28. Intel Corporation: Intel SSE4 programming reference (April 2007)

    Google Scholar 

  29. Browne, S., Dongarra, J., Garner, N., London, K., Mucci, P.: A portable programming interface for performance evaluation on modern processors. The International Journal of High Performance Computing Applications 14, 189–204 (2000)

    Article  Google Scholar 

  30. Ding, C., Zhong, Y.: Predicting whole-program locality through reuse distance analysis. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (2003)

    Google Scholar 

  31. Schuff, D.L., Parsons, B.S., Pai, V.S.: Multicore-aware reuse distance analysis. In: Workshop on Performance Modeling, Evaluation, and Optimization of Ubiquitous Computing and Networked Systems (2010)

    Google Scholar 

  32. Ding, C., Zhong, Y.: Reuse distance analysis. Technical Report UR-CS-TR-741, Computer Science Department, University of Rochester (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey S. Vetter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Vetter, J.S. et al. (2014). Quantifying Architectural Requirements of Contemporary Extreme-Scale Scientific Applications. In: Jarvis, S., Wright, S., Hammond, S. (eds) High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation. PMBS 2013. Lecture Notes in Computer Science(), vol 8551. Springer, Cham. https://doi.org/10.1007/978-3-319-10214-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10214-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10213-9

  • Online ISBN: 978-3-319-10214-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics