End-to-End Modeling and Simulation of High- Performance Computing Systems

  • Cyriel Minkenberg
  • Wolfgang Denzel
  • German Rodriguez
  • Robert Birke

Abstract

Designing large-scale High-Performance Computing (HPC) systems, including architecture design space exploration and performance prediction, is a daunting task that can benefit enormously from discrete event simulation techniques, as the interactions between the various components of such a system generally render analytic approaches intractable. The work described in this chapter specifically deals with end-to-end, full-system simulation, as opposed to simulation of individual components or nodes. The tools described here can be used in the design phase of a new HPC system to optimize system design for a given set of workloads, or to create performance forecasts for new workloads on existing systems.

We have taken a network-centric approach, as the scale of current high-end HPC systems is in the range of hundreds of thousands of processing cores, so that the impact of the communication among so many cores will be a key factor in determining overall system performance. To this end, we developed an Omnest-based simulation environment that enables studying the impact of an HPC machine’s communication subsystem on the overall system’s performance for specific workloads.

Full system simulation at an abstraction level that still maintains a reasonably high level of detail is infeasible without resorting to parallel simulation, the main limiting factors being simulation run time and memory footprint. By applying Parallel Discrete Event Simulation techniques, the power of modern parallel computers can be exploited to great effect to perform these kinds of simulations at large scales.

Keywords

Interconnection Network Message Passing Interface Discrete Event Simulation Virtual Channel Deadlock Prevention 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arimilli, B., Arimilli, R., Chung, V., Clark, S., Denzel, W., Drerup, B., Hoefler, T., Joyner, J., Lewis, J., Li, J., Ni, N., Rajamony, R.: The PERCS high-performance interconnect. In: 2010 IEEE 18th Annual Symposium on High-Performance Interconnects on Proc. High Performance Interconnects (HOTI), August 18-20, pp. 75–82 (2010)Google Scholar
  2. Bagrodia, R., Takai, M.: Performance evaluation of conservative algorithms in parallel simulation languages. IEEE Transactions Parallel Distributed Systems 11(4), 395–411 (2000)CrossRefGoogle Scholar
  3. Boden, N.J., Cohen, D., Felderman, R.E., Kulawik, A.E., Seitz, C.L., Seizovic, J.N., Su, W.K.: Myrinet: A gigabit-per-second local area network. IEEE Micro. 15(1), 29–36 (1995)CrossRefGoogle Scholar
  4. Chandy, M., Misra, J.: Distributed simulation: A case study in design and verification of distributed programs. IEEE Transactions on Software Engineering 5, 440–452 (1979)MathSciNetMATHCrossRefGoogle Scholar
  5. Dally, W.J., Towles, B.: Principles and practices of interconnection networks, 1st edn. Morgan Kaufmann (2004)Google Scholar
  6. Denzel, W., Li, J., Walker, P., Jin, Y.: A framework for end-to-end simulation of high-performance computing systems. SIMULATION - Transactions of The Society for Modeling and Simulation International 86(5-6), 331–350 (2010)CrossRefGoogle Scholar
  7. Desai, N., Balaji, P., Sadayappan, P., Islam, M.: Are nonblocking networks really needed for high-end-computing workloads. In: Proc. 2008 IEEE International Conference on Cluster Computing (Cluster 2008), Tsukuba, Japan, September 29-October 1, pp. 152–159 (2008)Google Scholar
  8. Fujimoto, R.M.: Parallel discrete event simulation. In: Proceedings of the 21st Conference on Winter Simulation, pp. 19–28 (1989)Google Scholar
  9. Geoffray, P., Hoefler, T.: Adaptive routing strategies for modern high performance networks. In: Proc. 16th IEEE Symposium on High Performance Interconnects (HOTI 2008), Stanford, CA, August 27-28, pp. 165–172 (2008)Google Scholar
  10. Kamil, S., Shalf, J., Oliker, L., Skinner, D.: Understanding ultra-scale application communication requirements. In: Proc. Workload Characterization Symposium, October 2005, pp. 178–187 (2005)Google Scholar
  11. Kim, J., Dally, W.J., Scott, S., Abts, D.: Technology-driven, highly-scalable dragonfly network. In: Proc. International Symposium on Computer Architecture (ISCA), Beijing, China, pp. 77–88 (2008)Google Scholar
  12. Leiserson, C.E., Abuhamdeh, Z.S., Douglas, D.C., Feynman, C.R., Ganmukhi, M.N., Hill, J.V., Hillis, W.D., Kuszmaul, B.C., St. Pierre, M.A., Wells, D.S., Wong, M.C., Yang, S.W., Zak, R.: The network architecture of the Connection Machine CM-5. In: Proc. 4th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), San Diego, CA, pp. 272–285 (June 1992)Google Scholar
  13. Lencse, G.: Parallel simulation with OMNeT++ using the statistical synchronization method. In: Proceedings of the 2nd International OMNeT++ Workshop, pp. 24–32 (2002)Google Scholar
  14. Luszczek, P., Bailey, D., Dongarra, J., et al.: The HPC challenge (HPCC) benchmark suite. In: Proc. 2006 ACM/IEEE Conference on Supercomputing, SC 2006, Tampa, FL, USA (2006)Google Scholar
  15. Magnusson, P.S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: A full system simulation platform. IEEE Computer 35(2), 50–58 (2002)CrossRefGoogle Scholar
  16. Minkenberg, C., Rodriguez, G.: Trace-driven co-simulation of high-performance computing systems using OMNeT++. In: Proc. SIMUTools 2nd International Workshop on OMNeT++ (OMNeT++ 2009), Rome, Italy, March 6 (2009)Google Scholar
  17. Öhring, S., Ibel, M., Das, S.K., Kumar, M.J.: On generalized fat trees. In: Proc. 9th International Symposium on Parallel Processing (IPPS 1995), Santa Barbara, CA, April 25-28, pp. 37–44 (1995)Google Scholar
  18. Peterson, J.L., et al.: Application of full-system simulation in exploratory system design and development. IBM Journal of Research and Development 50(2/3), 321–332 (2006)CrossRefGoogle Scholar
  19. Petrini, F., Vanneschi, M.: k-ary n-trees: High-performance networks for massively parallel architectures. In: Proc. 11th International Symposium on Parallel Processing (IPPS 1997), Geneva, Switzerland, April 1-5, pp. 87–93 (1997)Google Scholar
  20. Rajamony, R., Arimilli, L.B., Gildea, K.: PERCS: The IBM POWER7-IH high-performance computing system. IBM Journal of Research and Development 55(3), 3:1–3:12 (2011)CrossRefGoogle Scholar
  21. Rodriguez, G., Beivide, R., Minkenberg, C., Labarta, J., Valero, M.: Exploring pattern-aware routing in generalized fat tree networks for HPC. In: Proc. 23rd International Conference on Supercomputing (ICS 2009), New York, NY, June 9-11 (2009)Google Scholar
  22. Scherson, I.D., Chien, C.K.: Least common ancestor networks. In: Proc. 7th International Parallel Processing Symposium (IPPS), pp. 507–513 (1993)Google Scholar
  23. Sinharoy, B., Kalla, R., Starke, W.J., Le, H.Q., Cargnoni, R., Van Norstrand, J.A., Ronchetti, B.J., Stuecheli, J., Leenstra, J., Guthrie, G.L., Nguyen, D.Q., Blaner, B., Marino, C.F., Retter, E., Williams, P.: IBM POWER7 multicore server processor. IBM Journal of Research and Development 55(3) 1, 1:1–1:29 (2011)Google Scholar
  24. Varga, A.: The OMNeT++ discrete event simulation system. In: Proc. European Simulation Multiconference (ESM 2001), Prague, Czech Republic (June 2001)Google Scholar
  25. Varga, A.: OMNet++ User Manual (2010), http://www.omnetpp.org/doc/omnetpp41/Manual.pdf (accessed October 27, 2011)
  26. Varga, A., Sekercioglu, Y.A., Egan, G.K.: A practical efficiency criterion for the null message algorithm. In: Proc. European Simulation Symposium (ESS 2003), Delft, The Netherlands, October 26–29 (2003)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2012

Authors and Affiliations

  • Cyriel Minkenberg
    • 1
  • Wolfgang Denzel
    • 1
  • German Rodriguez
    • 1
  • Robert Birke
    • 1
  1. 1.IBM Research – ZurichRüschlikonSwitzerland

Personalised recommendations