Performance analysis and prediction for distributed homogeneous clusters

  • Heinz Kredel
  • Hans Günther Kruse
  • Sabine Richling
  • Erich Strohmaier
Special Issue Paper


We present a new performance model based on the roofline concept for the analysis and performance prediction of distributed computing clusters. The background for our performance modeling is the 28 km InfiniBand interconnection between two bwGRiD clusters each consisting of 140 compute nodes in day-to-day production use. The model is used to analyze the MPI performance of intra-cluster communication compared to inter-cluster communication. We compare the new modeling results to our earlier stochastic model (Richling et al. in Proc. of 3PGCIC-2010. IEEE, New York 2010) where we could give an estimate on the bandwidth requirements for doubling the performance of an application (LinPack as the simplest example). We will derive some bounds for the size of regions in a cluster and the scaling of the maximal speed-up for the region-region-interconnected network.


Performance model Performance prediction Inter-cluster communication Roofline model 



We thank our colleagues Rolf Bogus, Hermann Lauer and Steffen Hau as well as the bwGRiD team for the help in the construction and operation of the interesting hardware and the optimization of the connection, which is the basis for this paper. One of the authors, H.G. Kruse, thanks LBNL/Berkeley for the hospitality during a research visit. The inspiring and exciting atmosphere favored the becoming of this work. We also thank the referees for the insightful suggestions to improve the paper.

bwGRiD is a Member of the German D-Grid initiative and is funded by the Ministry of Education and Research and the Ministry for Science, Research and Arts Baden-Württemberg.


  1. 1.
    bwGRiD (2007–2010) Member of the German D-Grid initiative, funded by the Ministry of Education and Research and the Ministry for Science, Research and Arts Baden-Württemberg, Universities of Baden-Württemberg, . Accessed May 2010
  2. 2.
    Kredel H, Kruse H-G, Richling S (2010) Zur Leistung von verteilten, homogenen Clustern. Pik 2:166–171 Google Scholar
  3. 3.
    Richling S, Hau S, Kredel H, Kruse H-G (2010) Operating two InfiniBand Grid clusters over 28 km distance. In: Proc of 3PGCIC-2010. IEEE, New York Google Scholar
  4. 4.
    Richling S, Hau S, Kredel H, Kruse H-G (2011) Operating two InfiniBand Grid clusters over 28 km distance. Int J Grid Util Comput 2(4):303–312 CrossRefGoogle Scholar
  5. 5.
    Richling S, Hau S, Kredel H, Kruse H-G (2011) A long-distance InfiniBand interconnection between two clusters in production use. In: Proc supercomputing, November 12–18, 2011. IEEE, New York Google Scholar
  6. 6.
    Merz M, Krietemeyer M (eds) (2006) IPACS integrated performance analysis of computer systems—benchmarks for distributed computer systems. Logos Verlag, Berlin Google Scholar
  7. 7.
    Obsidian: high performance network. Accessed May 2010
  8. 8.
    LinPack und HPL: Linear Algebra Package and High Performance LinPack. Accessed Jan 2012
  9. 9.
    Schlegel U, Grobe K, Southwell D 100 Gbit/s DWDM InfiniBand transport over up to 40 km. Accessed May 2010
  10. 10.
    Plaat A et al. (2001) Sensitivity of parallel applications to large differences in bandwidth and latency in two-layer interconnects. Future Gener Comput Syst 17:769–782 zbMATHCrossRefGoogle Scholar
  11. 11.
    Yu W, Rao NSV, Vetter JS (2008) Experimental analysis of infiniBand transport services on WAN. In: International conference on networking, architecture, and storage, pp 233–240 CrossRefGoogle Scholar
  12. 12.
    Carter S, Minich M, Rao N (2007) Experimental evaluation of InfiniBand transport over local- and wide-area networks. In: Proc 2007 spring simulation multiconference (SpringSim’07), pp 419–426 Google Scholar
  13. 13.
    Kredel H, Kruse H-G, Ott I (2011) Lastverhalten und Systemkonfiguration von Web-Applikationsservern. Praxis d. Informationsverarbeitung und Kommunikation (PIK), vol 3. De Gruyter Saur, München, pp 215–223 Google Scholar
  14. 14.
    Kredel H, Kruse H-G, Ott I (2011) Performance analysis and performance modeling of web-applications. In: Proc 3PGCIC-2011. IEEE, New York, pp 115–122 Google Scholar
  15. 15.
    Williams S, Waterman A, Patterson D (2009) Roofline: an insightful visual performance model for multicore architectures. Commun ACM 52(4):65–76 CrossRefGoogle Scholar
  16. 16.
    Hill MD, Marty MR (2008) Amdahl’s law in the multicore era. Computer 41(7):33–38 CrossRefGoogle Scholar
  17. 17.
    Popa A (2012) What is the speed of light in a fiber optic cable? Accessed Jan 2012

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Heinz Kredel
    • 1
  • Hans Günther Kruse
    • 1
  • Sabine Richling
    • 2
  • Erich Strohmaier
    • 3
  1. 1.IT-CenterUniversity of MannheimMannheimGermany
  2. 2.IT-CenterUniversity of HeidelbergHeidelbergGermany
  3. 3.Future Technology GroupLawrence Berkeley National LaboratoryBerkeleyUSA

Personalised recommendations