SONAR: Automated Communication Characterization for HPC Applications

  • Steffen LammelEmail author
  • Felix Zahn
  • Holger Fröning
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9945)


Future computing systems will need to operate within hard power and energy constraints, this is particularly true for Exascale-class systems. These constraints are hard for technical, economical and ecological reasons, thus, such systems have to operate within given power and energy budgets. Therefore, we anticipate the need for modeling tools that help to predict power and energy consumption. In particular, such modeling tools would allow for detailed explorations of various alternatives when designing systems. While processing and memory already receives a large amount of interest from the research community, power modeling of scalable interconnection networks is rather neglected. However, analyses show that the network contributes about 20 % to the overall power consumption of HPC systems. Considering the increasing energy efficiency of other components, this fraction is likely to increase. While models for processing and memory typically rely on performance counters to model power and energy, we observe that the distributed nature of networks leads to significantly more complex metrics. Selecting the right set of abstract metrics, which will be used as input for such a prediction, is crucial for prediction performance.

In this work we introduce our tool called Simple Offline Network Analyzer (SONAR) to derive complex metrics from communication traces of HPC applications. We explain the motivation behind choosing this concept, the implementation, and the ability of the tool to easily support the integration of new metrics. We also show exemplary explorations using an initial set of metrics for a representative range of HPC applications, including contemporary as well as emerging Exascale workloads. In particular, we use SONAR to characterize the communication of applications in terms of verbosity and network utilization, as we believe both to be important metrics for power prediction.


Exascale HPC MPI Communication Characterization Automated tooling 



We thank the anonymous reviewers for their constructive and detailed reviews. We would also like to express our thanks to Alexander Matz for his support. Furthermore, we want to thank Pedro J. Garcia and Jesus Escudero-Sahuquillo for our insightful technical discussions.


  1. 1.
    Saravanan, K.P., Carpenter, P.M., Ramirez, A.: Power/performance evaluation of energy efficient ethernet (eee) for high performance computing. In: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 205–214, April 2013Google Scholar
  2. 2.
    Zahn, F., Yebenes, P., Lammel, S., Garcia, P.J., Froning, H.: Analyzing the energy (dis-) proportionality of scalable interconnection networks. In: HiPINEB, 2016, 2016 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), pp. 25–32 (2016)Google Scholar
  3. 3.
    Borkar, S.: Exascale computing - a fact or a fiction? (invited talk). In: IPDPS, vol. 3. IEEE Computer Society (2013)Google Scholar
  4. 4.
    Shalf, J., Dosanjh, S., Morrison, J.: Exascale computing technology challenges. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds.) VECPAR 2010. LNCS, vol. 6449, pp. 1–25. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-19328-6_1 CrossRefGoogle Scholar
  5. 5.
    Malony, A.D., Shende, S.: Performance technology for complex parallel and distributed systems. In: Kacsuk, P., Kotsis, G. (eds.) Distributed and Parallel Systems: From Instruction Parallelism to Cluster Computing, pp. 37–46. Springer, Boston (2000)CrossRefGoogle Scholar
  6. 6.
    Dandapanthula, N., Subramoni, H., Vienne, J., Kandalla, K., Sur, S., Panda, D.K., Brightwell, R.: INAM - a scalable infiniband network analysis and monitoring tool. In: Alexander, M., et al. (eds.) Euro-Par 2011. LNCS, vol. 7156, pp. 166–177. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-29740-3_20 CrossRefGoogle Scholar
  7. 7.
    Subramoni, H., Augustine, A.M., Arnold, M., Perkins, J., Lu, X., Hamidouche, K., Panda, D.K.: Inam\(\hat{2}\): infiniband network analysis & monitoring with mpi (2016)Google Scholar
  8. 8.
    Agelastos, A., Allan, B., Brandt, J., Cassella, P., Enos, J., Fullop, J., Gentile, A., Monk, S., Naksinehaboon, N., Ogden, J., Rajan, M., Showerman, M., Stevenson, J., Taerat, N., Tucker, T.: The lightweight distributed metric service: a scalable infrastructure for continuous monitoring of large scale computing systems and applications. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014, Piscataway, NJ, USA, pp. 154–165. IEEE Press (2014)Google Scholar
  9. 9.
    Mohr, B., Voevodin, V., Giménez, J., Hagersten, E., Knüpfer, A., Nikitenko, D.A., Nilsson, M., Servat, H., Shah, A., Winkler, F., Wolf, F., Zhukov, I.: The HOPSA workflow and tools. In: Cheptsov, A., Brinkmann, S., Gracia, J., Resch, M.M., Nagel, W.E. (eds.) Tools for High Performance Computing 2012, pp. 127–146. Springer, Berlin (2013)CrossRefGoogle Scholar
  10. 10.
    Evans, T., Barth, W.L., Browne, J.C., DeLeon, R.L., Furlani, T.R., Gallo, S.M., Jones, M.D., Patra, A.K.: Comprehensive resource use monitoring for hpc systems with tacc stats. In: Proceedings of the First International Workshop on HPC User Support Tools, HUST 2014, Piscataway, NJ, USA, pp. 13–21. IEEE Press (2014)Google Scholar
  11. 11.
    Gallardo, E., Vienne, J., Fialho, L., Teller, P., Browne, J.: Mpi advisor: a minimal overhead tool for mpi library performance tuning. In: Proceedings of the 22Nd European MPI Users’ Group Meeting, EuroMPI 2015, pp. 6:1–6:10. ACM, New York (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Computer Engineering GroupRuprecht-Karls University of HeidelbergHeidelbergGermany

Personalised recommendations