Abstract
Over the last decade, InfiniBand (IB) has become an increasingly popular interconnect for deploying modern supercomputing systems. As supercomputing systems grow in size and scale, the impact of IB network topology on the performance of high performance computing (HPC) applications also increase. Depending on the kind of network (FAT Tree, Tori, or Mesh), the number of network hops involved in data transfer varies. No tool currently exists that allows users of such large-scale clusters to analyze and visualize the communication pattern of HPC applications in a network topology-aware manner. In this paper, we take up this challenge and design a scalable, low-overhead InfiniBand Network Topology-Aware Performance Analysis Tool for MPI - INTAP-MPI. INTAP-MPI allows users to analyze and visualize the communication pattern of HPC applications on any IB network (FAT Tree, Tori, or Mesh). We integrate INTAP-MPI into the MVAPICH2 MPI library, allowing users of HPC clusters to seamlessly use it for analyzing their applications. Our experimental analysis shows that the INTAP-MPI is able to profile and visualize the communication pattern of applications with very low memory and performance overhead at scale.
This research is supported in part by U.S. Department of Energy grant #DE-FC02-06ER25755; National Science Foundation grants #CCF-0916302, #OCI-0926691 and #CCF-0937842.
Chapter PDF
Similar content being viewed by others
Keywords
- Network Topology
- Message Passing Interface
- Communication Pattern
- High Performance Computing
- Collective Operation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Gnuplot, http://www.gnuplot.info/
MVAPICH2: High Performance MPI over InfiniBand, 10GigE/iWARP and RoCE, http://mvapichcse.ohio-state.edu/
Open fabrics enterprise distribution, http://www.openfabrics.org/
Bhatele, A.: Automating Topology Aware Mapping for Supercomputers. Ph.D. thesis, Dept. of Computer Science, University of Illinois (August 2010)
Bhatele, A., Kale, L.V., Chen, N., Johnson, R.E.: A Pattern Language for Topology Aware Mapping (June 2009)
Fürlinger, K., Wright, N.J., Skinner, D.: Performance Analysis and Workload Characterization with IPM. In: Parallel Tools Workshop, pp. 31–38 (2009)
Geimer, M., Wolf, F., Wylie, B., Ábrahám, E., Becker, D., Mohr, B.: The Scalasca Performance Toolset Architecture. Concurr. Comput.: Pract. Exper. 22(6), 702–719 (2010)
Subramoni, H., Kandalla, K., Vienne, J., Sur, S., Barth, B., Tomko, K., Mclay, R., Schulz, K., Panda, D.K.: Design and Evaluation of Network Topology-/Speed- Aware Broadcast Algorithms for InfiniBand Clusters. In: CLUSTER (2011)
Hoefler, T., Snir, M.: Generic Topology Mapping Strategies for Large-scale Parallel Architectures. In: Proceedings of the 2011 ACM International Conference on Supercomputing, ICS 2011, pp. 75–85. ACM (June 2011)
InfiniBand Trade Association, http://www.infinibandta.org/
Kalé, L., Krishnan, S.: CHARM++: A Portable Concurrent Object Oriented System Based on C++. In: Proceedings of OOPSLA 1993, pp. 91–108. ACM Press (September 1993)
MPI Forum: MPI: A Message Passing Interface. In: Proceedings of Supercomputing (1993)
Dandapanthula, N., Subramoni, H., Vienne, J., Kandalla, K., Sur, S., Panda, D.K., Brightwell, R.: INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool. In: Alexander, M., D’Ambra, P., Belloum, A., Bosilca, G., Cannataro, M., Danelutto, M., Di Martino, B., Gerndt, M., Jeannot, E., Namyst, R., Roman, J., Scott, S.L., Traff, J.L., Vallée, G., Weidendorfer, J. (eds.) Euro-Par 2011 Workshops, Part II. LNCS, vol. 7156, pp. 166–177. Springer, Heidelberg (2012)
Nagel, W.E., Arnold, A., Weber, M., Hoppe, H.C., Solchenbach, K.: VAMPIR: Visualization and Analysis of MPI Resources. Supercomputer 12, 69–80 (1996)
OFED: Open Subnet Manager, http://www.openfabrics.org/downloads/management/README
Sistare, S., Allen, D., Bowker, R., Jourdenais, K., Simons, J., Title, R.: A Scalable Debugger for Massively Parallel Message-Passing Programs. IEEE Parallel Distrib. Technol. 2(2), 50–56 (1994)
Spear, W., Malony, A.D., Lee, C.W., Biersdorff, S., Shende, S.: An Approach to Creating Performance Visualizations in a Parallel Profile Analysis Tool. In: Alexander, M., D’Ambra, P., Belloum, A., Bosilca, G., Cannataro, M., Danelutto, M., Di Martino, B., Gerndt, M., Jeannot, E., Namyst, R., Roman, J., Scott, S.L., Traff, J.L., Vallée, G., Weidendorfer, J. (eds.) Euro-Par 2011 Workshops, Part II. LNCS, vol. 7156, pp. 156–165. Springer, Heidelberg (2012)
Top500: Top500 Supercomputing systems (November 2010), http://www.top500.org
Vetter, J.S., McCracken, M.O.: Statistical Scalability Analysis of Communication Operations in Distributed Applications. In: Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, PPoPP 2001, pp. 123–132 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Subramoni, H., Vienne, J., Panda, D.K.(. (2013). A Scalable InfiniBand Network Topology-Aware Performance Analysis Tool for MPI. In: Caragiannis, I., et al. Euro-Par 2012: Parallel Processing Workshops. Euro-Par 2012. Lecture Notes in Computer Science, vol 7640. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36949-0_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-36949-0_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36948-3
Online ISBN: 978-3-642-36949-0
eBook Packages: Computer ScienceComputer Science (R0)