Skip to main content

Detecting Changes in Communication Properties of Parallel Programs by InfiniBand Traffic Analysis

  • Conference paper
  • First Online:
Parallel Computational Technologies (PCT 2021)

Abstract

Modern computational systems and parallel applications quite often are highly complicated. As a result, their interaction becomes difficult to analyze. There is a wide variety of profiling tools that can help in finding bottlenecks and inefficient parts in programs running on high-performance clusters. However, those tools involve additional overheads. This might be partially avoided by introducing methods of analysis that work on the network layer. In this article, we describe the development of a new tool for exploring visually and analyzing the behavior of different MPI parallel programs. The tool is based on an existing method of collecting traffic data from the InfiniBand network on the Lomonosov supercomputer. The comprehensive implementation includes constructing communication matrices of MPI processes and displaying various parts of the application timeline through these matrices, plotting communicational graphs and message distribution graphs built on several parameters of InfiniBand packets. The obtained visual representation of traffic of parallel applications may enable the analysis of such applications without inspecting the code directly, as demonstrated by examining a few NPB tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Mellanox Technologies, Mellanox Integrated Switch Management Solution: http://www.mellanox.com/products/management-software/fabricit.

  2. 2.

    Wireshark: https://www.wireshark.org/.

  3. 3.

    Python Dash library: https://dash.plotly.com/.

References

  1. Nagios. http://www.nagios.org/

  2. Massie, M.L., Chun, B.N., Culler, D.E.: The ganglia distributed monitoring system: design, implementation, and experience. Parallel Comput. 30(7), 817–840 (2004). https://doi.org/10.1016/j.parco.2004.04.001

    Article  Google Scholar 

  3. Dandapanthula, N., et al.: INAM - a scalable InfiniBand network analysis and monitoring tool. In: Alexander, M., et al. (eds.) Euro-Par 2011. LNCS, vol. 7156, pp. 166–177. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29740-3_20

    Chapter  Google Scholar 

  4. Adhianto, L., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: HPCToolkit: performance measurement and analysis for supercomputers with node-level parallelism. In: Workshop on Node Level Parallelism for Large Scale Supercomputers, in Conjuction with Supercomputing 2008 (2008)

    Google Scholar 

  5. Malony, A.D., Shende, S.: Performance technology for complex parallel and distributed systems. In: Kotsis, G., Kacsuk, P. (eds.) Distributed and Parallel Systems. SECS, vol. 567, pp. 37–46. Springer, Boston (2000). https://doi.org/10.1007/978-1-4615-4489-0_5

    Chapter  Google Scholar 

  6. Karrels, E., Lusk, E.: Performance analysis of MPI programs. In: Proceedings of the Workshop on Environments and Tools for Parallel Scientific Computing, pp. 195–200 (1994)

    Google Scholar 

  7. Subramoni, H., et al.: INAM2: InfiniBand Network Analysis and Monitoring with MPI (2016). https://doi.org/10.1007/978-3-319-41321-1_16

  8. Message Passing Interface Forum. MPI: A Message-Passing Interface Standard Version 3.0, section 14.3 (2012)

    Google Scholar 

  9. Infiniband Architecture Specification, Volume 1, Release 1.1 (2002)

    Google Scholar 

  10. Gradskov, A., Stefanov, K.: InfiniBand traffic analysis for building application communication profile. In: Russian Supercomputing Days: Proceedings of the International Conference, pp. 768–775 (2017). (in Russian)

    Google Scholar 

  11. Gabriel, E., et al.: Open MPI: goals, concept, and design of a next generation MPI implementation. In: Proceedings, 11th European PVM/MPI Users’ Group Meeting, pp. 97–104 (2004). https://doi.org/10.1007/978-3-540-30218-6_19

  12. Travis, E.: Oliphant: A Guide to NumPy. Trelgol Publishing, Austin (2006)

    Google Scholar 

  13. Bailey, D.H.: The NAS Parallel Benchmarks. United States (2009). https://doi.org/10.2172/983318

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Domracheva, D., Stefanov, K. (2021). Detecting Changes in Communication Properties of Parallel Programs by InfiniBand Traffic Analysis. In: Sokolinsky, L., Zymbler, M. (eds) Parallel Computational Technologies. PCT 2021. Communications in Computer and Information Science, vol 1437. Springer, Cham. https://doi.org/10.1007/978-3-030-81691-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-81691-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-81690-2

  • Online ISBN: 978-3-030-81691-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics