Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

European Conference on Parallel Processing

Euro-Par 2011: Euro-Par 2011: Parallel Processing Workshops pp 166–177Cite as

  1. Home
  2. Euro-Par 2011: Parallel Processing Workshops
  3. Conference paper
INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool

INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool

  • N. Dandapanthula30,
  • H. Subramoni30,
  • J. Vienne30,
  • K. Kandalla30,
  • S. Sur30,
  • Dhabaleswar K. Panda30 &
  • …
  • Ron Brightwell31 
  • Conference paper
  • 1135 Accesses

  • 7 Citations

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7156)

Abstract

As InfiniBand (IB) clusters grow in size and scale, predicting the behavior of the IB network in terms of link usage and performance becomes an increasingly challenging task. There currently exists no open source tool that allows users to dynamically analyze and visualize the communication pattern and link usage in the IB network. In this context, we design and develop a scalable InfiniBand Network Analysis and Monitoring tool - INAM. INAM monitors IB clusters in real time and queries the various subnet management entities in the IB network to gather the various performance counters specified by the IB standard. We provide an easy to use web-based interface to visualize performance counters and subnet management attributes of a cluster in an on-demand basis. It is also capable of capturing the communication characteristics of a subset of links in the network. Our experimental results show that INAM is able to accurately visualize the link utilization as well as the communication pattern of target applications.

Keywords

  • Leaf Node
  • Communication Pattern
  • Link Utilization
  • Performance Counter
  • Link Usage

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This research is supported in part by Sandia Laboratories grant #1024384, U.S. Department of Energy grants #DE-FC02-06ER25749, #DE-FC02-06ER25755 and contract #DE-AC02-06CH11357; National Science Foundation grants #CCF-0621484, #CCF-0702675, #CCF-0833169, #CCF-0916302 and #OCI-0926691; grant from Wright Center for Innovation #WCI04-010-OSU-0; grants from Intel, Mellanox, Cisco, QLogic, and Sun Microsystems; Equipment donations from Intel, Mellanox, AMD, Obsidian, Advanced Clustering, Appro, QLogic, and Sun Microsystems.

Download conference paper PDF

References

  1. Barth, W.: Nagios. System and Network Monitoring. No Starch Press, U.S. Ed edn. (2006)

    Google Scholar 

  2. Charts, H.: HighCharts JS - Interactive JavaScript Charting, http://www.highcharts.com/

  3. DWR: DWR - Direct Web Remoting, http://directwebremoting.org/dwr/

  4. Hoefler, T., Schneider, T., Lumsdaine, A.: Multistage Switches are not Crossbars: Effects of Static Routing in High-Performance Networks. In: Proceedings of the 2008 IEEE Cluster Conference (September 2008)

    Google Scholar 

  5. InfiniBand Trade Association, http://www.infinibandta.org/

  6. Massie, M.L., Chun, B.N., Culler, D.E.: The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. Parallel Computing 30(7) (July 2004)

    Google Scholar 

  7. Mellanox: Fabric-it, http://www.mellanox.com/pdf/prod_ib_switch_systems/pb_FabricIT_EFM.pdf

  8. MVAPICH2, http://mvapich.cse.ohio-state.edu/

  9. MySQL: MySQL, http://www.mysql.com/

  10. Müller, M.S., van Waveren, G.M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W.C., Parrott, C., Elken, T., Feng, H., Ponder, C.: Spec mpi2007 - an application benchmark suite for parallel systems using mpi. Concurrency and Computation: Practice and Experience, 191–205 (2010)

    Google Scholar 

  11. Open Fabrics Alliance, http://www.openfabrics.org/

  12. SUN: Java 2 platform, enterprise edition (j2ee) overview, http://java.sun.com/j2ee

  13. Top500: Top500 Supercomputing systems (November 2010), http://www.top500.org

  14. Vienne, J., Martinasso, M., Vincent, J.M., Méhaut, J.F.: Predictive models for bandwidth sharing in high performance clusters. In: Proceedings of the 2008 IEEE Cluster Conference (September 2008)

    Google Scholar 

  15. W3C: HTML5 - Canvas Element, https://developer.mozilla.org/en/HTML/Canvas

Download references

Author information

Authors and Affiliations

  1. Department of Computer Science and Engineering, The Ohio State University, USA

    N. Dandapanthula, H. Subramoni, J. Vienne, K. Kandalla, S. Sur & Dhabaleswar K. Panda

  2. Sandia National Laboratories, Albuquerque, NM, 87123, USA

    Ron Brightwell

Authors
  1. N. Dandapanthula
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. H. Subramoni
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. J. Vienne
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. K. Kandalla
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. S. Sur
    View author publications

    You can also search for this author in PubMed Google Scholar

  6. Dhabaleswar K. Panda
    View author publications

    You can also search for this author in PubMed Google Scholar

  7. Ron Brightwell
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Scilytics, Koellnerhofgasse 3/15A, 1010, Vienna, Austria

    Michael Alexander

  2. ICAR-CNR, Via P. Castellino, 111, 80131, Napoli, Italy

    Pasqua D’Ambra

  3. University of Amsterdam, 1090, Amsterdam, Netherlands

    Adam Belloum

  4. Innovative Computing Laboratory, The University of Tennessee, US

    George Bosilca

  5. Department of Experimental Medicine and Clinic, University Magna Græcia, 88100, Catanzaro, Italy

    Mario Cannataro

  6. Computer Science Department, University of Pisa, Italy

    Marco Danelutto

  7. Second University of Naples, Italy

    Beniamino Di Martino

  8. TUMünchen,, Boltzmannstr. 3, ,, 85748, Garching, Germany

    Michael Gerndt

  9. Equipe Runtime, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Emmanuel Jeannot & Raymond Namyst & 

  10. Equipe HIEPACS, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Jean Roman

  11. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831-6164, Oak Ridge, TN, USA

    Stephen L. Scott

  12. Department of Scientific Computing, University of Vienna, Nordbergstr. 15/3C, 1090, Vienna, Austria

    Jesper Larsson Traff

  13. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA

    Geoffroy Vallée

  14. Technische Universität München, Germany

    Josef Weidendorfer

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dandapanthula, N. et al. (2012). INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool. In: Alexander, M., et al. Euro-Par 2011: Parallel Processing Workshops. Euro-Par 2011. Lecture Notes in Computer Science, vol 7156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29740-3_20

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-29740-3_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29739-7

  • Online ISBN: 978-3-642-29740-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature