Advertisement

INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool

  • N. Dandapanthula
  • H. Subramoni
  • J. Vienne
  • K. Kandalla
  • S. Sur
  • Dhabaleswar K. Panda
  • Ron Brightwell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7156)

Abstract

As InfiniBand (IB) clusters grow in size and scale, predicting the behavior of the IB network in terms of link usage and performance becomes an increasingly challenging task. There currently exists no open source tool that allows users to dynamically analyze and visualize the communication pattern and link usage in the IB network. In this context, we design and develop a scalable InfiniBand Network Analysis and Monitoring tool - INAM. INAM monitors IB clusters in real time and queries the various subnet management entities in the IB network to gather the various performance counters specified by the IB standard. We provide an easy to use web-based interface to visualize performance counters and subnet management attributes of a cluster in an on-demand basis. It is also capable of capturing the communication characteristics of a subset of links in the network. Our experimental results show that INAM is able to accurately visualize the link utilization as well as the communication pattern of target applications.

Keywords

Leaf Node Communication Pattern Link Utilization Performance Counter Link Usage 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Barth, W.: Nagios. System and Network Monitoring. No Starch Press, U.S. Ed edn. (2006)Google Scholar
  2. 2.
    Charts, H.: HighCharts JS - Interactive JavaScript Charting, http://www.highcharts.com/
  3. 3.
    DWR: DWR - Direct Web Remoting, http://directwebremoting.org/dwr/
  4. 4.
    Hoefler, T., Schneider, T., Lumsdaine, A.: Multistage Switches are not Crossbars: Effects of Static Routing in High-Performance Networks. In: Proceedings of the 2008 IEEE Cluster Conference (September 2008)Google Scholar
  5. 5.
    InfiniBand Trade Association, http://www.infinibandta.org/
  6. 6.
    Massie, M.L., Chun, B.N., Culler, D.E.: The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. Parallel Computing 30(7) (July 2004)Google Scholar
  7. 7.
  8. 8.
  9. 9.
    MySQL: MySQL, http://www.mysql.com/
  10. 10.
    Müller, M.S., van Waveren, G.M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W.C., Parrott, C., Elken, T., Feng, H., Ponder, C.: Spec mpi2007 - an application benchmark suite for parallel systems using mpi. Concurrency and Computation: Practice and Experience, 191–205 (2010)Google Scholar
  11. 11.
    Open Fabrics Alliance, http://www.openfabrics.org/
  12. 12.
    SUN: Java 2 platform, enterprise edition (j2ee) overview, http://java.sun.com/j2ee
  13. 13.
    Top500: Top500 Supercomputing systems (November 2010), http://www.top500.org
  14. 14.
    Vienne, J., Martinasso, M., Vincent, J.M., Méhaut, J.F.: Predictive models for bandwidth sharing in high performance clusters. In: Proceedings of the 2008 IEEE Cluster Conference (September 2008)Google Scholar
  15. 15.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • N. Dandapanthula
    • 1
  • H. Subramoni
    • 1
  • J. Vienne
    • 1
  • K. Kandalla
    • 1
  • S. Sur
    • 1
  • Dhabaleswar K. Panda
    • 1
  • Ron Brightwell
    • 2
  1. 1.Department of Computer Science and EngineeringThe Ohio State UniversityUSA
  2. 2.Sandia National LaboratoriesAlbuquerqueUSA

Personalised recommendations