International Journal of Parallel Programming

, Volume 35, Issue 3, pp 207–232

Dynamic Binary Instrumentation and Data Aggregation on Large Scale Systems

  • Gregory L. Lee
  • Martin Schulz
  • Dong H. Ahn
  • Andrew Bernat
  • Bronis R. de Supinski
  • Steven Y. Ko
  • Barry Rountree
Special Issue on High-End Computing

Dynamic binary instrumentation for performance analysis on large scale architectures such as the IBM Blue Gene/L system (BG/L) poses unique challenges. Their unprecedented scale and often limited OS support require new mechanisms to organize binary instrumentation, to interact with the target application, and to collect the resulting data.

We describe the design and current status of a new implementation of the Dynamic Probe Class Library (DPCL) API for large scale systems. DPCL provides an easy to use layer for dynamic instrumentation on parallel MPI applications based on the DynInst dynamic instrumentation library for sequential platforms. Our work includes modifying DynInst to control instrumentation from remote I/O nodes and porting DPCL’s communication for performance data collection to use MRNet, a tree-based overlay network that (TBON) supports scalable multicast and data reduction. We describe extensions to the DPCL API that support instrumentation of task subsets and aggregation of collected performance data.

Keywords

Massively parallel architectures binary instrumentation scalable data collection performance analysis tools 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Buck B., Hollingsworth J.(2000). An API for Runtime Code Patching. The International Journal of High Performance Computing Applications 14(4):317–329CrossRefGoogle Scholar
  2. 2.
    L. DeRose, T. Hoover, and J. Hollingsworth, The Dynamic Probe Class Library—An Infrastructure for Developing Instrumentation for Performance Tools, in Proceedings of the 15th International Parallel and Distributed Processing Symposium (April 2001).Google Scholar
  3. 3.
    T. Ludwig, R. Wismüller, V. Sunderam, and A. Bode. OMIS—On-line Monitoring Interface Specification (Version 2.0), vol. 9, LRR-TUM Research Report Series, Shaker Verlag, Aachen, Germany (1997) ISBN 3-8265-3035-7.Google Scholar
  4. 4.
    Miller B., Callaghan M., Cargille J., Hollingsworth J., Irvin R., Karavanic K., Kunchithapadam K., Newhall T. (November 1995). The Paradyn Parallel Performance Measurement Tool. IEEE Computer 28(11):37–46Google Scholar
  5. 5.
    J. May and J. Gyllenhaal, Tool Gear: Infrastructure for Parallel Tools, in Proceedings of the 2003 International Conference on Parallel and Distributed Techniques and Applications (June 2003).Google Scholar
  6. 6.
    The Open|SpeedShop Team, Open|speedshop for Linux, http://www.openspeedshop.org/ (November 2006).Google Scholar
  7. 7.
    U. of Mannheim, U. of Tennessee, and NERSC/LBNL. TOP500 Supercomputing Sites. http://www.top500.org/.Google Scholar
  8. 8.
    N. Adiga et al., An overview of the bluegene/l supercomputer, in Proceedings of IEEE/ACM Supercomputing ’02 (Nov 2002).Google Scholar
  9. 9.
    K. Davis, A. Hoisie, G. Johnson, D. Kerbyson, M. Lang, S. Pakin, and F. Petrini. A Performance and Scalability Analysis of the BlueGene/L Architecture, In Proceedings of IEEE/ACM Supercomputing ’04 (November 2004).Google Scholar
  10. 10.
    J. DelSignore, TotalView on Blue Gene/L. Presented at “Blue Gene/L: Applications, Architecture and Software Workshop”, presentation available at http://www.llnl.gov/asci/platforms/bluegene/papers/26delsignore.pdf.Google Scholar
  11. 11.
    P. J. Mucci, DynaProf, http://www.cs.utk.edu/ mucci/dynaprof/ (2006).Google Scholar
  12. 12.
    M. Schulz, J. May, and J. Gyllenhaal. DynTG: A Tool for Interactive, Dynamic Instrumentation, in Proceedings of the 5th International Conference in Computational Science (ICCS), Part II, LNCS, Vol. 3515, pp. 140–148 (May 2005).Google Scholar
  13. 13.
    P. Roth, D. Arnold, and B. Miller, MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools, in Proceedings of IEEE/ACM Supercomputing ’03 November (2003).Google Scholar
  14. 14.
    IBM, An Overview of the BlueGene/L Supercomputer. Whitepaper available at http://www-fp.mcs.anl.gov/bgconsortium.Google Scholar
  15. 15.
    SLURM: Simple Linux Utility for Resource Management. http://www.llnl.gov/linux/slurm/ (June 2005).Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Gregory L. Lee
    • 1
  • Martin Schulz
    • 1
  • Dong H. Ahn
    • 1
  • Andrew Bernat
    • 2
  • Bronis R. de Supinski
    • 1
  • Steven Y. Ko
    • 3
  • Barry Rountree
    • 4
  1. 1.Lawrence Livermore National Laboratory, LivermoreLivermoreUSA
  2. 2.University of WisconsinMadisonUSA
  3. 3.University of IllinoisUrbana-ChampaignUSA
  4. 4.University of GeorgiaAthensUSA

Personalised recommendations