A Community Databank for Performance Tracefiles
Tracefiles provide a convenient record of the behavior of HPC programs, but are not generally archived because of their storage requirements. This has hindered the developers of performance analysis tools, who must create their own tracefile collections in order to test tool functionality and usability. This paper describes a shared databank where members of the HPC community can deposit tracefiles for use in studying the performance characteristics of HPC platforms as well as in tool development activities. We describe how the Tracefile Testbed was designed and implemented to facilitate flexible searching and retrieval of tracefiles. A Web-based interface provides a convenient mechanism for browsing and downloading collections of tracefiles and tracefile segments based on a variety of characteristics. The paper discusses the key implementation challenges.
Unable to display preview. Download preview PDF.
- 1.R. Eigenmann and S. Hassanzadeh. Benchmarking with Real Industrial Applications: The SPEC High-Performance Group. IEEE Computational Science and Engineering, Spring Issue, 1996.Google Scholar
- 2.T. Fahringer and A. Pozgaj. P3T+: A Performance Estimator for Distributed and Parallel Programs. Journal of Scientific Programming, 7(1), 2000.Google Scholar
- 3.B.P. Miller et al. The Paradyn Parallel Measurement Performance Tool. IEEE Computer, 28(11):37–46, 1995.Google Scholar
- 4.K.L. Karavanic and B.P. Miller. Improving Online Performance Diagnosis by the Use of Historical Performance Data. In Proc. SC’99, 1999.Google Scholar
- 6.S.E. Perl, W.E. Weihl, and B. Noble. Continuous Monitoring and Performance Specification. Technical Report 153, Digital Systems Research Center, June 1998.Google Scholar
- 7.D.A. Reed et al. Performance Analysis of Parallel Systems: Approaches and Open Problems. In Joint Symposium on Parallel Processing, pages 239–256, 1998.Google Scholar
- 8.S. Shende and A. Malony and J. Cuny and K. Lindlan and P. Beckman and S. Karmesin, Portable Profiling and Tracing for Parallel Scientific Applications using C++. In Proc. SPDT’98: ACM SIGMETRICS Symposium on Parallel and Distributed Tools, pages 134–145, 1998.Google Scholar