Speedy: An integrated performance extrapolation tool for pC++ Programs

  • Bernd W. Mohr
  • Allen D. Malony
  • Kesavan Shanmugam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 977)


Performance extrapolation is the process of evaluating the performance of a parallel program in a target execution environment using performance information obtained for the same program in a different environment. Performance extrapolation techniques are suited for rapid performance tuning of parallel programs, particularly when the target environment is unavailable. This paper describes one such technique that was developed for data-parallel C++ programs written in the pC++ language. In pC++, the programmer can distribute a collection of objects to various processors and can have methods invoked on those objects execute in parallel. Using performance extrapolation in the development of pC++ applications allows tuning decisions to be made in advance of detailed execution measurements. The pC++ language system includes Τ, an integrated environment for analyzing and tuning the performance of pC++ programs. This paper presents speedy, a new addition to Τ, that predicts the performance of pC++ programs on parallel machines using extrapolation techniques. Speedy applies the existing instrumentation support of Τ to capture high-level event traces of a n-thread pC++ program run on a uniprocessor machine together with trace-driven simulation to predict the performance of the program run on a target n-processor machine. We describe how speedy works and how it is integrated into Τ. We also show how speedy can be used to evaluate a pC++ program for a given target environment.


performance prediction extrapolation object-parallel programming trace-driven simulation performance debugging tools modeling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    F. Bodin, P. Beckman, D. Gannon, S. Yang, S. Kesavan, A. Malony, B. Mohr, Implementing a Parallel C++ Runtime System for Scalable Parallel Systems, Proc. Supercomputing 93, IEEE Computer Society, pp. 588–597, November 1993.Google Scholar
  2. [2]
    F. Bodin, P. Beckman, D. Gannon, J. Gotwals, S. Narayana, S. Srinivas, B. Winnicka, Sage++: An Object Oriented Toolkit and Class Library for Building Fortran and C++ Restructuring Tools, Proc. Oonski '94, Oregon, 1994.Google Scholar
  3. [3]
    E. A. Brewer, W. E. Weihl, Developing Parallel Applications Using High-Performance Simulation, Proc. ACM/ONR Workshop on Parallel and Distributed Debugging, pp. 158–168, May 1993.Google Scholar
  4. [4]
    D. Brown, S. Hackstadt, A. Malony, B. Mohr, Program Analysis Environments for Parallel Language Systems: The TAU Environment, Proc. of the Workshop on Environments and Tools For Parallel Scientific Computing, Townsend, Tennessee, pp. 162–171, May 1994.Google Scholar
  5. [5]
    M. E. Crovella and T. J. LeBlanc, Parallel Performance Prediction Using Lost Cycles Analysis, Proc. Supercomputing 94, IEEE Computer Society, pp. 600–609, Nov 1994.Google Scholar
  6. [6]
    D. C. Grunwald, A Users Guide to AWESIME: An Object Oriented Parallel Programming and Simulation System, Technical Report 552-91, Department of Computer Science, University of Colorado at Boulder, November 1991.Google Scholar
  7. [7]
    S. Hackstadt, A. Malony, Next-Generation Parallel Performance Visualization: A Prototyping Environment for Visualization Development, Proc. Parallel Architectures and Languages Europe, (PARLE), Athens, Greece, 1994.Google Scholar
  8. [8]
    R. Helm, A. D. Malony and S. F. Fickas, Capturing and Automating Performance Diagnosis: The Poirot Approach, Proc. International Parallel Processing SymposiumGoogle Scholar
  9. [9]
    V. Herrarte, E. Lusk, Studying Parallel Program Behavior with Upshot, Technical Report ANL-91/15, Mathematics and Computer Science Division, Argonne Natl. Lab., 1991.Google Scholar
  10. [10]
    S. Hiranandani, K. Kennedy, C.-W. Tseng, S. Warren, The D Editor: A New Interactive Parallel Programming Tool, Proc. Supercomputing'94, IEEE Computer Society Press, pp. 733–742, November 1994.Google Scholar
  11. [11]
    J. Kohn and W. Williams, ATExpert, Journal of Parallel and Distributed Computing, Vol. 18, 1993, pp. 205–222.CrossRefGoogle Scholar
  12. [12]
    A. Malony, B. Mohr, P. Beckman, D. Gannon, S. Yang, F. Bodin, Performance Analysis of pC++: A Portable Data-Parallel Programming System for Scalable Parallel Computers, Proc. 8th Int. Parallel Processing Symb. (IPPS), Mexico, IEEE, pp. 75–85, April 1994.Google Scholar
  13. [13]
    B. Mohr, Standardization of Event Traces Considered Harmful or Is an Implementation of Object-Independent Event Trace Monitoring and Analysis Systems Possible?, Proc. CNRS-NSF Workshop on Environments and Tools For Parallel Scientific Computing, Elsevier, Advances in Parallel Computing, Vol. 6, pp. 103–124, 1993.Google Scholar
  14. [14]
    B. Mohr, D. Brown, A. Malony, TAU: A Portable Parallel Program Analysis Environment for pC++, Proc. of CONPAR 94 — VAPP VI, Linz, Austria, Springer Verlag, LNCS 854, pp. 29–40, September 1994.Google Scholar
  15. [15]
    J. Ousterhout, Tcl and the Tk Toolkit, Addison-Wesley, 1994.Google Scholar
  16. [16]
    D. A. Reed, R. D. Olson, R. A. Aydt, T. M. Madhyasta, T. Birkett, D. W. Jensen, B. A.A. Nazief, B. K. Totty, Scalable Performance Environments for Parallel Systems. Proc. 6th Distributed Memory Computing Conf., IEEE Computer Society Press, pp. 562–569, 1991.Google Scholar
  17. [17]
    S. K. Reinhardt, M. D. Hill, J. R. Larus, A. R. Lebeck, J. C. Lewis and D. A. Wood, The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers, Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Comp. Systems, pp. 48–60, 1993.Google Scholar
  18. [18]
    K. Shanmugam, Performance Extrapolation of Parallel Programs, Master's Thesis, Department of Computer and Information Science, University of Oregon, June 1994.Google Scholar
  19. [19]
    K. Shanmugam, A. Malony, Performance Extrapolation of Parallel Programs, Proc. ICPP'95.Google Scholar
  20. [20]
    K. Shanmugam, A. Malony, B. Mohr, Performance Extrapolation of Parallel Programs, Technical Report CIS-TR-95-14, University of Oregon, Department of Computer and Information Science, May 1995.Google Scholar
  21. [21]
    H. Wabnig and G. Haring, PAPS — The Parallel Program Performance Prediction Toolset, Computer Performance Evaluation — Modelling Techniques and Tools, LNCS 794, Springer-Verlag, pp. 284–304, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • Bernd W. Mohr
    • 1
  • Allen D. Malony
    • 1
  • Kesavan Shanmugam
    • 2
  1. 1.Department of Computer and Information ScienceUniversity of OregonEugeneUSA
  2. 2.Convex Computer Corp.RichardsonUSA

Personalised recommendations