Performance Issues in Parallel Processing Systems

  • Luiz A. DeRose
  • Mario Pantano
  • Daniel A. Reed
  • Jeffrey S. Vetter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1769)


Simply put, the goal of performance analysis is to provide the data and insights required to optimize the execution behavior of application or system components. Using such data and insights, application and system developers can choose to optimize software and execution environments along many axes, including execution time, memory requirements, and resource use. Given the diversity of performance optimization goals and the wide range of possible problems, a complete performance analysis toolkit necessarily includes a broad range of techniques. These range from mechanisms for simple code timings to multi-level hardware/software measurement and correlation across networks, system software, runtime libraries, compile-time code transformations, and adaptive execution.


Performance Data Parallel System Parallel Processing System Code Fragment Projection Pursuit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adve, V., Mellor-Crummey, J., Wang, J.-C., and Reed, D. Integrating Compilation and Performance Analysis for Data-Parallel Programs. In Proceedings of Supercomputing’95 (November 1995).Google Scholar
  2. 2.
    Adve, V. S., Mellor-Crummey, J., Anderson, M., Kennedy, K., Wang, J., and Reed, D. A. Integrating Compilation and Performance Analysis for Data-Parallel Programs. In Proceedings of the Workshop on Debugging and Performance Tuning for Parallel Computing Systems, M. L. Simmons, A. H. Hayes, D. A. Reed, and J. Brown, Eds. IEEE Computer Society Press, 1994.Google Scholar
  3. 3.
    Balasundaram, V., Fox, G., Kennedy, K., and Kremer, U. A Static Performance Estimator to Guide Data Partitioning Decisions. In 3rd ACM Sigplan Symposium on Principles and Practice of Parallel Programming (PPoPP) (April 1991).Google Scholar
  4. 4.
    Benkner, S. Vienna Fortran Compilation System-Version 2.0-User’s Guide. Tech. rep., University of Vienna, 1995.Google Scholar
  5. 5.
    Benkner, S., and Pantano, M. HPF+: Optimizing HPF for Advanced Applications. Supercomputer 13,2 (1997), 31–43.Google Scholar
  6. 6.
    Benkner, S., Sanjari, K., Sipkova, V., and Velkov, B. Parallelizing Irregular Applications with the Vienna HPF+ Compiler VFC. In HPCN Europe (April 1998), Lecture Notes in Computer Science, Springer-Verlag.Google Scholar
  7. 7.
    Bodin, F., Beckman, P., Gannon, D., Gotwals, J., Narayana, S., Srinivas, S., and Winnicka, B. Sage++: An Object-Oriented Toolkit and Class Library for Building Fortran and C++ Restructuring Tools. In OON-SKI’94 Proceedings of the First Annual Object-Oriented Numerics Conference (April 1993), pp. 122–138.Google Scholar
  8. 8.
    Bodin, F., Beckman, P., Gannon, D., Narayana, S., and Yang, S. Distributed pC++: Basic Ideas for an Object Parallel Language. In OON-SKI’93 Proceedings of the First Annual Object-Oriented Numerics Conference (April 1993), pp. 1–24.Google Scholar
  9. 9.
    Calzarossa, M., Massari, L., Merlo, A., Pantano, M., and Tessera, D. Medea: A Tool for Workload Characterization of Parallel Systems. IEEE Parallel and Distributed Technology 3,4 (November 1995), 72–80.CrossRefGoogle Scholar
  10. 10.
    Calzarossa, M., Massari, L., Merlo, A., Pantano, M., and Tessera, D. Integration of a Compilation System and a Performance Tool: The HPF+ Approach. In HPCN Europe (April 1998), Lecture Notes in Computer Science, Springer-Verlag.Google Scholar
  11. 11.
    Chang, P. P., Mahlke, S. A., and Hwu, W. W. Using Profile Information to Assist Classic code Optimization. Software-Practice & Experience (to appear).Google Scholar
  12. 12.
    Cleveland, W. S., and MiGill, M. E., Eds. Dynamic Graphics for Statistics. Wadsworth & Brooks/Cole, 1988.Google Scholar
  13. 13.
    Colwell, R., Nix, R., O’Donnell, J., Papwoth, D., and Rodman, P. A VLIW Architecture for a Trace Scheduling Compiler. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems (October 1987).Google Scholar
  14. 14.
    Couch, A.Graphical Representations of Program Performance on Hypercube Message-Passing Multiprocessors. PhD thesis, Tufts University, Department of Computer Science, 1988.Google Scholar
  15. 15.
    DeRose, L., Zhang, Y., and Reed, D. Svpablo: A Multi-Language Performance Analysis System. In Computer Performance Evaluation Modelling Techniques and Tools (September 1998), R. Puigjaner, N. Savino, and B. Serra, Eds., Lecture Notes in Computer Science, vol. 1469, Springer-Verlag, pp. 352–355.Google Scholar
  16. 16.
    Fahringer, T. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer 28,11 (November 1995), 47–56.Google Scholar
  17. 17.
    Fahringer, T. Effective Symbolic Analysis to Support Parallelizing Compilers and Performance Analysis. In HPCN Europe (April 1997), Lecture Notes in Computer Science, Springer-Verlag.Google Scholar
  18. 18.
    Fisher, J. Trace Scheduling: A Technique for Global Microcode Compactation. IEEE Tranactions on Computers (July 1981), 478–490.Google Scholar
  19. 19.
    Foster, I., and Kesselman, C., Eds. The Grid: Blueprint for a New Computing Infrastructure. Morgan-Kaufmann, 1998.Google Scholar
  20. 20.
    Graham, S., Kessler, P., and McKusick, M. gprof: A Call Graph Execution Profiler. In Proceedings of the SIGPLAN’82 Symposium on Compiler Construction (Boston, MA, June 1982), Association for Computing Machinery, pp. 120–126.Google Scholar
  21. 21.
    Gu, W., Eisenhauer, G., Schwan, K., and Vetter, J. Falcon: On-line monitoring and steering of parallel programs. Concurrency: Practice and Experience 10,9 (1998), 699–736.zbMATHCrossRefGoogle Scholar
  22. 22.
    Hartleb, F., and Mertsiotakis, V. Bounds for the Mean Runtime of Parallel Programs. In Computer Performance Evaluation’ 92: Modeling Techniques and Tools (1992), R. Pooley and J. Hillston, Eds., pp. 197–210.Google Scholar
  23. 23.
    Heath, M. T., and Etheridge, J. A. Visualizing the Performance of Parallel Programs. IEEE Software (Sept. 1991), 29–39.Google Scholar
  24. 24.
    Heath, M. T., Malony, A. D., and Rover, D. T. The Visual Display of Parallel Performance Data. Computer 28,11 (1995), 21–8.CrossRefGoogle Scholar
  25. 25.
    Hurley, C., and Buja, A. Analyzing High-dimensional Data with Motion Graphics. SIAM Journal of Scientific and Statistical Computing 11,6 (Nov. 1990), 1193–1211.zbMATHCrossRefGoogle Scholar
  26. 26.
    Kraemer, E., and Stasko, J. T. The Visualization of Parallel Systems: An Overview. Jour. Parallel and Distributed Computing 18,2 (1993), 105–17.CrossRefGoogle Scholar
  27. 27.
    Larus, J. R., and Schnarr, E. EEL: Machine-Independent Executable Editing. In Proceedings of the SIGPLAN’ 95 Conference on Programming Languages Design and Impelementation (PLDI) (June 1995).Google Scholar
  28. 28.
    LeBlanc, T. J., and Markatos, E. P. Operating System Support for Adaptive Real-time Systems. In Proceedings of the Seventh IEEE Workshop on Real-Time Operating Systems and Software (May 1990), pp. 1–10.Google Scholar
  29. 29.
    Malony, A. D., Reed, D. A., and Wijshoff, H. Performance Measurement Intrusion and Perturbation Analysis. IEEE Transactions on Parallel and Distributed Systems 3,4 (July 1992), 433–450.CrossRefGoogle Scholar
  30. 30.
    Mendes, C. L.Performance Scalability Prediction on Multicomputers. PhD thesis, University of Illinois at Urbana-Champaign, May 1997.Google Scholar
  31. 31.
    Mendes, C. L., Wang, J.-C., and Reed, D. A. Automatic Performance Prediction and Scalability Analysis for Data Parallel Programs. In Proceedings of the CRPC Workshop on Data Layout and Performance Prediction (Houston, April 1995).Google Scholar
  32. 32.
    Miller, B. P., Callaghan, M. D., Cargille, J. M., Hollingsworth, J. K., Irvin, R. B., Karavanic, K. L., Kunchithapadam, K., and Newhall, T. The Paradyn Parallel Performance Measurement Tools. IEEE Computer 28,11 (November 1995), 37–46.Google Scholar
  33. 33.
    MIPS Technologies Inc. MIPS R10000 Microprocessor User’s Manual, 2.0 ed., 1996.Google Scholar
  34. 34.
    Mohr, B., Malony, A., and Cuny, J. Tau Tuning and Analysis Utilities for Portable Parallel Programming. In Parallel Programming using C++, G. Wilson, Ed. M.I.T. Press, 1996.Google Scholar
  35. 35.
    Nickolayev, O. Y., Roth, P. C., and Reed, D. A. Real-time Statistical Clustering for Event Trace Reduction. International Journal of Supercomputer Applications and High Performance Computing (1997).Google Scholar
  36. 36.
    Reed, D. A. Experimental Performance Analysis of Parallel Systems: Techniques and Open Problems. In Proceedings of the 7th International Conference on Modelling Techniques and Tools for Computer Performance Evaluation (May 1994), pp. 25–51.Google Scholar
  37. 37.
    Reed, D. A., Aydt, R. A., DeRose, L., Mendes, C. L., Ribler, R. L., Shaffer, E., Simitci, H., Vetter, J. S., Wells, D. R., Whitmore, S., and Zhang, Y. Performance Analysis of Parallel Systems: Approaches and Open Problems. In Proceedings of the Joint Symposium on Parallel Processing (JSPP) (June 1998), pp. 239–256.Google Scholar
  38. 38.
    Reed, D. A., Aydt, R. A., Noe, R. J., Roth, P. C., Shields, K. A., Schwartz, B., and Tavera, L. F. Scalable Performance Analysis: The Pablo Performance Analysis Environment. In Proceedings of the Scalable Parallel Libraries Conference (1993), A. Skjellum, Ed., IEEE Computer Society.Google Scholar
  39. 39.
    Reed, D. A., Elford, C. L., Madhyastha, T., Scullin, W. H., Aydt, R. A., and Smirni, E. I/O, Performance Analysis, and Performance Data Immersion. In Proceedings of MASCOTS’ 96 (Feb. 1996), pp. 1–12.Google Scholar
  40. 40.
    Ribarsky, W., Ayers, E., Eble, J., and Mukherjea, S. Using Glyphmaker to Create Customized Visualizations of Complex Data. IEEE Computer, July (1994), 57–64.Google Scholar
  41. 41.
    Ribler, R., Vetter, J., Simitci, H., and Reed, D. Autopilot: Adaptive Control of Distributed Applications. In Proc. Seventh IEEE Int’l Symp. High Performance Distributed Computing (HPDC) (1998), pp. 172–9.Google Scholar
  42. 42.
    Robertson, G. G., Card, S. K., and Mackinlay, J. D. Information Visualization using 3D Interactive Animation. Communications of the ACM 36,4 (1993), 56–71.CrossRefGoogle Scholar
  43. 43.
    Rosenblum, L. J. Research Issues in Scientific Visualization. IEEE Computer Graphics and Applications 14,2 (1994), 61–3.CrossRefGoogle Scholar
  44. 44.
    Stasko, J., Domingue, J., Brown, M. H., and Price, B. A., Eds. Software Visualization: Programming as a Multimedia Experience,. MIT Press, Cambridge, MA, 1998.Google Scholar
  45. 45.
    Sun, X. H., Pantano, M., and Fahringer, T. Performance Range Comparison for Restructuring Compilation. In IEEE International Conference on Parallel Processing (Minneapolis, August 1998), pp. 595–602.Google Scholar
  46. 46.
    Vetter, J. S. Computational Steering Annotated Bibliography. SIGPLAN Notices 32,6 (1997), 40–4.CrossRefGoogle Scholar
  47. 47.
    Yan, J. C., Sarukkai, S. R., and Mehra, P. Performance Measurement, Visualization and Modeling of Parallel and Distributed Programs using the AIMS Toolkit. Software Practice & Experience 25,4 (April 1995), 429–461.CrossRefGoogle Scholar
  48. 48.
    Zagha, M., Larson, B., Turner, S., and Itzkowitz, M. Performance Analysis Using the MIPS R10000 Performance Counters. In Proceedings of Supercomputing’96 (November 1996).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Luiz A. DeRose
    • 1
  • Mario Pantano
    • 1
  • Daniel A. Reed
    • 1
  • Jeffrey S. Vetter
    • 1
  1. 1.Department of Computer ScienceUniversity of IllinoisUrbanaUSA

Personalised recommendations