Advertisement

Abstract

One disappointing contrast in parallel systems is between the peak performance of the parallel systems and the actual performance of parallel applications. As parallel systems expand in size and complexity, the contrast becomes more and more important, which justifies the search for metrics, techniques and tools that allow users to understand the sources of performance degradation. Understanding performance is important not only for improving efficiency of applications, but also for guiding enhancements to parallel architectures and parallel programming environments.

Keywords

Performance Prediction System Size Parallel Algorithm Parallel Program Parallel System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    V.D. Agrawl and S.T. Chakradhar, Performance estimation in a massively parallel system, In: Proc. of Suptercomputing90, 1990, 306–313.Google Scholar
  2. [2]
    A. Alexandrov, M.F. Ionescu, et al., LogGP: Incorporating long messages into the LogP model for parallel computation, Journal of Parallel and Distributed Computation 44, 1997, 71–79.CrossRefGoogle Scholar
  3. [3]
    V. Balasunderam, G. Fox, K. Kennedy, and U. Kremer, A static performance estimator to guide data partitioning decisions, ACM SIGPLAN Notices, 26(7), 1991, 213–223.CrossRefGoogle Scholar
  4. [4]
    J.C. Browne and A.K. Adiga, Graph structured performance models, Performance Evaluation of Supercomputers, Ed.: J.J. Martin, Elsevier Science Publishers B.V. (North-Holland), 1988.Google Scholar
  5. [5]
    M. Calzarossa, Workload characterization for supercomputers, Performance Evaluation of Supercomputers, Ed.: J.L. Martin, Elsevier Science Publishers B.V. (North-Holland), 1988.Google Scholar
  6. [6]
    M.J. Clement and M.J. Quinn, Analytical performance prediction on multicomputers, In: Proc. of the Supercomputing93, 1993, 886–893.Google Scholar
  7. [7]
    M.J. Clement and M.J. Quinn, Symbolic performance prediction of scalable parallel programs, In: Proc. of the 9th International Parallel Processing Symposium, IEEE Computer Society Press, 1995, 635–639.Google Scholar
  8. [8]
    M.E. Crovella, Performance prediction and tuning of parallel programs, PH.D. thesis, Department of Computer Science, University of Rochester, USA, 1994.Google Scholar
  9. [9]
    D. Culler, R. Karp, D. Patterson, et al., LogP: Towards a realistic model of parallel computation, In: Proc. of the 4th ACM SIGPLAN Conference on Parallel Programming Practice and Experience, ACM, 1993, 1–12.Google Scholar
  10. [10]
    R.T. Dimpsey and R.K. Iyer, A measurement-based model to predict the performance impact of system modifications: a case study, IEEE transactions on Parallel and Distributed Systems, Vol.6, No.l, January 1995, 28–40.CrossRefGoogle Scholar
  11. [11]
    M.A. Driscoll and W.R. Daasch, Accurate predictions of parallel program execution time, Journal of Parallel and Distributed Computing 25, 1995, 16–30.CrossRefGoogle Scholar
  12. [12]
    T. Fahringer and H.A. Zima, Static parameter based performance prediction tool for parallel programs, In: Proc. of ACM International Conference on Supercomputing, ACM Press, 1993, 207–219.Google Scholar
  13. [13]
    S. Fortune and J. Wyllie, Parallelism in random access machines, In: Proc. of the 10th Annual Symposium on Theory of Computing, 1978, 114–118.Google Scholar
  14. [14]
    A.Y. Grama, A. Gupta, and V. Kumar, Isoefficiency: Measuring the scalability of parallel algorithms and architectures, IEEE Parallel & Distributed Technology, August 1993, 12–21.Google Scholar
  15. [15]
    A. Grujic, M. Tomasevic, and V. Milutinovic, A simulation study of hardwareoriented DSM approaches, IEEE Parallel & Distributed Technology, Spring 1996, 74–83.Google Scholar
  16. [16]
    M. Gupta and P. Banerjee, Compile-time estimation of communication costs on multicomputers, In: Proc. of the International Conference on Supercomputing, ACM Press, 1993, 470–475.Google Scholar
  17. [17]
    E. Hagersten and S. Haridi, A quantitative comparison of efficiency for large shared memory architecture, Tech. Report, Swedish Institute of Computer Science, Sweden, 1993.Google Scholar
  18. [18]
    L. H. Jamieson, Using algorithm characteristics to evaluate parallel architectures, Performance Evaluation of Supercomputers, Ed.: J.J. Martin, Elsevier Science Publishers B.V. (North-Holland), 1988.Google Scholar
  19. [19]
    A. Kapelnikov, R.R. Muntz, and M.D. Ercegovac, A methodology for performance analysis of parallel computation with looping constructs, Journal of Parallel and Distributed Computing 14, 1992, 105–120.CrossRefGoogle Scholar
  20. [20]
    De-Ron Liang and S.K. Tripathi, Performance prediction of parallel computation, In: Proc. of the 8th International Parallel Processing Symposium, IEEE Computer Society Press, 1994, 625–629.Google Scholar
  21. [21]
    V.W. Mak and S.F. Lundstrom, Predicting performance of parallel computations, IEEE Transactions on Parallel and Distributed Systems, Vol.1, No.3, July 1990.Google Scholar
  22. [22]
    J.W. Meira, Modeling performance of parallel programs, Tech. Report 589, Computer Science Department, the University of Rochester, June 1995.Google Scholar
  23. [23]
    D. Menasce, S.H. Noh, and S.K. Tripathi, A methodology for the performance prediction of massively parallel applications, In: Proc. of the 5th IEEE Symposium on Parallel and Distributed Processing, IEEE Computer Society Press, 1993, 250–257.Google Scholar
  24. [24]
    M. Parashar and S. hariri, Compile-time performance prediction of HPF/Fortran 90D, IEEE Parallel & Distributed Technology, Spring 1996, 57–73.Google Scholar
  25. [25]
    J.P. Singh, et al., An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors, In: Proc. of Supercomputing93, 214–225.Google Scholar
  26. [26]
    H.V. Sreekantaswamy, S. Chanson, and A. Wagner, Performance prediction modeling of multicomputers, In: Proc. of the 12th International Conference of Distributed Computing Systems, IEEE Computer Society Press, 1992, 278–285.Google Scholar
  27. [27]
    P. Stenstrom, T. Joe, and A. Gupta, Comparative performance evaluation of Cache-Coherent NUMA and COMA architectures, In: Proc. of the 19th Annual International Symposium on Computer Architecture, IEEE Computer Society Press, 1992, 80–91.Google Scholar
  28. [28]
    Xian-He Sun, The Relation of scalability and execution time, In: Proc. of IPPS’s 96), IEEE Computer Society Press, 1996.Google Scholar
  29. [29]
    Xian-He Sun and Jianping Zhu, Performance prediction: a case study using a scalable shared-virtual-memory machine, IEEE Parallel & Distributed Technology, Winter 1996, 36–49.Google Scholar
  30. [30]
    Xian-He Sun, The SCALA System for Performance Modeling and Prediction, http://www.csc.lsu.edu.
  31. [31]
    J. Tsai and A. Agarwal, Analyzing multiprocessor cache behavior through data reference modeling, In: Proc. of I993 ACM Sigmetrics, ACM, 1993, 236–247.Google Scholar
  32. [32]
    L.G. Valiant, A bridging model for parallel computation, Communication of ACM 33(8), 1990, 103–111.CrossRefGoogle Scholar
  33. [33]
    D.F. Vrsalovic, D.P. Siewiorek, Z. Z. Segall, and E.F. Gehrnger, Performance prediction and calibration for a class of multiprocessors, IEEE Transactions on Computers, 37(11), 1988, 1353–1365.CrossRefGoogle Scholar
  34. [34]
    H. Wabnig and G. Haring, Performance prediction of parallel systems with scalable specification-methodology and case study, Performance Evaluation Review, Vol. 22, No. 2-4, 1995, 46–62.CrossRefGoogle Scholar
  35. [35]
    K. Wang, Precise compile-time performance prediction for superscalar-based computers, In: Proc. ofACMSIGPLAN94, ACM Press, 1994, 73–84.Google Scholar
  36. [36]
    Xingfu Wu and Wei Li, Performance models for scalable cluster computing, Journal of Systems Architecture, Vol. 44, No. 3, Elsevier Science Publishers B.V. (North-Holland), Dec. 1997, 189–205.CrossRefGoogle Scholar
  37. [37]
    Zhiwei Xu and Kai Hwang, Early prediction of MPP performance: SP2, T3D, Paragon experiences, Parallel Computing, Vol. 22, No.7, 1996, 917–942.MATHCrossRefGoogle Scholar
  38. [38]
    Xiaodong Zhang, Zhichen Xu, and Lin Sun, Performance predictions on implicit Communication systems, In: Proc. of the 6th IEEE Symposium on Parallel and Distributed Processing, IEEE Computer Society Press, 1994, 560–568.Google Scholar
  39. [39]
    S.E. Hambrusch and A.K. Khokhar, C 3: An architecture-independent model for coarse-grained parallel machines, Dept. of Computer Science, Purdue University, Nov. 1993.Google Scholar

Copyright information

© Springer Science+Business Media New York 1999

Authors and Affiliations

  • Xingfu Wu
    • 1
    • 2
  1. 1.Department of Computer ScienceLouisiana State UniversityUSA
  2. 2.State Key Laboratory for Novel Software Technology at Nanjing UniversityChina

Personalised recommendations