The Journal of Supercomputing

, Volume 62, Issue 3, pp 1609–1634 | Cite as

A framework for efficient performance prediction of distributed applications in heterogeneous systems

  • Bogdan Florin CorneaEmail author
  • Julien Bourgeois


Predicting distributed application performance is a constant challenge to researchers, with an increased difficulty when heterogeneous systems are involved. Research conducted so far is limited by application type, programming language, or targeted system. The employed models become too complex and prediction cost increases significantly. We propose dPerf, a new performance prediction tool. In dPerf, we extended existing methods from the frameworks Rose and SimGrid. New methods have also been proposed and implemented such that dPerf would perform (i) static code analysis and (ii) trace-based simulation. Based on these two phases, dPerf predicts the performance of C, C++ and Fortran applications communicating using MPI or P2PSAP. Neither one of the used frameworks was developed explicitly for performance prediction, making dPerf a novel tool. dPerf accuracy is validated by a sequential Laplace code and a parallel NAS benchmark. For a low prediction cost and a high gain, dPerf yields accurate results.


Performance prediction Distributed applications Automatic static analysis Block benchmarking Trace-based simulation dPerf 



This work is funded by the French National Agency for Research under the ANR-07-CIS7-011-01 contract [2].


  1. 1.
    Adve VS, Bagrodia R, Browne JC, Deelman E, Dube A, Houstis EN, Rice JR, Sakellariou R, Sundaram-Stukel DJ, Teller PJ, Vernon MK (2000) POEMS: End-to-end performance design of large parallel adaptive computational systems. IEEE Trans Softw Eng 26:1027–1048 CrossRefGoogle Scholar
  2. 2.
    ANR CIP project web page.
  3. 3.
    Badia RM, Escalé F, Gabriel E, Gimenez J, Keller R, Labarta J, Müller MS (2004) Performance prediction in a grid environment. In: Grid computing. Lecture notes in computer science, vol 2970. Springer, Berlin/Heidelberg, pp 257–264 CrossRefGoogle Scholar
  4. 4.
    Bailey DH, Barszcz E, Barton JT, Browning DS, Carter RL, Dagum L, Fatoohi RA, Frederickson PO, Lasinski TA, Schreiber RS, Simon HD, Venkatakrishnan V, Weeratunga SK (1991) The NAS parallel benchmarks—summary and preliminary results. In: SC’91: proceedings of the 1991 ACM/IEEE conference on supercomputing. ACM Press, New York, pp 158–165 CrossRefGoogle Scholar
  5. 5.
    Bourgeois J, Spies F (2000) Performance prediction of an NAS benchmark program with ChronosMix environment. In: Euro-Par’00: the 6-th international Euro-Par conference on parallel processing. Springer, Berlin, pp 208–216 Google Scholar
  6. 6.
    Casanova H, Legrand A, Quinson M (2008) SimGrid: a generic framework for large-scale distributed experiments. In: UKSIM’08: proceedings of the 10th int conference on computer modeling and simulation. IEEE Computer Society, Los Alamitos, pp 126–131 CrossRefGoogle Scholar
  7. 7.
    Cornea BF, Bourgeois J (2011) Performance prediction of distributed applications using block benchmarking methods. In: PDP’11, 19-th int Euromicro conf on parallel, distributed and network-based processing. IEEE Computer Society, Los Alamitos Google Scholar
  8. 8.
  9. 9.
    Culler D, Karp R, Patterson D, Sahay A, Schauser KE, Santos E, Subramonian R, von Eicken T (1993) LogP: towards a realistic model of parallel computation. ACM Press, New York, pp 1–12 Google Scholar
  10. 10.
    El Baz D, Nguyen TT (2010) A self-adaptive communication protocol with application to high performance peer to peer distributed computing. In: PDP’10: proceedings of the 18th Euromicro conference on parallel, distributed and network-based processing. IEEE Computer Society, Los Alamitos, pp 327–333 Google Scholar
  11. 11.
    Ernst-Desmulier JB, Bourgeois J, Spies F, Verbeke J (2005) Adding new features in a peer-to-peer distributed computing framework. In: PDP’05: proceedings of the 13th Euromicro conference on parallel, distributed and network-based processing. IEEE Computer Society, Los Alamitos, pp 34–41 Google Scholar
  12. 12.
    Ernst-Desmulier JB, Bourgeois J, Spies F (2008) P2pperf: a framework for simulating and optimizing peer-to-peer-distributed computing applications. Concurr Comput 20(6):693–712 CrossRefGoogle Scholar
  13. 13.
    Fahringer T (1996) On estimating the useful work distribution of parallel programs under the P3T: a static performance estimator. Concurr Pract Exp 8:28–32 Google Scholar
  14. 14.
    Fahringer T, Zima HP (1993) A static parameter based performance prediction tool for parallel programs. In: ICS’93: proceedings of the 7th international conference on supercomputing. ACM Press, New York, pp 207–219 CrossRefGoogle Scholar
  15. 15.
    Finney SA (2001) Real-time data collection in Linux: a case study. Behav Res Methods Instrum Comput 33:167–173 CrossRefGoogle Scholar
  16. 16.
    Laplace transform instrumented with dPerf; simple block benchmarking method.
  17. 17.
  18. 18.
    Li J, Shi F, Deng N, Zuo Q (2009) Performance prediction based on hierarchy parallel features captured in multi-processing system. In: HPDC’09: proc of the 18th ACM int symposium on high performance distributed computing. ACM Press, New York, pp 63–64 CrossRefGoogle Scholar
  19. 19.
    Livadas PE, Croll S (1994) System dependence graphs based on parse trees and their use in software maintenance. Inf Sci 76(3–4):197–232 CrossRefGoogle Scholar
  20. 20.
    Marin G (2007) Application insight through performance modeling. In: IPCCC’07: proceedings of the performance, computing, and comm. conf. IEEE Computer Society, Los Alamitos Google Scholar
  21. 21.
    Marin G, Mellor-Crummey J (2004) Cross-architecture performance predictions for scientific applications using parameterized models. In: SIGMETRICS’04/Performance’04: proceedings of the joint international conference on measurement and modeling of computer systems. ACM Press, New York, pp 2–13 CrossRefGoogle Scholar
  22. 22.
  23. 23.
    Nguyen TT, El Baz D, Spiteri P, Jourjon G, Chau M (2010) High performance peer-to-peer distributed computing with application to obstacle problem. In: IPDPSW’10: IEEE international symposium on parallel distributed processing, workshops and Phd forum, pp 1–8 Google Scholar
  24. 24.
    Noeth M, Marathe J, Mueller F, Schulz M, de Supinski B (2006) Scalable compression and replay of communication traces in massively parallel environments. In: SC’06: proceedings of the 2006 ACM/IEEE conference on supercomputing. ACM Press, New York, p 144 Google Scholar
  25. 25.
    PAPI project website.
  26. 26.
  27. 27.
    Perfmon project webpage.
  28. 28.
    Pettersson M (2012) Perfctr project webpage.
  29. 29.
    Prakash S, Bagrodia RL (1998) MPI-SIM: using parallel simulation to evaluate mpi programs. In: WSC’98: proceedings of the 30th conference on winter simulation. IEEE Computer Society Press, Los Alamitos, pp 467–474 Google Scholar
  30. 30.
    Rose LD, Poxon H (2009) A paradigm change: from performance monitoring to performance analysis. In: SBAC-PAD, pp 119–126 Google Scholar
  31. 31.
    Saavedra RH, Smith AJ (1996) Analysis of benchmark characteristics and benchmark performance prediction. ACM Trans Comput Syst 14(4):344–384 CrossRefGoogle Scholar
  32. 32.
    Schordan M, Quinlan D (2003) A source-to-source architecture for user-defined optimizations. In: Modular programming languages. Lecture notes in computer science, vol 2789. Springer, Berlin/Heidelberg, pp 214–223 CrossRefGoogle Scholar
  33. 33.
    Skinner D, Kramer W (2005) Understanding the causes of performance variability in HPC workloads. In: IEEE workload characterization symposium, pp 137–149 CrossRefGoogle Scholar
  34. 34.
    Snavely A, Wolter N, Carrington L (2001) Modeling application performance by convolving machine signatures with application profiles. In: WWC’01: IEEE international workshop on workload characterization. IEEE Computer Society, Los Alamitos, pp 149–156 Google Scholar
  35. 35.
    Sundaram-Stukel D, Vernon MK (1999) Predictive analysis of a wavefront application using LogGP. In: 7th ACM SIGPLAN symposium on principles and practice of parallel programming, vol 34(8). ACM Press, New York, pp 141–150 CrossRefGoogle Scholar
  36. 36.
    The message passing interface standard.
  37. 37.
    van Gemund AJC (2003) Symbolic performance modeling of parallel systems. IEEE Trans Parallel Distrib Syst 14(2):154–165 CrossRefGoogle Scholar
  38. 38.
    Zaparanuks D, Jovic M, Hauswirth M (2009) Accuracy of performance counter measurements. In: ISPASS’09: IEEE international symposium on performance analysis of systems and software, pp 23–32 Google Scholar
  39. 39.
    Zhai J, Chen W, Zheng W (2010) Phantom: predicting performance of parallel applications on large-scale parallel machines using a single node. In: PPoPP’10: proceedings of the 15th ACM SIGPLAN symposium on principles and practice of parallel programming. ACM Press, New York, pp 305–314 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.UFC/FEMTO-ST InstituteUMR CNRS 6174BesançonFrance

Personalised recommendations