Exact Dependence Analysis for Increased Communication Overlap

  • Simone Pellegrini
  • Torsten Hoefler
  • Thomas Fahringer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7490)

Abstract

MPI programs are often challenged to scale up to several million cores. In doing so, the programmer tunes every aspect of the application code. However, for large applications, this is often not practical and expensive tracing tools and post-mortem analysis are employed to guide the tuning efforts finding hot-spots and performance bottlenecks. In this paper we revive the use of compiler analysis techniques to automatically unveil opportunities for communication/computation overlap using the result of exact data dependence analysis provided by the polyhedral model. We apply our technique to a 5-point stencil code showing performance improvements up to 28% using 512 cores.

Keywords

Message passing Compiler Analysis Data Dependence Analysis Polyhedral Model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Benabderrahmane, M.-W., Pouchet, L.-N., Cohen, A., Bastoul, C.: The Polyhedral Model Is More Widely Applicable Than You Think. In: Gupta, R. (ed.) CC 2010. LNCS, vol. 6011, pp. 283–303. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  2. 2.
    Danalis, A., Pollock, L., Swany, M.: Automatic MPI application transformation with ASPhALT. In: Par. and Distr. Proc. Symp., IPDPS 2007, pp. 1–8 (March 2007)Google Scholar
  3. 3.
    Danalis, A., Kim, K.Y., Pollock, L., Swany, M.: Transformations to parallel codes for communication-computation overlap. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, SC 2005, Washington, DC, USA, p. 58 (2005)Google Scholar
  4. 4.
    Danalis, A., Pollock, L., Swany, M., Cavazos, J.: MPI-aware compiler optimizations for improving communication-computation overlap. In: Proceedings of the 23rd International Conference on Supercomputing, ICS 2009, pp. 316–325 (2009)Google Scholar
  5. 5.
    Distributed and Parallel Systems Group, University of Innsbruck: Insieme Comiler and Runtime Infrastructure, http://insieme-compiler.org
  6. 6.
    Fahringer, T., Mehofer, E.: Buffer-Safe Communication Optimization Based on Data Flow Analysis and Performance Prediction. In: Malyshkin, V.E. (ed.) PaCT 1997. LNCS, vol. 1277, pp. 189–200. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  7. 7.
    Geimer, M., Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. CCPE Journal 22(6), 702–719 (2010)Google Scholar
  8. 8.
    Girbal, S., et al.: Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. Intl. Journal of Par. Progr. 34(3), 261–317 (2006)MATHCrossRefGoogle Scholar
  9. 9.
    Gropp, W., Lusk, E., Skjellum, A.: Using MPI: portable parallel programming with the message-passing interface, 2nd edn. MIT Press, Cambridge (1999)Google Scholar
  10. 10.
    Kennedy, K., Allen, J.R.: Optimizing compilers for modern architectures: a dependence-based approach, San Francisco, CA, USA (2002)Google Scholar
  11. 11.
    Knüpfer, A., et al.: The vampir performance analysis tool-set. In: Tools for High Performance Computing, pp. 139–155 (2008)Google Scholar
  12. 12.
    MPI Forum: MPI: A Message-Passing Interface Standard. Version 2.2 (September 4, 2009), http://www.mpi-forum.org (December 2009)
  13. 13.
    Shende, S.S., Malony, A.D.: The Tau Parallel Performance System. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)CrossRefGoogle Scholar
  14. 14.
    Thakur, R., Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Hoefler, T., Kumar, S., Lusk, E., Traeff, J.L.: MPI at Exascale. In: Procceedings of SciDAC 2010 (June 2010)Google Scholar
  15. 15.
    Truong, H.-L., Fahringer, T.: SCALEA: A Performance Analysis Tool for Distributed and Parallel Programs. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 41–55. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  16. 16.
    Vasilache, N., Cohen, A., Bastoul, C., Girbal, S.: Violated dependence analysis. In: ACM ICS (2006)Google Scholar
  17. 17.
    Verdoolaege, S.: isl: An Integer Set Library for the Polyhedral Model. In: Fukuda, K., van der Hoeven, J., Joswig, M., Takayama, N. (eds.) ICMS 2010. LNCS, vol. 6327, pp. 299–302. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Simone Pellegrini
    • 1
  • Torsten Hoefler
    • 2
    • 3
  • Thomas Fahringer
    • 1
  1. 1.Institute of InformaticsUniversity of InnsbruckAustria
  2. 2.University of Illinois at Urbana-ChampaignUSA
  3. 3.Department of Computer ScienceETH ZurichSwitzerland

Personalised recommendations