Advertisement

ASSIST: An FDO Source-to-Source Transformation Tool for HPC Applications

  • Youenn Lebras
  • Andres S. Charif Rubial
  • Romain Dolbeau
  • William JalbyEmail author
Conference paper

Abstract

The complexity and the diversity of computer architectures have dramaticaly evolved over the last decade, which makes it impossible to manually optimize codes for all these architectures. In addition, compilers must remain conservative with respect to their optimization choices because of their static cost model. One way to guide them is to use feedback data from data profiling of a representative training dataset (FDO/PGO) for a given application. It then becomes possible, based on that knowledge, to add specific compiler directives and/or flags to enhance performance. Moreover, automatic transformations simplifying portions of the application (e.g. specialization) can be applied. In this paper we present ASSIST, a directive-oriented source-to-source manipulation tool that aims at providing such assistance. The tool is integrated into the MAQAO toolset and takes advantage of all the available static and dynamic profiling data produced by the other tools. It also features a set of code transformations triggered by directives. The combination of both leads to an autotuning process that helps users to keep their code as generic as possible whilst also benefiting from a performance gain related to feedback or user knowledge. We demonstrate how we can build a compiler’s PGO-like tool and compare our first results to the Intel compiler PGO mode.

Notes

Acknowledgements

We would like to thank Gabriel Staffelbach (CERFACS) for providing our laboratory with the AVBP application, as well as Ghislain Lartigue and Vincent Moureau (CORIA) for providing us with YALES2. This work has been carried out by the Li-PaRAD laboratory, PeXL and the Exascale Computing Research laboratory, with the support of CEA, Intel, UVSQ. Intel granted us dedicted access to a Skylake SP machine on which the experiments were run. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the CEA, Intel, or UVSQ.

References

  1. 1.
  2. 2.
    Amaral, J.N., Berube, P.: Aestimo: a Feedback-Directed Optimization Evaluation Tool. IEEE, Piscataway, NJ, USA (2006)Google Scholar
  3. 3.
    Barthou, D., Rubial, A.C., Jalby, W., Koliai, S., Valensi, C.: Performance tuning of x86 openmp codes with maqao. In: Parallel Tools Workshop, pp. 95–113. Desden, Germany, September 2009. SpringerGoogle Scholar
  4. 4.
    Bendifallah, Z., Jalby, W., Noudohouenou, J., Oseret, E., Palomares, V., Rubial, A.C.: PAMDA: performance assessment using MAQAO toolset and differential analysis, pp. 107–127. Springer International Publishing, Cham (2014)Google Scholar
  5. 5.
    Bodin, F., Dolbeau, R., Bihan, S.: Hmpp: a hybrid multi-core parallel programming environment. In: Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), vol. 28 (2007)Google Scholar
  6. 6.
    Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: ACM SIGPLAN Notices, pp. 101–113. ACM (2008)Google Scholar
  7. 7.
  8. 8.
    Charif-Rubial, A.S., Barthou, D., Valensi, C., Shende, S., Malony, A., Jalby, W.: Mil: a language to build program analysis tools through static binary instrumentation. In: 20th Annual International Conference on High Performance Computing, pp. 206–215, Dec 2013Google Scholar
  9. 9.
    Chen, D., Xinliang Li, D., Moseley, T.: Autofdo: automatic feedback-directed optimization for warehouse-scale applications. In: Proceedings of the 2016 International Symposium on Code Generation and Optimization, CGO 2016, pp. 12–23. ACM, New York, NY, USA (2016)Google Scholar
  10. 10.
    Chris Lattner et Vikram Adve. Dms/spl reg: program transformations for practical scalable software evolution. In: Proceedings of the 26th International Conference on Software Engineering, ICSE 2004, pp. 625–634. IEEE (2004)Google Scholar
  11. 11.
    Chris Lattner et Vikram Adve. Llvm a compilation framework for lifelong program analysis and transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, p. 75. IEEE Computer (2004)Google Scholar
  12. 12.
    Chun Chen, J.C., Hall, M.: Chill: a framework for composing high-level loop transformations, June 2008Google Scholar
  13. 13.
    Cordy, J.R.: Source transformation, analysis and generation in txl. In: Proceedings of the 2006 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, PEPM 2006, pp. 1–11. ACM, New York, NY, USA (2006)Google Scholar
  14. 14.
  15. 15.
    Dave et al.: Cetus: a source-to-source compiler infrastructure for multicores. Computer, 36—42, December 2009Google Scholar
  16. 16.
  17. 17.
    Gonze, X. et al.: Abinit: first-principles approach to material and nanosystem properties. Comput. Phys. Commun., 2582–2615. Elsevier (2009)Google Scholar
  18. 18.
  19. 19.
    Hartono, A., Norris, B., Sadayappan, P.: Annotation-based empirical performance tuning using orio. In: 2009 IEEE International Symposium on Parallel Distributed Processing, pp. 1–11, May 2009Google Scholar
  20. 20.
  21. 21.
    Irigoin et al: Interprocedural analyses forprogramming environments. In: Workshop on Evironments and Tools For Parallel Scientifc Computing, Saint-Hilaire du Touvier, France, August 1992Google Scholar
  22. 22.
    Koliaï, S., Bendifallah, Z., Tribalat, M., Valensi, C., Acquaviva, J.-T., Jalby, W.: Quantifying performance bottleneck cost through differential analysis. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS 2013, pp. 263–272. ACM, New York, NY, USA, (2013)Google Scholar
  23. 23.
    MAQAO toolsuite. http://www.maqao.org
  24. 24.
    Novillo, D.: Samplepgo: the power of profile guided optimizations without the usability burden. In: Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, LLVM-HPC 2014, pp. 22–28. IEEE Press, Piscataway, NJ, USA (2014)Google Scholar
  25. 25.
    Palkowski, M., Bielecki, W.: TRACO Parallelizing Compiler, pp. 409–421. Springer International Publishing, Cham (2015)Google Scholar
  26. 26.
  27. 27.
    Paul Klint, J.V., van der Storm, T.: Rascal a domain specific language for source code analysis ad manipulation. In: IEEE International Working Conference on Source Code Analysis and Manipulation, pp. 168–177. IEEE Computer Society (2009)Google Scholar
  28. 28.
    Quinlan et al.: Rose: compiler support for object-oriented framework. In: Parallel Processing Letters, pp. 215—226. Lawrence Livermore National Laboratory, Livermore, CA, USA, October 2000. World ScientificGoogle Scholar
  29. 29.
    Rubial, A.C., Oseret, E., Noudohouenou, J., Jalby, W., Lartigue, G.: CQA: a code quality analyzer tool at binary level. In: HiPC, pp. 1–10. IEEE Computer Society (2014)Google Scholar
  30. 30.
    Rudgyard, M., Schonfeld, T.: Steady and unsteady flow simulationsusing the hybrid flow solver avbp. AIAA J., 1378–1385. AIAA ARC (1999)Google Scholar
  31. 31.
    Takizawa, H., Suda, R., Hirasawa, S.: Xevtgen: fortran code transformer generator for high performance scienti c codes. Int. J. Network. Comput., 263—289 (2016)Google Scholar
  32. 32.
    Verdoolaege, S., et al.: Polyhedral parallel code generation for cuda. ACM Trans. Architec. Code Optim. ACM, January 2013Google Scholar
  33. 33.
    Vermaas, R., Bravenboer, M., Kalleberg, K.T., Visser, E.: Stratego/xt 0.17. a language and toolset for program transformation. In: Science of Computer Programming. Elsevier, May 2008Google Scholar
  34. 34.
    Wu, C., Lian, R., Zhang, J., Ju, R., Chan, S., Liu, L., Feng, X., Zhang, Z.: An Overview of the Open Research Compiler, pp. 17–31. Springer, Berlin Heidelberg, Berlin, Heidelberg (2005)Google Scholar
  35. 35.
    Xiao, X., Hirasawa, S., Takizawa, H., Kobayashi, H.: An approach to customization of compiler directives for application-specific code transformations. In: 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs, pp. 99–106, Sept 2014Google Scholar
  36. 36.
    Yi, Q.: Poet: a scripting language for applying parameterized source-to-source program transformations. In: Software Practice And Experience, pp. 675–706. University of Texas at San Antonio, USA, May 2012. John Wiley and SonsGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Youenn Lebras
    • 1
    • 3
  • Andres S. Charif Rubial
    • 1
    • 2
    • 3
  • Romain Dolbeau
    • 4
  • William Jalby
    • 1
    • 3
    Email author
  1. 1.UVSQ/Exascale Computing ResearchVersaillesFrance
  2. 2.PeXLVersaillesFrance
  3. 3.Exascale Research ComputingVersaillesFrance
  4. 4.AtosBezonsFrance

Personalised recommendations