A Mixed Method of Parallel Software Auto-Tuning Using Statistical Modeling and Machine Learning

  • Anatoliy Doroshenko
  • Pavlo Ivanenko
  • Oleksandr Novak
  • Olena YatsenkoEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1007)


A mixed method combining formal and auto-tuning approaches and aimed at maximizing efficiency of parallel programs (in terms of execution time) is proposed. The formal approach is based on algorithmic algebra and the usage of tools for automated design and synthesis of programs based on high-level algorithm specifications (schemes). Parallel software auto-tuning is the method of adjusting some structural parameters of a program to a target hardware platform to speed-up computation as much as possible. Previously, we have developed a framework intended to automate the generation of an auto-tuner from a program source code. However, auto-tuning for complex and nontrivial parallel systems is usually time-consuming due to empirical evaluation of huge amount of parameter values combinations of an initial parallel program in a target environment. In this paper, we extend our approach with statistical modeling and neural network algorithms that allow to reduce significantly the space of possible parameter combinations. The improvement consists in automatic training of a neural network model on results of “traditional” tuning cycles and the subsequent replacement of some auto-tuner calls with an evaluation from the statistical model. The method allows, particularly, transferring knowledge about the influence of parameters on program performance between “similar” (in terms of hardware architecture) computing environments for the same applications. The idea is to reuse a model trained on data from a similar environment. The use of the method is illustrated by an example of tuning a parallel sorting program which combines several sorting methods.


Algorithmic algebra Automated program design Auto-tuning Machine learning Neural network Parallel computation Statistical modeling 


  1. 1.
    Naono, K., Teranishi, K., Cavazos, J., Suda, R.: Software Automatic Tuning: From Concepts to State-of-the-Art Results. Springer, Berlin (2010). Scholar
  2. 2.
    Durillo, J., Fahringer, T.: From single- to multi-objective auto-tuning of programs: advantages and implications. Sci. Program. 22(4), 285–297 (2014)Google Scholar
  3. 3.
    Doroshenko, A., Shevchenko, R.: A rewriting framework for rule-based programming dynamic applications. Fundamenta Informaticae 72(1–3), 95–108 (2006)zbMATHGoogle Scholar
  4. 4.
    Andon, P.I., Doroshenko, A.Y., Tseytlin, G.O., Yatsenko, O.A.: Algebra-Algorithmic Models and Methods of Parallel Programming. Akademperiodyka, Kyiv (2007). (in Russian)CrossRefGoogle Scholar
  5. 5.
    Yatsenko, O.: On parameter-driven generation of algorithm schemes. In: Popova-Zeugmann, L. (ed.) CS&P’2012, pp. 428–438. Humboldt University Press, Berlin (2012)Google Scholar
  6. 6.
    Doroshenko, A., Zhereb, K., Yatsenko, O.: Developing and optimizing parallel programs with algebra-algorithmic and term rewriting tools. In: Ermolayev, V., Mayr, H.C., Nikitchenko, M., Spivakovsky, A., Zholtkevych, G. (eds.) ICTERI 2013. CCIS, vol. 412, pp. 70–92. Springer, Cham (2013). Scholar
  7. 7.
    Ivanenko, P., Doroshenko, A., Zhereb, K.: TuningGenie: auto-tuning framework based on rewriting rules. In: Ermolayev, V., Mayr, H., Nikitchenko, M., Spivakovsky, A., Zholtkevych, G. (eds.) ICTERI 2014. CCIS, vol. 469, pp. 139–158. Springer, Cham (2014). Scholar
  8. 8.
    Doroshenko, A., Ivanenko, P., Ovdii, O., Yatsenko, O.: Automated program design—an example solving a weather forecasting problem. Open Phys. 14(1), 410–419 (2016)CrossRefGoogle Scholar
  9. 9.
    Doroshenko, A., Ivanenko, P., Novak, O., Yatsenko, O.: Optimization of parallel software tuning with statistical modeling and machine learning. In: Ermolayev, V., et al. (eds.) ICTERI 2018. Communications in Computer and Information Science, vol. 2105, pp. 219–226. Springer, Cham (2018)Google Scholar
  10. 10.
  11. 11.
    Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Education, New York (1997)zbMATHGoogle Scholar
  12. 12.
    Givens, G.H., Hoeting, J.A.: Computational Statistics, 2nd edn. Wiley, Chichester (2012)CrossRefGoogle Scholar
  13. 13.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)zbMATHGoogle Scholar
  14. 14.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Class RecursiveAction (Java SE 9 & JDK 9) – Oracle Help Center. Accessed 30 Nov 2018
  16. 16.
    Class ForkJoinTask (Java SE 9 & JDK 9) – Oracle Help Center. Accessed 30 Nov 2018
  17. 17.
    Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Crawley, M.J.: The R Book, 1st edn. Wiley, Chichester (2012)CrossRefGoogle Scholar
  19. 19.
    Fletcher, R.: Practical Methods of Optimization, 2nd edn. Wiley, Chichester (2000)CrossRefGoogle Scholar
  20. 20.
    Karcher, T., Schaefer, C., Pankratius, V.: Auto-tuning support for manycore applications: perspectives for operating systems and compilers. ACM SIGOPS Oper. Syst. Rev. 43(2), 96–97 (2009)CrossRefGoogle Scholar
  21. 21.
    Whaley, R., Petitet, A., Dongarra, J.J.: Automated empirical optimizations of software and the ATLAS Project. Parallel Comput. 27(1–2), 3–35 (2001)CrossRefGoogle Scholar
  22. 22.
    Frigo, M., Johnson, S.: FFTW: an adaptive software architecture for the FF. Acoust. Speech Sig. Process. 3, 1381–1384 (1998)Google Scholar
  23. 23.
    Schaefer, C.A., Pankratius, V., Tichy, W.F.: Atune-IL: an instrumentation language for auto-tuning parallel applications. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 9–20. Springer, Heidelberg (2009). Scholar
  24. 24.
    Tapus, C., Chung, I.-H., Hollingsworth, J.K.: Active harmony: towards automated performance tuning. In: 2002 ACM/IEEE Conference on Supercomputing, SC 2002, pp. 1–11. IEEE Computer Society, Los Alamitos (2002)Google Scholar
  25. 25.
    Yi, Q., Seymour, K., You, H., Vuduc, R., Quinla, D.: POET: parameterized optimizations for empirical tuning. In: Parallel and Distributed Processing Symposium 2007, IPDPS 2007, p. 447. IEEE Computer Society, Piscataway (2007)Google Scholar
  26. 26.
    Katagiri, T., Kise, K., Honda, H., Yuba, T.: FIBER: a generalized framework for auto-tuning software. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds.) ISHPC 2003. LNCS, vol. 2858, pp. 146–159. Springer, Heidelberg (2003). Scholar
  27. 27.
    Pekhimenko, G., Brown, A.D.: Efficient program compilation through machine learning techniques. In: Naono, K., Teranishi, K., Cavazos, J., Suda, R. (eds.) Software Automatic Tuning, pp. 335–351. Springer, New York (2010). Scholar
  28. 28.
    Fursin, G., et al.: Milepost GCC: machine learning enabled self-tuning compiler. Int. J. Parallel Program. 39(3), 296–327 (2011)CrossRefGoogle Scholar
  29. 29.
    Plotnikov, D., Melnik, D., Vardanyan, M., Buchatskiy, R., Zhuykov, R., Lee, J.-H.: Automatic tuning of compiler optimizations and analysis of their impact. In: 8th International Workshop on Automatic Performance Tuning (iWAPT 2013). Procedia Computer Science, vol. 18, pp. 1312–1321. Elsevier B.V., Amsterdam (2013)CrossRefGoogle Scholar
  30. 30.
    Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). Scholar
  31. 31.
    Eggensperger, K., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Efficient benchmarking of hyperparameter optimizers via surrogates. In: 29th AAAI Conference on Artificial Intelligence (AAAI 2015), pp. 1114–1120. AAAI Press, Palo Alto (2015)Google Scholar
  32. 32.
    Rahman, M., Pouchet, L.-N., Sadayappan, P.: Neural network assisted tile size selection. In: 5th International Workshop on Automatic Performance Tuning (iWAPT 2010), pp. 1–15. Springer, Berkeley (2010)Google Scholar
  33. 33.
    Kofler, K., Grasso, I., Cosenza, B., Fahringer, T.: An automatic input-sensitive approach for heterogeneous task partitioning. In: 27th ACM International Conference on Supercomputing (ICS 2013), pp. 149–160. ACM, New York (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Anatoliy Doroshenko
    • 1
  • Pavlo Ivanenko
    • 1
  • Oleksandr Novak
    • 1
  • Olena Yatsenko
    • 1
    Email author
  1. 1.Institute of Software Systems of National Academy of Sciences of UkraineKyivUkraine

Personalised recommendations