Multi-Objective Auto-Tuning with Insieme: Optimization and Trade-Off Analysis for Time, Energy and Resource Usage

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8632)


The increasing complexity of modern multi- and many-core hardware design makes performance tuning of parallel applications a difficult task. In the past, auto-tuners have been successfully applied to minimize execution time. However, besides execution time, additional optimization goals have recently arisen, such as energy consumption or computing costs. Therefore, more sophisticated methods capable of exploiting and identifying the trade-offs among these goals are required. In this work we present and discuss results of applying a multi-objective search-based auto-tuner to optimize for three conflicting criteria: execution time, energy consumption, and resource usage. We examine a method, called RS-GDE3, to tune HPC codes using the Insieme parallelizing and optimizing compiler. Our results demonstrate that RS-GDE3 offers solutions of superior quality than those provided by a hierarchical and a random search at a fraction of the required time (5%) or energy (8%). A comparison to a state-of-the-art multi-objective optimizer (NSGA-II) shows that RS-GDE3 computes solutions of higher quality. Finally, based on the trade-off solutions found by RS-GDE3, we provide a detailed analysis and several hints on how to improve the design of multi-objective auto-tuners and code optimization.


Execution Time Resource Usage Random Search Target Platform Tile Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Tapus, C., Chung, I., Hollingsworth, J.: Active harmony: Towards automated performance tuning. In: IEEE 2002 Conference on Supercomputing (2002)Google Scholar
  2. 2.
    Li, D., de Supinski, B.R., Schulz, M., et al.: Strategies for energy-efficient resource management of hybrid programming models. IEEE Transactions on Parallel and Distributed Systems 24(1), 144–157 (2013)CrossRefGoogle Scholar
  3. 3.
    Tiwari, A., Laurenzano, M.A., Carrington, L., Snavely, A.: Auto-tuning for energy usage in scientific applications. In: Alexander, M., et al. (eds.) Euro-Par 2011 Workshops. Part II. LNCS, vol. 7156, pp. 178–187. Springer, Heidelberg (2012)Google Scholar
  4. 4.
    Hoste, K., Eeckhout, L.: Cole: Compiler optimization level exploration. In: Proc. of the 6th Intl. Symposium on Code Generation and Optimization. ACM (2008)Google Scholar
  5. 5.
    Rahman, S., Guo, J., Bhat, A., et al.: Studying the impact of application-level optimizations on the power consumption of multi-core architectures. In: Proc. of the 9th Conference on Computing Frontiers. ACM (2012)Google Scholar
  6. 6.
    Dong, Y., Chen, J., Yang, X.: et al.: Energy-oriented openmp parallel loop scheduling. In: International Symposium on Parallel and Distributed Processing with Applications, ISPA 2008. IEEE (2008)Google Scholar
  7. 7.
    Jordan, H., Thoman, P., Durillo, J., et al.: A multi-objective auto-tuning framework for parallel codes. In: IEEE 2012 Conference on Supercomputing (2012)Google Scholar
  8. 8.
    Fox, A., Griffith, R.: Joseph, et al.: Above the clouds: A berkeley view of cloud computing. Dept. Electrical Eng. and Comput. Sciences, University of California, Berkeley, Rep. UCB/EECS 28 (2009)Google Scholar
  9. 9.
    Hähnel, M., Döbel, B., Völp, M., et al.: Measuring energy consumption for short code paths using RAPL. SIGMETRICS Perform. Eval. Rev. 40(3) (January 2012)Google Scholar
  10. 10.
    Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: A comparative case study and the strength pareto approach. IEEE Transactions on Evolutionary Computation 3(4) (1999)Google Scholar
  11. 11.
    Flynn, M., Hung, P., Rudd, K.: Deep submicron microprocessor design issues. IEEE Micro, 19(4) (1999)Google Scholar
  12. 12.
    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation 6(2), 182–197 (2002)CrossRefGoogle Scholar
  13. 13.
    Freeh, V., Lowenthal, D.: Using multiple energy gears in mpi programs on a power-scalable cluster. In: Proc. of the 10th ACM SIGPLAN PPoPP. ACM (2005)Google Scholar
  14. 14.
    Balaprakash, P., Tiwari, A., Wild, S.: Multi-objective optimization of hpc kernels for performance, power, and energy. In: 4th International Workshop on Performance Modeling, Benchmarking, and Simulation of HPC Systems, PMBS 2012 (2013)Google Scholar
  15. 15.
    Freeh, V., Lowenthal, D., Pan, F., et al.: Analyzing the energy-time trade-off in high-performance computing applications. IEEE Transactions on Parallel and Distributed Systems, 18(6) (2007)Google Scholar
  16. 16.
    Lively: C., Wu, X., Taylor, V., et al.: Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems. International Journal of High Performance Computing Applications 25(3) (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Institute of Computer ScienceUniversity of InnsbruckAustria

Personalised recommendations