Automatic detection of power bottlenecks in parallel scientific applications

  • María Barreda
  • Sandra Catalán
  • Manuel F. Dolz
  • Rafael Mayo
  • Enrique S. Quintana-Ortí
Special Issue Paper


In this paper we present an extension of the pmlib framework for power-performance analysis that permits a rapid and automatic detection of power sinks during the execution of concurrent scientific workloads. The extension is shaped in the form of a multithreaded Python module that offers high reliability and flexibility, rendering an overall inspection process that introduces low overhead. Additionally, we investigate the advantages and drawbacks of the RAPL power model, introduced in the Intel Xeon “Sandy-Bridge” CPU, versus a data acquisition system from National Instruments.


Power efficiency High performance computing Profiling and tracing power consumption Scientific applications 


  1. 1.
    Albers S (2010) Energy-efficient algorithms. Commun ACM 53:86–96 CrossRefGoogle Scholar
  2. 2.
    Aliaga JI, Bollhöfer M, Martín AF, Quintana-Ortí ES (2011) Exploiting thread-level parallelism in the iterative solution of sparse linear systems. Parallel Comput 37(3):183–202 CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Aliaga JI, Dolz MF, Martín AF, Mayo R, Quintana-Ortí ES (2012) Leveraging task-parallelism in energy-efficient ILU preconditioners. In: 2nd int con on ICT as key technology against global warming—ICT-GLOW. Lecture notes in computer science, vol 7453, pp 55–63 CrossRefGoogle Scholar
  4. 4.
    Alonso P, Badia RM, Labarta J, Barreda M, Dolz MF, Mayo R, Quintana-Ortí ES, Reyes R (2012) Tools for power-energy modelling and analysis of parallel scientific applications. In: 41st int conf on parallel processing—ICPP, pp 420–429 Google Scholar
  5. 5.
    Alonso P, Dolz MF, Igual FD, Mayo R, Quintana-Ortí ES (2012) Reducing energy consumption of dense linear algebra operations on hybrid CPU-GPU platforms. In: Proc 10th IEEE int symp on parallel and distributed processing with applications—ISPA 2012, pp 56–62 CrossRefGoogle Scholar
  6. 6.
    Alonso P, Dolz MF, Igual FD, Quintana-Ortí ES, Mayo R (2013) Runtime scheduling of the LU factorization: performance and energy. In: Proc energy efficiency in large scale distributed systems conference—EE-LSDS 2013 (to appear) Google Scholar
  7. 7.
    Ashby S et al. (2010) The opportunities and challenges of Exascale computing. In: Summary report of the advanced scientific computing advisory committee (ASCAC) subcommittee. Google Scholar
  8. 8.
    Barreda M, Barrachina S, Catalán S, Dolz MF, Fabregat G, Mayo R, Quintana ES (2013) A framework for power-performance analysis of parallel scientific applications. In: Third int conference on smart grids, green communications and IT energy-aware technologies—Energy 2013, pp 114–119 Google Scholar
  9. 9.
    Bergman K et al. (2008) Exascale computing study: technology challenges in achieving exascale systems. In: DARPA IPTO Exascale computing study. Google Scholar
  10. 10.
    Castillo M, Fernández JC, Mayo R, Quintana-Ortí ES, Roca V (2012) Analysis of strategies to save energy for message-passing dense linear algebra kernels. In: Proc 20th euromicro conference on parallel, distributed and network based processing, pp 346–352 Google Scholar
  11. 11.
    Dongarra J et al. (2011) The international Exascale software project roadmap. Int J High Perform Comput Appl 25(1):3–60 CrossRefGoogle Scholar
  12. 12.
    Duranton M et al. (2013) The HiPEAC vision for advanced computing in horizon 2020. Google Scholar
  13. 13.
    El Mehdi Diouri M, Dolz MF, Glück O, Lefèvre L, Alonso P, Catalán S, Mayo R, Quintana-Ortí ES (2013) Solving some mysteries in power monitoring of servers: take care of your wattmeters! In: Proc energy efficiency in large scale distributed systems conference—EE-LSDS 2013 (to appear) Google Scholar
  14. 14.
    HP Corp, Intel Corp, Microsoft Corp, Phoenix Tech Ltd, Toshiba Corp (2011) Advanced configuration and power interface specification, revision 5.0 Google Scholar
  15. 15.
    Intel Corp (2012) Intel 64 and IA-32 architectures software developer manual Google Scholar
  16. 16.
    Intel Corp (2012) Intel Xeon processor. Google Scholar
  17. 17.
    Intel: Intel math kernel library (mkl) 11.0.
  18. 18.
    Knüpfer A, Brunst H et al. (2008) The vampir performance analysis tool-set. In: Tools for high performance computing, pp 139–155 CrossRefGoogle Scholar
  19. 19.
    Kunkel J (2011) HDTrace—a tracing and simulation environment of application and system interaction. Tech Rep 2, Department of Informatics, Scientific Computing. Universität Hamburg Google Scholar
  20. 20.
    Mienik M CPU burn-in v1.01.
  21. 21.
    NVIDIA Corporation (2009) NVIDIA CUDA compute unified device architecture programming guide, 2.3.1 edn. Google Scholar
  22. 22.
    Official Website. Python Programming Language.
  23. 23.
    Pillet V, Labarta J, Cortes T, Girona S (1995) Paraver: a tool to visualize and analyze parallel code. In: 18th world OCCAM and transputer user group technical meeting Google Scholar
  24. 24.
    Quintana-Ortí G, Igual FD, Quintana-Ortí ES, van de Geijn RA (2009) Solving dense linear systems on platforms with multiple hardware accelerators. SIGPLAN Not 44(4):121–130. doi:10.1145/1594835.1504196 CrossRefGoogle Scholar
  25. 25.
    Quintana-Ortí G, Quintana-Ortí E, van de Geijn R, Zee FV, Chan E (2009) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36(3):14:1–14:26 CrossRefGoogle Scholar
  26. 26.
    Saxe E (2010) Power-efficient software. In: ACM queue Google Scholar
  27. 27.
    Servat H, Llort G Extrae user guide manual for version 2.1.1 Google Scholar
  28. 28.
    Shende SS, Malony AD (2006) The tau parallel performance system. Int J High Perform Comput Appl 20(2):287–311 CrossRefGoogle Scholar
  29. 29.
    The Green500 list (2012).

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • María Barreda
    • 1
  • Sandra Catalán
    • 1
  • Manuel F. Dolz
    • 2
  • Rafael Mayo
    • 1
  • Enrique S. Quintana-Ortí
    • 1
  1. 1.Depto. de Ingeniería y Ciencia de ComputadoresUniv. Jaume ICastellónSpain
  2. 2.Department of InformaticsUniversity of HamburgHamburgGermany

Personalised recommendations