Automatic detection of power bottlenecks in parallel scientific applications
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
In this paper we present an extension of the pmlib framework for power-performance analysis that permits a rapid and automatic detection of power sinks during the execution of concurrent scientific workloads. The extension is shaped in the form of a multithreaded Python module that offers high reliability and flexibility, rendering an overall inspection process that introduces low overhead. Additionally, we investigate the advantages and drawbacks of the RAPL power model, introduced in the Intel Xeon “Sandy-Bridge” CPU, versus a data acquisition system from National Instruments.
- Albers, S (2010) Energy-efficient algorithms. Commun ACM 53: pp. 86-96 CrossRef
- Aliaga, JI, Bollhöfer, M, Martín, AF, Quintana-Ortí, ES (2011) Exploiting thread-level parallelism in the iterative solution of sparse linear systems. Parallel Comput 37: pp. 183-202 CrossRef
- Aliaga, JI, Dolz, MF, Martín, AF, Mayo, R, Quintana-Ortí, ES (2012) Leveraging task-parallelism in energy-efficient ILU preconditioners. 2nd int con on ICT as key technology against global warming—ICT-GLOW. pp. 55-63 CrossRef
- Alonso, P, Badia, RM, Labarta, J, Barreda, M, Dolz, MF, Mayo, R, Quintana-Ortí, ES, Reyes, R (2012) Tools for power-energy modelling and analysis of parallel scientific applications. 41st int conf on parallel processing—ICPP. pp. 420-429
- Alonso, P, Dolz, MF, Igual, FD, Mayo, R, Quintana-Ortí, ES (2012) Reducing energy consumption of dense linear algebra operations on hybrid CPU-GPU platforms. Proc 10th IEEE int symp on parallel and distributed processing with applications—ISPA 2012. pp. 56-62 CrossRef
- Alonso, P, Dolz, MF, Igual, FD, Quintana-Ortí, ES, Mayo, R (2013) Runtime scheduling of the LU factorization: performance and energy. Proc energy efficiency in large scale distributed systems conference—EE-LSDS 2013.
- Ashby, S (2010) The opportunities and challenges of Exascale computing. Summary report of the advanced scientific computing advisory committee (ASCAC) subcommittee.
- Barreda, M, Barrachina, S, Catalán, S, Dolz, MF, Fabregat, G, Mayo, R, Quintana, ES (2013) A framework for power-performance analysis of parallel scientific applications. Third int conference on smart grids, green communications and IT energy-aware technologies—Energy 2013. pp. 114-119
- Bergman, K (2008) Exascale computing study: technology challenges in achieving exascale systems. DARPA IPTO Exascale computing study.
- Castillo, M, Fernández, JC, Mayo, R, Quintana-Ortí, ES, Roca, V (2012) Analysis of strategies to save energy for message-passing dense linear algebra kernels. Proc 20th euromicro conference on parallel, distributed and network based processing. pp. 346-352
- Dongarra, J (2011) The international Exascale software project roadmap. Int J High Perform Comput Appl 25: pp. 3-60 CrossRef
- Duranton, M (2013) The HiPEAC vision for advanced computing in horizon 2020.
- El Mehdi Diouri, M, Dolz, MF, Glück, O, Lefèvre, L, Alonso, P, Catalán, S, Mayo, R, Quintana-Ortí, ES (2013) Solving some mysteries in power monitoring of servers: take care of your wattmeters!. Proc energy efficiency in large scale distributed systems conference—EE-LSDS 2013.
- Advanced configuration and power interface specification, revision 5.0.
- Intel 64 and IA-32 architectures software developer manual.
- Intel Xeon processor.
- Intel: Intel math kernel library (mkl) 11.0. http://software.intel.com/en-us/intel-mkl
- Knüpfer, A, Brunst, H (2008) The vampir performance analysis tool-set. Tools for high performance computing. pp. 139-155 CrossRef
- Kunkel J (2011) HDTrace—a tracing and simulation environment of application and system interaction. Tech Rep 2, Department of Informatics, Scientific Computing. Universität Hamburg
- Mienik M CPU burn-in v1.01. http://www.cpuburnin.com/
- NVIDIA CUDA compute unified device architecture programming guide.
- Official Website. Python Programming Language. http://www.python.org/
- Pillet, V, Labarta, J, Cortes, T, Girona, S (1995) Paraver: a tool to visualize and analyze parallel code. 18th world OCCAM and transputer user group technical meeting.
- Quintana-Ortí, G, Igual, FD, Quintana-Ortí, ES, Geijn, RA (2009) Solving dense linear systems on platforms with multiple hardware accelerators. SIGPLAN Not 44: pp. 121-130 CrossRef
- Quintana-Ortí, G, Quintana-Ortí, E, Geijn, R, Zee, FV, Chan, E (2009) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36: pp. 14:1-14:26 CrossRef
- Saxe, E (2010) Power-efficient software. ACM queue.
- Servat H, Llort G Extrae user guide manual for version 2.1.1
- Shende, SS, Malony, AD (2006) The tau parallel performance system. Int J High Perform Comput Appl 20: pp. 287-311 CrossRef
- The Green500 list (2012). http://www.green500.org
- Automatic detection of power bottlenecks in parallel scientific applications
Computer Science - Research and Development
Volume 29, Issue 3-4 , pp 221-229
- Cover Date
- Print ISSN
- Online ISSN
- Springer Berlin Heidelberg
- Additional Links
- Power efficiency
- High performance computing
- Profiling and tracing power consumption
- Scientific applications
- Industry Sectors