Abstract
In modern day computing, the performance of parallel programs is bound by the dynamic execution context that includes inherent program behavior, resource requirements, co-scheduled programs sharing the system resources, hardware failures and input data. Besides this dynamic context, the optimization goals are increasingly becoming multi-objective and dynamic such as minimizing execution time while maximizing energy efficiency. Efficiently mapping the parallel threads on to the hardware cores is crucial to achieve these goals. This paper proposes a novel approach to judiciously map parallel programs to hardware in dynamic contexts and goals. It uses a simple, yet novel technique by collecting a set of mapping policies to determine best number of threads that are optimal for specific contexts. It then binds threads to cores for increased affinity. Besides, this approach also determines the optimal DVFS levels for these cores to achieve higher energy efficiency. On extensive evaluation with state-of-art techniques, this scheme outperforms them in the range 1.08x up to 1.21x and 1.39x over OpenMP default.
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-CONF-696003.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
NAS 2.3. http://phase.hpcc.jp/Omni/benchmarks/NPB/index.html
Ansel, J., Pacula, M., Wong, Y.L., Chan, C., Olszewski, M., O’Reilly, U.-M., Amarasinghe, S.: Siblingrivalry: online autotuning through local competitions. In: Proceedings of the 2012 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2012 (2012)
Cloutier, M.F., Paradis, C., Weaver, V.M.: Design and analysis of a 32-bit embedded high-performance cluster optimized for energy and performance. In: Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance Computing, Co-HPC 2014 (2014)
Dagum, L., Menon, R.: OpenMP: an industry-standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)
Emani, M.K., O’Boyle, M.F.P.: Celebrating diversity: a mixture of experts approach for runtime mapping in dynamic environments. In: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, 15–17 June 2015
Filieri, A., Hoffmann, H., Maggio, M.: Automated multi-objective control for self-adaptive software design. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE (2015)
Green-Marl. https://github.com/stanford-ppl/Green-Marl
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Li, J., Martinez, J.F.: Power-performance implications of thread-level paral- lelism on chip multiprocessors. In: IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2005, pp. 124–134. IEEE (2005)
“likwid," https://github.com/RRZE-HPC/likwid
Merkel, A., Bellosa, F.: Memory-aware scheduling for energy efficiency on multicore processors. In: Proceedings of the Conference on Power Aware Computing and Systems, HotPower 2008 (2008)
“Parsec 2.1," http://parsec.cs.princeton.edu/
Rountree, B., Lownenthal, D.K., de Supinski, B.R., Schulz, M., Freeh, V.W., Bletsch, T.: Adagio: making DVS practical for complex HPC applications. In: Proceedings of the 23rd International Conference on Supercomputing, ICS 2009, pp. 460–469. ACM, New York (2009). http://doi.acm.org/10.1145/1542275.1542340
Sridharan, S., Gupta, G., Sohi, G.S.: Adaptive, efficient, parallel execution of parallel programs. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2014 (2014)
Wang, Z., O’Boyle, M.F.P., Emani, M.K.: Smart, adaptive mapping of parallelism in the presence of external workload. In: Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO), CGO 2013 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Emani, M.K. (2017). Mapping Medley: Adaptive Parallelism Mapping with Varying Optimization Goals. In: Ding, C., Criswell, J., Wu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2016. Lecture Notes in Computer Science(), vol 10136. Springer, Cham. https://doi.org/10.1007/978-3-319-52709-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-52709-3_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52708-6
Online ISBN: 978-3-319-52709-3
eBook Packages: Computer ScienceComputer Science (R0)