Abstract
Parallel applications used to be executed alone until their termination on partitions of supercomputers: a very static environment for very static applications. The recent shift to multicore architectures for desktop and embedded systems as well as the emergence of cloud computing is raising the problem of the impact of the execution context on performance. The number of criteria to take into account for that purpose is significant: architecture, system, workload, dynamic parameters, etc. Finding the best optimization for every context at compile time is clearly out of reach. Dynamic optimization is the natural solution, but it is often costly in execution time and may offset the optimization it is enabling. In this paper, we present a static-dynamic compiler optimization technique that generates loop-based programs with dynamic auto-tuning capabilities with very low overhead. Our strategy introduces switchable scheduling, a family of program transformations that allows to switch between optimized versions while always processing useful computation. We present both the technique to generate self-adaptive programs based on switchable scheduling and experimental evidence of their ability to sustain high-performance in a dynamic environment.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: PACT 2013 IEEE International Conference on Parallel Architecture and Compilation Techniques, Juan-les-Pins, France, pp. 7–16 (September 2004)
Bodin, F., Kisuki, T., Knijnenburg, P.M.W., O’Boyle, M.F.P., Rohou, E.: Iterative compilation in a non-linear optimisation space. In: W. on Profile and Feedback Directed Compilation, Paris (October 1998)
Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: PLDI 2008 ACM Conf. on Programming language Design and Implementation, Tucson, USA (June 2008)
Byler, M., Davies, J.R.B., Huson, C., Leasure, B., Wolfe, M.: Multiple version loops. In: International Conference on Parallel Processing (August 1987)
Emani, M., Wang, Z., O’Boyle, M.: Smart, adaptive mapping of parallelism in the presence of external workload. In: 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 1–10 (2013)
Feautrier, P.: Parametric integer programming. RAIRO Recherche Opérationnelle 22(3), 243–268 (1988)
Feautrier, P.: Some efficient solutions to the affine scheduling problem, part II: multidimensional time. Int. J. of Parallel Programming 21(6), 389–420 (1992)
Girbal, S., Vasilache, N., Bastoul, C., Cohen, A., Parello, D., Sigler, M., Temam, O.: Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. Int. J. of Parallel Programming 34(3), 261–317 (2006)
Jimborean, A., Mastrangelo, L., Loechner, V., Clauss, P.: VMAD: An Advanced Dynamic Program Analysis & Instrumentation Framework. In: O’Boyle, M. (ed.) CC 2012. LNCS, vol. 7210, pp. 220–239. Springer, Heidelberg (2012)
Luk, C.-K., Hong, S., Kim, H.: Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: MICRO-42. 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 45–55 (December 2009)
Pouchet, L.-N., Bastoul, C., Cohen, A., Cavazos, J.: Iterative optimization in the polyhedral model: Part II, multidimensional time. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2008), Tucson, Arizona, pp. 90–100. ACM Press (June 2008)
Pouchet, L.-N., Bondhugula, U., Bastoul, C., Cohen, A., Ramanujam, J., Sadayappan, P.: Combined iterative and model-driven optimization in an automatic parallelization framework. In: SC 2010, New Orleans, USA (November 2010)
Pradelle, B., Clauss, P., Loechner, V.: Adaptive Runtime Selection of Parallel Schedules in the Polytope Model. In: 19th High Performance Computing Symposium - HPC 2011. United States, Boston (2011)
Rauchwerger, L., Padua, D.: The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. In: Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation, PLDI 1995, pp. 218–232. ACM, New York (1995)
Steffan, J.G., Colohan, C., Zhai, A., Mowry, T.C.: The stampede approach to thread-level speculation. ACM Trans. Comput. Syst. 23(3), 253–300 (2005)
Tavarageri, S., Pouchet, L.-N., Ramanujam, J., Rountev, A., Sadayappan, P.: Dynamic selection of tile sizes. In: 18th IEEE Int. Conf. on High Performance Computing (HiPC 2011), Bangalore, India (December 2011)
Upadrasta, R., Cohen, A.: Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra. In: ACM Symposium on Principles of Programming Languages, POPL 2013, Rome, Italy, pp. 483–496 (2013)
Voss, M., Eigenmann, R.: ADAPT: Automated de-coupled adaptive program transformation. In: Int. Conf. on Parallel Processing, pp. 163–170 (2000)
Whaley, C., Petitet, A., Dongarra, J.J.: Automated empirical optimization of software and the ATLAS project. Parallel Computing 27(1–2), 3–35 (2000)
Wolfe, M.: High performance compilers for parallel computing. Addison-Wesley Publishing Company (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Bagnères, L., Bastoul, C. (2014). Switchable Scheduling for Runtime Adaptation of Optimization. In: Silva, F., Dutra, I., Santos Costa, V. (eds) Euro-Par 2014 Parallel Processing. Euro-Par 2014. Lecture Notes in Computer Science, vol 8632. Springer, Cham. https://doi.org/10.1007/978-3-319-09873-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-09873-9_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09872-2
Online ISBN: 978-3-319-09873-9
eBook Packages: Computer ScienceComputer Science (R0)