Compiler Optimizations: Machine Learning versus O3

Kashnikov, Yuriy; Beyler, Jean Christophe; Jalby, William

doi:10.1007/978-3-642-37658-0_3

Yuriy Kashnikov^17,19,
Jean Christophe Beyler^18,19 &
William Jalby^17,19

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7760))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

1276 Accesses
1 Citations
2 Altmetric

Abstract

Software engineers are highly dependent on compiler technology to create efficient programs. Optimal execution time is currently the most important criteria in the HPC field; to achieve this the user applies the common compiler option -O3. The following paper extensively tests the other performance options available and concludes that, although old compiler versions could benefit from compiler flag combinations, modern compilers perform admirably at the commonly used -O3 level.

The paper presents the Universal Learning Machine (ULM) framework, which combines different tools together to predict the best flags from data gathered offline. The ULM framework evaluates three hundred kernels extracted from 144 benchmark applications. It automatically processes more than ten thousand compiler flag combinations for each kernel. In order to perform a complete study, the experimental setup includes three modern mainstream compilers and four different architectures. For 62% of kernels, the optimal flag is the generic optimization level -O3.

For the remaining 38% of kernels, an extension to the ULM framework allows a user to instantly obtain the optimal flag combination, using a static prediction method. The prediction method examines four known machine learning algorithms, Nearest Neighbor, Stochastic Gradient Descent, and Support Vector Machines (SVM). ULM used SVM for the best results of a 92% accuracy rate for the considered kernels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cavazos, J., Fursin, G., Agakov, F., Bonilla, E., O’Boyle, M.F.P., Temam, O.: Rapidly selecting good compiler optimizations using performance counters. In: Proceedings of the International Symposium on Code Generation and Optimization, CGO 2007, pp. 185–197. IEEE Computer Society, Washington, DC (2007)
Google Scholar
Agakov, F., Bonilla, E., Cavazos, J., Franke, B., Fursin, G., O’Boyle, M.F.P., Thomson, J., Toussaint, M., Williams, C.K.I.: Using machine learning to focus iterative optimization. In: Proceedings of the International Symposium on Code Generation and Optimization, CGO 2006, pp. 295–305. IEEE Computer Society, Washington, DC (2006)
Google Scholar
Pan, Z., Eigenmann, R.: Fast and effective orchestration of compiler optimizations for automatic performance tuning. In: Proceedings of the International Symposium on Code Generation and Optimization, CGO 2006, pp. 319–332. IEEE Computer Society, Washington, DC (2006)
Google Scholar
Hoste, K., Eeckhout, L.: Cole: compiler optimization level exploration. In: Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2008, pp. 165–174. ACM, New York (2008)
Chapter Google Scholar
Fursin, G., Kashnikov, Y., Wahid, A., Chamski, M.Z., Temam, O., Namolaru, M., Yom-tov, E., Mendelson, B., Zaks, A., Courtois, E., Bodin, F., Barnard, P., Ashton, E., Bonilla, E., Thomson, J., Williams, C.K.I.: Milepost GCC: machine learning enabled self-tuning compiler (2011)
Google Scholar
Bodin, F., Kisuki, T., Knijnenburg, P., O’Boyle, M., Rohou, E.: Iterative compilation in a non-linear optimisation space (1998)
Google Scholar
Sanchez, R.N., Amaral, J.N., Szafron, D., Pirvu, M., Stoodley, M.: Using support vector machines to learn how to compile a method. In: Proceedings of the 2010 22nd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2010, pp. 223–230. IEEE Computer Society, Washington, DC (2010)
Chapter Google Scholar
Bishop, C.M.: Pattern recognition and machine learning, 1st edn., corr. 2nd printing edn. Springer (October 2006)
Google Scholar
Dubach, C., Cavazos, J., Franke, B., Fursin, G., O’Boyle, M.F., Temam, O.: Fast compiler optimisation evaluation using code-feature based performance prediction. In: Proceedings of the 4th International Conference on Computing Frontiers, CF 2007, pp. 131–142. ACM, New York (2007)
Google Scholar
Namolaru, M., Cohen, A., Fursin, G., Zaks, A., Freund, A.: Practical aggregation of semantical program properties for machine learning based optimization. In: Proceedings of the 2010 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2010, pp. 197–206. ACM, New York (2010)
Chapter Google Scholar
Azimi, R., Stumm, M., Wisniewski, R.W.: Online performance analysis by statistical sampling of microprocessor performance counters. In: Proceedings of the 19th Annual International Conference on Supercomputing, ICS 2005, pp. 101–110. ACM, New York (2005)
Chapter Google Scholar
Treibig, J., Hager, G., Wellein, G.: LIKWID: Lightweight Performance Tools. CoRR abs/1104.4874 (2011)
Google Scholar
Intel Corp.: Intel 64 and IA-32 Architectures Optimization Reference Manual (2011)
Google Scholar
Intel Corp.: Intel 64 and IA-32 Architectures Software Developer’s Manual (2011)
Google Scholar
Petit, E., Papaure, G., Dru, F., Bodin, F.: ASTEX: a Hot path Based Thread Extractor for Distributed Memory System on a Chip. In: 1st HiPEAC Industrial Workshop, Grenoble, France (2006)
Google Scholar
Cooper, K.D., Schielke, P.J., Subramanian, D.: Optimizing for reduced code space using genetic algorithms. SIGPLAN Not. 34, 1–9 (1999)
Article Google Scholar
Triantafyllis, S., Vachharajani, M., Vachharajani, N., August, D.I.: Compiler optimization-space exploration. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, CGO 2003, pp. 204–215. IEEE Computer Society, Washington, DC (2003)
Chapter Google Scholar
Parello, D., Temam, O., Cohen, A., Verdun, J.M.: Towards a systematic, pragmatic and architecture-aware program optimization process for complex processors. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, p. 15. IEEE Computer Society, Washington, DC (2004)
Google Scholar
Zhao, M., Childers, B.R., Soffa, M.L.: A model-based framework: An approach for profit-driven optimization. In: Proceedings of the International Symposium on Code Generation and Optimization, CGO 2005, pp. 317–327. IEEE Computer Society, Washington, DC (2005)
Chapter Google Scholar
Vuduc, R., Demmel, J.W., Bilmes, J.A.: Statistical models for empirical search-based performance tuning. Int. J. High Perform. Comput. Appl. 18, 65–94 (2004)
Article Google Scholar
Stephenson, M., Amarasinghe, S.: Predicting unroll factors using supervised classification. In: Proceedings of the International Symposium on Code Generation and Optimization, CGO 2005, pp. 123–134. IEEE Computer Society, Washington, DC (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Université de Versailles Saint-Quentin-en-Yvelines, France
Yuriy Kashnikov & William Jalby
Intel France, France
Jean Christophe Beyler
Exascale Computing Research Center, France
Yuriy Kashnikov, Jean Christophe Beyler & William Jalby

Authors

Yuriy Kashnikov
View author publications
You can also search for this author in PubMed Google Scholar
Jean Christophe Beyler
View author publications
You can also search for this author in PubMed Google Scholar
William Jalby
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Department of Computer Science and Engineering, Waseda University, 27 Waseda-machi, 162-0042, Shinjuku-ku, Tokyo, Japan
Hironori Kasahara & Keiji Kimura &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kashnikov, Y., Beyler, J.C., Jalby, W. (2013). Compiler Optimizations: Machine Learning versus O3. In: Kasahara, H., Kimura, K. (eds) Languages and Compilers for Parallel Computing. LCPC 2012. Lecture Notes in Computer Science, vol 7760. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37658-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-37658-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37657-3
Online ISBN: 978-3-642-37658-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics