Abstract
Building effective optimization heuristics is a challenging task which often takes developers several months if not years to complete. Predictive modelling has recently emerged as a promising solution, automatically constructing heuristics from training data. However, obtaining this data can take months per platform. This is becoming an ever more critical problem and if no solution is found we shall be left with out of date heuristics which cannot extract the best performance from modern machines.
In this work, we present a low-cost predictive modelling approach for automatic heuristic construction which significantly reduces this training overhead. Typically in supervised learning the training instances are randomly selected to evaluate regardless of how much useful information they carry. This wastes effort on parts of the space that contribute little to the quality of the produced heuristic. Our approach, on the other hand, uses active learning to select and only focus on the most useful training examples.
We demonstrate this technique by automatically constructing a model to determine on which device to execute four parallel programs at differing problem dimensions for a representative Cpu–Gpu based heterogeneous system. Our methodology is remarkably simple and yet effective, making it a strong candidate for wide adoption. At high levels of classification accuracy the average learning speed-up is 3x, as compared to the state-of-the-art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In passive learning techniques, the training examples are selected without feedback as to the quality of the machine learned heuristic. Most usually, this will mean that all training examples are generated ahead of time and then a heuristic is learned once. In active learning, by contrast, the selection of training examples is an iterative process which is driven by feedback about the quality of the heuristic.
References
Power, J., Basu, A., Gu, J., Puthoor, S., Beckmann, B.M., Hill, M.D., Reinhardt, S.K., Wood, D.A.: Heterogeneous system coherence for integrated cpu-gpu systems. In: Proceedings of MICRO 2013
Kulkarni, S., Cavazos, J.: Mitigating the compiler optimization phase-ordering problem using machine learning. In: Proceedings of OOPSLA 2012
Dubach, C., Jones, T., Bonilla, E., Fursin, G., O’Boyle, M.F.P.: Portable compiler optimisation across embedded programs and microarchitectures using machine learning. In: Proceedings of MICRO 2009
Cavazos, J., Fursin, G., Agakov, F., Bonilla, E., O’Boyle, M.F.P., Temam, O.: Rapidly selecting good compiler optimizations using performance counters. In: Proceedings of CGO 2007
Grewe, D., Wang, Z., O’Boyle, M.F.: Portable mapping of data parallel programs to OpenCL for heterogeneous systems. In: Proceedings of CGO 2013
Settles, B.: Active learning literature survey, University of Wisconsin-Madison, Computer Sciences Technical report 1648 (2009)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J., Lee, S.-H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: Proceedings of IISWC 2009
Che, S., Sheaffer, J., Boyer, M., Szafaryn, L., Wang, L., Skadron, K.: A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads. In: Proceedings of IISWC 2010
Cooper, K.D., Schielke, P.J., Subramanian, D.: Optimizing for reduced code space using genetic algorithms. In: Proceedings of LCTES 1999
Wang, Z., O’Boyle, M.F.: Mapping parallelism to multi-cores: a machine learning based approach. In: Proceedings of PPoPP 2009
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of COLT 1992
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York Inc., Secaucus (2006)
Dagan, I., Engelson, S.P.: Committee-based sampling for training probabilistic classifiers. In: Proceedings of ICML 1995
Moore, D.S., McCabe, G.P.: Introduction to the Practice of Statistics. W.H. Freeman, New York (2002)
Welch, B.L.: The Generalization of “Student’s" Problem when Several Different Population Variances are Involved. Biometrika 34, 28–35 (1947)
Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: Proceedings of PACT 2004
Pouchet, L.-N., Bastoul, C., Cohen, A., Cavazos, J.: Iterativeoptimization in the polyhedral model: part II, multidimensional time. In: Proceedings of PLDI 2008
Clement, M., Quinn, M.: Analytical performance prediction on multicomputers. In: Proceedings of SC 1993
Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschulat, J., Stenström, P.: The worst-case execution-time problem - overview of methods and survey of tools. ACM TECS 7, 1–53 (2008)
Hong, S., Kim, H.: An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In: Proceedings of ISCA 2009
Hormati, A.H., Choi, Y., Kudlur, M., Rabbah, R., Mudge, T., Mahlke, S.: Flextream: adaptive compilation of streaming applications for heterogeneous architectures. In: Proceedings of PACT 2009
Stephenson, M., Amarasinghe, S., Martin, M., O’Reilly, U.-M.: Meta optimization: improving compiler heuristics with machine learning. In: Proceedings of PLDI 2003
Wang, Z., O’Boyle, M.F.: Partitioning streaming parallelism for multi-cores: a machine learning based approach. In: PACT 2010
Grewe, D., Wang, Z., O’Boyle, M.F.P.: OpenCL task partitioning in the presence of GPU contention. In: Caṣcaval, C., Montesinos-Ortego, P. (eds.) LCPC 2013 - Testing. LNCS, vol. 8664, pp. 87–101. Springer, Heidelberg (2014)
Grewe, D., Wang, Z., O’Boyle, M.: A workload-aware mapping approach for data-parallel programs. In: HiPEAC 2011
Zuluaga, M., Krause, A., Milder, P., Püschel, M.: “Smart" design space sampling to predict pareto-optimal solutions. In Proceedings of LCTES 2012
Emani, M.K., Wang, Z., O’Boyle, M.F.P.: Smart, adaptivemapping of parallelism in the presence of external workload. In: CGO 2013
Wang, Z., O’Boyle, M.F.P.: Using machine learning to partition streaming programs. ACM TACO 10 (2013)
Fursin, G., Miranda, C., Temam, O., Namolaru, M., Yom-Tov, E., Zaks, A., Mendelson, B., Bonilla, E., Thomson, J., Leather, H., Williams, C., O’Boyle, M., Barnard, P., Ashton, E., Courtois, E., Bodin, F.: In: Proceedings of the GCC Developers’ Summit
Luk, C.-k., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of MICRO 2009
Zuluaga, M., Krause, A., Sergent, G., Püschel, M.: Active learning for multi-objective optimization. In: Proceedings of ICML 2013
Balaprakash, P., Gramacy, R.B., Wild, S.M.: Active-learning-based surrogate models for empirical performance tuning. In: Proceedings of CLUSTER 2013
Balaprakash, P., Rupp, K., Mametjanov, A., Gramacy, R.B., Hovland, P.D., Wild, S.M.: Empirical performance modeling of GPU kernels using active learning. In: Proceedings of ParCo 2013
Liu, Y., Zhang, E.Z., Shen, X.: A Cross-input adaptive framework for GPU program optimizations. In: Proceedings of IPDPS 2009
Samadi, M., Hormati, A., Mehrara, M., Lee, J., Mahlke, S.: Adaptive input-aware compilation for graphics engines. In: Proceedings of PLDI 2012
Acknowledgements
This work was funded under the EPSRC grant, ALEA (EP/H044752/1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ogilvie, W.F., Petoumenos, P., Wang, Z., Leather, H. (2015). Fast Automatic Heuristic Construction Using Active Learning. In: Brodman, J., Tu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2014. Lecture Notes in Computer Science(), vol 8967. Springer, Cham. https://doi.org/10.1007/978-3-319-17473-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-17473-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17472-3
Online ISBN: 978-3-319-17473-0
eBook Packages: Computer ScienceComputer Science (R0)