Advertisement

RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines

  • Alex G. C. de Sá
  • Walter José G. S. Pinto
  • Luiz Otavio V. B. Oliveira
  • Gisele L. Pappa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10196)

Abstract

Automatic Machine Learning is a growing area of machine learning that has a similar objective to the area of hyper-heuristics: to automatically recommend optimized pipelines, algorithms or appropriate parameters to specific tasks without much dependency on user knowledge. The background knowledge required to solve the task at hand is actually embedded into a search mechanism that builds personalized solutions to the task. Following this idea, this paper proposes RECIPE (REsilient ClassifIcation Pipeline Evolution), a framework based on grammar-based genetic programming that builds customized classification pipelines. The framework is flexible enough to receive different grammars and can be easily extended to other machine learning tasks. RECIPE overcomes the drawbacks of previous evolutionary-based frameworks, such as generating invalid individuals, and organizes a high number of possible suitable data pre-processing and classification methods into a grammar. Results of f-measure obtained by RECIPE are compared to those two state-of-the-art methods, and shown to be as good as or better than those previously reported in the literature. RECIPE represents a first step towards a complete framework for dealing with different machine learning tasks with the minimum required human intervention.

Keywords

Grammar-based genetic programming Classification Automatic Machine Learning 

Notes

Acknowledgments

This work was partially supported by the following Brazilian Research Support Agencies: CNPq, CAPES and FAPEMIG.

References

  1. 1.
    Banzhaf, W., Francone, F.D., Keller, R.E., Nordin, P.: Genetic Programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann Publishers Inc., Burlington (1998)CrossRefzbMATHGoogle Scholar
  2. 2.
    Pappa, G.L., Ochoa, G., Hyde, M.R., Freitas, A.A., Woodward, J., Swan, J.: Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms. Genet. Program. Evolvable Mach. 15(1), 3–35 (2014)CrossRefGoogle Scholar
  3. 3.
    Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 485–492 (2016)Google Scholar
  4. 4.
    Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 847–855 (2013)Google Scholar
  5. 5.
    Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 2755–2763 (2015)Google Scholar
  6. 6.
    Olson, R.S., Urbanowicz, R.J., Andrews, P.C., Lavender, N.A., Kidd, L.C., Moore, J.H.: Automating biomedical data science through tree-based pipeline optimization. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 123–137. Springer, Heidelberg (2016). doi: 10.1007/978-3-319-31204-0_9 CrossRefGoogle Scholar
  7. 7.
    McKay, R., Hoai, N., Whigham, P., Shan, Y., O’Neill, M.: Grammar-based genetic programming: a survey. Genet. Program. Evolvable Mach. 11(3), 365–396 (2010)CrossRefGoogle Scholar
  8. 8.
    Mendoza, H., Klein, A., Feurer, M., Springenberg, J., Hutter, F.: Towards automatically-tuned neural networks. In: Proceedings of the ICML AutoML Workshop (2016)Google Scholar
  9. 9.
    Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRefGoogle Scholar
  10. 10.
    Yao, X.: Evolving artificial neural networks. Proc. IEEE 87(9), 1423–1447 (1999)CrossRefGoogle Scholar
  11. 11.
    Pappa, G.L., Freitas, A.A.: Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach. Springer, Heidelberg (2009)zbMATHGoogle Scholar
  12. 12.
    Dioşan, L., Rogozan, A., Pecuchet, J.P.: Improving classification performance of support vector machine by genetically optimising kernel shape and hyper-parameters. Appl. Intell. 36(2), 280–294 (2012)CrossRefGoogle Scholar
  13. 13.
    Mantovani, R.G., Rossi, A.L.D., Vanschoren, J., Bischl, B., de Carvalho, A.: Effectiveness of random search in SVM hyper-parameter tuning. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1–8 (2015)Google Scholar
  14. 14.
    Barros, R.C., Basgalupp, M.P., de Carvalho, A.C.P.L.F., Freitas, A.A.: Automatic design of decision-tree algorithms with evolutionary algorithms. Evol. Comput. 21(4), 659–684 (2013)CrossRefGoogle Scholar
  15. 15.
    Sá, A.G.C., Pappa, G.L.: Towards a method for automatically evolving bayesian network classifiers. In: Proceedings of the Conference Companion on Genetic and Evolutionary Computation, pp. 1505–1512 (2013)Google Scholar
  16. 16.
    Sá, A.G.C., Pappa, G.L.: A hyper-heuristic evolutionary algorithm for learning bayesian network classifiers. In: Bazzan, A.L.C., Pichara, K. (eds.) IBERAMIA 2014. LNCS (LNAI), vol. 8864, pp. 430–442. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-12027-0_35 Google Scholar
  17. 17.
    Springenberg, J.T., Klein, A., Falkner, S., Hutter, F.: Bayesian optimization with robust bayesian neural networks. In: Proceedings of the Conference on Neural Information Processing Systems (2016)Google Scholar
  18. 18.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11, 10–18 (2009)CrossRefGoogle Scholar
  19. 19.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: SciKit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Feurer, M., Springenberg, J.T., Hutter, F.: Initializing bayesian hyperparameter optimization via meta-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1128–1135 (2015)Google Scholar
  21. 21.
    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRefGoogle Scholar
  22. 22.
    Whigham, P.A., Dick, G., Maclaurin, J., Owen, C.A.: Examining the “best of both worlds” of grammatical evolution. In: Proceedings of the Conference on Genetic and Evolutionary Computation, pp. 1111–1118 (2015)Google Scholar
  23. 23.
    Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., Burlington (2011)Google Scholar
  24. 24.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007)Google Scholar
  25. 25.
    Freitas, A.A., Vasieva, O., de Magalhães, J.P.: A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related. BMC Genomics 12(1) (2011)Google Scholar
  26. 26.
    Souto, M., Costa, I., Araujo, D., Ludermir, T., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinf. 9(1), 497 (2008)CrossRefGoogle Scholar
  27. 27.
    Wan, C., Freitas, A.A., De Magalhães, J.P.: Predicting the pro-longevity or anti-longevity effect of model organism genes with new hierarchical feature selection methods. IEEE/ACM Trans. Comput. Biol. Bioinform. 12(2), 262–275 (2015)CrossRefGoogle Scholar
  28. 28.
    Wilcoxon, F., Katti, S., Wilcox, R.A.: Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Sel. Tables Math. Stat. 1, 171–259 (1970)zbMATHGoogle Scholar
  29. 29.
    Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 2546–2554 (2011)Google Scholar
  30. 30.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(1), 281–305 (2012)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Alex G. C. de Sá
    • 1
  • Walter José G. S. Pinto
    • 1
  • Luiz Otavio V. B. Oliveira
    • 1
  • Gisele L. Pappa
    • 1
  1. 1.Computer Science DepartmentUniversidade Federal de Minas GeraisBelo HorizonteBrazil

Personalised recommendations