Abstract
Ensembles of classifiers are a very popular type of method for performing classification, due to their usually high predictive accuracy. However, ensembles have two drawbacks. First, ensembles are usually considered a ‘black box’, non-interpretable type of classification model, mainly because typically there are a very large number of classifiers in the ensemble (and often each classifier in the ensemble is a black-box classifier by itself). This lack of interpretability is an important limitation in application domains where a model’s predictions should be carefully interpreted by users, like medicine, law, etc. Second, ensemble methods typically involve many hyper-parameters, and it is difficult for users to select the best settings for those hyper-parameters. In this work we propose an Evolutionary Algorithm (an Estimation of Distribution Algorithm) that addresses both these drawbacks. This algorithm optimizes the hyper-parameter settings of a small ensemble of 5 interpretable classifiers, which allows users to interpret each classifier. In our experiments, the ensembles learned by the proposed Evolutionary Algorithm achieved the same level of predictive accuracy as a well-known Random Forest ensemble, but with the benefit of learning interpretable models (unlike Random Forests).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Available at https://github.com/henryzord/PBIL.
- 2.
Available at https://sci2s.ugr.es/keel/datasets.php.
- 3.
Available at https://archive.ics.uci.edu/ml/datasets.
References
Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)
Ali, J., et al.: Random forests and decision trees. Int. J. Comput. Sci. Issues (IJCSI) 9(5), 272 (2012)
Baluja, S., Caruana, R.: Removing the genetics from the standard genetic algorithm. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 38–46. Elsevier, Tahoe City (1995)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Breiman, L., et al.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)
Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: a survey on methods and metrics. Electronics 8(8), 832 (2019)
Cohen, W.W.: Fast effective rule induction. In: Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Fernández-Delgado, M., et al.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15(1), 3133–3181 (2014)
Feurer, M., et al.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)
Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Fifteenth International Conference on Machine Learning, pp. 144–151 (1998)
Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newsl. 15(1), 1–10 (2014)
Fürnkranz, J., Kliegr, T., Paulheim, H.: On cognitive preferences and the plausibility of rule-based models (2018). arXiv preprint arXiv:1803.01316
Galea, M., Shen, Q., Levine, J.: Evolutionary approaches to fuzzy modelling for classification. Knowl. Eng. Rev. 19(1), 27–59 (2004)
Hall, M., et al.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Hauschild, M., Pelikan, M.: An introduction and survey of estimation of distribution algorithms. Swarm Evol. Comput. 1(3), 111–128 (2011)
Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT press, Cambridge (1992)
Huysmans, J., et al.: An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Supp. Syst. 51(1), 141–154 (2011)
Kohavi, R.: The power of decision tables. In: Lavrac, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 174–189. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-59286-5_57
Kordik, P., Cerny, J., Fryda, T.: Discovering predictive ensembles for transfer learning and meta-learning. Mach. Learn. 107, 177–207 (2018)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2004)
Lapuschkin, S., et al.: Unmasking Clever Hans predictors and assessing what machines really learn. Nat. Commun. 10(1), 1096 (2019)
Larcher, C., Barbosa, H.: Auto-CVE: a coevolutionary approach to evolve ensembles in automated machine learning. In: Proceedings of The Genetic and Evolutionary Computation Conference, pp. 392–400 (2019). https://doi.org/10.1145/3321707.3321844
Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A new Tool for Evolutionary Computation. Springer, Heidelberg (2001). https://doi.org/10.1007/978-1-4615-1539-5
Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Luštrek, M., et al.: What makes classification trees comprehensible? Exp. Syst. Appl. 62, 333–346 (2016)
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)
Olson, R.S., Urbanowicz, R.J., Andrews, P.C., Lavender, N.A., Kidd, L.C., Moore, J.H.: Automating biomedical data science through tree-based pipeline optimization. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 123–137. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31204-0_9
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Rish, I., et al.: An empirical study of the naive Bayes classifier. In: Proceedings of theWorkshop on empirical methods in artificial intelligence, IJCAI 2001, Seattle, USA, vol. 3, pp. 41–46 (2001)
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson, London (2006)
Thornton, C., et al.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: International Conference on Knowledge Discovery and Data Mining, pp. 847–855. ACM (2013)
Xavier-Júnior, J.A.C., et al.: A novel evolutionary algorithm for automated machine learning focusing on classifier ensembles. In: Brazilian Conference on Intelligent Systems. pp. 1–6. IEEE, São Paulo (2018)
Zangari, M., et al.: Not all PBILs are the same: unveiling the different learning mechanisms of PBIL variants. Appl. Soft Comput. 53, 88–96 (2017)
Acknowledgment
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Cagnini, H.E.L., Freitas, A.A., Barros, R.C. (2020). An Evolutionary Algorithm for Learning Interpretable Ensembles of Classifiers. In: Cerri, R., Prati, R.C. (eds) Intelligent Systems. BRACIS 2020. Lecture Notes in Computer Science(), vol 12319. Springer, Cham. https://doi.org/10.1007/978-3-030-61377-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-61377-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61376-1
Online ISBN: 978-3-030-61377-8
eBook Packages: Computer ScienceComputer Science (R0)