An Evolutionary Algorithm for Learning Interpretable Ensembles of Classifiers

Cagnini, Henry E. L.; Freitas, Alex A.; Barros, Rodrigo C.

doi:10.1007/978-3-030-61377-8_2

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12319))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

1212 Accesses
2 Citations

Abstract

Ensembles of classifiers are a very popular type of method for performing classification, due to their usually high predictive accuracy. However, ensembles have two drawbacks. First, ensembles are usually considered a ‘black box’, non-interpretable type of classification model, mainly because typically there are a very large number of classifiers in the ensemble (and often each classifier in the ensemble is a black-box classifier by itself). This lack of interpretability is an important limitation in application domains where a model’s predictions should be carefully interpreted by users, like medicine, law, etc. Second, ensemble methods typically involve many hyper-parameters, and it is difficult for users to select the best settings for those hyper-parameters. In this work we propose an Evolutionary Algorithm (an Estimation of Distribution Algorithm) that addresses both these drawbacks. This algorithm optimizes the hyper-parameter settings of a small ensemble of 5 interpretable classifiers, which allows users to interpret each classifier. In our experiments, the ensembles learned by the proposed Evolutionary Algorithm achieved the same level of predictive accuracy as a well-known Random Forest ensemble, but with the benefit of learning interpretable models (unlike Random Forests).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Available at https://github.com/henryzord/PBIL.
2.
Available at https://sci2s.ugr.es/keel/datasets.php.
3.
Available at https://archive.ics.uci.edu/ml/datasets.

References

Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)
Google Scholar
Ali, J., et al.: Random forests and decision trees. Int. J. Comput. Sci. Issues (IJCSI) 9(5), 272 (2012)
Google Scholar
Baluja, S., Caruana, R.: Removing the genetics from the standard genetic algorithm. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 38–46. Elsevier, Tahoe City (1995)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Breiman, L., et al.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)
MATH Google Scholar
Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: a survey on methods and metrics. Electronics 8(8), 832 (2019)
Article Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Article MathSciNet Google Scholar
Fernández-Delgado, M., et al.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15(1), 3133–3181 (2014)
MathSciNet MATH Google Scholar
Feurer, M., et al.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)
Google Scholar
Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Fifteenth International Conference on Machine Learning, pp. 144–151 (1998)
Google Scholar
Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newsl. 15(1), 1–10 (2014)
Article Google Scholar
Fürnkranz, J., Kliegr, T., Paulheim, H.: On cognitive preferences and the plausibility of rule-based models (2018). arXiv preprint arXiv:1803.01316
Galea, M., Shen, Q., Levine, J.: Evolutionary approaches to fuzzy modelling for classification. Knowl. Eng. Rev. 19(1), 27–59 (2004)
Article Google Scholar
Hall, M., et al.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Hauschild, M., Pelikan, M.: An introduction and survey of estimation of distribution algorithms. Swarm Evol. Comput. 1(3), 111–128 (2011)
Article Google Scholar
Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT press, Cambridge (1992)
Book Google Scholar
Huysmans, J., et al.: An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Supp. Syst. 51(1), 141–154 (2011)
Article Google Scholar
Kohavi, R.: The power of decision tables. In: Lavrac, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 174–189. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-59286-5_57
Chapter Google Scholar
Kordik, P., Cerny, J., Fryda, T.: Discovering predictive ensembles for transfer learning and meta-learning. Mach. Learn. 107, 177–207 (2018)
Article MathSciNet MATH Google Scholar
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2004)
Book MATH Google Scholar
Lapuschkin, S., et al.: Unmasking Clever Hans predictors and assessing what machines really learn. Nat. Commun. 10(1), 1096 (2019)
Article Google Scholar
Larcher, C., Barbosa, H.: Auto-CVE: a coevolutionary approach to evolve ensembles in automated machine learning. In: Proceedings of The Genetic and Evolutionary Computation Conference, pp. 392–400 (2019). https://doi.org/10.1145/3321707.3321844
Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A new Tool for Evolutionary Computation. Springer, Heidelberg (2001). https://doi.org/10.1007/978-1-4615-1539-5
Book MATH Google Scholar
Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Luštrek, M., et al.: What makes classification trees comprehensible? Exp. Syst. Appl. 62, 333–346 (2016)
Article Google Scholar
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)
MATH Google Scholar
Olson, R.S., Urbanowicz, R.J., Andrews, P.C., Lavender, N.A., Kidd, L.C., Moore, J.H.: Automating biomedical data science through tree-based pipeline optimization. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 123–137. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31204-0_9
Chapter Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Google Scholar
Rish, I., et al.: An empirical study of the naive Bayes classifier. In: Proceedings of theWorkshop on empirical methods in artificial intelligence, IJCAI 2001, Seattle, USA, vol. 3, pp. 41–46 (2001)
Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson, London (2006)
Google Scholar
Thornton, C., et al.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: International Conference on Knowledge Discovery and Data Mining, pp. 847–855. ACM (2013)
Google Scholar
Xavier-Júnior, J.A.C., et al.: A novel evolutionary algorithm for automated machine learning focusing on classifier ensembles. In: Brazilian Conference on Intelligent Systems. pp. 1–6. IEEE, São Paulo (2018)
Google Scholar
Zangari, M., et al.: Not all PBILs are the same: unveiling the different learning mechanisms of PBIL variants. Appl. Soft Comput. 53, 88–96 (2017)
Article Google Scholar

Download references

Acknowledgment

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

Author information

Authors and Affiliations

School of Technology, PUCRS, Porto Alegre, Brazil
Henry E. L. Cagnini & Rodrigo C. Barros
Computing School, University of Kent, Canterbury, UK
Alex A. Freitas

Authors

Henry E. L. Cagnini
View author publications
You can also search for this author in PubMed Google Scholar
Alex A. Freitas
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo C. Barros
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henry E. L. Cagnini .

Editor information

Editors and Affiliations

Federal University of São Carlos, São Carlos, Brazil
Ricardo Cerri
Federal University of ABC, Santo Andre, Brazil
Ronaldo C. Prati

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cagnini, H.E.L., Freitas, A.A., Barros, R.C. (2020). An Evolutionary Algorithm for Learning Interpretable Ensembles of Classifiers. In: Cerri, R., Prati, R.C. (eds) Intelligent Systems. BRACIS 2020. Lecture Notes in Computer Science(), vol 12319. Springer, Cham. https://doi.org/10.1007/978-3-030-61377-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-61377-8_2
Published: 13 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61376-1
Online ISBN: 978-3-030-61377-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics