Advertisement

A Framework for Analytical Approaches to Combine Interpretable Models

  • Pedro StrechtEmail author
  • João Mendes-Moreira
  • Carlos Soares
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 898)

Abstract

Analytic approaches to combine interpretable models, although presented in different contexts, can be generalized to highlight the components that can be specialized. We propose a framework that structures the combination process, formalizes the problems that can be solved in alternative ways and evaluates the combined models based on their predictive ability to replace the base ones, without loss of interpretability. The framework is illustrated with a case study using data from the University of Porto, Portugal, where experiments were carried out. The results show that grouping base models by scientific areas, ordering by the number of variables and intersecting their underlying rules creates conditions for the combined models to outperform them.

Keywords

Knowledge generalization Interpretable models Prediction of performance Decision tree merging C5.0 

Notes

Acknowledgments

This work is funded by projects “NORTE-07-0124-FEDER-000059” and “NORTE-07-0124-FEDER-000057”, financed by the North Portugal Regional Operational Programme (ON.2 - O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), and by national funds, through the Portuguese funding agency, Fundação para a Ciência e a Tecnologia (FCT).

References

  1. 1.
    Andrzejak, A., Langner, F., Zabala, S.: Interpretable models from distributed data via merging of decision trees. In: Proceedings of the 2013 IEEE Symposium on Computational Intelligence and Data Mining. IEEE (2013)Google Scholar
  2. 2.
    Bursteinas, B., Long, J.: Merging distributed classifiers. In: Proceedings of the 5th World Multiconference on Systemics, Cybernetics and Informatics (2001)Google Scholar
  3. 3.
    Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, London (1993)CrossRefGoogle Scholar
  4. 4.
    Gorbunov, K., Lyubetsky, V.: The tree nearest on average to a given set of trees. Probl. Inf. Transm. 47(3), 274–288 (2011)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Hall, L., Chawla, N., Bowyer, K.: Combining decision trees learned in parallel. In: Working Notes of the KDD-97 Workshop on Distributed Data Mining, pp. 10–15 (1998)Google Scholar
  6. 6.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2011)zbMATHGoogle Scholar
  7. 7.
    Kargupta, H., Park, B.: A Fourier spectrum-based approach to represent decision trees for mining data streams in mobile environments. IEEE Trans. Knowl. Data Eng. 16, 216–229 (2004)CrossRefGoogle Scholar
  8. 8.
    Kohavi, R., Quinlan, R.: Data mining tasks and methods: classification: decision-tree discovery. In: Handbook of Data Mining and Knowledge Discovery, pp. 267–276. Oxford University Press Inc., New York (1999)Google Scholar
  9. 9.
    Kuhn, M., Weston, S., Coulter, N., Quinlan, J.: C50: C5.0 decision trees and rule-based models. R package version 0.1.0-16 (2014)Google Scholar
  10. 10.
    Lyubetsky, V., Gorbunov, K.: Fast algorithm to reconstruct a species supertree from a set of protein trees. Mol. Biol. 46(1), 161–167 (2012)CrossRefGoogle Scholar
  11. 11.
    Maimon, O., Rokach, L.: Data Mining and Knowledge Discovery Handbook, 2nd edn. Springer, Boston (2010).  https://doi.org/10.1007/978-0-387-09823-4CrossRefzbMATHGoogle Scholar
  12. 12.
    Opitz, D., Maclin, R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. 11, 169–198 (1999)CrossRefGoogle Scholar
  13. 13.
    Provost, F.J., Hennessy, D.N.: Scaling up: distributed machine learning with cooperation. In: Proceedings of the 13th National Conference on Artificial Intelligence, pp. 74–79 (1996)Google Scholar
  14. 14.
    Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  15. 15.
    Shannon, W.D., Banks, D.: Combining classification trees using MLE. Stat. Med. 18(6), 727–740 (1999)CrossRefGoogle Scholar
  16. 16.
    Stone, M.: Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc.: Ser. B 36(2), 111–147 (1974)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Strecht, P., Mendes-Moreira, J., Soares, C.: Merging decision trees: a case study in predicting student performance. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS (LNAI), vol. 8933, pp. 535–548. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-14717-8_42CrossRefGoogle Scholar
  18. 18.
    Williams, G.: Inducing and combining multiple decision trees. Ph.D. thesis, Australian National University (1990)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.INESC TEC/Faculdade de EngenhariaUniversidade do PortoPortoPortugal

Personalised recommendations