From Ensemble Methods to Comprehensible Models

Ferri, C.; Hernández-Orallo, J.; Ramírez-Quintana, M. J.

doi:10.1007/3-540-36182-0_16

C. Ferri⁷,
J. Hernández-Orallo⁷ &
M. J. Ramírez-Quintana⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2534))

Included in the following conference series:

International Conference on Discovery Science

972 Accesses
9 Citations

Abstract

Ensemble methods improve accuracy by combining the predictions of a set of different hypotheses. However, there are two important shortcomings associated with ensemble methods. Huge amounts of memory are required to store a set of multiple hypotheses and, more importantly, comprehensibility of a single hypothesis is lost. In this work, we devise a new method to extract one single solution from a hypothesis ensemble without using extra data, based on two main ideas: the selected solution must be similar, semantically, to the combined solution, and this similarity is evaluated through the use of a random dataset. We have implemented the method using shared ensembles, because it allows for an exponential number of potential base hypotheses. We include several experiments showing that the new method selects a single hypothesis with an accuracy which is reasonably close to the combined hypothesis.

This work has been partially supported by CICYT under grant TIC2001-2705-C03- 01, Generalitat Valenciana under grant GV00-092-14 and Acción Integrada Hispano- Alemana HA2001-0059.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
MATH MathSciNet Google Scholar
J.G. Cleary and L.E. Trigg. Experiences with ob1, an optimal bayes decision tree learner. Technical report, Department of Computer Science, Univ. of Waikato, New Zealand, 1998.
Google Scholar
J. Cohen. A coefficient of agreement for nominal scales. Educational and Psychological Meas., 20:37–46, 1960.
Article Google Scholar
T. G Dietterich. Ensemble methods in machine learning. In First International Workshop on Multiple Classifier Systems, pages 1–15, 2000.
Google Scholar
Thomas G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting, and Randomization. Machine Learning, 40(2):139–157, 2000.
Article Google Scholar
C. Ferri, J. Hernández, and M.J. Ramírez. Induction of Decision Multi-trees using Levin Search. In Int. Conf. on Computational Science,ICCS’02, LNCS, 2002.
Google Scholar
C. Ferri, J. Hernández, and M.J. Ramírez. Learning multiple and different hypotheses. Technical report, Department of Computer Science, Universitat Politécnica de Valéncia, 2002.
Google Scholar
C. Ferri, J. Hernández, and M.J. Ramírez. SMILES system, a multi-purpose learning system. http://www.dsic.upv.es/~flip/smiles/, 2002.
Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. In the 13th Int. Conf. on Machine Learning (ICML’1996), pages 148–156, 1996.
Google Scholar
Tim Kam Ho. C4.5 decision forests. In Proc. of 14th Intl. Conf. on Pattern Recognition,Brisbane,Australia, pages 545–549, 1998.
Google Scholar
Ludmila I. Kuncheva. A Theoretical Study on Six Classifier Fusion Strategies. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(2):281–286, 2002.
Article Google Scholar
Ludmila I. Kuncheva and Christopher J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Submitted to Machine Learning, 2002.
Google Scholar
Dragos D. Margineantu and Thomas G. Dietterich. Pruning adaptive boosting. In 14th Int. Conf. on Machine Learning, pages 211–218. Morgan Kaufmann, 1997.
Google Scholar
N.J. Nilsson. Artificial Intelligence: a new synthesis. Morgan Kaufmann, 1998.
Google Scholar
University of California. UCI Machine Learning Repository Content Summary. http://www.ics.uci.edu/~mlearn/MLSummary.html.
J. Quinlan. Miniboosting decision trees. Submitted to JAIR, 1998.
Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
J. R. Quinlan. Bagging, Boosting, and C4.5. In Proc. of the 13th Nat. Conf. on A.I. and the 8th Innovative Applications of A.I. Conf., pages 725–730. AAAI/MIT Press, 1996.
Google Scholar
Ross Quinlan. Relational learning and boosting. In Saso Dzeroski and Nada Lavrac, editors, Relational Data Mining, pages 292–306. Springer-Verlag, September 2001.
Google Scholar
P. Volf and F. Willems. Context maximizing: Finding mdl decision trees. In Symposium on Information Theory in the Benelux,Vol.15, pages 192–200, 1994.
Google Scholar
Geoffrey I. Webb. Further experimental evidence against the utility of Occam’s razor. Journal of Artificial Intelligence Research, 4:397–417, 1996.
MATH MathSciNet Google Scholar
David H. Wolpert. Stacked generalization. Neural Networks, 5(2):241–259, 1992.
Article Google Scholar

Download references

Author information

Authors and Affiliations

DSIC, UPV, Camino de Vera s/n, 46020, Valencia, Spain
C. Ferri, J. Hernández-Orallo & M. J. Ramírez-Quintana

Authors

C. Ferri
View author publications
You can also search for this author in PubMed Google Scholar
J. Hernández-Orallo
View author publications
You can also search for this author in PubMed Google Scholar
M. J. Ramírez-Quintana
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Deutsches Forschungszentrum für Künstliche Intelligenz, Stuhlsatzenhausweg 3, 66123, Saarbrücken, Germany
Steffen Lange
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Ken Satoh
Department of Computer Science, University of Maryland, College Park, 20742, Maryland, MD, USA
Carl H. Smith

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J. (2002). From Ensemble Methods to Comprehensible Models. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_16

Download citation

DOI: https://doi.org/10.1007/3-540-36182-0_16
Published: 08 November 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00188-1
Online ISBN: 978-3-540-36182-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics