Experiences Using Decision Trees for Knowledge Discovery

Armengol, Eva; García-Cerdaña, Àngel; Dellunde, Pilar

doi:10.1007/978-3-319-47557-8_11

Eva Armengol⁵,
Àngel García-Cerdaña^5,6 &
Pilar Dellunde^5,7

Part of the book series: Studies in Computational Intelligence ((SCI,volume 671))

979 Accesses
3 Citations

Abstract

Knowledge discovery is the process of identifying useful patterns from large data sets. There are two families of approaches to be used for knowledge discovery: clustering, when the classes of domain objects are not known; and inductive learning algorithms, when the classes are known and the goal is to construct a domain model useful to identify new unseen objects. Clustering algorithms have also been proposed to analyze the data when the classes are known. However, to our knowledge, inductive learning methods are not used to analyze the available data but only for prediction. What we propose here is a methodology, namely FTree, that uses a decision tree to analyze both the available data identifying patterns and some important aspects of the domain (at least from the domain’s part represented by the data at hand) such as similarity between classes, separability, characterization of classes and even some possible errors on data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

E. Armengol. Usages of generalization in CBR. In R.O. Weber and M. M. Richter, editors, ICCBR-2007. Case-based Reasoning and Development, number 4626 in Lecture Notes in Artificial Intelligence, pages 31–45. Springer-Verlag, 2007.
Google Scholar
E. Armengol. Building partial domain theories from explanations. Knowledge Intelligence, 2/08:19–24, 2008.
Google Scholar
E. Armengol and E. Plaza. Discovery of toxicological patterns with lazy learning. In V. Palade, R.J. Howlett, and L. Jain, editors, KES-2003, number 2774 in Lecture Notes in Artificial Intelligence, pages 919–926. Springer, 2003.
Google Scholar
A. Asuncion and D.J. Newman. UCI machine learning repository, 2007.
Google Scholar
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
Google Scholar
J. Gehrke, R. Ramakrishnan, and V. Ganti. RainForest - a framework for fast decision tree construction of large datasets. Data Mining and Knowledge Discovery, 4(2/3):127–162, 2000.
Google Scholar
L.E. Gómez, M.A. Verdugo, B. Arias and R.L. Schalock. Formulari de l’escala gencat de qualitat de vida. manual d’aplicació de l’escala gencat de qualitat de vida. Technical report, Departament d’Acció Social i Ciutadania, Generalitat de Catalunya, Barcelona, 2008.
Google Scholar
L.E. Gómez, M.A. Verdugo, B. Arias and R.L. Schalock. Informe sobre la creació d’una escala multidimensional per avaluar la qualitat de vida de les persones usuàries dels serveis socials a catalunya. Technical report, Departament d’Acció Social i Ciutadania, Generalitat de Catalunya, Barcelona, 2008.
Google Scholar
A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Comput. Surv., 31(3):264–323, September 1999.
Google Scholar
T. Kohonen. The self-organizing map. Neurocomputing, 21(1-3):1–6, 1998.
Google Scholar
R. López de Mántaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991.
Google Scholar
O. Maimon and L. Rokach, editors. Data Mining and Knowledge Discovery Handbook, 2nd ed. Springer, 2010.
Google Scholar
M. Núñez. The use of background knowledge in decision tree induction. Machine Learning, 6:231–250, 1991.
Google Scholar
J. Ortega and D. Fisher. Flexibly exploiting prior knowledge in empirical learning. In Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2, IJCAI’95, pages 1041–1047, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.
Google Scholar
M. J. Pazzani. Knowledge discovery from data? IEEE Intelligent Systems, 15(2):10–13, 2000.
Google Scholar
J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.
Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
J. R. Quinlan. Discovering rules by induction from large collection of examples. In Expert Systems in the Microelectronic Age. D. Michie (Ed.), pages 168–201. Edimburg Eniversity Press, 1979.
Google Scholar
R.L. Schalock and M.A. Verdugo. Handbook of quality of life for human service practitioners. Washington, DC, 2002.
Google Scholar
J. C. Shafer, R. Agrawal, and M. Mehta. Sprint: A scalable parallel classifier for data mining. In VLDB, pages 544–555, 1996.
Google Scholar
S. M. Sivagama. A knowledge discovery using decision tree by Gini coefficient. In International Conference on Business, Engineering and Industrial Applications (ICBEIA), pages 232–235, 2011.
Google Scholar
Y. Tsai, Paul H. King, Ph. D, Michael S. Higgins, Ph. D, and Nimesh P. Patel. An expert-guided decision tree construction strategy: An application in knowledge discovery with medical databases. In AMIA Annual Fall Symposium, pages 208–212, 1997.
Google Scholar

Download references

Acknowledgements

The authors thank Susana Puig their helpful comments and suggestions, and the Taller Jeroni de Moragas. This research is partially funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 689176 (SYSMICS project), the projects RASO (TIN2015-71799-C2-1-P) and RPREF (CSIC Intramural 201650E044) and the grants 2014-SGR-118 and 2014-SGR-788 from the Generalitat de Catalunya.

Author information

Authors and Affiliations

IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council, Campus UAB, 08193, Bellaterra, Catalonia, Spain
Eva Armengol, Àngel García-Cerdaña & Pilar Dellunde
Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Tànger, 122-140, 08018, Barcelona, Catalonia, Spain
Àngel García-Cerdaña
Philosophy Department, Universitat Autònoma de Barcelona, Campus UAB, 08193, Bellaterra, Catalonia, Spain
Pilar Dellunde

Authors

Eva Armengol
View author publications
You can also search for this author in PubMed Google Scholar
Àngel García-Cerdaña
View author publications
You can also search for this author in PubMed Google Scholar
Pilar Dellunde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eva Armengol .

Editor information

Editors and Affiliations

University of Skövde, School of Informatics University of Skövde, Skövde, Sweden
Vicenç Torra
University of Skövde, School of Informatics University of Skövde, Skövde, Sweden
Anders Dahlbom
Toho Gakuen , Tokyo, Japan
Yasuo Narukawa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Armengol, E., García-Cerdaña, À., Dellunde, P. (2017). Experiences Using Decision Trees for Knowledge Discovery. In: Torra, V., Dahlbom, A., Narukawa, Y. (eds) Fuzzy Sets, Rough Sets, Multisets and Clustering. Studies in Computational Intelligence, vol 671. Springer, Cham. https://doi.org/10.1007/978-3-319-47557-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-47557-8_11
Published: 14 January 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47556-1
Online ISBN: 978-3-319-47557-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics