Abstract
Data mining is a useful decision support technique that can be used to discover production rules in warehouses or corporate data. Data mining research has made much effort to apply various mining algorithms efficiently on large databases. However, a serious problem in their practical application is the long processing time of such algorithms. Nowadays, one of the key challenges is to integrate data mining methods within the framework of traditional database systems. Indeed, such implementations can take advantage of the efficiency provided by SQL engines.
In this paper, we propose an integrating approach for decision trees within a classical database system. In other words, we try to discover knowledge from relational databases, in the form of production rules, via a procedure embedding SQL queries. The obtained decision tree is defined by successive, related relational views. Each view corresponds to a given population in the underlying decision tree. We selected the classical Induction Decision Tree (ID3) algorithm to build the decision tree. To prove that our implementation of ID3 works properly, we successfully compared the output of our procedure with the output of an existing and validated data mining software, SIPINA. Furthermore, since our approach is tuneable, it can be generalized to any other similar decision tree-based method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo. Fast discovery of association rules. In Advances in Kowledge Discovery and Data Mining, pages 307–328, 1996.
S. Chaudhuri. Data mining and database systems: Where is the intersection? Data Engineering Bulletin, 21(1):4–8, 1998.
E. F. Codd. Providing olap (on-line analytical processing) to user-analysts: An it mandate. Technical report, E.F. Codd and Associates, 1993.
J. Gehrke, R. Ramakrishnan, and V. Ganti. Rainforest-a framework for fast decision tree construction of large datasets. In 24th International Conference on Very Large Data Bases (VLDB 98), New York City, USA, pages 416–427. Morgan Kaufmann, 1998.
IBM. Db2 intelligent miner scoring. http://www-4.ibm.com/software/data/iminer/scoring, 2001.
R. Meo, G. Psaila, and S. Ceri. A new sql-like operator for mining association rules. In 22th International Conference on Very Large Data Bases (VLDB 96), Mumbai, India, pages 122–133. Morgan Kaufmann, 1996.
Microsoft. Introduction to ole-db for data mining. http://www.microsoft.com/data/oledb, July 2000.
A. Netz, S. Chaudhuri, J. Bernhardt, and U. Fayyad. Integration of data mining and relational databases. In 26th International Conference on Very Large Data Bases (VLDB 00), Cairo, Egypt, pages 719–722. Morgan Kaufmann, 2000.
Oracle. Oracle 9i data mining. White paper, June 2001.
J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.
S. Sarawagi, S. Thomas, and R. Agrawal. Integrating mining with relational database systems: Alternatives and implications. In ACM SIGMOD International Conference on Management of Data (SIGMOD 98), Seattle, USA, pages 343–354. ACM Press, 1998.
S. Soni, Z. Tang, and J. Yang. Performance study microsoft data mining algorithms. Technical report, Microsoft Corp., 2001.
D. A. Zighed and R. Rakotomalala. Sipina-w(c) for windows: User’s guide. Technical report, ERIC laboratory, University of Lyon 2, France, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bentayeb, F., Darmont, J. (2002). Decision Tree Modeling with Relational Views. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_46
Download citation
DOI: https://doi.org/10.1007/3-540-48050-1_46
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43785-7
Online ISBN: 978-3-540-48050-1
eBook Packages: Springer Book Archive