Abstract
One of the most challenging problems in data manipulation in the future is to be able to efficiently handle very large databases but also multiple induced properties or generalizations in that data. Popular examples of useful properties are association rules, and inclusion and functional dependencies. Our view of a possible approach for this task is to specify and query inductive databases, which are databases that in addition to data also contain intensionally defined generalizations about the data. We formalize this concept and show how it can be used throughout the whole process of data mining due to the closure property of the framework. We show that simple query languages can be defined using normal database terminology. We demonstrate the use of this framework to model typical data mining processes. It is then possible to perform various tasks on these descriptions like, e.g., optimizing the selection of interesting properties or comparing two processes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In SIGMOD’93, pages 207–216, May 1993. ACM.
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI Press, 1996.
J.-F. Boulicaut. A KDD framework to support database audit. In WITS’98, volume TR 19, pages 257–266, December 1998. University of Jyväskylä.
J.-F. Boulicaut, M. Klemettinen, and H. Mannila. Querying inductive databases: A case study on the MINE RULE operator. In PKDD’98, volume 1510 of LNAI, pages 194–202, September 1998. Springer-Verlag.
S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. In SIGMOD’97, pages 255–264, 1997. ACM Press.
L. Dehaspe and L. De Raedt. Mining association rules in multiple relations. In Proceedings 7th Int’l Workshop on Inductive Logic Programming, volume 1297 of LNAI, pages 125–132. Springer-Verlag, 1997.
J. M. Hellerstein. Optimization techniques for queries with expensive methods. ACM Transaction on Database Systems, 1998. Available at http://www.cs.berkeley.edu/~jmh/miscpapers/todsxfunc.ps.
T. Imielinski and H. Mannila. A database perspective on knowledge discovery. Communications of the ACM, 39(11):58–64, November 1996.
T. Imielinski, A. Virmani, and A. Abdulghani. DataMine: Application programming interface and query language for database mining. In KDD’96, pages 256–261, August 1996. AAAI Press.
M. Klemettinen, H. Mannila, and H. Toivonen. Rule discovery in telecommunication alarm data. Journal of Network and Systems Management, 1999. To appear.
H. Mannila. Inductive databases and condensed representations for data mining. In Proceedings of the International Logic Programming Symposium (ILPS’97), pages 21–30, October 1997. MIT Press.
H. Mannila. Methods and problems in data mining. In ICDT’97, volume 1186 ofLNCS, pages 41–55. Springer-Verlag, 1997.
R. Meo, G. Psaila, and S. Ceri. A new SQL-like operator for mining association rules. In VLDB’96, pages 122–133, September 1996. Morgan Kaufmann.
R. Ng, L. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In SIGMOD’98, pages 13–24, 1998. ACM Press.
P. Smyth and R. M. Goodman. An information theoretic approach to rule induction from databases. IEEE Transactions on Knowledge and Data Engineering, 4(4):301–316, August 1992.
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In KDD’97, pages 67–73, 1997. AAAI Press.
D. Tsur, J. D. Ullman, S. Abiteboul, C. Clifton, R. Motwani, S. Nestorov, and A. Rosenthal. Query ocks: A generalization of association-rule mining. In SIGMOD’98, pages 1–12, 1998. ACM Press.
A. Tuzhilin. A pattern discovery algebra. In SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Technical Report 97-07 University of British Columbia, pages 71–76, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Boulicaut, JF., Klemettinen, M., Mannila, H. (1999). Modeling KDD Processes within the Inductive Database Framework. In: Mohania, M., Tjoa, A.M. (eds) DataWarehousing and Knowledge Discovery. DaWaK 1999. Lecture Notes in Computer Science, vol 1676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48298-9_31
Download citation
DOI: https://doi.org/10.1007/3-540-48298-9_31
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66458-1
Online ISBN: 978-3-540-48298-7
eBook Packages: Springer Book Archive