Efficient Algorithms for Mining Inclusion Dependencies

De Marchi, Fabien; Lopes, Stéphane; Petit, Jean-Marc

doi:10.1007/3-540-45876-X_30

Fabien De Marchi⁷,
Stéphane Lopes⁸ &
Jean-Marc Petit⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2287))

Included in the following conference series:

International Conference on Extending Database Technology

621 Accesses
28 Citations

Abstract

Foreign keys form one of the most fundamental constraints for relational databases. Since they are not always defined in existing databases, algorithms need to be devised to discover foreign keys. One of the underlying problems is known to be the inclusion dependency (IND) inference problem. In this paper a new data mining algorithm for computing unary INDs is given. From unary INDs, we also propose a levelwise algorithmto discover all remaining INDs, where candidate INDs of size i + 1 are generated from satisfied INDs of size i, (i > 0). An implementation of these algorithms has been achieved and tested against synthetic databases. Up to our knowledge, this paper is the first one to address in a comprehensive manner this data mining problem, from algorithms to experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. In Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo, editors, International Conference on Very Large Data Bases, Santiago de Chile, Chile, pages 487–499. Morgan Kaufmann, 1994.
Google Scholar
S. Bell and P. Brockhausen. Discovery of constraints and data dependencies in databases (extended abstract). In Nada Lavrac and Stefan Wrobel, editors, European Conference on Machine Learning, Crete, Greece, pages 267–270, 1995.
Google Scholar
G. Vossen C. Fahrner. A survey of database design transformations based on the entity-relationship model. Data and Knowledge Engineering, 15(3):213–250, 1995.
Article MATH Google Scholar
M. Casanova, R. Fagin, and C. Papadimitriou. Inclusion dependencies and their interaction with functional dependencies. Journal of Computer and System Sciences, 24(1):29–59, February 1984.
Article MathSciNet Google Scholar
Qi Cheng, Jarek Gryz, Fred Koo, T. Y. Cliff Leung, Linqi Liu, Xiaoyan Qian, and Berni Schiefer. Implementation of two semantic query optimization techniques in DB2 universal database. In Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, and Michael L. Brodie, editors, International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, pages 687–698. Morgan Kaufmann, 1999.
Google Scholar
C. J. Date. Referential integrity. In International Conference on Very Large Data Bases, Cannes, France, pages 2–12. IEEE Computer Society Press, 1981.
Google Scholar
F. De Marchi, S. Lopes, and J-M. Petit. Informative armstrong relations: Application to database analysis. In Bases de Données Avancées, Agadir, Maroc, October 2001.
Google Scholar
F. De Marchi, M. Rivon, S. Lopes, and J-M. Petit. Mind: Algorithme par niveaux de découverte des dépendances d’inclusion. In Inforsid 2001 (french information system conference), Martigny, Swiss, May 2001.
Google Scholar
Jarek Gryz. Query folding with inclusion dependencies. In International Conference on Data Engineering, Orlando, Florida, USA, pages 126–133. IEEE Computer Society, 1998.
Google Scholar
J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, August 2000.
Google Scholar
Y. Huhtala, J. Karkkainen, P. Porkka, and H. Toivonen. TANE: An efficient algorithmfor discovering functional and approximate dependencies. The Computer Journal, 42(2):100–111, 1999.
Article MATH Google Scholar
M. Kantola, H. Mannila, K. J. Räihä, and H. Siirtola. Discovering functional and inclusion dependencies in relational databases. International Journal of Intelligent Systems, 7:591–607, 1992.
Article MATH Google Scholar
M. Levene and G. Loizou. A Guided Tour of Relational Databases and Beyond. SPRINGER, 1999.
Google Scholar
M. Levene and M. W. Vincent. Justification for inclusion dependency normal form. IEEE Transactions on Knowledge and Data Engineering, 12(2):281–291, 2000.
Article Google Scholar
S. Lopes, J.-M. Petit, and L. Lakhal. Efficient discovery of functional dependencies and armstrong relations. In Carlo Zaniolo, Peter C. Lockemann, Marc H. Scholl, and Torsten Grust, editors, International Conference on Extending Database Technology, Konstanz, Germany, volume 1777 of Lecture Notes in Computer Science, pages 350–364. Springer, 2000.
Google Scholar
S. Lopes, J-M. Petit, and F. Toumani. Discovering interesting inclusion dependencies: Application to logical database tuning. Information System, 17(1):1–19, 2002.
Article Google Scholar
H. Mannila and K. J. Räihä. The Design of Relational Databases. Addison-Wesley, second edition, 1994.
Google Scholar
H. Mannila and H. Toivonen. Levelwise Search and Borders of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery, 1(1):241–258, 1997.
Article Google Scholar
V.M. Markowitz and J.A. Makowsky. Identifying Extended Entity-Relationship Object Structures in Relational Schemas. IEEE Transactions on Software Engineering, 16(1):777–790, August 1990.
Article Google Scholar
R. J. Miller, M. A. Hernández, L. M. Haas, L. Yan, C. T. H. Ho, R. Fagin, and L. Popa. The clio project: Managing heterogeneity. SIGMOD Record, 30(1):78–83, 2001.
Article Google Scholar
Noel Novelli and Rosine Cicchetti. Fun: An efficient algorithmfor mining functional and embedded dependencies. In Jan Van den Bussche and Victor Vianu, editors, 8th International Conference on Databases Theory, London, UK, volume 1973 of Lecture Notes in Computer Science, pages 189–203. Springer, 2001.
Google Scholar
E. Pichat and R. Bodin. Ingénierie des données. Masson, 1790.
Google Scholar
C. Wyss, C. Giannella, and E. Robertson. Fastfds: A heuristic-driven depth-first algorithmfor mining functional dependencies fromrelation instances. In Yahiko Kambayashi, Werner Winiwarter, and Masatoshi Arikawa, editors, Data Warehousing and Knowledge Discovery, Munich, Germany, volume 2114 of Lecture Notes in Computer Science, pages 101–110, 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire LIMOS, Université Blaise Pascal - Clermont-Ferrand II, 24 avenue des Landais, 63 177 Aubiére cedex, CNRS FRE 2239, France
Fabien De Marchi & Jean-Marc Petit
Laboratoire PRISM, 45, avenue des Etats-Unis, 78035 Versailles Cedex, CNRS UMR 8636, France
Stéphane Lopes

Authors

Fabien De Marchi
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Petit
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Aalborg University, Aalborg
Christian S. Jensen & Simonas Šaltenis &
Business and Information Technology Dept., CLRC Rutherford Appleton Laboratory, UK
Keith G. Jeffery
Faculty of Mathematics and Physics, Charles University, Czech Republic
Jaroslav Pokorny
Department of Information Science, University of Milan, Milan
Elisa Bertino
Institute of Information Systems, ETH Zurich, Zurich
Klemens Böhn
Informatik V, RWTH Aachen, Aachen
Matthias Jarke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De Marchi, F., Lopes, S., Petit, JM. (2002). Efficient Algorithms for Mining Inclusion Dependencies. In: Jensen, C.S., et al. Advances in Database Technology — EDBT 2002. EDBT 2002. Lecture Notes in Computer Science, vol 2287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45876-X_30

Download citation

DOI: https://doi.org/10.1007/3-540-45876-X_30
Published: 14 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43324-8
Online ISBN: 978-3-540-45876-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics