Approximate dependency inference from relations

  • Jyrki Kivinen
  • Heikki Mannila
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 646)

Abstract

The functional dependency inference problem consists of finding a cover for the set dep(r) of functional dependencies that hold in a given relation r. All known algorithms for this task have running times that can be in the worst case exponential in the size of the smallest cover of the dependency set. We consider approximate dependency inference. We define various measures for the error of a dependency in a relation. These error measures have the value 0 if the dependency holds and a value close to 1 if the dependency clearly does not hold. Depending on the measure used, all dependencies with error at least ɛ in r can be detected with high probability by considering only O(1/ɛ) or O(¦r¦1/2/ɛ) random tuples of r. We also show how a machine learning algorithm due to Angluin, Frazier and Pitt can be applied to give in output-polynomial time an approximately correct cover for the set of dependencies holding in r.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hussein Almuallim, Thomas G. Dietterich: Learning with many irrelevant features. In Proc. 9th National Conference on Artificial Intelligence (1991) 547–552Google Scholar
  2. 2.
    Dana Angluin: Queries and concept learning. Machine Learning 2 (1988) 319–242Google Scholar
  3. 3.
    Dana Angluin, Michael Frazier, Leonard Pitt: Learning conjunctions of Horn clauses. In Proc. 31st Symposium on Foundations of Computer Science (1990) 186–191Google Scholar
  4. 4.
    Dana Angluin, Leslie G. Valiant: Fast probabilistic algorithms for Hamiltonian circuits and matching. J. Comp. Syst. Sci. 18 (1979) 155–193Google Scholar
  5. 5.
    Catriel Beeri, Martin Dowd, Ronald Fagin, Richard Statman: On the structure of Armstrong relations for functional dependencies. J. ACM 31 (1984) 30–46Google Scholar
  6. 6.
    M. Bouzeghoub, George Gardarin, E. Metais: Database design tools: An expert system approach. In Proc. 11th International Conference on Very Large Data Bases (1985) 82–95Google Scholar
  7. 7.
    Marco Antonio Casanova, Jose E. Amaral de Sa: Mapping uninterpreted schemes into entity-relationship diagrams: Two applications to conceptual schema design. IBM J. Res. Devel. 28 (1984) 82–94Google Scholar
  8. 8.
    R. Dechter: Decomposing an n-ary relation into a tree of binary relations. In Proc. 6th Symposium on Principles of Database Systems (1987) 185–189Google Scholar
  9. 9.
    János Demetrovics, Vu Duc Thi: Keys, antikeys and prime attributes. Annales Univ. Sci. Budapest, Sect. Comp. 8 (1987) 35–52Google Scholar
  10. 10.
    János Demetrovics, Vu Duc Thi: Some results about functional dependencies. Acta Cybernetica 8 (1988) 273–278Google Scholar
  11. 11.
    Thomas Eiter, Georg Gottlob: Identifying the minimal transversals of a hypergraph and related problems. Report CD-TR 91/16, Technische Universität Wien (1991)Google Scholar
  12. 12.
    Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical selectivity estimation through adaptive sampling. In Proc. 1990 International Conference on Management of Data (1990) 1–11Google Scholar
  13. 13.
    David Maier: The Theory of Relational Databases. Computer Science Press (1983)Google Scholar
  14. 14.
    Heikki Mannila, Kari-Jouko Räihä: Design by example: An application of Armstrong relations. J. Comp. Syst. Sci, 33 (1986) 126–141Google Scholar
  15. 15.
    Heikki Mannila, Kari-Jouko Räihä: Dependency inference. In Proc. 13th International Conference on Very Large Data Bases (1987) 155–158Google Scholar
  16. 16.
    Heikki Mannila, Kari-Jouko Räihä: Algorithms for inferring functional dependencies. Report C-1991-41, University of Helsinki, Department of Computer Science (1991)Google Scholar
  17. 17.
    Heikki Mannila, Kari-Jouko Räihä: On the complexity of inferring functional dependencies (to appear)Google Scholar
  18. 18.
    Stuart J. Russell: The Use of Knowledge in Analogy and Induction. Morgan Kaufmann (1989)Google Scholar
  19. 19.
    Jeffrey C. Schlimmer: Learning determinations and checking databases. In Proc. 1991 AAAI Workshop on Knowledge Discovery in Databases (1991) 64–76Google Scholar
  20. 20.
    Yehoshua Sagiv, Claude Delobel, David S. Parker, Ronald Fagin: An equivalence between relational database dependencies and a fragment of propositional logic. J. ACM 28 (1981) 435–453Google Scholar
  21. 21.
    Michael Siegel: Automatic rule derivation for semantic query optimization. Report # 86-013, Boston University, Computer Science Department (1986)Google Scholar
  22. 22.
    Antonio M. Silva, Michael A. Melkanoff: A method for helping discover the dependencies of a relation. In Advances in Data Base Theory (1981) 115–133Google Scholar
  23. 23.
    Jeffrey D. Ullman: Principles of Database and Knowledge-Base Systems, volume I. Computer Science Press (1988)Google Scholar
  24. 24.
    Leslie G. Valiant: A theory of the learnable. Comm. ACM 27 (1984) 1134–1142Google Scholar

Copyright information

© Springer-Verlag 1992

Authors and Affiliations

  • Jyrki Kivinen
    • 1
  • Heikki Mannila
    • 1
  1. 1.Department of Computer ScienceUniversity of HelsinkiHelsinkiFinland

Personalised recommendations