Advertisement

FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances Extended Abstract

  • Catharine Wyss
  • Chris Giannella
  • Edward Robertson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2114)

Abstract

The problem of discovering functional dependencies (FDs) from an existing relation instance has received considerable attention in the database research community. To date, even the most efficient solutions have exponential complexity in the number of attributes of the instance. We develop an algorithm, FastFDs, for solving this problem based on a depth-first, heuristic-driven (DFHD) search for finding minimal covers of hypergraphs. The technique of reducing the FD discovery problem to the problem of finding minimal covers of hypergraphs was applied previously by Lopes et al. in the algorithm Dep-Miner. Dep-Miner employs a levelwise search for minimal covers, whereas FastFDs uses DFHD search. We report several tests on distinct benchmark relation instances involving Dep-Miner, FastFDs, and Tane. Our experimental results indicate that DFHD search is more efficient than Dep-Miner’s levelwise search or Tane’s partitioning approach for many of these benchmark instances.

Keywords

Functional Dependency Search Tree Correlation Factor Minimal Cover Relation Instance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, Rakesh; Mannila, Heikki; Srikant, Ramakrishnan; Toivonen, Hannu and Verkamo, A.I. “Fast Discovery of Association Rules.” Advances in KDD, AAA, Press, Menlo Park, CA, pg. 307–328, 1996.Google Scholar
  2. 2.
    Demetrovics, J; Katona, G; Miklos, D; Seleznjev, O. and Thalheim, B. “The Average Length of Keys and Functional Dependencies in (Random) Databases.” Lecture Notes in Computer Science, vol. 893, 1995.Google Scholar
  3. 3.
    Eiter, Thomas and Gottlob, Goerg. “Identifying the Minimal Traversals of a Hypergraph and Related Problems.” SIAM Journal of Computing, vol. 24,no. 6, pg. 1278–1304, 1995.zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Flach, Peter and Savnik, Iztok. “Database Dependency Discovery: a Machine Learning Approach.” AI Comm. vol. 12,no. 3, pg 139–160.Google Scholar
  5. 5.
    Gunopulos, Dimitrios; Khardon, Roni; Mannila, Heikki; and Toivonen, Hannu. “Data Mining, Hypergraph Traversals, and Machine Learning (extended abstract)”, PODS, 1997, pg 209–216.Google Scholar
  6. 6.
    Huhtala, Ykä; Kärkkäinen, Juha; Porkka, Pasi and Toivonen, Hannu. “TANE: An Efficient Algorithm for Discovering Functional and Approximate Dependencies.” The Computer Journal, vol. 42,no. 2, 1999.Google Scholar
  7. 7.
    Kantola, Martti; Mannila, Heikki; Räihä, Kari-Jouko and Siirtola, Harri. “Discovering Functional and Inclusion Dependencies in Relational Databases.” Journal of Intelligent Systems, vol. 7, pg. 591–607, 1992.zbMATHCrossRefGoogle Scholar
  8. 8.
    Lopes, Stephane; Petit, Jean-Marc and Lakhal, Lotfi. “Efficient Discovery of Functional Dependencies and Armstrong Relations.” Proceedings of ECDT 2000. Lecture Notes in Computer Science, vol 1777.Google Scholar
  9. 9.
    Mannila, Heikki and Räihä, Kari-Jouko. “Dependency Inference (Extended Abstract)”, Proceedings of the Very Large Databases Conference (VLDB), Brighton, pg. 155–158, 1987.Google Scholar
  10. 10.
    Mannila, Heikki and Räihä, Kari-Jouko. “Algorithms for Inferring Functional Dependencies from Relations”, Data & Knowledge Engineering, 12, pg. 83–99, 1994.zbMATHCrossRefGoogle Scholar
  11. 11.
    Merz, C.J. and Murphy, P.M. UCI Machine Learning databases (1996). http://www.ics.uci.edu/~mlearn/MLRepository.html. Irvine, CA: University of California, Department of Information and Comp. Sci.
  12. 12.
    The Tane and Tane/mem source code is available on the web at http://http://www.cs.helsinki.fi/research/fdk/datamining/tane
  13. 13.
    Wyss, C; Giannella, C; and Robertson E. “FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances”, Technical Report, Dept. of Comp. Sci, Indiana University, May 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Catharine Wyss
    • 1
  • Chris Giannella
    • 1
  • Edward Robertson
    • 1
  1. 1.Computer Science DepartmentIndiana UniversityBloomingtonUSA

Personalised recommendations