Characterizing functional dependencies in formal concept analysis with pattern structures

Article

Abstract

Computing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query optimization. Whereas it has been deeply investigated in those fields, strong links exist with the mathematical framework of Formal Concept Analysis. Considering the discovery of functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. However, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures, an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets.

Keywords

Association rules Attribute implications Data dependencies Pattern structures Formal concept analysis 

Mathematical Subject Classifications (2010)

MSC2010 06A15 68P15 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)MATHGoogle Scholar
  2. 2.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI/MIT Press (1996)Google Scholar
  3. 3.
    Babin, M.A., Kuznetsov, S.O.: Computing premises of a minimal cover of functional dependencies is intractable. Discret. Appl. Math. 161(6), 742–749 (2013)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Baixeries, J.: Lattice Characterization of Armstrong and Symmetric Dependencies (PhD Thesis). Universitat Politècnica de Catalunya (2007)Google Scholar
  5. 5.
    Baixeries, J., Balcázar, J.L.: Characterization and armstrong relations for degenerate multivalued dependencies using formal concept analysis. In: Ganter, B., Godin, R. (eds.) ICFCA, Volume 3403 of Lecture Notes in Computer Science, pp. 162–175. Springer (2005Google Scholar
  6. 6.
    Baudinet, M., Chomicki, J., Wolper, P.: Constraint-generating dependencies. J. Comput. Syst. Sci. 59(1), 94–115 (1999)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Beeri, C., Dowd, M., Fagin, R., Statman, R.: On the structure of armstrong relations for functional dependencies. J. ACM 31(1), 30–46 (1984)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Beeri, C., Vardi, M.Y.: Formal systems for tuple and equality generating dependencies. SIAM J. Comput. 13(1), 76–98 (1984)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Belohlávek, R., Vychodil, V.: Data tables with similarity relations: functional dependencies, complete rules and non-redundant bases. In: Lee,M.-L., Tan, K.-L.,Wuwongse, V. (eds.) DASFAA, Volume 3882 of Lecture Notes in Computer Science, pp. 644–658. Springer (2006)Google Scholar
  10. 10.
    Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: Chirkova, R., Dogac, A., Özsu, M.T., Sellis, T.K. (eds.) ICDE, pp. 746–755. IEEE (2007)Google Scholar
  11. 11.
    Caspard, N., Monjardet, B.: The lattices of closure systems, closure operators, and implicational systems on a finite set: a survey. Discret. Appl. Math. 127(2), 241–269 (2003)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cambridge University Press, Cambridge (1990)MATHGoogle Scholar
  13. 13.
    Diallo, T., Novelli, N., Petit, J.-M.: Discovering (frequent) constant conditional functional dependencies. IJDMMM 4(3), 205–223 (2012)CrossRefGoogle Scholar
  14. 14.
    Fan,W.: Dependencies revisited for improving data quality. In: Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS ’08, pp. 159–170. ACM, New York (2008)Google Scholar
  15. 15.
    Fan, W., Geerts, F., Li, J., Xiong, M.: Discovering conditional functional dependencies. IEEE Trans. Knowl. Data Eng. 23(5), 683–698 (2011)CrossRefGoogle Scholar
  16. 16.
    Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: Delugach, H.S., Stumme, G. (eds.) Conceptual Structures: Broadening the Base, Proceedings of the 9th International Conference on Conceptual Structures (ICCS 2001), LNCS 2120, pp. 129–142. Springer (2001)Google Scholar
  17. 17.
    Ganter, B., Wille, R.: Formal Concept Analysis. Springer, Berlin (1999)CrossRefMATHGoogle Scholar
  18. 18.
    Graetzer, G., Davey, B., Freese, R., Ganter, B., Greferath, M., Jipsen, P., Priestley, H., Rose, H., Schmidt, E., Schmidt, S., Wehrung, F., Wille, R.: General Lattice Theory. Freeman, San Francisco (1971)Google Scholar
  19. 19.
    Guigues, J.-L., Duquenne, V.: Familles minimales d’implications informatives résultant d’un tableau de données binaires. Math. Sci. Hum. 95, 5–18 (1986)MathSciNetGoogle Scholar
  20. 20.
    Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: Tane: an efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)CrossRefMATHGoogle Scholar
  21. 21.
    Kanellakis, P.C.: Elements of relational database theory. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, vol. B, pp. 1073–1156. MIT Press, Cambridge (1990)Google Scholar
  22. 22.
    Kaytoue, M., Kuznetsov, S.O., Napoli, A.: Revisiting numerical pattern mining with formal concept analysis. In: Walsh, T. (ed.) IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16–22, 2011, pp. 1342–1347. IJCAI/AAAI, Barcelona (2011)Google Scholar
  23. 23.
    Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression data with pattern structures in formal concept analysis. Inf. Sci. 181(10), 1989–2001 (2011)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Kuznetsov, S.O.: Machine learning on the basis of formal concept analysis. Autom. Remote Control 62(10), 1543–1564 (2001)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Kuznetsov, S.O.: Galois connections in data analysis: contributions from the soviet era and modern russian research. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis, Foundations and Applications. Lecture Notes in Computer Science 3626, pp. 196–225. Springer (2005)Google Scholar
  26. 26.
    Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. J. Exp. Theor. Artif. Intell. 14(2–3), 189–216 (2002)CrossRefMATHGoogle Scholar
  27. 27.
    Lopes, S., Petit, J.-M., Lakhal, L.: Functional and approximate dependency mining: database and fca points of view. J. Exp. Theor. Artif. Intell. 14(2–3), 93–114 (2002)CrossRefMATHGoogle Scholar
  28. 28.
    Maier, D.: The Theory of Relational Databases. Computer Science, Rockville (1983)MATHGoogle Scholar
  29. 29.
    Medina, R., Nourine, L.: A unified hierarchy for functional dependencies, conditional functional dependencies and association rules. In: Ferré, S., Rudolph, S. (eds.) ICFCA, Volume 5548 of Lecture Notes in Computer Science, pp. 98–113. Springer (2009)Google Scholar
  30. 30.
    Medina, R., Nourine, L.: Conditional functional dependencies: an fca point of view. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA, Volume 5986 of Lecture Notes in Computer Science, pp. 161–176. Springer (2010)Google Scholar
  31. 31.
    Nedjar, S., Pesci, F., Lakhal, L., Cicchetti, R.: The agree concept lattice for multidimensional database analysis. In: Valtchev, P., Jäschke, R. (eds.) ICFCA, Volume 6628 of Lecture Notes in Computer Science, pp. 219–234. Springer (2011)Google Scholar
  32. 32.
    Ramakrishnan, R., Gehrke, J.: Database Management Systems, 2nd edn. Osborne/McGraw-Hill, Berkeley (2000)Google Scholar
  33. 33.
    Sagiv, Y., Delobel, C., Parker, D.S. Jr., Fagin, R.: An equivalence between relational database dependencies and a fragment of propositional logic. J. ACM 28(3), 435–453 (1981)MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    Simovici, D.A., Cristofor, D., Cristofor, L.: Impurity measures in databases. Acta Inf. 38(5), 307–324 (2002)MathSciNetCrossRefMATHGoogle Scholar
  35. 35.
    Simovici, D.A., Tenney, R.L.: Relational Database Systems, 1st edn. Academic, Orlando (1995)Google Scholar
  36. 36.
    Song, S., Chen, L.: Differential dependencies: reasoning and discovery. ACM Trans. Database Syst. 36(3), 16:1–16:41 (2011)CrossRefGoogle Scholar
  37. 37.
    Song, S., Chen, L.: Efficient discovery of similarity constraints for matching dependencies. Data Knowl. Eng. 87(0), 146–166 (2013). doi:10.1016/j.datak.2013.06.003. http://www.sciencedirect.com/science/article/pii/S0169023X13000700 MathSciNetCrossRefGoogle Scholar
  38. 38.
    Song, S., Chen, L., Yu, P.S.: Comparable dependencies over heterogeneous data. VLDB J. 22(2), 253–274 (2013)CrossRefGoogle Scholar
  39. 39.
    Ullman, J.: Principles of Database Systems and Knowledge-Based Systems, vol. 1–2. Computer Science, Rockville (1989)Google Scholar
  40. 40.
    Uno, T., Kiyomi, M., Arimura, H.: Lcm ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Bayardo, R.J. Jr., Goethals, B., Zaki, M.J. (eds.) FIMI, Volume 126 of CEUR Workshop Proceedings. CEUR-WS.org (2004)Google Scholar
  41. 41.
    Valtchev, P., Missaoui, R., Godin, R.: Formal concept analysis for knowledge discovery and data mining: The new challenges. In: Eklund, P.W. (ed.) ICFCA, Volume 2961 of Lecture Notes in Computer Science, pp. 352–371. Springer (2004)Google Scholar
  42. 42.
    Wille, R.: Why can concept lattices support knowledge discovery in databases? J. Exp. Theor. Artif. Intell. 14(2–3), 81–92 (2002)CrossRefMATHGoogle Scholar
  43. 43.
    Wyss, C., Giannella, C., Robertson, E.L.: Fastfds: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances—extended abstract. In: Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery, DaWaK ’01, pp. 101–110. Springer, London (2001)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Departament de Llenguatges i Sistemes InformàticsUniversitat Politècnica de CatalunyaBarcelonaSpain
  2. 2.Université de Lyon, CNRS, INSA-Lyon, LIRIS, UMR5205LyonFrance
  3. 3.LORIA (CNRS – Inria Nancy Grand-Est – Université de Lorraine)Vandœuvre-lès-NancyFrance

Personalised recommendations