Abstract
Practical database applications engender the impression that sets of constraints are rather small and that large sets are unusual and caused by bad design decisions. Theoretical investigations show, however, that minimal constraint sets are potentially very large. Their size can be estimated to be exponential in terms of the number of attributes. The gap between belief and theory causes non-acceptance of theoretical results. However, beliefs are related to average cases.
The theory known so far considered worst case complexity. This paper aims at developing a theory of average case complexity. Several statistic models and asymptotics of corresponding probabilities are investigated for random databases. We show that exponential complexity of independent key sets and independent sets of functional dependencies is rather unusual. Depending on the size of relations almost all minimal keys have a length which mainly depends on the size. The number of minimal keys of other length is exponentially small compared with the number of minimal keys of the derived length. Further, if a key is valid in a relation then it is probably the minimal key. The same results hold for functional dependencies.
Supported by the Hungarian National Foundation of Scientific Research, Grant No. 2575.
Supported by the German Natural Science Research Council, contract BB-II-B1-3141-211(94).
Preview
Unable to display preview. Download preview PDF.
References
Ahlswede, R., Wegener, I. (1979). Suchprobleme, Teubner B.G., Stuttgart.
Albrecht M., Altus M., Buchholz B., Düsterhöft A., Schewe K.-D., Thalheim B. (1994), Die intelligente Tool Box zum Datenbankentwurf RAD. Datenbank-Rundbrief, 13, FG 2.5. der GI, Kassel.
Andreev, A. (1982), Tests and pattern recognition. PhD thesis, Moscov State University, 1982.
Armstrong, W.W. (1974). Depending structures of database relationships. Information Processing-74, North Holland, Amsterdam, 580–583.
Beeri C., Dowd M., Fagin R., Statman R. (1984), On the structure of Armstrong relations for functional dependencies. Journal of ACM, Vol.31, No.1, 30–46.
Bekessy A., Demetrovics J., Hannak L., Frankl P., Katona G. (1980), On the number of maximal dependencies in a database relation of fixed order. Discrete Math., 30, 83–88.
Bender, E.A. (1974) Asymptotic methods in enumeration. SIAM Review, 16, 4, 485–515.
Billingsley, P. (1975) Convergence of Probability Measures, Wiley, N.Y.
Codd E.F. (1970), A relational model for large shared data banks. Comm. ACM 13, 6, p. 197–204.
Demetrovics J. (1979), On the Equivalence of Candidate Keys with Sperner sets. Acta Cybernetica, Vol. 4, No. 3, Szeged, 247–252.
Demetrovics J., Katona G.O.H. (1983), Combinatorial problems of database models. Colloquia Mathematica Societatis Janos Bolyai 42, Algebra, Combinatorics and Logic in Computer Science, Gyor (Hungary), 331–352.
Demetrovics J., Katona G.O.H., and Miklos (1994), Functional Dependencies in Random Relational Databases. Manuscript, Budapest.
Demetrovics J., Libkin L.O., and Muchnik I.B. (1989), Functional dependencies and the semilattice of closed classes. Proc. MFDBS-89, LNCS 364, 136–147.
Feller, W. (1968) An Introduction to Probability Theory and its Applications, Wiley, N.Y.
Gottlob G. (1987), On the size of nonredundant FD-covers. Information Processing Letters, 24, 6, 355–360.
Mannila H., Räihä K.-J. (1982), On the relationship between minimum and optimum covers for a set of functional dependencies. Res. Rep. C-1982-51, University of Helsinki.
Mannila H., Räihä K.-J. (1992), The design of relational databases. Addison-Wesley, Amsterdam.
Sachkov, V.N. (1982). An Introduction to Combinatorics Methods of Discrete Mathemathics, Moscow, Nauka.
Seleznjev O., Thalheim B. (1988), On the number of minimal keys in relational databases over nonuniform domains. Acta Cybernetica, Szeged, 8, 3, 267–271.
Seleznjev O., Thalheim B. (1994), Probability Problems in Database Theory. Preprint 1-3/1994, Cottbus Technical University.
Thalheim B. (1987), On the number of keys in relational databases. Proc. FCT-87-Conf., Kazan, LNCS 1987.
Thalheim B. (1989), On Semantic Issues Connected with Keys in Relational Databases Permitting Null Values. Journal Information Processing and Cybernetics, EIK, 25, 1/2, 11–20.
Thalheim B. (1991), Dependencies in Relational Databases. Leipzig, Teubner Verlag.
Thalheim B. (1992), On the number of keys in relational and nested relational databases. Discrete Applied Mathematics, 38.
Tou, J., Gonsales, R. (1974). Pattern Recognition Principles, Add.-Wesley, London.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Demetrovics, J., Katona, G.O.H., Miklos, D., Seleznjev, O., Thalheim, B. (1995). The average length of keys and functional dependencies in (random) databases. In: Gottlob, G., Vardi, M.Y. (eds) Database Theory — ICDT '95. ICDT 1995. Lecture Notes in Computer Science, vol 893. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58907-4_21
Download citation
DOI: https://doi.org/10.1007/3-540-58907-4_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58907-5
Online ISBN: 978-3-540-49136-1
eBook Packages: Springer Book Archive