Abstract
We present a dynamic modelization of a database when submitted to a sequence of queries and updates, that allows us to study the evolution of the sizes of relations. While the problem of estimating the sizes of derived relations at a given time (“static” case) has been the subject of several studies, to the best of our knowledge the evolution of the relation sizes under queries and updates (“dynamic” cases) has not been studied so far. We consider the size of a relation as a random variable, and we study its probability distribution when the database is submitted to a sequence of insertions, deletions and queries. We show that it behaves asymptotically as a Gaussian process, whose expectation and covariance are proportional to the time. This approach also allows us to analyze the maximum of the size of the derived relation.
Preview
Unable to display preview. Download preview PDF.
References
P. BILLINGSLEY. Convergence of Probability Measures. 1968, Wiley.
S. CHRISTODOULAKIS. Estimating block transfers and join sizes. ACM SIGMOD: 40–54, 1983.
S. CHRISTODOULAKIS. Implications of certains assumptions in database performance evaluation. ACM Transactions on Database Systems, 9(2): 165–186, 1984.
H.E. DANIELS. The maximum of a Gaussian process whose mean path has a maximum, with an application to the strength of bundles of fibres. Adv. Appl. Prob., 21: 315–333, 1989.
P. FLAJOLET and J. FRANÇON and J. VUILLEMIN. Sequence of operations analysis for dynamic data structures. Journal of Algorithms: 111–141, 1980.
J. FRANÇON and C. PUECH. Histoires de files de priorité avec fusions. 1984, 9th Colloquium on Trees in Algebra and Programming, Bordeaux (France), B. Courcelle Ed.: 119–138, Cambridge University Press, 1984.
D. GARDY. Normal limiting distributions for projection and semijoin sizes. SIAM Journal on Discrete Mathematics, 5(2): 219–248, 1992.
D. GARDY. Join sizes, urn models and normal limiting distributions. Theoretical Computer Science (A), 131: 375–414, 1994.
D. GARDY and G. LOUCHARD. Dynamic analysis of some relational data base parameters I: projections. Technical report, Lab. Prism, University of Versailles, No. 94–6, 1994.
D. GARDY and G. LOUCHARD. Dynamic analysis of some relational data base parameters II: equijoins and semijoins. Technical report, Lab. Prism, University of Versailles, No. 94–7, 1994.
A. VAN GELDER. Multiple join size estimation by virtual domain. Principles of Database Systems, Washington (USA): 180–189, 1993.
W.-C. HOU and G. OZSOYOGLU. Statistical estimators for aggregate relational algebra queries. ACM Transactions On Database Systems, 16(4): 600–654, 1991.
N.L. JOHNSON and S. KOTZ. Urn models and their application. Wiley & Sons, 1977.
S. KARLIN and H.M. TAYLOR. A second Course in Stochastic Processes. Academic Press, 1981.
C.M. KENYON-MATHIEU and J.S. VITTER. General methods for the analysis of the maximum size of dynamic data structures. 16th International Colloquium on Automata, Languages and Programming, Springer-Verlag LNCS No. 372: 473–487, Stresa (Italy), 1989.
Y. LING and W. SUN. A supplement to sampling-based methods for query size estimation in a database system. SIGMOD Record, 21 (4), 1992.
R.L. LIPTON and J.F. NAUGHTON and D.A. SCHNEIDER and S. SESHADRI. Efficient sampling strategies for relational database operations. Theoretical Computer Science, 116 (1): 195–226, 1993.
G. LOUCHARD. Trie size in a dynamic list structure. TAPSOFT'93, M.-C. Gaudel and J.-P. Jouannaud Eds., Springer Verlag LNCS No. 668: 719–731, 1993.
R.S. MAIER. A path integral approach to data structure evolution. Journal of Complexity: 232–260, 1991.
M. V. MANNINO and P. CHU and T. SAGER. Statistical profile estimation in database systems. ACM Computing Surveys, 20 (3): 191–221, 1988.
T.H. MERRETT and E. OTOO. Distribution models of relations. 5th Conference on Very Large Data Bases, (Rio de Janeiro), 418–425, 1979.
J. K. MULLIN. Estimating the size of a relational join. Information Systems, 18(3): 189–196, 1993.
B. MUTHUSWAMY and L. KERSCHBERG. A detailed statistical model for relational query optimization. Annual Conference of the ACM, Denver, Colorado (USA): 439–448, 1985.
W. SUN and Y. LING and N. RISHE and Y. DENG. An instant and accurate size estimation method for joins and selection in a retrieval-intensive environment. ACM SIGMOD International Conference, Washington, D.C. (USA): 79–88, 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gardy, D., Louchard, G. (1995). Dynamic analysis of the sizes of relations. In: Mayr, E.W., Puech, C. (eds) STACS 95. STACS 1995. Lecture Notes in Computer Science, vol 900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-59042-0_94
Download citation
DOI: https://doi.org/10.1007/3-540-59042-0_94
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-59042-2
Online ISBN: 978-3-540-49175-0
eBook Packages: Springer Book Archive