Skip to main content

Dynamic analysis of the sizes of relations

  • Conference paper
  • First Online:
STACS 95 (STACS 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 900))

Included in the following conference series:

  • 894 Accesses

Abstract

We present a dynamic modelization of a database when submitted to a sequence of queries and updates, that allows us to study the evolution of the sizes of relations. While the problem of estimating the sizes of derived relations at a given time (“static” case) has been the subject of several studies, to the best of our knowledge the evolution of the relation sizes under queries and updates (“dynamic” cases) has not been studied so far. We consider the size of a relation as a random variable, and we study its probability distribution when the database is submitted to a sequence of insertions, deletions and queries. We show that it behaves asymptotically as a Gaussian process, whose expectation and covariance are proportional to the time. This approach also allows us to analyze the maximum of the size of the derived relation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. BILLINGSLEY. Convergence of Probability Measures. 1968, Wiley.

    Google Scholar 

  2. S. CHRISTODOULAKIS. Estimating block transfers and join sizes. ACM SIGMOD: 40–54, 1983.

    Google Scholar 

  3. S. CHRISTODOULAKIS. Implications of certains assumptions in database performance evaluation. ACM Transactions on Database Systems, 9(2): 165–186, 1984.

    Google Scholar 

  4. H.E. DANIELS. The maximum of a Gaussian process whose mean path has a maximum, with an application to the strength of bundles of fibres. Adv. Appl. Prob., 21: 315–333, 1989.

    Google Scholar 

  5. P. FLAJOLET and J. FRANÇON and J. VUILLEMIN. Sequence of operations analysis for dynamic data structures. Journal of Algorithms: 111–141, 1980.

    Google Scholar 

  6. J. FRANÇON and C. PUECH. Histoires de files de priorité avec fusions. 1984, 9th Colloquium on Trees in Algebra and Programming, Bordeaux (France), B. Courcelle Ed.: 119–138, Cambridge University Press, 1984.

    Google Scholar 

  7. D. GARDY. Normal limiting distributions for projection and semijoin sizes. SIAM Journal on Discrete Mathematics, 5(2): 219–248, 1992.

    Google Scholar 

  8. D. GARDY. Join sizes, urn models and normal limiting distributions. Theoretical Computer Science (A), 131: 375–414, 1994.

    Google Scholar 

  9. D. GARDY and G. LOUCHARD. Dynamic analysis of some relational data base parameters I: projections. Technical report, Lab. Prism, University of Versailles, No. 94–6, 1994.

    Google Scholar 

  10. D. GARDY and G. LOUCHARD. Dynamic analysis of some relational data base parameters II: equijoins and semijoins. Technical report, Lab. Prism, University of Versailles, No. 94–7, 1994.

    Google Scholar 

  11. A. VAN GELDER. Multiple join size estimation by virtual domain. Principles of Database Systems, Washington (USA): 180–189, 1993.

    Google Scholar 

  12. W.-C. HOU and G. OZSOYOGLU. Statistical estimators for aggregate relational algebra queries. ACM Transactions On Database Systems, 16(4): 600–654, 1991.

    Google Scholar 

  13. N.L. JOHNSON and S. KOTZ. Urn models and their application. Wiley & Sons, 1977.

    Google Scholar 

  14. S. KARLIN and H.M. TAYLOR. A second Course in Stochastic Processes. Academic Press, 1981.

    Google Scholar 

  15. C.M. KENYON-MATHIEU and J.S. VITTER. General methods for the analysis of the maximum size of dynamic data structures. 16th International Colloquium on Automata, Languages and Programming, Springer-Verlag LNCS No. 372: 473–487, Stresa (Italy), 1989.

    Google Scholar 

  16. Y. LING and W. SUN. A supplement to sampling-based methods for query size estimation in a database system. SIGMOD Record, 21 (4), 1992.

    Google Scholar 

  17. R.L. LIPTON and J.F. NAUGHTON and D.A. SCHNEIDER and S. SESHADRI. Efficient sampling strategies for relational database operations. Theoretical Computer Science, 116 (1): 195–226, 1993.

    Google Scholar 

  18. G. LOUCHARD. Trie size in a dynamic list structure. TAPSOFT'93, M.-C. Gaudel and J.-P. Jouannaud Eds., Springer Verlag LNCS No. 668: 719–731, 1993.

    Google Scholar 

  19. R.S. MAIER. A path integral approach to data structure evolution. Journal of Complexity: 232–260, 1991.

    Google Scholar 

  20. M. V. MANNINO and P. CHU and T. SAGER. Statistical profile estimation in database systems. ACM Computing Surveys, 20 (3): 191–221, 1988.

    Google Scholar 

  21. T.H. MERRETT and E. OTOO. Distribution models of relations. 5th Conference on Very Large Data Bases, (Rio de Janeiro), 418–425, 1979.

    Google Scholar 

  22. J. K. MULLIN. Estimating the size of a relational join. Information Systems, 18(3): 189–196, 1993.

    Google Scholar 

  23. B. MUTHUSWAMY and L. KERSCHBERG. A detailed statistical model for relational query optimization. Annual Conference of the ACM, Denver, Colorado (USA): 439–448, 1985.

    Google Scholar 

  24. W. SUN and Y. LING and N. RISHE and Y. DENG. An instant and accurate size estimation method for joins and selection in a retrieval-intensive environment. ACM SIGMOD International Conference, Washington, D.C. (USA): 79–88, 1993.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ernst W. Mayr Claude Puech

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gardy, D., Louchard, G. (1995). Dynamic analysis of the sizes of relations. In: Mayr, E.W., Puech, C. (eds) STACS 95. STACS 1995. Lecture Notes in Computer Science, vol 900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-59042-0_94

Download citation

  • DOI: https://doi.org/10.1007/3-540-59042-0_94

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-59042-2

  • Online ISBN: 978-3-540-49175-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics