On estimating the size of projections

  • Jeffrey F. Naughton
  • S. Seshadri
Optimization
Part of the Lecture Notes in Computer Science book series (LNCS, volume 470)

Abstract

We present a new sampling algorithm for estimating the number of tuples in the projection of a relation. The algorithm requires no assumptions about the distributions of values in the attributes of the relation and converges faster and smoother than previous sampling algorithms for the problem. We give both a sound theoretical basis for the algorithm and experimental data from an implementation of the algorithm.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [ABM89]
    Rafiul Ahad, K. V. Bapa Rao, and Dennis McLeod. On estimating the cardinality of the projection of a database relation. ACM Transactions on Database Systems, 14(1):28–40, March 1989.Google Scholar
  2. [FM85]
    P. Flajolet and G. N. Martin. Probabilistic counting algorithms for data base applications. JCSS, 31, 1985.Google Scholar
  3. [GG82a]
    Erol Gelenbe and Daniele Gardy. On the sizes of projections: I. Information Processing Letters, 14(1):18–21, March 1982.Google Scholar
  4. [GG82b]
    Erol Gelenbe and Daniele Gardy. On the sizes of projections: II. In Proceedings of the 8th VLDB Conference, Mexico City, Mexico, September 1982.Google Scholar
  5. [Goo49]
    L. A. Goodman. On the estimation of the number of classes in a population. Ann. Math. Sta., 1949.Google Scholar
  6. [GP84]
    Daniele Gardy and Claude Puech. On the sizes of projections: A generating function approach. Information Systems, 9(3/4):231–235, 1984.CrossRefGoogle Scholar
  7. [HOT88]
    Wen-Chi Hou, Gultekin Ozsoyoglu, and Baldeao K. Taneja. Statistical estimators for relational algebra expressions. In Proceedings of the Seventh ACM Symposium on Principles of Database Systems, pages 276–287, Austin, Texas, March 1988.Google Scholar
  8. [HOT89]
    Wen-Chi Hou, Gultekin Ozsoyoglu, and Baldeao K. Taneja. Processing aggregate relational queries with hard time constraints. In Proceedings of the ACM-SIGMOD Conference on the Management of Data, pages 68–77, Portland, Oregon, June 1989.Google Scholar
  9. [LN89]
    Richard J. Lipton and Jeffrey F. Naughton. Estimating the size of generalized transitive closures. In Proceedings of the Fifteenth International Conference on Very Large Databases, pages 165–172, Amsterdam, The Netherlands, August 1989.Google Scholar
  10. [LN90]
    Richard J. Lipton and Jeffrey F. Naughton. Query size estimation by adaptive sampling. In Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Nashville, Tennessee, March 1990.Google Scholar
  11. [LNS90]
    Richard A. Lipton, Jeffrey F. Naughton, and Donovan A. Schneider. Practical selectivity estimation through adaptive sampling. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Atlantic City, New Jersey, May 1990.Google Scholar
  12. [MO79]
    T. H. Merrett and E. Otoo. Distribution models of relations. In Proceedings of the 5th International VLDB Conference, pages 418–425, Rio de Janeiro, Brazil, October 1979.Google Scholar
  13. [OR89]
    Frank Olken and Doron Rotem. Random sampling from B+-trees. In Proceedings of the Fifteenth International Conference on Very Large Databases, pages 269–278, Amsterdam, The Netherlands, August 1989.Google Scholar
  14. [ORX90]
    Frank Olken, Doron Rotem, and Ping Xu. Random sampling from hash files. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 375–386, Atlantic City, New Jersey, May 1990.Google Scholar
  15. [WVZT90]
    Kyu-Young Whang, Brad T. Vander-Zanden, and Howard M. Taylor. A linear-time probabilistic counting algorithm for database applications. ACM Transactions on Database Systems, 15(2):208–229, June 1990.Google Scholar

Copyright information

© Springer-Verlag 1990

Authors and Affiliations

  • Jeffrey F. Naughton
    • 1
  • S. Seshadri
    • 1
  1. 1.Computer Sciences DepartmentUniversity of Wisconsin-MadisonUSA

Personalised recommendations