On the Generation of 2-Dimensional Index Workloads

  • Joseph M. Hellerstein
  • Lisa Hellerstein
  • George Kollios
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1540)

Abstract

A large number of database index structures have been proposed over the last two decades, and little consensus has emerged regarding their relative effectiveness. In order to empirically evaluate these indexes, it is helpful to have methodologies for generating random queries for performance testing. In this paper we propose a domain-independent approach to the generation of random queries: choose randomly among all logically distinct queries. We investigate this idea in the context of range queries over 2-dimensional points. We present an algorithm that chooses randomly among logically distinct 2-d range queries. It has constant-time expected performance over uniformly distributed data, and exhibited good performance in experiments over a variety of real and synthetic data sets. We observe nonuniformities in the way randomly chosen logical 2-d range queries are distributed over a variety of spatial properties. This raises questions about the quality of the workloads generated from such queries. We contrast our approach with previous work that generates workloads of random spatial ranges, and we sketch directions for future work on the robust generation of workloads for studying index performance.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alberto Belussi and Christos Faloutsos. Estimating the Selectivity of Spatial Queries Using the’ Correlation ‘Fractal Dimension. In Proc. 21st International Conference on Very Large Data Bases, Zurich, September 1995, pages 299–310.Google Scholar
  2. 2.
    Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, and Bernhard Seeger. The R*-tree: An Efficient and Robust Access Method For Points and Rectangles. In Proc. ACM-SIGMOD International Conference on Management of Data, pages 322–331, Atlantic City, May 1990.Google Scholar
  3. 3.
    Georgios Evangelidis, David B. Lomet, and Betty Salzberg. The hBΠ-tree: A Modified hB-tree Supporting Concurrency, Recovery and Node Consolidation. In Proc. 21st International Conference on Very Large Data Bases, Zurich, September 1995, pages 551–561.Google Scholar
  4. 4.
    Christos Faloutsos and Ibrahim Kamel. Beyond Uniformity and Independence: Analysis of R-trees Using the Concept of Fractal Dimension. In Proc. 13th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 4–13, Minneapolis, May 1994.Google Scholar
  5. 5.
    V. Gaede and O. Gunther. Multidimensional Access Methods. ACM Computing Surveys, 1997. To appear.Google Scholar
  6. 6.
    O. Gunther, V. Oria, P. Picouet, J.-M. Saglio, and M. Scholl. Benchmarking Spatial Joins A la Carte. In J. Ferriee (ed.), Proc. 13e Journees Bases de Donnees Avancees, Grenoble, 1997.Google Scholar
  7. 7.
    Jim Gray. The Benchmark Handbook for Database and Transaction Processing Systems, Second Edition Morgan Kauffman, San Francisco, 1993.MATHGoogle Scholar
  8. 8.
    Diane Greene. An Implementation and Performance Analysis of Spatial Data Access Methods. In Proc. 5th IEEE International Conference on Data Engineering, pages 606–615, 1989.Google Scholar
  9. 9.
    Antonin Guttman. R-Trees: A Dynamic Index Structure For Spatial Searching. In Proc. ACM-SIGMOD International Conference on Management of Data, pages 47–57, Boston, June 1984.Google Scholar
  10. 10.
    Joseph M. Hellerstein, Elias Koutsoupias, and Christos H. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 249–256, Tucson, May 1997.Google Scholar
  11. 11.
    Joseph M. Hellerstein, Jeffrey F. Naughton, and Avi Pfeffer. Generalized Search Trees for Database Systems. In Proc. 21st International Conference on Very Large Data Bases, Zurich, September 1995.Google Scholar
  12. 12.
    Ibrahim Kamel and Christos Faloutsos. On Packing R-trees. In Proc. Second Int. Conference on Information and Knowledge Management (CIKM), Washington, DC, Nov 1-5 1993.Google Scholar
  13. 13.
    Curtis P. Kolovson and Michael Stonebraker. Segment Indexes: Dynamic Indexing Techniques for Multi-Dimensional Interval Data. In Proc. ACM-SIGMOD International Conference on Management of Data, pages 138–147, Denver, June 1991.Google Scholar
  14. 14.
    Marcel Kornacker, Mehul Shah, and Joseph M. Hellerstein. amdb: An Access Method Debugging Toolkit. In Proc. ACM-SIGMOD International Conference on Management of Data, Seattle, June 1998.Google Scholar
  15. 15.
    Elias Koutsoupias and David Scot Taylor. Tight bounds for 2-dimensional Indexing Schemes. In Proc. 17th ACM PODS Symposium on Principles of Database Systems, Seattle, 1998.Google Scholar
  16. 16.
    Bernd-Uwe Pagel and Hans-Werner Six. Are Window Queries Represesntative for Arbitrary Range Queries? In Proc. 15th ACM PODS Symposium on Principles of Database Systems, pages 151–160 1996.Google Scholar
  17. 17.
    Bernd-Uwe Pagel, Hans-Werner Six, Heinrich Toben, and Peter Widmayer. Towards an Analysis of Range Query Performance in Spatial Data Structures. In Proc. 12th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 214–221, Washington, D. C., May 1993.Google Scholar
  18. 18.
    Bernd-Uwe Pagel, Hans-Werner Six, and Mario Winter. Window Query-Optimal Clustering of Spatial Objects In Proc. 14th ACM PODS Symposium on Principles of Database Systems, pages 86–94, 1995.Google Scholar
  19. 19.
    Vasilis Samoladas and Daniel P. Miranker. A Lower Bound Theorem for Indexing Schemes and its Application to Multidimensional Range Queries In Proc. 17th ACM PODS Symposium on Principles of Database Systems, Seattle, 1998.Google Scholar
  20. 20.
    Timos K. Sellis, Nick Roussopoulos, and Christos Faloutsos. Multidimen-sional Access Methods: Trees Have Grown Everywhere. In Proc. 23rd In-ternational Conference on Very Large Data Bases, Athens, August 1997.Google Scholar
  21. 21.
    Shlesinger, Zaslavsky, and Frisch (Eds.), editors. Levy Flights and Related Topics in Physics. Springer-Verlag, 1995.Google Scholar
  22. 22.
    U.S. Bureau of the Census. TiGER Files(TM) 1992 Technical Documentation 1992.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Joseph M. Hellerstein
    • 1
  • Lisa Hellerstein
    • 2
  • George Kollios
    • 2
  1. 1.Berkeley EECS Computer Science DivisionUniversity of CaliforniaBerkeleyUSA
  2. 2.Dept. of Computer and Information Science Six MetroTech CenterPolytechnic UniversityBrooklynUSA

Personalised recommendations