Abstract
Query cost estimation is an important and well-studied problem in relational database systems. In this paper we study the cost estimation problem in the context of spatial database systems.
We introduce a new method that provides accurate cost estimation for spatial selections, or window queries, by building wavelet-based histograms for spatial data. Our method is based upon two techniques: (a) A representation transformation in which geometric objects are represented by points in higher-dimensional space and window queries correspond to semi-infinite range-sum queries, and (b) Multiresolution wavelet decomposition that provides a time-efficient and space-efficient approximation of the underlying distribution of the multidimensional point data, especially for semi-infinite range-sum queries. We also show for the first time how a wavelet decomposition of a dense multidimensional array derived from a sparse array through a partial-sum computation can still be computed efficiently using sparse techniques by doing the processing implicitly on the original data. Our method eliminates the drawbacks of the partition-based histogram methods in previous work, and even with very small space allocation it gives excellent cost estimation over a broad range of spatial data distributions and queries.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
A. Aboulnaga and J. F. Naughton. Accurate estimation of the cost of spatial selections. In Proceedings IEEE International Conference on Data Engineering, San Diego, California, 2000.
S. Acharya, V. Poosala, and S. Ramaswamy. Selectivity estimation in spatial databases. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, pages 13–24, Phildelphia, June 1999.
A. Belussi and C. Faloutsos. Estimating the selectivity of spatial queries using the ‘correlation’ fractal dimension. In Proceedings of 21th International Conference on Very Large Data Bases, pages 299–310, Zurich, Switzerland, September 1995.
C. Faloutsos and I. Kamel. Beyond uniformity and independence: Analysis of r-trees using the concept of fractal dimension. In Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 4–13, Minneapolis, Minnesota, May 1994. ACM Press.
I. Kamel and C. Faloutsos. On packing r-trees. In B. K. Bhargava, T. W. Finin, and Y. Yesha, editors, CIKM 93, Proceedings of the Second International Conference on Information and Knowledge Management, Washington, DC, USA, November 1–5, 1993, pages 490–499. ACM, 1993.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 448–459, Seattle, WA, June 1998.
Y. Matias, J. S. Vitter, and M. Wang. Dynamic maintenance of wavelet-based histograms. In Proceedings of the 2000 International Conference on Very Large Databases, Cairo, Egypt, September 2000.
M. Muralikrishna and D. J. DeWitt. Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, pages 28–36, 1988.
J. Nievergelt and K. Hinrichs. Storage and access structures for geometric data bases. In Proceedings of International Conference on Foundation of Data Organization, pages 335–345, 1985.
J. A. Orenstein. Spatial query processing in an object-oriented database system. In C. Zaniolo, editor, Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, pages 326–336, Washington, D.C., May 1986.
B.-U. Pagel, H.-W. Six, H. Toben, and P. Widmayer. Towards an analysis of range query performance in spatial data structures. In Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 25–28, 1993, Washington, DC, pages 214–221. ACM Press, 1993.
G. Piatetsky-Shapiro and C. Connell. Accurate estimation of the number of tuples satisfying a condition. In Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, pages 256–276, 1984.
V. Poosala and Y. E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In Proceedings of the 1997 International Conference on Very Large Databases, Athens, Greece, August 1997.
V. Poosala, Y. E. Ioannidis, P. J. Haas, and E. Shekita. Improved histograms for selectivity estimation of range predicates. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Canada, May 1996.
B. Seeger and H. P. Kriegal. Techniques for design and implementation of efficient spatial access methods. In Proceedings of 14th International Conference on Very Large Data Bases, pages 360–371, 1988.
Y. Theodoridis and T. K. Sellis. A model for the prediction of r-tree performance. In Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 161–171, Montreal, Canada, June 1996.
J. S. Vitter. External memory algorithms and data structures. In J. Abello and J. S. Vitter, editors, External Memory Algorithms and Visualization. American Mathematical Society Press, Providence, RI, 1999.
J. S. Vitter and M. Wang. Approximate computation of multidimensional aggregates of sparse data using wavelets. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, pages 193–204, Phildelphia, June 1999.
J. S. Vitter, M. Wang, and B. Iyer. Data cube approximation and histograms via wavelets. In Proceedings of Seventh International Conference on Information and Knowledge Management, pages 96–104, Washington D.C., November 1998.
M. Wang. Approximation and Learning Techniques in Database Systems. Ph. D. dissertation, Duke University, 1999. The thesis is available via the author’s web page http://www.cs.duke.edu/minw/.
K.-Y. Whang, S.-W. Kim, and G. Wiederhold. Dynamic maintenance of data distribution for selectivity estimation. VLDB Journal, 3(1):29–51, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, M., Vitter, J.S., Lim, L., Padmanabhan, S. (2001). Wavelet-Based Cost Estimation for Spatial Queries. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds) Advances in Spatial and Temporal Databases. SSTD 2001. Lecture Notes in Computer Science, vol 2121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47724-1_10
Download citation
DOI: https://doi.org/10.1007/3-540-47724-1_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42301-0
Online ISBN: 978-3-540-47724-2
eBook Packages: Springer Book Archive