Abstract
In high-dimensional query processing, the optimization of the logical page-size of index structures is an important research issue. Even very simple query processing techniques such as the sequential scan are able to outperform indexes which are not suitably optimized. Page-size optimization based on a cost model faces the problem, that the optimum not only depends on static schema information such as the dimension of the data space but also on dynamically changing parameters such as the number of objects stored in the database and the degree of clustering and correlation in the current data set. Therefore, we propose a method for adapting the page size of an index dynamically during insert processing. Our solution, called DABS-tree, uses a flat directory whose entries consist of an MBR, a pointer to the data page and the size of the data page. Before splitting pages in insert operations, a cost model is consulted to estimate whether the split operation is beneficial. Otherwise, the split is avoided and the logical page-size is adapted instead. A similar rule applies for merging when performing delete operations. We present an algorithm for the management of data pages with varying page-sizes in an index and show that all restructuring operations are locally restricted. We show in our experimental evaluation that the DABS tree outperforms the X-tree by a factor up to 4.6 and the sequential scan by a factor up to 6.6.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal R., Faloutsos C., Swami A.: ‘Efficient similarity search in sequence databases’, Proc. 4th Int. Conf. on Foundations of Data Organization and Algorithms, 1993, LNCS 730, pp. 69–84
Agrawal R., Lin K., Shawney H., Shim K.: ‘Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases’, Proc. of the 21st Conf. on Very Large Databases, 1995, pp. 490–501.
Arya S., Mount D.M., Narayan O.: ‘Accounting for Boundary Effects in Nearest Neighbor Searching’, Proc. 11th Symp. on Computational Geometry, Vancouver, Canada, pp. 336–344, 1995.
Aref W. G., Samet H.: ‘Optimization Strategies for Spatial Query Processing’, Proc. 17th Int. Conf. on Very Large Databases (VLDB’91), Barcelona, Catalonia, 1991, pp. 81–90.
Bentley J.L.: ‘Multidimensional Search Trees Used for Associative Searching’, Communications of the ACM, Vol. 18, No. 9, pp. 509–517, 1975.
Berchtold S., Böhm C., Jagadish H. V., Kriegel H.-P., Sander J.: ‘Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces’, Proc. Int. Conf. on Data Engineering, Konstanz, Germany, 2000.
Berchtold S., Böhm C., Kriegel H.-P.: ‘The Pyramid-Technique: Towards indexing beyond the Curse of Dimensionality’, Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, pp. 142–153,1998.
Berchtold S., Böhm C., Keim D., Kriegel H.-P., Xu X.: ‘Optimal Multidimensional Query Processing Using Tree Striping’, submitted.
Berchtold S., Böhm C., Keim D., Kriegel H.-P.: ‘A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space’, ACM PODS Symposium on Principles of Database Systems, 1997, Tucson, Arizona.
Belussi A., Faloutsos C.: ‘Estimating the Selectivity of Spatial Queries Using the ‘Correlation’ Fractal Dimension’. Proceedings of 21th International Conference on Very Large Data Bases, VLDB’95, Zurich, Switzerland, 1995, pp. 299–310.
Berchtold S., Kriegel H.-P.: ‘S3: Similarity Search in CAD Database Systems’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1997, Tucson, Arizona, pp. 564–567.
Berchtold S., Keim D., Kriegel H.-P.: ‘The X-Tree: An Index Structure for High-Dimensional Data’, 22nd Conf. on Very Large Databases, 1996, Bombay, India, pp. 28–39.
Berchtold S., Keim D., Kriegel H.-P.: ‘Using Extended Feature Objects for Partial Similarity Retrieval’, VLDB Journal Vol. 6, No. 4, pp. 333–348, 1997.
Böhm C.: ‘Efficiently Indexing High-Dimensional Data Spaces’, Ph.D. Thesis, Faculty for Mathematics and Computer Science, University of Munich, Utz-Verlag München, 1998.
Friedman J. H., Bentley J. L., Finkel R. A.: ‘An Algorithm for Finding Best Matches in Logarithmic Expected Time’, ACM Transactions on Mathematical Software, Vol. 3, No. 3, September 1977, pp. 209–226.
Faloutsos C., Barber R., Flickner M., Hafner J., et al.: ‘Efficient and Effective Querying by Image Content’, Journal of Intelligent Information Systems, 1994, Vol. 3, pp. 231–262.
Faloutsos C., Kamel I.: ‘Beyond Uniformity and Independence: Analysis of R-trees Using the Concept of Fractal Dimension’, Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Minneapolis, Minnesota, 1994, pp. 4–13.
Faloutsos C., Ranganathan M., Manolopoulos Y.: ‘Fast Subsequence Matching in Time-Series Databases’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1994, pp. 419–429.
Faloutsos C., Sellis T., Roussopoulos N.: ‘Analysis of Object-Oriented Spatial Access Methods’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1987.
Gaede V., Günther O.: ‘Survey on Multidimensional Access Methods’, Technical Report ISS-16, Humbold-Universität Berlin, 1995.
Gary J. E., Mehrotra R.: ‘Similar Shape Retrieval using a Structural Feature Index’, Information Systems, Vol. 18, No. 7, 1993, pp. 525–537.
Henrich, A.: ‘The LSD h-tree: An Access Structure for Feature Vectors’, Proc. 14th Int. Conf. on Data Engineering, Orlando, 1998.
C.A.R. Hoare, ‘Quicksort’, Computer Journal, Vol. 5, No. 1, 1962.
Hjaltason G. R., Samet H.: ‘Ranking in Spatial Databases’, Proc. 4th Int. Symp. on Large Spatial Databases, Portland, ME, 1995, pp. 83–95.
Jagadish H. V.: ‘A Retrieval Technique for Similar Shapes’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1991, pp. 208–217.
Kastenmüller G., Kriegel H.-P., Seidl T.: ‘Similarity Search in 3D Protein Databases’, Proc. German Conference on Bioinformatics (GCB’98), Köln (Cologne), 1998.
Korn F., Sidiropoulos N., Faloutsos C., Siegel E., Protopapas Z.: ‘Fast Nearest Neighbor. Search in Medical Image Databases’, Proc. 22nd VLDB Conference, Mumbai (Bombay), India, 1996, pp. 215–226.
Katayama N., Satoh S.: ‘The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1997, pp. 369–380.
Kriegel H.-P., Seidl T.: ‘Approximation-Based Similarity Search for 3-D Surface Segments’, GeoInformatica Journal, Kluwer Academic Publishers, 1998, to appear.
Lin K., Jagadish H. V., Faloutsos C.: ‘The TV-Tree: An Index Structure for High-Dimensional Data’, VLDB Journal, Vol. 3, pp. 517–542, 1995.
Papadopoulos A., Manolopoulos Y.: ‘Performance of Nearest Neighbor Queries in R-Trees’, Proc. 6th Int. Conf. on Database Theory, Delphi, Greece, in: Lecture Notes in Computer Science, Vol. 1186, Springer, pp. 394–408, 1997.
Pagel B.-U., Six H.-W., Toben H., Widmayer P.: ‘Towards an Analysis of Range Query Performance in Spatial Data Structures’, Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS’93, Washington, D.C., 1993, pp.214–221.
Shawney H., Hafner J.: ‘Efficient Color Histogram Indexing’, Proc. Int. Conf. on Image Processing, 1994, pp. 66–70.
Seidl T., Kriegel H.-P.: ‘Efficient User-Adaptable Similarity Search in Large Multimedia Databases’, Proc. 23rd Int. Conf. on Very Large Databases (VLDB’97), Athens, Greece, 1997, pp. 506–515.
Yannis Theodoridis, Timos K. Sellis: ‘A Model for the Prediction of R-tree Performance’. Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 3–5, 1996, Montreal, Canada. ACM Press, 1996, ISBN 0-89791-781-2 pp. 161–171.
White D.A., Jain R.: ‘Similarity indexing with the SS-tree’, Proc. 12th Int. Conf on Data Engineering, New Orleans, LA, 1996.
Weber R., Schek H.-J., Blott S.: ‘A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces’, Proc. Int. Conf. on Very Large Databases, New York, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Böhm, C., Kriegel, HP. (2000). Dynamically Optimizing High-Dimensional Index Structures. In: Zaniolo, C., Lockemann, P.C., Scholl, M.H., Grust, T. (eds) Advances in Database Technology — EDBT 2000. EDBT 2000. Lecture Notes in Computer Science, vol 1777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46439-5_3
Download citation
DOI: https://doi.org/10.1007/3-540-46439-5_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67227-2
Online ISBN: 978-3-540-46439-6
eBook Packages: Springer Book Archive