Skip to main content

Dynamically Optimizing High-Dimensional Index Structures

  • Conference paper
  • First Online:
Advances in Database Technology — EDBT 2000 (EDBT 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1777))

Included in the following conference series:

Abstract

In high-dimensional query processing, the optimization of the logical page-size of index structures is an important research issue. Even very simple query processing techniques such as the sequential scan are able to outperform indexes which are not suitably optimized. Page-size optimization based on a cost model faces the problem, that the optimum not only depends on static schema information such as the dimension of the data space but also on dynamically changing parameters such as the number of objects stored in the database and the degree of clustering and correlation in the current data set. Therefore, we propose a method for adapting the page size of an index dynamically during insert processing. Our solution, called DABS-tree, uses a flat directory whose entries consist of an MBR, a pointer to the data page and the size of the data page. Before splitting pages in insert operations, a cost model is consulted to estimate whether the split operation is beneficial. Otherwise, the split is avoided and the logical page-size is adapted instead. A similar rule applies for merging when performing delete operations. We present an algorithm for the management of data pages with varying page-sizes in an index and show that all restructuring operations are locally restricted. We show in our experimental evaluation that the DABS tree outperforms the X-tree by a factor up to 4.6 and the sequential scan by a factor up to 6.6.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal R., Faloutsos C., Swami A.: ‘Efficient similarity search in sequence databases’, Proc. 4th Int. Conf. on Foundations of Data Organization and Algorithms, 1993, LNCS 730, pp. 69–84

    Google Scholar 

  2. Agrawal R., Lin K., Shawney H., Shim K.: ‘Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases’, Proc. of the 21st Conf. on Very Large Databases, 1995, pp. 490–501.

    Google Scholar 

  3. Arya S., Mount D.M., Narayan O.: ‘Accounting for Boundary Effects in Nearest Neighbor Searching’, Proc. 11th Symp. on Computational Geometry, Vancouver, Canada, pp. 336–344, 1995.

    Google Scholar 

  4. Aref W. G., Samet H.: ‘Optimization Strategies for Spatial Query Processing’, Proc. 17th Int. Conf. on Very Large Databases (VLDB’91), Barcelona, Catalonia, 1991, pp. 81–90.

    Google Scholar 

  5. Bentley J.L.: ‘Multidimensional Search Trees Used for Associative Searching’, Communications of the ACM, Vol. 18, No. 9, pp. 509–517, 1975.

    Article  MATH  MathSciNet  Google Scholar 

  6. Berchtold S., Böhm C., Jagadish H. V., Kriegel H.-P., Sander J.: ‘Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces’, Proc. Int. Conf. on Data Engineering, Konstanz, Germany, 2000.

    Google Scholar 

  7. Berchtold S., Böhm C., Kriegel H.-P.: ‘The Pyramid-Technique: Towards indexing beyond the Curse of Dimensionality’, Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, pp. 142–153,1998.

    Google Scholar 

  8. Berchtold S., Böhm C., Keim D., Kriegel H.-P., Xu X.: ‘Optimal Multidimensional Query Processing Using Tree Striping’, submitted.

    Google Scholar 

  9. Berchtold S., Böhm C., Keim D., Kriegel H.-P.: ‘A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space’, ACM PODS Symposium on Principles of Database Systems, 1997, Tucson, Arizona.

    Google Scholar 

  10. Belussi A., Faloutsos C.: ‘Estimating the Selectivity of Spatial Queries Using the ‘Correlation’ Fractal Dimension’. Proceedings of 21th International Conference on Very Large Data Bases, VLDB’95, Zurich, Switzerland, 1995, pp. 299–310.

    Google Scholar 

  11. Berchtold S., Kriegel H.-P.: ‘S3: Similarity Search in CAD Database Systems’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1997, Tucson, Arizona, pp. 564–567.

    Google Scholar 

  12. Berchtold S., Keim D., Kriegel H.-P.: ‘The X-Tree: An Index Structure for High-Dimensional Data’, 22nd Conf. on Very Large Databases, 1996, Bombay, India, pp. 28–39.

    Google Scholar 

  13. Berchtold S., Keim D., Kriegel H.-P.: ‘Using Extended Feature Objects for Partial Similarity Retrieval’, VLDB Journal Vol. 6, No. 4, pp. 333–348, 1997.

    Article  Google Scholar 

  14. Böhm C.: ‘Efficiently Indexing High-Dimensional Data Spaces’, Ph.D. Thesis, Faculty for Mathematics and Computer Science, University of Munich, Utz-Verlag München, 1998.

    Google Scholar 

  15. Friedman J. H., Bentley J. L., Finkel R. A.: ‘An Algorithm for Finding Best Matches in Logarithmic Expected Time’, ACM Transactions on Mathematical Software, Vol. 3, No. 3, September 1977, pp. 209–226.

    Article  MATH  Google Scholar 

  16. Faloutsos C., Barber R., Flickner M., Hafner J., et al.: ‘Efficient and Effective Querying by Image Content’, Journal of Intelligent Information Systems, 1994, Vol. 3, pp. 231–262.

    Article  Google Scholar 

  17. Faloutsos C., Kamel I.: ‘Beyond Uniformity and Independence: Analysis of R-trees Using the Concept of Fractal Dimension’, Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Minneapolis, Minnesota, 1994, pp. 4–13.

    Google Scholar 

  18. Faloutsos C., Ranganathan M., Manolopoulos Y.: ‘Fast Subsequence Matching in Time-Series Databases’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1994, pp. 419–429.

    Google Scholar 

  19. Faloutsos C., Sellis T., Roussopoulos N.: ‘Analysis of Object-Oriented Spatial Access Methods’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1987.

    Google Scholar 

  20. Gaede V., Günther O.: ‘Survey on Multidimensional Access Methods’, Technical Report ISS-16, Humbold-Universität Berlin, 1995.

    Google Scholar 

  21. Gary J. E., Mehrotra R.: ‘Similar Shape Retrieval using a Structural Feature Index’, Information Systems, Vol. 18, No. 7, 1993, pp. 525–537.

    Article  Google Scholar 

  22. Henrich, A.: ‘The LSD h-tree: An Access Structure for Feature Vectors’, Proc. 14th Int. Conf. on Data Engineering, Orlando, 1998.

    Google Scholar 

  23. C.A.R. Hoare, ‘Quicksort’, Computer Journal, Vol. 5, No. 1, 1962.

    Google Scholar 

  24. Hjaltason G. R., Samet H.: ‘Ranking in Spatial Databases’, Proc. 4th Int. Symp. on Large Spatial Databases, Portland, ME, 1995, pp. 83–95.

    Google Scholar 

  25. Jagadish H. V.: ‘A Retrieval Technique for Similar Shapes’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1991, pp. 208–217.

    Google Scholar 

  26. Kastenmüller G., Kriegel H.-P., Seidl T.: ‘Similarity Search in 3D Protein Databases’, Proc. German Conference on Bioinformatics (GCB’98), Köln (Cologne), 1998.

    Google Scholar 

  27. Korn F., Sidiropoulos N., Faloutsos C., Siegel E., Protopapas Z.: ‘Fast Nearest Neighbor. Search in Medical Image Databases’, Proc. 22nd VLDB Conference, Mumbai (Bombay), India, 1996, pp. 215–226.

    Google Scholar 

  28. Katayama N., Satoh S.: ‘The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries’, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1997, pp. 369–380.

    Google Scholar 

  29. Kriegel H.-P., Seidl T.: ‘Approximation-Based Similarity Search for 3-D Surface Segments’, GeoInformatica Journal, Kluwer Academic Publishers, 1998, to appear.

    Google Scholar 

  30. Lin K., Jagadish H. V., Faloutsos C.: ‘The TV-Tree: An Index Structure for High-Dimensional Data’, VLDB Journal, Vol. 3, pp. 517–542, 1995.

    Article  Google Scholar 

  31. Papadopoulos A., Manolopoulos Y.: ‘Performance of Nearest Neighbor Queries in R-Trees’, Proc. 6th Int. Conf. on Database Theory, Delphi, Greece, in: Lecture Notes in Computer Science, Vol. 1186, Springer, pp. 394–408, 1997.

    Google Scholar 

  32. Pagel B.-U., Six H.-W., Toben H., Widmayer P.: ‘Towards an Analysis of Range Query Performance in Spatial Data Structures’, Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS’93, Washington, D.C., 1993, pp.214–221.

    Google Scholar 

  33. Shawney H., Hafner J.: ‘Efficient Color Histogram Indexing’, Proc. Int. Conf. on Image Processing, 1994, pp. 66–70.

    Google Scholar 

  34. Seidl T., Kriegel H.-P.: ‘Efficient User-Adaptable Similarity Search in Large Multimedia Databases’, Proc. 23rd Int. Conf. on Very Large Databases (VLDB’97), Athens, Greece, 1997, pp. 506–515.

    Google Scholar 

  35. Yannis Theodoridis, Timos K. Sellis: ‘A Model for the Prediction of R-tree Performance’. Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 3–5, 1996, Montreal, Canada. ACM Press, 1996, ISBN 0-89791-781-2 pp. 161–171.

    Chapter  Google Scholar 

  36. White D.A., Jain R.: ‘Similarity indexing with the SS-tree’, Proc. 12th Int. Conf on Data Engineering, New Orleans, LA, 1996.

    Google Scholar 

  37. Weber R., Schek H.-J., Blott S.: ‘A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces’, Proc. Int. Conf. on Very Large Databases, New York, 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Böhm, C., Kriegel, HP. (2000). Dynamically Optimizing High-Dimensional Index Structures. In: Zaniolo, C., Lockemann, P.C., Scholl, M.H., Grust, T. (eds) Advances in Database Technology — EDBT 2000. EDBT 2000. Lecture Notes in Computer Science, vol 1777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46439-5_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-46439-5_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67227-2

  • Online ISBN: 978-3-540-46439-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics