Advertisement

Slim-Trees: High Performance Metric Trees Minimizing Overlap between Nodes

  • Caetano TrainaJr.
  • Agma Traina
  • Bernhard Seeger
  • Christos Faloutsos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1777)

Abstract

In this paper we present the Slim-tree, a dynamic tree for organizing metric datasets in pages of fixed size. The Slim-tree uses the “fat-factor” which provides a simple way to quantify the degree of overlap between the nodes in a metric tree. It is well-known that the degree of overlap directly affects the query performance of index structures. There are many suggestions to reduce overlap in multidimensional index structures, but the Slim-tree is the first metric structure explicitly designed to reduce the degree of overlap.

Moreover, we present new algorithms for inserting objects and splitting nodes. The new insertion algorithm leads to a tree with high storage utilization and improved query performance, whereas the new split algorithm runs considerably faster than previous ones, generally without sacrificing search performance. Results obtained from experiments with real-world data sets show that the new algorithms of the Slim-tree consistently lead to performance improvements. After performing the Slim-down algorithm, we observed improvements up to a factor of 35% for range queries.

Keywords

Minimal Span Tree Index Structure Distance Calculation Range Query Point Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gaede, V., Gunther, O.: Multidimensional Access Methods. ACM Computing Surveys, 30(2) (1998) 170–231.CrossRefGoogle Scholar
  2. 2.
    Ciaccia, P., Patella, M., Zezula, P.: M-tree: An Efficient Access Method for Similarity Search in Metric Spaces, VLDB (1997) 426–435.Google Scholar
  3. 3.
    Burkhard, W.A., Keller R.M.: Some Approaches to Best-Match File Searching. CACM 16(4) (1973) 230–236.MATHGoogle Scholar
  4. 4.
    Uhlmann, J.K.: Satisfying General Proximity/Similarity Queries with Metric Trees. IPL 40(4) (1991) 175–179.MATHCrossRefGoogle Scholar
  5. 5.
    Yianilos, P. N.: Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces. ACM SODA (1993) 311–321.Google Scholar
  6. 6.
    Baeza-Yates, R.A., Cunto, W., Manber, U., Wu S.: Proximity Matching Using Fixed-Queries Trees. CPM, (1994) 198–212.Google Scholar
  7. 7.
    Bozkaya, T., Özsoyoglu, Z.M. Distance-Based Indexing for High-Dimensional Metric Spaces, ACM-SIGMOD (1997) 357–368.Google Scholar
  8. 8.
    Brin S.: Near Neighbor Search in Large Metric Spaces, VLDB (1995) 574–584.Google Scholar
  9. 9.
    Guttman A.: R-Tree: Adynamic Index Structure for Spatial Searching. ACMSIGMOD (1984) 47–57.Google Scholar
  10. 10.
    Ciaccia, P., Patella, M.: Bulk Loading the M-tree. ADC’98 (1998) 15–26.Google Scholar
  11. 11.
    Kruskal Jr., J.B.: On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem. Proc. Amer. Math. Soc. (7) (1956) 48–50.CrossRefMathSciNetGoogle Scholar
  12. 12.
    Ciaccia, P., Patella, M., Rabitti, F., Zezula, P.: Indexing Metric Spaces with M-tree. Proc. Quinto convegno Nazionale SEBD (1997).Google Scholar
  13. 13.
    Faloutsos, C., Kamel, L.: Beyond Uniformity and Independence: Analysis of R-tree Using the Concept of Fractal Dimension. ACM-PODS (1994) 4–13.Google Scholar
  14. 14.
    Traina Jr., C., Traina, A., Faloutsos, C.: Distance Exponent: A New Concept for Selectivity Estimation in Metric Trees. CMU-CS-99-110 Technical Report (1999).Google Scholar
  15. 15.
    Sellis, T., Roussopoulos, N., Faloutsos, C.: The R+-tree: A Dynamic Index for Multidimensional Objects. VLDB (1987) 507–518.Google Scholar
  16. 16.
    Beckmann, N., Kriegel, H.-P., Schneider R., Seeger, B.: The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. ACM-SIGMOD (1990) 322–331.Google Scholar
  17. 17.
    Berchtold, S., Böhm, C., Keim, D.A., Kriegel, H.-P.: A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space. ACM-PODS (1997) 78–86.Google Scholar
  18. 18.
    Wactlar, H.D., Kanade, T., Smith, M.A., Stevens, S.M.: Intelligent Access to Digital Video: Informedia Project. IEEE Computer, 29(3) (1996) 46–52.Google Scholar
  19. 19.
    Visionics Corp.-Available at http://www.visionics.com/live/frameset.html (12-Feb-1999).

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Caetano TrainaJr.
    • 1
  • Agma Traina
    • 1
  • Bernhard Seeger
    • 2
  • Christos Faloutsos
    • 3
  1. 1.Department of Computer ScienceUniversity of São Paulo at São CarlosBrazil
  2. 2.Fachbereich Mathematik und InformatikUniversität MarburgGermany
  3. 3.Department of Computer ScienceCarnegie Mellon UniversityUSA

Personalised recommendations