Abstract
In modern database applications the similarity, or dissimilarity of data objects is examined by performing distance-based queries (DBQs) on multidimensional data. The R-tree and its variations are commonly cited multidimensional access methods. In this paper, we investigate the performance of the most representative distance-based queries in multidimensional data spaces, where the point datasets are indexed by tree-like structures belonging to the R-tree family. In order to perform the K-nearest neighbor query (K-NNQ) and the K-closest pair query (K-CPQ), non-incremental recursive branch-and-bound algorithms are employed. The K-CPQ is shown to be a very expensive query for datasets of high cardinalities that becomes even more costly as the dimensionality increases. We also give ⇔-approximate versions of DBQ algorithms that can be performed faster than the exact ones, at the expense of introducing a distance relative error of the result. Experimentation with synthetic multidimensional point datasets, following Uniform and Gaussian distributions, reveals that the best index structure for K-NNQ is the X-tree. However, for K-CPQ, the R*-tree outperforms th e X-tree in respect to the response time and the number of disk accesses, when an LRU buffer is used. Moreover, the application of the ⇔-approximate technique on the recursive K-CPQ algorithm leads to acceptable approximations of the result quickly, although the tradeo. between cost and accuracy cannot be easily controlled by the users.
The author has been partially supported by the Spanish CICYT (project TIC 2002- 03968).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Arya, D.M. Mount, N. S. Netanyahu, R. Silverman and A. Y. Wu: “An Optimal Algorithm for Approximate Nearest Neighbor Searching Fixed Dimensions”, Journal of the ACM, Vol.45, No.6, pp.891–923, 1998.
B. Braunmuller, M. Ester, H.P. Kriegel and J. Sander: “Efficiently Supporting Multiple Similarity Queries for Mining in Metric Databases”, Proceedings ICDE Conference, pp.256–267, 2000.
K. S. Beyer, J. Goldstein, R. Ramakrishnan and U. Shaft: “When Is “Nearest Neighbor” Meaningful?”, Proceedings 7th ICDT Conference, pp.217–235, 1999.
N. Beckmann, H.P. Kriegel, R. Schneider and B. Seeger: “The R*-tree: and Efficient and Robust Access Method for Points and Rectangles”, Proceedings 1990 ACM SIGMOD Conference, pp.322–331, 1990.
S. Berchtold, D. Kiem and H.P. Kriegel: “The X-tree: An Index Structure for High-Dimensional Data”, Proceedings 22nd VLDB Conference, pp.28–39, 1996.
K. L. Cheung and A.W. Fu: “Enhanced Nearest Neighbour Search on the R-tree”, ACM SIGMOD Record, Vol.27, No.3, pp.16–21, 1998.
P. Ciaccia and M. Patella; “PAC Nearest Neighbor Queries: Approximate and Controlled Searchin High-Dimensional and Metric Spaces”, Proceedings ICDE Conference, pp. 244–255, San Diego, CA, 2000.
A. Corral, J. Cañadas and M. Vassilakopoulos: “Approximate Algorithms for Distance-Based Queries in High-Dimensional Data Spaces Using R-Trees”, Proceedings 6th ADBIS Conference, pp. 163–176, 2002.
A. Corral, Y. Manolopoulos, Y. Theodoridis and M. Vassilakopoulos: “Closest Pair Queries in Spatial Databases”, Proceedings 2000 ACM SIGMOD Conference, pp.189–200, 2000.
A. Corral, J. Rodriguez and M. Vassilakopoulos: “Distance-Based Queries in Multidimensional Data Spaces using R-trees”, Proceedings 8th Panhellenic Conference on Informatics, Vol.I, pp.237–246, Nicosia, Cyprus, 2001.
T. Dean and M. S. Boddy: “An Analysis of Time-Dependent Planning”, Proceedings AAAI Conference, pp.49–54, St. Paul, MN, 1988. 17
C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equitz: “Efficient and Effective Querying by Image Content”, Journal of Intelligent Information Systems, Vol.3, No.3–4, pp.231–262, 1994. 1
V. Gaede and O. Gunther: “Multidimensional Access Methods”, ACM Computing Surveys, Vol.30, No.2, pp.170–231, 1998. 2
A. Guttman: “R-trees: A Dynamic Index Structure for Spatial Searching”, Proceedings 1984 ACM SIGMOD Conference, pp.47–57, 1984. 2, 4
G.R. Hjaltason and H. Samet: “Ranking in Spatial Databases”, Proceedings 4th SSD Conference, pp.83–95, 1995. 3
G.R. Hjaltason and H. Samet: “Incremental Distance Join Algorithms for Spatial Databases”, Proceedings 1998 ACM SIGMOD Conference, pp.237–248, 1998. 4
G. R. Hjaltason and H. Samet: “Distance Browsing in Spatial Databases”, ACM Transactions on Database Systems, Vol.24, No.2, pp.265–318, 1999. 3, 8, 17
H.V. Jagadish: “A Retrieval Technique for Similar Shapes”, Proceedings 1991 ACM SIGMOD Conference, pp.208–217, 1991. 1
N. Katayama and S. Satoh: “The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries”, Proceedings 1997 ACM SIGMOD Conference, pp.369–380, 1997. 17
N. Koudas and K. C. Sevcik: “HighD imensional Similarity Joins: Algorithms and Performance Evaluation”, Proceedings ICDE Conference, pp. 466–475, Orlando, FL, 1998. 6
F. Korn, N. Sidiropoulos, C. Faloutsos, C. Siegel and Z. Protopapas: “Fast Nearest Neighbor Search in Medical Images Databases”, Proceedings 22nd VLDB Conference, pp.215–226, 1996. 1
K. I. Lin, H. V. Jagadishand C. Faloutsos: “The TV-tree: an Index Structure for High-Dimensional Data”, The VLDB Journal, Vol.3, No.4, pp.517–542, 1994. 5
F.P. Preparata and M. I. Shamos: “Computational Geometry: an Introduction”, Springer, 1985. 4, 9
N. Roussopoulos, S. Kelley and F. Vincent: “Nearest Neighbor Queries”, Proceedings 1995 ACM SIGMOD Conference, pp.71–79, 1995. 3, 7, 8
H. Shin, B. Moon and S. Lee: “Adaptive Multi-Stage Distance Join Processing”, Proceedings 2000 ACM SIGMOD Conference, pp.343–354, 2000. 4
Y. Sakurai, M. Yoshikawa, S. Uemura and H. Kojima: “The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation”, Proceedings 26th VLDB Conference, pp.516–526, 2000. 17
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Corral, A., Cañadas, J., Vassilakopoulos, M. (2003). Processing Distance-Based Queries in Multidimensional Data Spaces Using R-trees. In: Manolopoulos, Y., Evripidou, S., Kakas, A.C. (eds) Advances in Informatics. PCI 2001. Lecture Notes in Computer Science, vol 2563. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-38076-0_1
Download citation
DOI: https://doi.org/10.1007/3-540-38076-0_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-07544-8
Online ISBN: 978-3-540-38076-4
eBook Packages: Springer Book Archive