Abstract
In this chapter we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the point distribution and we show that by eliminating these nodes the performance of the resulting index is robust to skewed data distributions. The index structure is formally defined, efficiently updatable and bounds on the number of nodes and the mean height of the tree can be proved. Bounds on the expected height of the tree can be given under certain mild constraints on the spatial distribution of points. Empirical evidence both on real data sets and generated data sets shows that the PK-tree outperforms the recently proposed spatial indexes based on the R-tree such as the SR-tree and X-tree by a wide margin. It is also significant that the relative performance advantage of the PK-tree grows with the dimensionality of the data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Abrash. BSP Trees. Dr. Dobbs Sourcebook, 20(14), 49–52, May/June 1995.
N. Beckmann, H.-P. Kriegel, R. Schneider, and Bernhard Seeger. The R*-tree: an efficient and robust access method for points and rectangles. Proc. ACM SIGMOD Conf. on Management of Data, 322–331, 1990.
S. Berchtold, D. A. Keim, and H.-P. Kriegel. The X-tree: an index structure for high-dimensional data. Proc. 22nd Int. Conf. on Very Large Data Bases (VLDB), 28–39, 1996.
P. Ciaccia, M. Patella, and P. Zezula. M-tree: an efficient access method for similarity search in metric spaces. Proc. 23rd Int. Conf. on Very Large Data Bases (VLDB), 426–435, 1997.
A. Guttman. R-trees: a dynamic index structure for spatial searching. Proc. ACM SIGMOD Conf. on Management of Data,47–57, 1984.
A. Henrich, H.-W. Six, and P. Widmayer. The LSD tree: spatial access to multi-dimensional point and non-point objects. Proc. 15th Int. Conf. on Very Large Data Bases (VLDB), 45–54, 1989.
I. Kamel and C. Faloutsos. Hilbert R-tree: an improved R-tree using fractals. Proc. 20th Int. Conf. on Very Large Data Bases (VLDB), 500–509, 1994.
N. Katayama and S. Satoh. The SR-tree: an index structure for high-dimensional nearest neighbor queries. Proc. ACM SIGMOD Conf. on Management of Data, 369–380, 1997.
K.-I. Lin, H. V. Jagadish, and C. Faloutsos. The TV-tree: an index structure for high-dimensional data. VLDB Journal, 3(4):517–542, 1994.
R. Motwani. Randomized Algorithms, Cambridge University Press, 1997.
J. T. Robinson. The K-D-B-tree: a search structure for large multidimensional dynamic indexes. Proc. ACM SIGMOD Conf. on Management of Data, 10–18, 1981.
H. Samet. The design and analysis of spatial data structures. Addison-Wesley Publishing Company, 1990.
T. K. Sellis, N. Roussopoulos, and C. Faloutsos. The R+-tree: a dynamic index for multi-dimensional objects. Proc. 13th Int. Conf. on Very Large Data Bases (VLDB), 507–518, 1987.
W. Wang, J. Yang, and R. Muntz. PK-tree: a dynamic spatial index structure for large data sets. UCLA Computer Science Department Technical Report #970039, 1997.
W. Wang, J. Yang, and R. Muntz. PK-tree: a spatial index structure for high dimensional point data. UCLA Computer Science Department Technical Report #980032, 1998.
W. Wang, J. Yang, and R. Muntz. PK-tree: a spatial index structure for high dimensional point data. Proc. Int. Conf. on Foundations of Data Organozation and Algorithms (FODO), 1998.
J. Yang, W. Wang, and R. Muntz. Yet another spatial indexing structure. UCLA Computer Science Department Technical Report #970040, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media New York
About this chapter
Cite this chapter
Wang, W., Yang, J., Muntz, R. (2000). PK-Tree: A Spatial Index Structure for High Dimensional Point Data. In: Tanaka, K., Ghandeharizadeh, S., Kambayashi, Y. (eds) Information Organization and Databases. The Springer International Series in Engineering and Computer Science, vol 579. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1379-7_20
Download citation
DOI: https://doi.org/10.1007/978-1-4615-1379-7_20
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5524-3
Online ISBN: 978-1-4615-1379-7
eBook Packages: Springer Book Archive