Abstract
High-dimensional data, such as documents, digital images, and audio clips, can be considered as spatial objects, which induce a metric space where the metric can be used to measure dissimilarities between objects. We propose a method for retrieving objects within some distance from a given object by utilizing a spatial indexing/access method R-tree. Since R-tree usually assumes a Euclidean metric, we have to embed objects into a Euclidean space. However, some of naturally defined distance measures, such as L 1 distance (or Manhattan distance), cannot be embedded into any Euclidean space. First, we prove that objects in discrete L 1 metric space can be embedded into vertices of a unit hypercube when the square root of L 1 distance is used as the distance. To take fully advantage of R-tree spatial indexing, we have to project objects into space of relatively lower dimension. We adopt FastMap by Faloutsos and Lin to reduce the dimension of object space. The range corresponding to a query (Q, h) for retrieving objects within distance h from a object Q is naturally considered as a hyper-sphere even after FastMap projection, which is an orthogonal projection in Euclidean space. However, it is turned out that the query range is contracted into a smaller hyper-box than the hyper-sphere by applying FastMap to objects embedded in the above mentioned way. Finally, we give a brief summary of experiments in applying our method to Japanese chess boards
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
N. Beckmann, H.P. Kriegal, R. Schneider, and B. Seeger.The R-tree: An Efficient and Robust Access Method for Points and Rectangles. In Proc. ACM SIGMOD International Conference on Management of Data, 19(2):322–331, 1990.
T. Bially. Space-filling Curves: Their Generation and Their Application to Bandwidth Reduction. IEEE Trans. on Information Theory, IT-15(6):658–664, 1969.
C. Faloutsos and S. Roseman. Fractals for Secondary Key Retrieval. In Proc. 8th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 247–252, 1989.
C. Faloutsos and K.I. Lin. FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets. In Proc. ACM SIGMOD International Conference on Management of Data, 24(2):163–174, 1995.
A. Guttman. R-tree: A Dynamic Index Structure for Spatial Searching. In Proc. ACM SIGMOD, pp. 47–57, 1984.
I. Kamel and C. Faloutsos. On Packing R-trees. In Proc. 2nd. International Conference on Information and Knowledge Management, pp. 490–499, 1993.
M. Otterman. Approximate Matching with High Dimensionality R-trees. M. Sc. Scholarly paper, Dept. of Computer Science, Univ. of Maryland, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shinohara, T., An, J., Ishizaka, H. (1998). Approximate Retrieval of High-Dimensional Data by Spatial Indexing. In: Arikawa, S., Motoda, H. (eds) Discovey Science. DS 1998. Lecture Notes in Computer Science(), vol 1532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49292-5_13
Download citation
DOI: https://doi.org/10.1007/3-540-49292-5_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65390-5
Online ISBN: 978-3-540-49292-4
eBook Packages: Springer Book Archive