Skip to main content
Log in

The TV-tree: An index structure for high-dimensional data

  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

We propose a file structure to index high-dimensionality data, which are typically points in some feature space. The idea is to use only a few of the features, using additional features only when the additional discriminatory power is absolutely necessary. We present in detail the design of our tree structure and the associated algorithms that handle such “varying length” feature vectors. Finally, we report simulation results, comparing the proposed structure with theR *-tree, which is one of the most successful methods for low-dimensionality spaces.The results illustrate the superiority of our method, which saves up to 80% in disk accesses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R., Faloutsos, C., and Swami, A. Efficient similarity search in sequence databases.FODO Conference, Evanston, IL, 1993.

  • Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. A basic local alignment tool.Journal of Molecular Biology 215(13):403–410, 1990.

    Google Scholar 

  • Angell, R.C., Freund, G.E., and Willet, P. Automatic spelling correction using a trigram similarity measure.Information Processing and Management, 19(4):255–261, 1983.

    Google Scholar 

  • Arya, M., Cody, W., Faloutsos, C., Richardson, J., and Toga, A. Qbism: A prototype 3-D medical image database system.IEEE Data Engineering Bulletin, 16(1):38–42, 1993.

    Google Scholar 

  • Aurenhammer, F. Voronoi diagrams: A survey of a fundamental geometric data structure.ACM Computing Surveys, 23(3):345–405, 1991.

    Google Scholar 

  • Beckmann, N., Kriegel, H.-P., Schneider, R., and Seeger, B. The R*-tree: An efficient and robust access method for points and rectangles.ACM SIGMOD, Atlantic City, NJ, 1990.

  • Bentley, J.L., Weide, B.W., and Yao, A.C. Optimal expected-time algorithms for closest-point problems.ACM Transactions on Mathematical Software, 6(4):563–580, 1980.

    Google Scholar 

  • Brinkhoff, T., Kriegel, H.-P., and Seeger, B. Efficient processing of spatial joins usingR-trees.Proceedings of the ACM SIGMOD, Washington, DC, 1993.

  • Chatfield, C.The Analysis of Time Series: An Introduction. London: Chapman and Hall, 1984. Third edition.

    Google Scholar 

  • Friedman, J.H., Baskett, F., and Shustek, L.H. An algorithm for finding nearest neighbors.IEEE Transactions on Computers, C-24(10):1000–1006, 1975.

    Google Scholar 

  • Fukunaga, K.Introduction to Statistical Pattern Recognition. New York: Academic Press, 1990.

    Google Scholar 

  • Fukunaga, K. and Narendra, P.M. A branch and bound algorithm for computing k-nearest neighbors.IEEE Transactions on Computers, C-24(7):750–753, 1975.

    Google Scholar 

  • Greene, D. An implementation and performance analysis of spatial data access methods.Proceedings of Data Engineering, Boston, MA, 1989.

  • Guttman, A. R-trees: A dynamic index structure for spatial searching.Proceedings of the ACM SIGMOD, 1984.

  • Hamming, R.W.Digital Filters. Englewood Cliffs, NJ: Prentice-Hall, 1977.

    Google Scholar 

  • Hartigan, J.A.Clustering algorithms. New York: John Wiley & Sons, 1975.

    Google Scholar 

  • Hoel, E.G. and Samet, H. A qualitative comparison study of data structures for large line segment databases.Proceedings of the ACM SIGMOD Conference, San Diego, CA, 1992.

  • Hunter, G.M. and Steiglitz, K. Operations on images using quad trees.IEEE Transactions on PAMI, 1(2):145–153 (1979).

    Google Scholar 

  • Jagadish, H.V. Spatial search with polyhedra.Proceedings of the Sixth IEEE International Conference on Data Engineering, Los Angeles, CA, 1990.

  • Jagadish, H.V. A retrieval technique for similar shapes.Proceedings of the ACM SIGMOD Conference, Denver, CO, 1991.

  • Kamel, I. and Faloutsos, C. HilbertR-tree: An improvedR-tree using fractals. Systems Research Center (SRC) TR-93-19, University of Maryland, College Park, MD, 1993.

    Google Scholar 

  • Kukich, K. Techniques for automatically correcting words in text.ACM Computing Surveys, 24(4):377–440, 1992.

    Google Scholar 

  • Mandelbrot, B.Fractal Geometry of Nature. New York: W.H. Freeman, 1977.

    Google Scholar 

  • Murtagh, F. A survey of recent advances in hierarchical clustering algorithms.The Computer Journal, 26(4):354–359, 1983.

    Google Scholar 

  • Narasimhalu, A.D. and Christodoulakis, S. Multimedia information systems: The unfolding of a reality.IEEE Computer, 24(10):6–8, 1991.

    Google Scholar 

  • Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, E., Petkovic, D., Yanker, P., Faloutsos, C., and Taubin, G. The qbic project: Querying images by content using color, texture, and shape.SPIE 1993 International Symposium on Electronic Imaging: Science and Technology Conference 1908, Storage and Retrieval for Image and Video Databases, San Jose, CA, 1993. Also available as IBM Research Report RJ 9203 (81511), 1993.

  • Nievergelt, J., Hinterberger, H., and Sevcik, K.C. The grid file: An adaptable, symmetric, multikey file structure.ACM TODS, 9(1):38–71, 1984.

    Google Scholar 

  • Orenstein, J.A. and Manola, F.A. Probe spatial data modeling and query processing in an image database application.IEEE Transactions on Software Engineering, 14(5):611–629, 1988.

    Google Scholar 

  • Ruskai, M.B., Beylkin, G., Coifman, R., Daubechies, I., Mallat, S., Meyer, Y., and Raphael, L.Wavelets and Their Applications. Boston: Jones and Bartlett Publishers, 1992.

    Google Scholar 

  • Salton, G. and Wong, A. Generation and search of clustered files.ACM TODS, 3(4):321–346, 1978.

    Google Scholar 

  • Samet, H..The Design and Analysis of Spatial Data Structures. Reading, MA: Addison-Wesley, 1989.

    Google Scholar 

  • Schroeder, M.:Fractals, Chaos, Power Laws: Minutes From an Infinite Paradise. New York: W.H. Freeman and Company, 1991.

    Google Scholar 

  • Wallace, G.K. The jpeg still picture compression standard.CACM, 34(4):31–44, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, KI., Jagadish, H.V. & Faloutsos, C. The TV-tree: An index structure for high-dimensional data. VLDB Journal 3, 517–542 (1994). https://doi.org/10.1007/BF01231606

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01231606

Key Words

Navigation