Advertisement

The VLDB Journal

, Volume 3, Issue 4, pp 517–542 | Cite as

The TV-tree: An index structure for high-dimensional data

  • King-Ip Lin
  • H. V. Jagadish
  • Christos Faloutsos
Article

Abstract

We propose a file structure to index high-dimensionality data, which are typically points in some feature space. The idea is to use only a few of the features, using additional features only when the additional discriminatory power is absolutely necessary. We present in detail the design of our tree structure and the associated algorithms that handle such “varying length” feature vectors. Finally, we report simulation results, comparing the proposed structure with theR*-tree, which is one of the most successful methods for low-dimensionality spaces.The results illustrate the superiority of our method, which saves up to 80% in disk accesses.

Key Words

Spatial index similarity retrieval query by content 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Faloutsos, C., and Swami, A. Efficient similarity search in sequence databases.FODO Conference, Evanston, IL, 1993.Google Scholar
  2. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. A basic local alignment tool.Journal of Molecular Biology 215(13):403–410, 1990.Google Scholar
  3. Angell, R.C., Freund, G.E., and Willet, P. Automatic spelling correction using a trigram similarity measure.Information Processing and Management, 19(4):255–261, 1983.Google Scholar
  4. Arya, M., Cody, W., Faloutsos, C., Richardson, J., and Toga, A. Qbism: A prototype 3-D medical image database system.IEEE Data Engineering Bulletin, 16(1):38–42, 1993.Google Scholar
  5. Aurenhammer, F. Voronoi diagrams: A survey of a fundamental geometric data structure.ACM Computing Surveys, 23(3):345–405, 1991.Google Scholar
  6. Beckmann, N., Kriegel, H.-P., Schneider, R., and Seeger, B. The R*-tree: An efficient and robust access method for points and rectangles.ACM SIGMOD, Atlantic City, NJ, 1990.Google Scholar
  7. Bentley, J.L., Weide, B.W., and Yao, A.C. Optimal expected-time algorithms for closest-point problems.ACM Transactions on Mathematical Software, 6(4):563–580, 1980.Google Scholar
  8. Brinkhoff, T., Kriegel, H.-P., and Seeger, B. Efficient processing of spatial joins usingR-trees.Proceedings of the ACM SIGMOD, Washington, DC, 1993.Google Scholar
  9. Chatfield, C.The Analysis of Time Series: An Introduction. London: Chapman and Hall, 1984. Third edition.Google Scholar
  10. Friedman, J.H., Baskett, F., and Shustek, L.H. An algorithm for finding nearest neighbors.IEEE Transactions on Computers, C-24(10):1000–1006, 1975.Google Scholar
  11. Fukunaga, K.Introduction to Statistical Pattern Recognition. New York: Academic Press, 1990.Google Scholar
  12. Fukunaga, K. and Narendra, P.M. A branch and bound algorithm for computing k-nearest neighbors.IEEE Transactions on Computers, C-24(7):750–753, 1975.Google Scholar
  13. Greene, D. An implementation and performance analysis of spatial data access methods.Proceedings of Data Engineering, Boston, MA, 1989.Google Scholar
  14. Guttman, A. R-trees: A dynamic index structure for spatial searching.Proceedings of the ACM SIGMOD, 1984.Google Scholar
  15. Hamming, R.W.Digital Filters. Englewood Cliffs, NJ: Prentice-Hall, 1977.Google Scholar
  16. Hartigan, J.A.Clustering algorithms. New York: John Wiley & Sons, 1975.Google Scholar
  17. Hoel, E.G. and Samet, H. A qualitative comparison study of data structures for large line segment databases.Proceedings of the ACM SIGMOD Conference, San Diego, CA, 1992.Google Scholar
  18. Hunter, G.M. and Steiglitz, K. Operations on images using quad trees.IEEE Transactions on PAMI, 1(2):145–153 (1979).Google Scholar
  19. Jagadish, H.V. Spatial search with polyhedra.Proceedings of the Sixth IEEE International Conference on Data Engineering, Los Angeles, CA, 1990.Google Scholar
  20. Jagadish, H.V. A retrieval technique for similar shapes.Proceedings of the ACM SIGMOD Conference, Denver, CO, 1991.Google Scholar
  21. Kamel, I. and Faloutsos, C. HilbertR-tree: An improvedR-tree using fractals. Systems Research Center (SRC) TR-93-19, University of Maryland, College Park, MD, 1993.Google Scholar
  22. Kukich, K. Techniques for automatically correcting words in text.ACM Computing Surveys, 24(4):377–440, 1992.Google Scholar
  23. Mandelbrot, B.Fractal Geometry of Nature. New York: W.H. Freeman, 1977.Google Scholar
  24. Murtagh, F. A survey of recent advances in hierarchical clustering algorithms.The Computer Journal, 26(4):354–359, 1983.Google Scholar
  25. Narasimhalu, A.D. and Christodoulakis, S. Multimedia information systems: The unfolding of a reality.IEEE Computer, 24(10):6–8, 1991.Google Scholar
  26. Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, E., Petkovic, D., Yanker, P., Faloutsos, C., and Taubin, G. The qbic project: Querying images by content using color, texture, and shape.SPIE 1993 International Symposium on Electronic Imaging: Science and Technology Conference 1908, Storage and Retrieval for Image and Video Databases, San Jose, CA, 1993. Also available as IBM Research Report RJ 9203 (81511), 1993.Google Scholar
  27. Nievergelt, J., Hinterberger, H., and Sevcik, K.C. The grid file: An adaptable, symmetric, multikey file structure.ACM TODS, 9(1):38–71, 1984.Google Scholar
  28. Orenstein, J.A. and Manola, F.A. Probe spatial data modeling and query processing in an image database application.IEEE Transactions on Software Engineering, 14(5):611–629, 1988.Google Scholar
  29. Ruskai, M.B., Beylkin, G., Coifman, R., Daubechies, I., Mallat, S., Meyer, Y., and Raphael, L.Wavelets and Their Applications. Boston: Jones and Bartlett Publishers, 1992.Google Scholar
  30. Salton, G. and Wong, A. Generation and search of clustered files.ACM TODS, 3(4):321–346, 1978.Google Scholar
  31. Samet, H..The Design and Analysis of Spatial Data Structures. Reading, MA: Addison-Wesley, 1989.Google Scholar
  32. Schroeder, M.:Fractals, Chaos, Power Laws: Minutes From an Infinite Paradise. New York: W.H. Freeman and Company, 1991.Google Scholar
  33. Wallace, G.K. The jpeg still picture compression standard.CACM, 34(4):31–44, 1991.Google Scholar

Copyright information

© VLDV 1994

Authors and Affiliations

  • King-Ip Lin
    • 1
  • H. V. Jagadish
    • 2
  • Christos Faloutsos
    • 1
  1. 1.Department of Computer ScienceUniversity of MarylandCollege Park
  2. 2.AT&T Bell LaboratoriesMurray Hill

Personalised recommendations