Skip to main content

Dimensionality Reduction Techniques for Nearest-Neighbor Computations

  • Reference work entry
  • First Online:
  • 42 Accesses

Synonyms

Clustering; Karhunen-Loève transform (KLT); Multi-dimensional indexing; Nearest neighbors query; Principal component analysis (PCA); Singular value decomposition (SVD)

Definition

Representing objects such as images by their feature vectors and searching for similarity according to the distances of the points representing them in high-dimensional space via k-nearest-neighbor (k-NN) queries to a target image are a popular paradigm. Dimensionality reduction via singular value decomposition (SVD) to individual clusters of a dataset results in higher dimensionality reduction for the same normalized mean square error (NMSE) than applying singular value decomposition (SVD) to the whole dataset. The cost of processing k-NN queries is further reduced by suitable indexing structures such as the ordered partition (OP)-tree and the stepwise dimensionality increasing (SDI)-tree.

Historical Background

IBM’s Query by Image Content (QBIC) project, which utilized content-based image retrieval...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Aggarwal CC, Procopiuc CM, Wolf JL, Yu PS, Park JS. Fast algorithms for projected clustering. In: Proceedings of the ACM SIGMOD International Conference; 1999. p. 61–72.

    Google Scholar 

  2. Aggarwal CC, Yu PS. Finding generalized projected clusters in high dimensional spaces. In: Proceedings of the ACM SIGMOD International Conference; 2000. p. 70–81.

    Google Scholar 

  3. Agrawal R, Gehrke J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the ACM SIGMOD International Conference, Seattle, June 1998. p. 94–105.

    Google Scholar 

  4. Böhm C, Kailing K, Kröger P, Zimek A. Computing clusters of correlation connected objects. In: Proceedings of the ACM SIGMOD International Conference; 2004. p. 455–66.

    Google Scholar 

  5. Castelli V, Thomasian A, Li CS. CSVD: clustering and singular value decomposition for approximate similarity search in high dimensional spaces. IEEE Trans. Knowl Data Eng. 2003;14(3):671–85.

    Article  Google Scholar 

  6. Chakrabarti K, Mehrotra S. Local dimensionality reduction: a new approach to indexing high dimensional space. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2000. p. 89–100.

    Google Scholar 

  7. Faloutsos C. Searching multimedia databases by content. Advances in database systems. Boston: KAP/Elsevier; 1996.

    Book  MATH  Google Scholar 

  8. Kim B, Park S. A fast k-nearest-neighbor finding algorithm based on the ordered partition. IEEE Trans. Pattern Anal. Mach. Intell. 1986;8(6):761–66.

    Article  MathSciNet  MATH  Google Scholar 

  9. Korn F, Jagadish HV, Faloutsos C. Efficiently supporting ad hoc queries in large datasets of time sequences. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1997. p. 289–300.

    Google Scholar 

  10. Korn F, Sidiropoulos N, Faloutsos C, Siegel E, Protopapas Z. Fast and effective retrieval of medical tumor shapes: nearest neighbor search in medical image databases. IEEE Trans Knowl Data Eng. 1998;10(6):889–904.

    Article  Google Scholar 

  11. Kriegel HP, Kröger P, Zimek A. Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Trans Knowl Discov Data. 2009;3(1): 1–58.

    Article  Google Scholar 

  12. Linde Y, Buzo A, Gray R. An algorithm for vector quantizer design. IEEE Trans Commun. 1980;28(1):84–95.

    Article  Google Scholar 

  13. Ravikanth KV, Agrawal D, Singh A. Dimensionality-reduction for similarity searching in dynamic databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1998. p. 166–76.

    Google Scholar 

  14. Samet H. Foundations of multidimensional and metric data structure. Amsterdam: Elsevier; 2006.

    MATH  Google Scholar 

  15. Thomasian A, Zhang L. The stepwise dimensionality increasing – SDI index for high dimensional data. Comput J. 2006;49(5):609–18.

    Article  Google Scholar 

  16. Thomasian A, Zhang L. Persistent clustered main memory index for accelerating k -NN queries on high dim. datasets. Multimed. Tools Appl. 2008;38(2):253–70.

    Article  Google Scholar 

  17. Thomasian A, Castelli V, Li CS. RCSVD: recursive clustering and singular value decomposition for approximate high-dimensionality indexing. In: Proceedings of the ACM International Conference on Information and Knowledge Management. p. 201–07.

    Google Scholar 

  18. Thomasian A, Li Y, Zhang L. Exact k-NN queries on clustered SVD datasets. Inf. Process. Lett. 2005;94(6):247–52.

    Article  MathSciNet  MATH  Google Scholar 

  19. Thomasian A, Li Y, Zhang L. Optimal subspace dimensionality for k-nearest-neighbor queries on clustered and dimensionality reduced datasets with SVD. Multimed. Tools Appl. 2008;40(2):241–59.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Thomasian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Thomasian, A. (2018). Dimensionality Reduction Techniques for Nearest-Neighbor Computations. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_80771

Download citation

Publish with us

Policies and ethics