Skip to main content

mmLSH: A Practical and Efficient Technique for Processing Approximate Nearest Neighbor Queries on Multimedia Data

  • Conference paper
  • First Online:
Similarity Search and Applications (SISAP 2020)

Abstract

Many large multimedia applications require efficient processing of nearest neighbor queries. Often, multimedia data are represented as a collection of important high-dimensional feature vectors. Existing Locality Sensitive Hashing (LSH) techniques require users to find top-k similar feature vectors for each of the feature vectors that represent the query object. This leads to wasted and redundant work due to two main reasons: 1) not all feature vectors may contribute equally in finding the top-k similar multimedia objects, and 2) feature vectors are treated independently during query processing. Additionally, there is no theoretical guarantee on the returned multimedia results. In this work, we propose a practical and efficient indexing approach for finding top-k approximate nearest neighbors for multimedia data using LSH called mmLSH, which can provide theoretical guarantees on the returned multimedia results. Additionally, we present a buffer-conscious strategy to speed up the query processing. Experimental evaluation shows significant gains in performance time and accuracy for different real multimedia datasets when compared against state-of-the-art LSH techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Supported by NSF Award #1337884.

  2. 2.

    mmLSH can be implemented over any state-of-the-art LSH technique.

References

  1. Bartolini, I., Ciaccia, P., Patella, M.: Query processing issues in region-based image databases. Knowl. Inf. Syst. 25, 389ā€“420 (2010). https://doi.org/10.1007/s10115-009-0257-4

    ArticleĀ  Google ScholarĀ 

  2. Arora, A., Sinha, S., Kumar, P., Bhattacharya, A.: Hd-index: pushing the scalability-accuracy boundary for approximate kNN search. In: VLDB (2018)

    Google ScholarĀ 

  3. Caltech dataset. http://www.vision.caltech.edu/Image_Datasets/Caltech256

  4. Christiani, T.: Fast Locality-sensitive hashing frameworks for approximate near neighbor search. In: Amato, G., Gennaro, C., Oria, V., Radovanović, M. (eds.) SISAP 2019. LNCS, vol. 11807, pp. 3ā€“17. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32047-8_1

    ChapterĀ  Google ScholarĀ 

  5. Corel dataset. http://www.ci.gxnu.edu.cn/cbir/Dataset.aspx

  6. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: SOCG (2004)

    Google ScholarĀ 

  7. Gan, J., Feng, J., Fang, Q., Ng, W.: Locality-sensitive hashing scheme based on dynamic collision counting. In: SIGMOD (2012)

    Google ScholarĀ 

  8. Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB (1999)

    Google ScholarĀ 

  9. Huang, Q., Feng, J., Zhang, Y., Fang, Q., Ng, W.: Query-aware locality-sensitive hashing for approximate nearest neighbor search. In: VLDB (2015)

    Google ScholarĀ 

  10. Jafari, O., Ossorgin, J., Nagarkar, P.: qwLSH: cache-conscious indexing for processing similarity search query workloads in high-dimensional spaces. In: ICMR (2019)

    Google ScholarĀ 

  11. JĆ©gou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87, 316ā€“336 (2010). https://doi.org/10.1007/s11263-009-0285-2

    ArticleĀ  Google ScholarĀ 

  12. Križaj, J., Å truc, V., PaveÅ”ić, N.: Adaptation of SIFT features for robust face recognition. In: Campilho, A., Kamel, M. (eds.) ICIAR 2010. LNCS, vol. 6111. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13772-3_40

    ChapterĀ  MATHĀ  Google ScholarĀ 

  13. Liu, W., Wang, H., Zhang, Y., Wang, W., Qin, L.: I-LSH: I/O efficient c-approximate nearest neighbor search in high-dimensional space. In: ICDE (2019)

    Google ScholarĀ 

  14. MirFlicker dataset. http://press.liacs.nl/mirflickr

  15. Nagarkar, P., Candan, K.S.: PSLSH: an index structure for efficient execution of set queries in high-dimensional spaces. In: CIKM (2018)

    Google ScholarĀ 

  16. Perez, C.A., Cament, L.A., Castillo, L.E.: Methodological improvement on local Gabor face recognition based on feature selection and enhanced Borda count. Pattern Recogn. 44, 951ā€“963 (2011)

    ArticleĀ  Google ScholarĀ 

  17. Reilly, B.: Social choice in the south seas: electoral innovation and the Borda count in the Pacific Island countries. IPSR 23, 355ā€“372+467 (2002)

    Google ScholarĀ 

  18. Seagate ST2000DM001 Manual. https://www.seagate.com/files/staticfiles/docs/pdf/datasheet/disc/barracuda-ds1737-1-1111us.pdf

  19. Sundaram, N., et al.: Streaming similarity search over one billion tweets using parallel locality-sensitive hashing. In: VLDB (2013)

    Google ScholarĀ 

  20. Tao, C., Tan, Y., Cai, H., Tian, J.: Airport detection from large IKONOS images using clustered SIFT keypoints and region information. In: GRSL (2011)

    Google ScholarĀ 

  21. Wang, J.Z., Li, J., Wiederhold, G.: Simplicity semantics-sensitive integrated matching for picture libraries. TPAMI 23, 947ā€“963 (2001)

    ArticleĀ  Google ScholarĀ 

  22. Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: CVPR (2009)

    Google ScholarĀ 

  23. Zhou, W., Li, H., Lu, Y., Tian, Q.: Large scale image search with geometric coding. In: MM 2011 (2011)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omid Jafari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jafari, O., Nagarkar, P., MontaƱo, J. (2020). mmLSH: A Practical and Efficient Technique for Processing Approximate Nearest Neighbor Queries on Multimedia Data. In: Satoh, S., et al. Similarity Search and Applications. SISAP 2020. Lecture Notes in Computer Science(), vol 12440. Springer, Cham. https://doi.org/10.1007/978-3-030-60936-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60936-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60935-1

  • Online ISBN: 978-3-030-60936-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics