SIFTpack: A Compact Representation for Efficient SIFT Matching

Gilinsky, Alexandra; Zelnik-Manor, Lihi

doi:10.1007/978-3-319-23048-1_6

Alexandra Gilinsky³ &
Lihi Zelnik-Manor³

2056 Accesses
1 Citations

Abstract

Computing distances between large sets of SIFT descriptors is a basic step in numerous algorithms in computer vision. When the number of descriptors is large, as is often the case, computing these distances can be extremely time consuming. We propose the SIFTpack: a compact way of storing SIFT descriptors, which enables significantly faster calculations between sets of SIFTs than the current solutions. SIFTpack can be used to represent SIFTs densely extracted from a single image or sparsely from multiple different images. We show that the SIFTpack representation saves both storage space and run time, for both finding nearest neighbors and computing all distances between all descriptors. The usefulness of SIFTpack is demonstrated as an alternative implementation for K-means dictionaries of visual words and for image retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, A., Triggs, B.: Hyperfeatures - multilevel local coding for visual recognition. In: ECCV, pp. 30–43 (2006)
Google Scholar
Aharon, M., Elad, M.: Sparse and redundant modeling of image content using an image-signature-dictionary. SIAM J. Imag. Sci. 1(3), 228–247 (2008)
Article MathSciNet MATH Google Scholar
Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM 45(6), 891–923 (1998)
Article MathSciNet MATH Google Scholar
Bagon, S., Boiman, O., Irani, M.: What is a good image segment? A unified approach to segment extraction. In: ECCV, pp. 30–44 (2008)
Google Scholar
Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2001)
Article Google Scholar
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. SIGGRAPH 28(3), 24:1–24:11 (2009)
Google Scholar
Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized patchmatch correspondence algorithm. In: ECCV, pp. 29–43 (2010)
Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar
Benoît, L., Mairal, J., Bach, F., Ponce, J.: Sparse image representation with epitomes. In: CVPR, pp. 2913–2920 (2011)
Google Scholar
Calonder, M., Lepetit, V., Ozuysal, M., Trzcinski, T., Strecha, C., Fua, P.: BRIEF: computing a local binary descriptor very fast. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2012)
Article Google Scholar
Crow, F.C.: Summed-area tables for texture mapping. SIGGRAPH 18(3), 207–212 (1984)
Article Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV, vol. 1, pp. 1–22 (2004)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition, vol. 2, pp. 886–893 (2005)
Google Scholar
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: SoCG, pp. 253–262 (2004)
Google Scholar
Deselaers, T., Ferrari, V.: Global and efficient self-similarity for object classification and detection. In: CVPR, pp. 1633–1640 (2010)
Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)
Google Scholar
Furuya, T., Ohbuchi, R.: Dense sampling and fast encoding for 3d model retrieval using bag-of-visual features. In: CIVR, pp. 26:1–26:8 (2009)
Google Scholar
Gilinsky, A., Zelnik-Manor, L.: Siftpack: a compact representation for efficient sift matching. In: IEEE International Conference on Computer Vision (ICCV). IEEE, New York (2013)
Google Scholar
He, K., Sun, J.: Computing nearest-neighbor fields via propagation-assisted kd-trees. In: CVPR, pp. 111–118 (2012)
Google Scholar
Janet, B., Reddy, A.: Image index model for retrieval using hausdorff distortion. In: ICCAIE, pp. 85–89 (2010)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: ECCV, pp. 304–317 (2008)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 117–128 (2011)
Article Google Scholar
Joulin, A., Bach, F., Ponce, J.: Discriminative clustering for image co-segmentation. In: CVPR, pp. 1943–1950 (2010)
Google Scholar
Joulin, A., Bach, F., Ponce, J.: Multi-class cosegmentation. In: CVPR, pp. 542–549 (2012)
Google Scholar
Ke, Y., Sukthankar, R.: Pca-sift: A more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 506–513 (2004)
Google Scholar
Khapli, V.R., Bhalchandra, A.S.: Compressed domain image retrieval using thumbnails of images. In: CICSyN, pp. 392–396 (2009)
Google Scholar
Khapli, V.R., Bhalchandra, A.S.: Image retrieval for compressed and uncompressed images. In: ICICS, pp. 1140–1143 (2009)
Google Scholar
Korman, S., Avidan, S.: Coherency sensitive hashing. In: ICCV, pp. 1607–1614 (2011)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)
Google Scholar
Li, Z., Imai, J., Kaneko, M.: Robust face recognition using block-based bag of words. In: ICPR, pp. 1285–1288 (2010)
Google Scholar
Liu, C., Yuen, J., Torralba, A.: Sift flow: Dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Article Google Scholar
Mallat, S., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41(12), 3397–3415 (1993)
Article MATH Google Scholar
Michael, N., Metaxas, D., Neidle, C.: Spatial and temporal pyramids for grammatical expression recognition of american sign language. In: ASSETS, pp. 75–82 (2009)
Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Article Google Scholar
Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: ECCV, pp. 490–503 (2006)
Google Scholar
Olonetsky, I., Avidan, S.: Treecann k-d tree coherence approximate nearest neighbor algorithm. In: ECCV, pp. 602–615 (2012)
Google Scholar
Ortiz, R.: Freak: Fast retina keypoint. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), CVPR ‘12, pp. 510–517 (2012)
Google Scholar
Pati, Y.C., Rezaiifar, R., Rezaiifar, Y.C.P.R., Krishnaprasad, P.S.: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27 th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 40–44 (1993)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR, pp. 1–8 (2007)
Google Scholar
Philbin, J., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR, pp. 1–8 (2008)
Google Scholar
Ramakrishnan, S., Rose, K., Gersho, A.: Constrained-storage vector quantization with a universal codebook. IEEE Trans. Image Process. 7, 42–51 (1995)
MathSciNet MATH Google Scholar
Rubinstein, M., Gutierrez, D., Sorkine, O., Shamir, A.: A comparative study of image retargeting. SIGGRAPH 29(5), 160:1–160:10 (2010)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: An efficient alternative to sift or surf. In: Proceedings of the 2011 International Conference on Computer Vision, ICCV ‘11, pp. 2564–2571 (2011)
Google Scholar
Schaefer, G.: Compressed domain image retrieval by comparing vector quantisation codebooks. In: Proc. SPIE, pp. 959–966 (2002)
Google Scholar
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR, pp. 1–8 (2007)
Google Scholar
Simakov, D., Caspi, Y., Shechtman, E., Irani, M.: Summarizing visual data using bidirectional similarity. In: CVPR, pp. 1–8 (2008)
Google Scholar
Stecha, C., Bronstein, A.M., Bronstein, M., Fua, P.: LDAHash: improved matching with smaller descriptors. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 34, 66–78 (2012)
Google Scholar
Strecha, C.: Dense matching of multiple wide-baseline views. In: ICCV, pp. 1194–1201 (2003)
Google Scholar
Tola, E., Lepetit, V., Fua, P.: A fast local descriptor for dense matching. In: CVPR, pp. 1–8 (2008)
Google Scholar
Trzcinski, T., Christoudias, M., Fua, P., Lepetit, V.: Boosting binary keypoint descriptors. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR ‘13), pp. 2874–2881 (2013)
Google Scholar
Umesh, K., Suresha, S.: Web image retrieval using visual dictionary. Int. J. Web Serv. Comput. 3(3), 77–84 (2012)
Article Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms. http://www.vlfeat.org/ (2008)
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)
Google Scholar
Winder, S.A.J., Hua, G., Brown, M.: Picking the best daisy. In: CVPR, pp. 178–185. IEEE, New York (2009)
Google Scholar
Yao, J., kuen Cham, W.: 3d modeling and rendering from multiple wide-baseline images by match propagation. Sig. Proc.: Image Commu. 21(6), 506–518 (2006)
Google Scholar

Download references

Acknowledgements

This research was supported in part by the Ollendorf Foundation and by the Israel Ministry of Science. We would also like to thank Prof. Michael Elad for useful conversations and good ideas.

Author information

Authors and Affiliations

Technion Israel Institute of Technology, Haifa, 32000, Israel
Alexandra Gilinsky & Lihi Zelnik-Manor

Authors

Alexandra Gilinsky
View author publications
You can also search for this author in PubMed Google Scholar
Lihi Zelnik-Manor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lihi Zelnik-Manor .

Editor information

Editors and Affiliations

The Open University of Israel, Raanana, Israel
Tal Hassner
Google Research, Cambridge, Massachusetts, USA
Ce Liu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gilinsky, A., Zelnik-Manor, L. (2016). SIFTpack: A Compact Representation for Efficient SIFT Matching. In: Hassner, T., Liu, C. (eds) Dense Image Correspondences for Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-23048-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-23048-1_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23047-4
Online ISBN: 978-3-319-23048-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics