Advertisement

Multimedia Tools and Applications

, Volume 78, Issue 21, pp 30585–30598 | Cite as

Exploring geometric information in CNN for image retrieval

  • Ying Li
  • Xiangwei KongEmail author
  • Haiyan Fu
Article
  • 178 Downloads

Abstract

Convolutional Neural Network (CNN) has brought significant improvements for various multimedia tasks. In contrast, image retrieval has not yet benefited as much since no training database is available. In this paper, we propose an unsupervised weighting scheme for pre-trained CNN models to adaptively emphasize image center. Different from the general preference for fully connected layers which represent abstract semantics, we aggregate the activations of convolutional layers on image patches to depict local patterns in details. It is an empirical observation that the target of searching is naturally the focus of an image. Thus we pooling the features with respect to their positions, since they innately maintain the geometric layout of an image. Experimental results on two benchmarks prove the effectiveness of our methods.

Keywords

Image retrieval Spatial pooling Feature weighting 

Notes

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 61772111, in part by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (NSFC) under Grant 71421001, in part by the National Natural Science Foundation of China (NSFC) under Grant 61502073, and in part by the Fundamental Research Funds for the Central UniversitiesDUT18JC02.

References

  1. 1.
    Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval. In: International conference on computer vision, pp 1269–1277Google Scholar
  2. 2.
    Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: European conference on computer vision, pp 584–599Google Scholar
  3. 3.
    Bai S, Bai X (2016) Sparse contextual activation for efficient visual re-ranking. IEEE Trans Image Process 25(3):1056–1069MathSciNetCrossRefGoogle Scholar
  4. 4.
    Bai S, Sun S, Bai X, Zhang Z, Tian Q (2016) Smooth neighborhood structure mining on multiple affinity graphs with applications to context-sensitive similarity. In: European conference on computer vision, pp 592–608Google Scholar
  5. 5.
    Bai S, Bai X, Tian Q, Latecki L J (2017) Regularized diffusion process for visual retrieval. In: AAAI conference on artificial intelligence, pp 3967–3973Google Scholar
  6. 6.
    Bai S, Zhou Z, Wang J, Bai X, Latecki L J, Tian Q (2017) Ensemble diffusion for retrieval. In: IEEE international conference on computer vision, pp 774–783Google Scholar
  7. 7.
    Bai S, Bai X, Tian Q, Latecki L J (2018) Regularized diffusion process on bidirectional context for object retrieval. IEEE Transactions on Pattern Analysis and Machine IntelligenceGoogle Scholar
  8. 8.
    Diaz I G, Birinci M, Diaz-De-Maria F, Delp E J (2017) Neighborhood matching for image retrieval. IEEE Transactions on Multimedia (99)Google Scholar
  9. 9.
    Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: European conference on computer vision, pp 392–407Google Scholar
  10. 10.
    He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361Google Scholar
  11. 11.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778Google Scholar
  12. 12.
    Husain S S, Bober M (2017) Improving large-scale image retrieval through robust aggregation of local descriptors. IEEE Trans Pattern Anal Mach Intell 39(9):1783–1796CrossRefGoogle Scholar
  13. 13.
    Jégou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision, pp 304–317Google Scholar
  14. 14.
    Jégou H, Douze M, Schmid C (2009) On the burstiness of visual elements. In: IEEE conference on computer vision and pattern recognition, pp 1169–1176Google Scholar
  15. 15.
    Jégou H, Douze M, Schmid C (2010) Improving bag-of-features for large scale image search. Int J Comput Vis 87(3):316–336CrossRefGoogle Scholar
  16. 16.
    Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: IEEE conference on computer vision and pattern recognition, pp 3304–3311Google Scholar
  17. 17.
    Jégou H, Perronnin F, Douze M, Sanchez J, Perez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Trans Pattern Anal Mach Intell 34(9):1704–1716CrossRefGoogle Scholar
  18. 18.
    Kumar M, Chhabra P, Garg N K (2018) An efficient content based image retrieval system using bayesnet and k-nn, Multimed Tools Appl, 1–14Google Scholar
  19. 19.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE conference on computer vision and pattern recognition, pp 2169–2178Google Scholar
  20. 20.
    Li Y, Kong X, Zheng L, Tian Q (2016) Exploiting hierarchical activations of neural network for image retrieval. In: Proceedings of the 24nd ACM international conference on Multimedia, pp 132–136. ACMGoogle Scholar
  21. 21.
    Lowe D G (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRefGoogle Scholar
  22. 22.
    Qin D, Wengert C, Gool L V (2013) Query adaptive similarity for large scale object retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1610–1617Google Scholar
  23. 23.
    Radenović F, Tolias G, Chum O (2016) Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. In: European conference on computer vision, pp 3–20. SpringerGoogle Scholar
  24. 24.
    Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. IEEE Transactions on Pattern Analysis and Machine IntelligenceGoogle Scholar
  25. 25.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
  26. 26.
    Sun S, Li Y, Zhou W, Tian Q, Li H (2017) Local residual similarity for image re-ranking. Inform Sci 417:143–153CrossRefGoogle Scholar
  27. 27.
    Tolias G, Sicre R, Jégou H (2016) Particular object retrieval with integral max-pooling of cnn activations. In: International conference on learning representations, pp 1–12Google Scholar
  28. 28.
    Wang Y, Lin X, Wu L, Zhang W (2015) Effective multi-query expansions: Robust landmark retrieva. In: Proceedings of the 23rd ACM international conference on Multimedia, pp 79–88. ACMGoogle Scholar
  29. 29.
    Wang Y, Lin X, Wu L, Zhang W, Zhang Q (2015) Lbmch: Learning bridging mapping for cross-modal hashing. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pp 999–1002. ACMGoogle Scholar
  30. 30.
    Wang Y, Lin X, Wu L, Zhang W (2017) Effective multi-query expansions: Collaborative deep networks for robust landmark retrieval. IEEE Trans Image Process 26(3):1393–1404MathSciNetCrossRefGoogle Scholar
  31. 31.
    Wang Y, Zhang W, Wu L, Lin X, Zhao X (2017) Unsupervised metric fusion over multiview data by graph random walk-based cross-view diffusion. IEEE Trans Neural Netw Learn Syst 28(1):57–70CrossRefGoogle Scholar
  32. 32.
    Wu L, Wang Y, Gao J, Li X (2018) Deep adaptive feature embedding with local sample distributions for person re-identification. Pattern Recogn 73:275–288CrossRefGoogle Scholar
  33. 33.
    Xie L, Tian Q, Flynn J, Wang J, Yuille A (2016) Geometric neural phrase pooling: Modeling the spatial co-occurrence of neurons. In: European conference on computer visionGoogle Scholar
  34. 34.
    Zhang S, Yang M, Wang X, Lin Y, Tian Q (2013) Semantic-aware co-indexing for image retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1673–1680Google Scholar
  35. 35.
    Zhang S, Yang M, Cour T, Yu K, Metaxas D N (2015) Query specific rank fusion for image retrieval. IEEE Trans Pattern Anal Mach Intell 37(4):803–815CrossRefGoogle Scholar
  36. 36.
    Zheng L, Wang S, Liu Z, Tian Q (2014) Packing and padding: Coupled multi-index for accurate image retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1939–1946Google Scholar
  37. 37.
    Zheng L, Zhao Y, Wang S, Wang J, Tian Q (2016) Good practice in cnn feature transfer. arXiv:1604.00133
  38. 38.
    Zheng L, Yang Y, Tian Q (2018) Sift meets cnn: A decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell 40(5):1224–1244CrossRefGoogle Scholar
  39. 39.
    Zhou W, Yang M, Wang X, Li H, Lin Y, Tian Q (2016) Scalable feature matching by dual cascaded scalar quantization for image retrieval. IEEE Trans Pattern Anal Mach Intell 38(1):159–171CrossRefGoogle Scholar
  40. 40.
    Zhu Y, Jiang J, Han W, Ding Y, Tian Q (2017) Interpretation of users’ feedback via swarmed particles for content-based image retrieval. Inform. Sci 375:246–257CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Information and Communication EngineeringDalian University of TechnologyDalianChina
  2. 2.Department of Data Science and Engineering ManagementZhejiang UniversityHangzhouChina

Personalised recommendations