Abstract
Convolutional Neural Network (CNN) has brought significant improvements for various multimedia tasks. In contrast, image retrieval has not yet benefited as much since no training database is available. In this paper, we propose an unsupervised weighting scheme for pre-trained CNN models to adaptively emphasize image center. Different from the general preference for fully connected layers which represent abstract semantics, we aggregate the activations of convolutional layers on image patches to depict local patterns in details. It is an empirical observation that the target of searching is naturally the focus of an image. Thus we pooling the features with respect to their positions, since they innately maintain the geometric layout of an image. Experimental results on two benchmarks prove the effectiveness of our methods.
This is a preview of subscription content, access via your institution.





References
Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval. In: International conference on computer vision, pp 1269–1277
Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: European conference on computer vision, pp 584–599
Bai S, Bai X (2016) Sparse contextual activation for efficient visual re-ranking. IEEE Trans Image Process 25(3):1056–1069
Bai S, Sun S, Bai X, Zhang Z, Tian Q (2016) Smooth neighborhood structure mining on multiple affinity graphs with applications to context-sensitive similarity. In: European conference on computer vision, pp 592–608
Bai S, Bai X, Tian Q, Latecki L J (2017) Regularized diffusion process for visual retrieval. In: AAAI conference on artificial intelligence, pp 3967–3973
Bai S, Zhou Z, Wang J, Bai X, Latecki L J, Tian Q (2017) Ensemble diffusion for retrieval. In: IEEE international conference on computer vision, pp 774–783
Bai S, Bai X, Tian Q, Latecki L J (2018) Regularized diffusion process on bidirectional context for object retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence
Diaz I G, Birinci M, Diaz-De-Maria F, Delp E J (2017) Neighborhood matching for image retrieval. IEEE Transactions on Multimedia (99)
Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: European conference on computer vision, pp 392–407
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
Husain S S, Bober M (2017) Improving large-scale image retrieval through robust aggregation of local descriptors. IEEE Trans Pattern Anal Mach Intell 39(9):1783–1796
Jégou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision, pp 304–317
Jégou H, Douze M, Schmid C (2009) On the burstiness of visual elements. In: IEEE conference on computer vision and pattern recognition, pp 1169–1176
Jégou H, Douze M, Schmid C (2010) Improving bag-of-features for large scale image search. Int J Comput Vis 87(3):316–336
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: IEEE conference on computer vision and pattern recognition, pp 3304–3311
Jégou H, Perronnin F, Douze M, Sanchez J, Perez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Trans Pattern Anal Mach Intell 34(9):1704–1716
Kumar M, Chhabra P, Garg N K (2018) An efficient content based image retrieval system using bayesnet and k-nn, Multimed Tools Appl, 1–14
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE conference on computer vision and pattern recognition, pp 2169–2178
Li Y, Kong X, Zheng L, Tian Q (2016) Exploiting hierarchical activations of neural network for image retrieval. In: Proceedings of the 24nd ACM international conference on Multimedia, pp 132–136. ACM
Lowe D G (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Qin D, Wengert C, Gool L V (2013) Query adaptive similarity for large scale object retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1610–1617
Radenović F, Tolias G, Chum O (2016) Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. In: European conference on computer vision, pp 3–20. Springer
Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sun S, Li Y, Zhou W, Tian Q, Li H (2017) Local residual similarity for image re-ranking. Inform Sci 417:143–153
Tolias G, Sicre R, Jégou H (2016) Particular object retrieval with integral max-pooling of cnn activations. In: International conference on learning representations, pp 1–12
Wang Y, Lin X, Wu L, Zhang W (2015) Effective multi-query expansions: Robust landmark retrieva. In: Proceedings of the 23rd ACM international conference on Multimedia, pp 79–88. ACM
Wang Y, Lin X, Wu L, Zhang W, Zhang Q (2015) Lbmch: Learning bridging mapping for cross-modal hashing. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pp 999–1002. ACM
Wang Y, Lin X, Wu L, Zhang W (2017) Effective multi-query expansions: Collaborative deep networks for robust landmark retrieval. IEEE Trans Image Process 26(3):1393–1404
Wang Y, Zhang W, Wu L, Lin X, Zhao X (2017) Unsupervised metric fusion over multiview data by graph random walk-based cross-view diffusion. IEEE Trans Neural Netw Learn Syst 28(1):57–70
Wu L, Wang Y, Gao J, Li X (2018) Deep adaptive feature embedding with local sample distributions for person re-identification. Pattern Recogn 73:275–288
Xie L, Tian Q, Flynn J, Wang J, Yuille A (2016) Geometric neural phrase pooling: Modeling the spatial co-occurrence of neurons. In: European conference on computer vision
Zhang S, Yang M, Wang X, Lin Y, Tian Q (2013) Semantic-aware co-indexing for image retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1673–1680
Zhang S, Yang M, Cour T, Yu K, Metaxas D N (2015) Query specific rank fusion for image retrieval. IEEE Trans Pattern Anal Mach Intell 37(4):803–815
Zheng L, Wang S, Liu Z, Tian Q (2014) Packing and padding: Coupled multi-index for accurate image retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1939–1946
Zheng L, Zhao Y, Wang S, Wang J, Tian Q (2016) Good practice in cnn feature transfer. arXiv:1604.00133
Zheng L, Yang Y, Tian Q (2018) Sift meets cnn: A decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell 40(5):1224–1244
Zhou W, Yang M, Wang X, Li H, Lin Y, Tian Q (2016) Scalable feature matching by dual cascaded scalar quantization for image retrieval. IEEE Trans Pattern Anal Mach Intell 38(1):159–171
Zhu Y, Jiang J, Han W, Ding Y, Tian Q (2017) Interpretation of users’ feedback via swarmed particles for content-based image retrieval. Inform. Sci 375:246–257
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 61772111, in part by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (NSFC) under Grant 71421001, in part by the National Natural Science Foundation of China (NSFC) under Grant 61502073, and in part by the Fundamental Research Funds for the Central UniversitiesDUT18JC02.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, Y., Kong, X. & Fu, H. Exploring geometric information in CNN for image retrieval. Multimed Tools Appl 78, 30585–30598 (2019). https://doi.org/10.1007/s11042-018-6414-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6414-6
Keywords
- Image retrieval
- Spatial pooling
- Feature weighting