Learning region-wise deep feature representation for image analysis

Zhu, Xiaobin; Wang, Qian; Li, Peng; Zhang, Xiao-Yu; Wang, Lei

doi:10.1007/s12652-018-0894-0

Learning region-wise deep feature representation for image analysis

Original Research
Published: 07 June 2018

Volume 14, pages 14775–14784, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Xiaobin Zhu ORCID: orcid.org/0000-0003-3021-0546¹,
Qian Wang¹,
Peng Li²,
Xiao-Yu Zhang³ &
…
Lei Wang⁴

308 Accesses
2 Citations
Explore all metrics

Abstract

Effective feature representation plays an important role in image analysis tasks. In recent years, deep features, instead of hand-crafted features, have become the mainstream of the representation in image analysis tasks. However, the existing deep learning methods always extract feature representations from the whole image directly. Such strategies concentrate on extracting global features, and tend to fail in capturing local geometric invariance and introduce noise information from regions of not interest. In this paper, we propose a novel region-wise deep feature extraction framework for promoting the local geometric invariance and reducing noise information. In our algorithm, object proposal is adopted to generate a set of foreground object bounding boxes, from which the pre-trained convolutional neural network model is adopted to extract region-wise deep features. Then, an improved vector of locally aggregated descriptors strategy with weighted multi-neighbor assignment is proposed to encode the local region-wise feature representations. The final feature representation is not restricted to the classification task, and can also be further quantized to hash codes for large-scale image retrieval. Extensive experiments conducted on publicly available datasets demonstrate the promising performance of our work against the state-of-the-art methods in both image retrieval and classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 3

Deep Learning-Based Descriptors for Object Instance Search

A multi-level descriptor using ultra-deep feature for image retrieval

Article 30 May 2019

Spatial locality-preserving feature coding for image classification

Article 21 February 2017

References

Alexe B, Deselaers T, Ferrari V (2010) What is an object? In: IEEE conference on computer vision and pattern recognition, pp 73–80
Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Found Comput Sci 51:117–122
Google Scholar
Arandjelovic R, Gronát P, Torii A, Pajdla T, Sivic J (2016) Netvlad: CNN architecture for weakly supervised place recognition. In: IEEE conference on computer vision and pattern recognition, pp 5297–5307
Babenko A, Lempitsky VS (2015) Aggregating deep convolutional features for image retrieval. CoRR abs. arxiv:1510.07493
Barat C, Ducottet C (2016) String representations and distances in deep convolutional neural networks for image classification. Pattern Recogn 54:104–115
Article ADS Google Scholar
Cai L, Zhu J, Zeng H, Chen J, Cai C, Ma K (2018) Hog-assisted deep feature learning for pedestrian gender recognition. J Franklin Inst 355(4):1991–2008
Article Google Scholar
Cao Z, Long M, Wang J, Yu PS (2017a) Hashnet: deep learning to hash by continuation. CoRR abs. arxiv:1702.00758
Cao Z, Long M, Wang J, Yu PS (2017b) Hashnet: deep learning to hash by continuation. In: ICCV, pp 5609–5618
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern, pp 886–893
Dixit M, Chen S, Gao D, Rasiwasia N, Vasconcelos N (2015) Scene classification with semantic fisher vectors. In: IEEE conference on computer vision and pattern recognition, pp 2974–2983
Dollár P, Zitnick CL (2015) Fast edge detection using structured forests. IEEE Trans Pattern Anal Mach Intell 37(8):1558–1570
Article PubMed Google Scholar
Dollár P, Zitnick CL (2013) Structured forests for fast edge detection. In: IEEE international conference on computer vision, pp 1841–1848
Durand T, Mordan T, Thome N, Cord M (2017) WILDCAT: weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation. In: IEEE conference on computer vision and pattern recognition, pp 5957–5966
Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
Google Scholar
Girshick RB (2015) Fast R-CNN. In: IEEE international conference on computer vision, pp 1440–1448
Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929
Article PubMed Google Scholar
Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: European conference on computer vision, pp 392–407
Hoang T, Do T, Tan DL, Cheung N (2017) Selective deep convolutional features for image retrieval. CoRR abs. arxiv:1707.00809
Jegou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: IEEE conference on conference on computer vision and pattern recognition, pp 3304–3311
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1106–1114
Lai H, Pan Y, Liu Y, Yan S (2015) Simultaneous feature learning and hash coding with deep neural networks. In: IEEE conference on computer vision and pattern recognition, pp 3270–3278
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE computer society conference on computer vision and pattern recognition, pp 2169–2178
Li P, Liu Y, Liu G, Guo M, Pan Z (2016a) A robust local sparse coding method for image classification with histogram intersection kernel. Neurocomputing 184:36–42
Article Google Scholar
Li Y, Li W, Mahadevan V, Vasconcelos N (2016b) VLAD3: encoding dynamics of deep features for action recognition. In: IEEE conference on computer vision and pattern recognition, pp 1951–1960
Lin K, Lu J, Chen C, Zhou J (2016) Learning compact binary descriptors with unsupervised deep neural networks. In: IEEE conference on computer vision and pattern recognition, pp 1183–1192
Liu P, Liu G, Guo M, Li P (2015) Image classification based on non-negative locality-constrained linear coding. Acta Autom Sin 41(7):1235–1243
Google Scholar
Liu Y, Zhang X, Zhu X, Guan Q, Zhao X (2017) Listnet-based object proposals ranking. Neurocomputing 267:182–194
Article Google Scholar
Liu L, Shen C, Wang L, van den Hengel A, Wang C (2014) Encoding high dimensional local features by sparse coding based fisher vectors. In: Advances in neural information processing systems, pp 1143–1151
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Maninis K, Pont-Tuset J, Arbelaez P, Gool LV (2018) Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans Pattern Anal Mach Intell 40(4):819–833
Article PubMed Google Scholar
Ng JY, Yang F, Davis LS (2015) Exploiting local features from deep networks for image retrieval. In: IEEE conference on computer vision and pattern recognition workshops, pp 53–61
Peng X, Wang L, Qiao Y, Peng Q (2014) Boosting VLAD with supervised dictionary learning and high-order statistics. In: Computer vision—ECCV 2014–13th European conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part III, pp 660–674
Pont-Tuset J, Arbelaez P, Barron JT, Marqués F, Malik J (2017) Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE Trans Pattern Anal Mach Intell 39(1):128–140
Article PubMed Google Scholar
Rahtu E, Kannala J, Blaschko MB (2011) Learning a category independent object detection cascade. In: IEEE international conference on computer vision, pp 1052–1059
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014a) CNN features off-the-shelf: An astounding baseline for recognition. In: IEEE conference on computer vision and pattern recognition, pp 512–519
Razavian AS, Sullivan J, Maki A, Carlsson S (2014b) Visual instance retrieval with deep convolutional networks. CoRR abs. arxiv:1412.6574
Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: IEEE conference on computer vision and pattern recognition, pp 37–45
Simonyan K, Vedaldi A, Zisserman A (2013) Deep fisher networks for large-scale image classification. In: Advances in neural information processing systems, pp 163–171
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR abs. arxiv:1409.1556
Tsai T, Huang Y, Chiang T (2006) Image retrieval based on dominant texture features. In: IEEE international symposium on industrial electronics, pp 441–446
Uijlings JRR, van de Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Article Google Scholar
Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning. In: Proceedings of the twenty-eighth AAAI conference on artificial intelligence, pp 2156–2162
Yang J, Liu J, Dai Q (2015) An improved bag-of-words framework for remote sensing image retrieval in large-scale image databases. Int J Digit Earth 8(4):273–292
Article Google Scholar
Yang H, Lin K, Chen C (2018) Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Trans Pattern Anal Mach Intell 40(2):437–451
Article PubMed Google Scholar
Zhang XY, Wang S, Zhu X, Yun X, Wu G (2015) Update vs. upgrade: modeling with indeterminate multi-class active learning. Neurocomputing 162:163–170
Article Google Scholar
Zhang J, Peng Y, Zhang J (2016a) Query-adaptive image retrieval by deep weighted hashing. CoRR abs. arxiv:1612.02541
Zhang J, Peng Y, Zhang J (2016b) SSDH: semi-supervised deep hashing for large scale image retrieval. CoRR abs. arxiv:1607.08477
Zhu X, Liu J, Wang J, Li C, Lu H (2014) Sparse representation for robust abnormality detection in crowded scenes. Pattern Recogn 47(5):1791–1799
Article ADS Google Scholar
Zhu J, Liao S, Lei Z, Li SZ (2017) Multi-label convolutional neural network based pedestrian attribute classification. Image Vis Comput 58:224–229
Article Google Scholar
Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision, pp 391–405

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (2017YFB1401000) and National Natural Science Foundation of China (61501457, 61602517). The corresponding authors are Peng Li and Xiao-Yu Zhang, who contribute equally to this paper.

Author information

Authors and Affiliations

Beijing Technology and Business University, Fangshan, China
Xiaobin Zhu & Qian Wang
College of Information and Control Engineering, China University of Petroleum, Qingdao, China
Peng Li
Institute of Information Engineering, Chinese Academy of Science, Beijing, China
Xiao-Yu Zhang
Academy of Broadcasting Science, SARFT, Beijing, China
Lei Wang

Authors

Xiaobin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Qian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaobin Zhu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Peng Li and Xiao-Yu Zhang are contributed equally to this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, X., Wang, Q., Li, P. et al. Learning region-wise deep feature representation for image analysis. J Ambient Intell Human Comput 14, 14775–14784 (2023). https://doi.org/10.1007/s12652-018-0894-0

Download citation

Received: 06 April 2018
Accepted: 31 May 2018
Published: 07 June 2018
Issue Date: November 2023
DOI: https://doi.org/10.1007/s12652-018-0894-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning region-wise deep feature representation for image analysis

Abstract

Access this article

Similar content being viewed by others

Deep Learning-Based Descriptors for Object Instance Search

A multi-level descriptor using ultra-deep feature for image retrieval

Spatial locality-preserving feature coding for image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning region-wise deep feature representation for image analysis

Abstract

Access this article

Similar content being viewed by others

Deep Learning-Based Descriptors for Object Instance Search

A multi-level descriptor using ultra-deep feature for image retrieval

Spatial locality-preserving feature coding for image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation