Multi-layer Weight-Aware Bilinear Pooling for Fine-Grained Image Classification

Li, Fenglei; Xu, Qin; Sun, Zehui; Mei, Yiming; Zhang, Qiang; Luo, Bin

doi:10.1007/978-3-030-39431-8_43

Fenglei Li¹⁶,
Qin Xu¹⁶,
Zehui Sun¹⁶,
Yiming Mei¹⁶,
Qiang Zhang¹⁶ &
…
Bin Luo¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11691))

Included in the following conference series:

International Conference on Brain Inspired Cognitive Systems

1235 Accesses
1 Citations

Abstract

Fine-grained images have similar global structure but exhibit variant local appearance. Bilinear pooling models have been proven to be effective in modeling different semantic parts and capturing the effective feature learning for fine-grained image classification. However, the bilinear models do not consider that convolutional neural networks (CNNs) may lose important semantic information during forward propagation, and feature interactions of different convolutional layers enhance feature learning which improves classification performance. Therefore, we propose a multi-layer weight-aware bilinear pooling method to model cross-layer object parts feature interaction as the feature representation, and different weights are assigned to each convolutional layer to adaptively adjust the outputs of the convolutional layers to highlight more discriminative features. The proposed method results in great performance improvement compared with previous state-of-the-art approaches. We demonstrate the effectiveness of our method on the CUB-200-2011 and FGVC-Aircraft datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear cnn models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1449–1457 (2015)
Google Scholar
Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54
Chapter Google Scholar
Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 365–374 (2017)
Google Scholar
Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 842–850 (2015)
Google Scholar
Gao, Y., Beijbom, O., Zhang, N., Darreel, T.: Compact bilinear pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 365–374 (2016)
Google Scholar
Zhang, X., Xiong, H., Zhou, W., Lin, W., Tian, Q.: Picking deep filter responses for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1134–1142 (2016)
Google Scholar
Lin, T.Y., Maji, S.: Improved bilinear pooling with cnns. arXiv preprint arXiv:1707.06772 (2017)
Kim, J.H., On, K.W., Lim, W., Kim, J., Ha, J.W., Zhang, B.T.: Hadamard product for low-rank bilinear pooling. arXiv preprint arXiv:1610.04325 (2016)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD birds-200-2011 dataset. Technical report CNS-TR-2011-001, California Institute of Technology (2011)
Google Scholar
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151 (2013)
Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Bilinear classifiers for visual recognition. In: Advances in Neural Information Processing Systems, pp. 1482–1490 (2009)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446 (2017)
Google Scholar
Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5209–5217 (2017)
Google Scholar
Sun, M., Yuan, Y., Zhou, F., Ding, E.: Multi-attention multi-class constraint for fine-grained image recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 834–850. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_49
Chapter Google Scholar
Zheng, J., Liu, Y., Ren, J., Zhu, T., Yan, Y., Yang, H.: Fusion of block and keypoints based approaches for effective copy-move image forgery detection. Multidimension. Syst. Signal Process. 27(4), 989–1005 (2016)
Article MathSciNet Google Scholar
Yan, Y., Ren, J., Li, Y., Windmill, J.F., Ijomah, W., Chao, K.M.: Adaptive fusion of color and spatial features for noise-robust retrieval of colored logo and trademark images. Multidimension. Syst. Signal Process. 27(4), 945–968 (2016)
Article MathSciNet Google Scholar
Ren, J., Jiang, J., Wang, D., Ipson, S.S.: Fusion of intensity and inter-component chromatic difference for effective and robust colour edge detection. IET Image Proc. 4(4), 294–301 (2010)
Article Google Scholar
Qi, L., Lu, X., Li, X.: Exploiting spatial relation for fine-grained image classification. Pattern Recogn. 91, 47–55 (2019)
Article Google Scholar
Hariharan, B., Arbelez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 447–456 (2015)
Google Scholar
Cai, S., Zuo, W., Zhang, L.: Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–520 (2017)
Google Scholar
Zhao, B., Wu, X., Feng, J., Peng, Q., Yan, S.: Diversified visual attention networks for fine-grained object classification. IEEE Trans. Multimedia 19(6), 1245–1256 (2017)
Article Google Scholar
Liu, L., Shen, C., van den Hengel, A.: The treasure beneath convolutional layers: cross-convolutional-layer pooling for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4749–4757 (2015)
Google Scholar
Yu, C., Zhao, X., Zheng, Q., Zhang, P., You, X.: Hierarchical bilinear pooling for fine-grained visual recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 595–610. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_35
Chapter Google Scholar
Li, X., Wang, W.H., Hu, X.L., Yang, J.: Selective kernel networks. arXiv preprint arXiv:1903.06586 (2019)
Lin, D., Shen, X., Lu, C., Jia, J.: Deep lac: Deep localization, alignment and classification for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1666–1674 (2015)
Google Scholar
Wang, D., Shen, Z., Shao, J., Zhang, W., Xue, X., Zhang, Z.: Multiple granularity descriptors for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2399–2406 (2015)
Google Scholar
Simon, M., Rodner, E.: Neural activation constellations: unsupervised part model discovery with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1143–1151 (2015)
Google Scholar
Krause, J., Jin, H., Yang, J., Fei-Fei, L.: Fine-grained recognition without part annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5546–5555 (2015)
Google Scholar
Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 321–328 (2013)
Google Scholar
Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3828–3836 (2015)
Google Scholar
Gosselin, P.H., Murray, N., Jgou, H., Perronnin, F.: Revisiting the fisher vector for fine-grained classification. Pattern Recogn. Lett. 49, 92–98 (2014)
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous referees for their constructive comments which have helped improve the paper. This work was supported by National Natural Science Foundation of China (61502003, 71501002, 61472002, 61671018, 61860206004), by the Key Research Project of Humanities and Social Sciences in Colleges and Universities of Anhui Province under Grant SK2019A0013.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, 230601, China
Fenglei Li, Qin Xu, Zehui Sun, Yiming Mei, Qiang Zhang & Bin Luo

Authors

Fenglei Li
View author publications
You can also search for this author in PubMed Google Scholar
Qin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zehui Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yiming Mei
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qin Xu .

Editor information

Editors and Affiliations

University of Strathclyde, Glasgow, UK
Jinchang Ren
Edinburgh Napier University, Edinburgh, UK
Amir Hussain
Guangdong Polytechnic Normal University, Guangzhou, China
Huimin Zhao
Xi’an Jiaotong-Liverpool University, Suzhou, China
Kaizhu Huang
Northwestern Polytechnical University, Xi'an, China
Jiangbin Zheng
Guangdong Polytechnic Normal University, Guangzhou, China
Jun Cai
Guangdong Polytechnic Normal University, Guangzhou, China
Rongjun Chen
Guangdong Polytechnic Normal University, Guangzhou, China
Yinyin Xiao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, F., Xu, Q., Sun, Z., Mei, Y., Zhang, Q., Luo, B. (2020). Multi-layer Weight-Aware Bilinear Pooling for Fine-Grained Image Classification. In: Ren, J., et al. Advances in Brain Inspired Cognitive Systems. BICS 2019. Lecture Notes in Computer Science(), vol 11691. Springer, Cham. https://doi.org/10.1007/978-3-030-39431-8_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-39431-8_43
Published: 01 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39430-1
Online ISBN: 978-3-030-39431-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics