A Visual Inductive Priors Framework for Data-Efficient Image Classification

Sun, Pengfei; Jin, Xuan; Su, Wei; He, Yuan; Xue, Hui; Lu, Quan

doi:10.1007/978-3-030-66096-3_35

Pengfei Sun ORCID: orcid.org/0000-0002-9719-8900¹⁰,
Xuan Jin¹⁰,
Wei Su¹⁰,
Yuan He¹⁰,
Hui Xue¹⁰ &
…
Quan Lu¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12536))

Included in the following conference series:

European Conference on Computer Vision

2017 Accesses
5 Citations

Abstract

State-of-the-art classifiers rely heavily on large-scale datasets, such as ImageNet, JFT-300M, MSCOCO, Open Images, etc. Besides, the performance may decrease significantly because of insufficient learning on a handful of samples. We present Visual Inductive Priors Framework (VIPF), a framework that can learn classifiers from scratch. VIPF can maximize the effectiveness of limited data. In this work, we propose a novel neural network architecture: DSK-net, which is very effective in training from small data sets. With more discriminative feature extracted from DSK-net, overfitting of network is alleviated. Furthermore, a loss function based on positive class as well as an induced hierarchy are also applied to further improve the VIPF’s capability of learning from scratch. Finally, we won the 1st Place in VIPriors image classification competition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barz, B., Denzler, J.: Deep learning on small datasets without pre-training using cosine loss. In: The IEEE Winter Conference on Applications of Computer Vision, pp. 1371–1380 (2020)
Google Scholar
Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)
Article Google Scholar
Chen, P.: Gridmask data augmentation. arXiv preprint arXiv:2001.04086 (2020)
Cohen, T., Welling, M.: Group equivariant convolutional networks. In: International Conference on Machine Learning, pp. 2990–2999 (2016)
Google Scholar
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)
Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)
Google Scholar
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Esteves, C., Allen-Blanchette, C., Zhou, X., Daniilidis, K.: Polar transformer networks. arXiv preprint arXiv:1709.01889 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., Li, M.: Bag of tricks for image classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 558–567 (2019)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Kayhan, O.S., Gemert, J.C.v.: On translation invariance in CNNs: convolutional layers can exploit absolute spatial location. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14274–14285 (2020)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 510–519 (2019)
Google Scholar
Northcutt, C.G., Jiang, L., Chuang, I.L.: Confident learning: estimating uncertainty in dataset labels. arXiv preprint arXiv:1911.00068 (2019)
Sifre, L., Mallat, S.: Rotation, scaling and deformation invariant scattering for texture discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1233–1240 (2013)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946 (2019)
Wan, A., et al.: NBDT: neural-backed decision trees. arXiv preprint arXiv:2004.00221 (2020)
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)
Google Scholar
Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Chapter Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: deep translation and rotation equivariance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5028–5037 (2017)
Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6023–6032 (2019)
Google Scholar
Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Zhang, R.: Making convolutional networks shift-invariant again. arXiv preprint arXiv:1904.11486 (2019)
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: AAAI, pp. 13001–13008 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Alibaba Group, Hangzhou, China
Pengfei Sun, Xuan Jin, Wei Su, Yuan He, Hui Xue & Quan Lu

Authors

Pengfei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Jin
View author publications
You can also search for this author in PubMed Google Scholar
Wei Su
View author publications
You can also search for this author in PubMed Google Scholar
Yuan He
View author publications
You can also search for this author in PubMed Google Scholar
Hui Xue
View author publications
You can also search for this author in PubMed Google Scholar
Quan Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pengfei Sun .

Editor information

Editors and Affiliations

University of Clermont Auvergne, Clermont Ferrand, France
Adrien Bartoli
Università degli Studi di Udine, Udine, Italy
Andrea Fusiello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, P., Jin, X., Su, W., He, Y., Xue, H., Lu, Q. (2020). A Visual Inductive Priors Framework for Data-Efficient Image Classification. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12536. Springer, Cham. https://doi.org/10.1007/978-3-030-66096-3_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-66096-3_35
Published: 03 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66095-6
Online ISBN: 978-3-030-66096-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics