COLAM: Co-Learning of Deep Neural Networks and Soft Labels via Alternating Minimization

Li, Xingjian; Xiong, Haoyi; An, Haozhe; Xu, Chengzhong; Dou, Dejing

doi:10.1007/s11063-022-10830-9

COLAM: Co-Learning of Deep Neural Networks and Soft Labels via Alternating Minimization

Published: 13 May 2022

Volume 54, pages 4735–4749, (2022)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Xingjian Li ORCID: orcid.org/0000-0001-8073-7552^1,2^na1,
Haoyi Xiong^1,2^na1,
Haozhe An¹,
Chengzhong Xu² &
…
Dejing Dou¹

219 Accesses
1 Altmetric
Explore all metrics

Abstract

Softening labels of training datasets with respect to data representations has been frequently used to improve the training of deep neural networks. While such a practice has been studied as a way to leverage “privileged information” about the distribution of the data, a well-trained learner with soft classification outputs should be first obtained as a prior to generate such privileged information. To solve such a “chicken-and-egg” problem, we propose COLAM framework that Co-Learns DNNs and soft labels through Alternating Minimization of two objectives—(a) the training loss subject to soft labels and (b) the objective to learn improved soft labels—in one end-to-end training procedure. We performed extensive experiments to compare our proposed method with a series of baselines. The experiment results show that COLAM achieves improved performance on many tasks with better testing classification accuracy. We also provide both qualitative and quantitative analyses that explain why COLAM works well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised contrastive learning with corrected labels for noisy label learning

Article 27 October 2023

Combating Noisy Labels via Contrastive Learning with Challenging Pairs

How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

References

Bagherinezhad H, Horton M, Rastegari M, Farhadi A (2018) Label refinery: improving imagenet classification through label progression. arXiv:1805.02641
Chorowski J, Jaitly N (2016) Towards better decoding and language model integration in sequence to sequence models. In: INTERSPEECH
Cimpoi M, Maji S, Vedaldi A (2015) Deep filter banks for texture recognition and segmentation. CVPR, pp 3828–3836
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. CVPR, pp 248–255
Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. In: ICML
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CVPR, pp 770–778
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. arXiv:1603.05027
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv:1503.02531
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580
Huang G, Liu Z, Weinberger KQ (2016) Densely connected convolutional networks. CVPR, pp 2261–2269
Huang Y, Cheng Y, Chen D, Lee H, Ngiam J, Le QV, Chen Z (2018) Gpipe: Efficient training of giant neural networks using pipeline parallelism. arXiv:1811.06965
Józefowicz R, Vinyals O, Schuster M, Shazeer N, Wu Y (2016) Exploring the limits of language modeling. arXiv:1602.02410
Krizhevsky A (2009) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS
Lopez-Paz D, Bottou L, Schölkopf B, Vapnik V (2016) Unifying distillation and privileged information. Int Conf Learn Represent (ICLR)
Maji S, Rahtu E, Kannala J, Blaschko MB, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv:1306.5151
Müller R, Kornblith S, Hinton GE (2019) When does label smoothing help? CoRR arXiv:1906.02629
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. ICVGIP, pp 722–729
Pereyra G, Tucker G, Chorowski J, Kaiser L, Hinton GE (2017) Regularizing neural networks by penalizing confident output distributions. arXiv:1701.06548
Real E, Aggarwal A, Huang Y, Le QV (2018) Regularized evolution for image classifier architecture search. In: AAAI
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. CVPR
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: CVPR, pp 2818–2826
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NIPS
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-UCSD birds-200-2011 dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology
Xie L, Wang J, Wei Z, Wang M, Tian Q (2016) Disturblabel: regularizing cnn on the loss layer. CVPR, pp 4753–4762
Xie S, Girshick RB, Dollár P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. CVPR, pp 5987–5995
Yun S, Park J, Lee K, Shin J (2020) Regularizing class-wise predictions via self-knowledge distillation. In: The IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. CoRR arXiv:1301.3557
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2016) Understanding deep learning requires rethinking generalization. arXiv:1611.03530
Zhang G, Wang C, Xu B, Grosse RB (2018) Three mechanisms of weight decay regularization. arXiv:1810.12281
Zoph B, Vasudevan V, Shlens J, Le QV (2017) Learning transferable architectures for scalable image recognition. CVPR, pp 8697–8710

Download references

Author information

Xingjian Li and Haoyi Xiong contributed equally to this work.

Authors and Affiliations

Big Data Lab, Baidu Research, Beijing, China
Xingjian Li, Haoyi Xiong, Haozhe An & Dejing Dou
University of Macau, Zhuhai, China
Xingjian Li, Haoyi Xiong & Chengzhong Xu

Authors

Xingjian Li
View author publications
You can also search for this author in PubMed Google Scholar
Haoyi Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Haozhe An
View author publications
You can also search for this author in PubMed Google Scholar
Chengzhong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Dejing Dou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haoyi Xiong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, X., Xiong, H., An, H. et al. COLAM: Co-Learning of Deep Neural Networks and Soft Labels via Alternating Minimization. Neural Process Lett 54, 4735–4749 (2022). https://doi.org/10.1007/s11063-022-10830-9

Download citation

Accepted: 05 April 2022
Published: 13 May 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11063-022-10830-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

COLAM: Co-Learning of Deep Neural Networks and Soft Labels via Alternating Minimization

Abstract

Access this article

Similar content being viewed by others

Supervised contrastive learning with corrected labels for noisy label learning

Combating Noisy Labels via Contrastive Learning with Challenging Pairs

How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

COLAM: Co-Learning of Deep Neural Networks and Soft Labels via Alternating Minimization

Abstract

Access this article

Similar content being viewed by others

Supervised contrastive learning with corrected labels for noisy label learning

Combating Noisy Labels via Contrastive Learning with Challenging Pairs

How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation