Learning with noisy labels via logit adjustment based on gradient prior method

Fu, Boyi; Peng, Yuncong; Qin, Xiaolin

doi:10.1007/s10489-023-04609-1

Learning with noisy labels via logit adjustment based on gradient prior method

Published: 25 July 2023

Volume 53, pages 24393–24406, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Boyi Fu^1,2,
Yuncong Peng^1,3 &
Xiaolin Qin^1,2

217 Accesses
1 Citation
Explore all metrics

Abstract

Robust loss functions are crucial for training models with strong generalization capacity in the presence of noisy labels. The commonly used Cross Entropy (CE) loss function tends to overfit noisy labels, while symmetric losses that are robust to label noise are limited by their symmetry conditions. We conduct an analysis of the gradient of CE and identify the main difficulty posed by label noise: the imbalance of gradient norm among samples. Inspired by long-tail learning, we propose a gradient prior (GP)-based logit adjustment method to mitigate the impact of label noise. This method makes full use of the gradient of samples to adjust the logit, enabling DNNs to effectively ignore noisy samples and instead focus more on learning hard samples. Experiments on benchmark datasets demonstrate that our method significantly improves the performance of CE and outperforms existing methods, especially in the case of symmetric noise. Experiments on the object detection dataset Pascal VOC further verify the plug-and-play and effective robustness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Partial label learning with noisy side information

Article 04 February 2022

A robust optimization method for label noisy datasets based on adaptive threshold: Adaptive-k

Article 16 December 2023

On the Convergence of a Family of Robust Losses for Stochastic Gradient Descent

Data availability

Due to privacy restrictions, participants of this study did not agree for their data to be shared publicly, so supporting data is not available.

References

Zou Z, Chen K, Shi Z, Guo Y, Ye J (2023) Object detection in 20 years: a survey. Proceedings of the IEEE
Peng Y, Fu B, Qin X (2022) Meta-style: few-shot learning dataset for social media field. In: International conference on artificial neural networks. Springer, pp 433–444
Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2020) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76
Northcutt C, Jiang L, Chuang I (2021) Confident learning: estimating uncertainty in dataset labels. J Artif Intell Res 70:1373–1411
Garg A, Nguyen C, Felix R, Do T-T, Carneiro G (2023) Instance-dependent noisy label learning via graphical modelling. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 2288–2298
Vahdat A (2017) Toward robustness against label noise in training deep discriminative neural networks. Adv Neural Inf Proces Syst 30
Lee K-H, He X, Zhang L, Yang L (2018) Cleannet: transfer learning for scalable image classifier training with label noise. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5447–5456
Li Y, Yang J, Song Y, Cao L, Luo J, Li L-J (2017) Learning from noisy labels with distillation. In: Proceedings of the IEEE international conference on computer vision. pp 1910–1918
Patrini G, Rozza A, Menon AK, Nock R, Qu L (2017) Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1944–1952
Han B, Yao J, Niu G, Zhou M, Tsang I, Zhang Y, Sugiyama M (2018) Masking: a new perspective of noisy supervision. Adv Neural Inf Proces Syst 31
Sukhbaatar S, Bruna J, Paluri M, Bourdev L, Fergus R (2015) Training convolutional networks with noisy labels. In: 3rd International conference on learning representations
Goldberger J, Ben-Reuven E (2017) Training deep neural-networks using a noise adaptation layer. In: International conference on learning representations
Yao J, Wang J, Tsang IW, Zhang Y, Sun J, Zhang C, Zhang R (2018) Deep learning from noisy image labels with quality embedding. IEEE Trans Image Process 28(4):1909–1922
Article MathSciNet Google Scholar
Jiang L, Zhou Z, Leung T, Li L-J, Fei-Fei L (2018) Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: International conference on machine learning. PMLR, pp 2304–2313
Yu X, Han B, Yao J, Niu G, Tsang I, Sugiyama M (2019) How does disagreement help generalization against label corruption? In: International conference on machine learning. PMLR, pp 7164–7173
Malach E, Shalev-Shwartz S (2017) Decoupling “when to update” from “how to update”. Adv Neural Inf Proces Syst 30
Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I, Sugiyama M (2018) Co-teaching: robust training of deep neural networks with extremely noisy labels. Adv Neural Inf Proces Syst 31
Wang Y, Ma X, Chen Z, Luo Y, Yi J, Bailey J (2019) Symmetric cross entropy for robust learning with noisy labels. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 322–330
Wang X, Kodirov E, Hua Y, Robertson NM. IMAE for noise-robust learning: mean absolute error does not treat examples equally and gradient magnitude’s variance matters
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. pp 2980–2988
Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. Adv Neural Inf Proces Syst 31
Feng L, Shu S, Lin Z, Lv F, Li L, An B (2021) Can cross entropy loss be robust to label noise? In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence. pp 2206–2212
Amid E, Warmuth MKK, Anil R, Koren T (2019) Robust bi-tempered logistic loss based on Bregman divergences. Adv Neural Inf Proces Syst 32
Ma X, Huang H, Wang Y, Romano S, Erfani S, Bailey J (2020) Normalized loss functions for deep learning with noisy labels. In: International conference on machine learning. PMLR, pp 6543–6553
Ghosh A, Kumar H, Sastry PS (2017) Robust loss functions under label noise for deep neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 31
Menon AK, Jayasumana S, Rawat AS, Jain H, Veit A, Kumar S. Long-tail learning via logit adjustment. In: International conference on learning representations
Tharwat A (2021) Classification assessment methods. Appl Comput Inform 17(1):168–192
Article Google Scholar
Müller R, Kornblith S, Hinton GE (2019) When does label smoothing help? Adv Neural Inf Proces Syst 32
Samuel Dvir, Chechik Gal (2021) Distributional robustness loss for long-tail learning. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 9495–9504
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images
Li W, Wang L, Li W, Agustsson E, Gool LV (2017) WebVision database: visual learning and understanding from web data. CoRR
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The Pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Mark Everingham SM, Van Eslami Luc, Gool Christopher KI, Winn Williams John, Andrew Zisserman (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
Gavrikov P, Keuper J (2022) CNN filter DB: an empirical investigation of trained convolutional filters. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 19066–19076
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1–9
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2818–2826
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. Mobilenets: efficient convolutional neural networks for mobile vision applications
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Proces Syst 28

Download references

Acknowledgements

This work was supported by The National Natural Science Foundation of China (No. 61402537), Sichuan Science and Technology Program (Nos. 2019ZDZX0006, 2020YFQ0056), the West Light Foundation of Chinese Academy of Sciences (201899) and the Talents by Sichuan provincial Party Committee Organization Department, and Science and Technology Service Network Initiative (KFJ-STS-QYZD-2021-21-001).

Author information

Authors and Affiliations

Laboratory for Automated Reasoning and Programming, Chengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu, 610041, China
Boyi Fu, Yuncong Peng & Xiaolin Qin
University of Chinese Academy of Sciences, Beijing, 100049, China
Boyi Fu & Xiaolin Qin
Hikvision Research Institute, Hangzhou, 310051, China
Yuncong Peng

Authors

Boyi Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yuncong Peng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Qin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Boyi Fu performed the conceptualization, methodology, software, validation, formal analysis, data curation, visualization and wrote the manuscript. Yungcong Peng performed the formal analysis, investigation and reviewed the manuscript. Xiaolin Qin performed the supervision, project administration, funding acquisition and reviewed the manuscript.

Corresponding author

Correspondence to Xiaolin Qin.

Ethics declarations

Ethical and informed consent for data used

The data used is completely public and free, and does not involve ethics and informed consent.

Conflict of interest

The authors declare there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fu, B., Peng, Y. & Qin, X. Learning with noisy labels via logit adjustment based on gradient prior method. Appl Intell 53, 24393–24406 (2023). https://doi.org/10.1007/s10489-023-04609-1

Download citation

Accepted: 03 April 2023
Published: 25 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10489-023-04609-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning with noisy labels via logit adjustment based on gradient prior method

Abstract

Access this article

Similar content being viewed by others

Partial label learning with noisy side information

A robust optimization method for label noisy datasets based on adaptive threshold: Adaptive-k

On the Convergence of a Family of Robust Losses for Stochastic Gradient Descent

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and informed consent for data used

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning with noisy labels via logit adjustment based on gradient prior method

Abstract

Access this article

Similar content being viewed by others

Partial label learning with noisy side information

A robust optimization method for label noisy datasets based on adaptive threshold: Adaptive-k

On the Convergence of a Family of Robust Losses for Stochastic Gradient Descent

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and informed consent for data used

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation