Skip to main content
Log in

Learning with noisy labels via logit adjustment based on gradient prior method

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Robust loss functions are crucial for training models with strong generalization capacity in the presence of noisy labels. The commonly used Cross Entropy (CE) loss function tends to overfit noisy labels, while symmetric losses that are robust to label noise are limited by their symmetry conditions. We conduct an analysis of the gradient of CE and identify the main difficulty posed by label noise: the imbalance of gradient norm among samples. Inspired by long-tail learning, we propose a gradient prior (GP)-based logit adjustment method to mitigate the impact of label noise. This method makes full use of the gradient of samples to adjust the logit, enabling DNNs to effectively ignore noisy samples and instead focus more on learning hard samples. Experiments on benchmark datasets demonstrate that our method significantly improves the performance of CE and outperforms existing methods, especially in the case of symmetric noise. Experiments on the object detection dataset Pascal VOC further verify the plug-and-play and effective robustness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

Due to privacy restrictions, participants of this study did not agree for their data to be shared publicly, so supporting data is not available.

References

  1. Zou Z, Chen K, Shi Z, Guo Y, Ye J (2023) Object detection in 20 years: a survey. Proceedings of the IEEE

  2. Peng Y, Fu B, Qin X (2022) Meta-style: few-shot learning dataset for social media field. In: International conference on artificial neural networks. Springer, pp 433–444

  3. Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2020) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76

  4. Northcutt C, Jiang L, Chuang I (2021) Confident learning: estimating uncertainty in dataset labels. J Artif Intell Res 70:1373–1411

  5. Garg A, Nguyen C, Felix R, Do T-T, Carneiro G (2023) Instance-dependent noisy label learning via graphical modelling. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 2288–2298

  6. Vahdat A (2017) Toward robustness against label noise in training deep discriminative neural networks. Adv Neural Inf Proces Syst 30

  7. Lee K-H, He X, Zhang L, Yang L (2018) Cleannet: transfer learning for scalable image classifier training with label noise. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5447–5456

  8. Li Y, Yang J, Song Y, Cao L, Luo J, Li L-J (2017) Learning from noisy labels with distillation. In: Proceedings of the IEEE international conference on computer vision. pp 1910–1918

  9. Patrini G, Rozza A, Menon AK, Nock R, Qu L (2017) Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1944–1952

  10. Han B, Yao J, Niu G, Zhou M, Tsang I, Zhang Y, Sugiyama M (2018) Masking: a new perspective of noisy supervision. Adv Neural Inf Proces Syst 31

  11. Sukhbaatar S, Bruna J, Paluri M, Bourdev L, Fergus R (2015) Training convolutional networks with noisy labels. In: 3rd International conference on learning representations

  12. Goldberger J, Ben-Reuven E (2017) Training deep neural-networks using a noise adaptation layer. In: International conference on learning representations

  13. Yao J, Wang J, Tsang IW, Zhang Y, Sun J, Zhang C, Zhang R (2018) Deep learning from noisy image labels with quality embedding. IEEE Trans Image Process 28(4):1909–1922

    Article  MathSciNet  Google Scholar 

  14. Jiang L, Zhou Z, Leung T, Li L-J, Fei-Fei L (2018) Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: International conference on machine learning. PMLR, pp 2304–2313

  15. Yu X, Han B, Yao J, Niu G, Tsang I, Sugiyama M (2019) How does disagreement help generalization against label corruption? In: International conference on machine learning. PMLR, pp 7164–7173

  16. Malach E, Shalev-Shwartz S (2017) Decoupling “when to update” from “how to update”. Adv Neural Inf Proces Syst 30

  17. Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I, Sugiyama M (2018) Co-teaching: robust training of deep neural networks with extremely noisy labels. Adv Neural Inf Proces Syst 31

  18. Wang Y, Ma X, Chen Z, Luo Y, Yi J, Bailey J (2019) Symmetric cross entropy for robust learning with noisy labels. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 322–330

  19. Wang X, Kodirov E, Hua Y, Robertson NM. IMAE for noise-robust learning: mean absolute error does not treat examples equally and gradient magnitude’s variance matters

  20. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. pp 2980–2988

  21. Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. Adv Neural Inf Proces Syst 31

  22. Feng L, Shu S, Lin Z, Lv F, Li L, An B (2021) Can cross entropy loss be robust to label noise? In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence. pp 2206–2212

  23. Amid E, Warmuth MKK, Anil R, Koren T (2019) Robust bi-tempered logistic loss based on Bregman divergences. Adv Neural Inf Proces Syst 32

  24. Ma X, Huang H, Wang Y, Romano S, Erfani S, Bailey J (2020) Normalized loss functions for deep learning with noisy labels. In: International conference on machine learning. PMLR, pp 6543–6553

  25. Ghosh A, Kumar H, Sastry PS (2017) Robust loss functions under label noise for deep neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 31

  26. Menon AK, Jayasumana S, Rawat AS, Jain H, Veit A, Kumar S. Long-tail learning via logit adjustment. In: International conference on learning representations

  27. Tharwat A (2021) Classification assessment methods. Appl Comput Inform 17(1):168–192

    Article  Google Scholar 

  28. Müller R, Kornblith S, Hinton GE (2019) When does label smoothing help? Adv Neural Inf Proces Syst 32

  29. Samuel Dvir, Chechik Gal (2021) Distributional robustness loss for long-tail learning. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 9495–9504

  30. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  31. Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images

  32. Li W, Wang L, Li W, Agustsson E, Gool LV (2017) WebVision database: visual learning and understanding from web data. CoRR

  33. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The Pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  34. Mark Everingham SM, Van Eslami Luc, Gool Christopher KI, Winn Williams John, Andrew Zisserman (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136

    Article  Google Scholar 

  35. Gavrikov P, Keuper J (2022) CNN filter DB: an empirical investigation of trained convolutional filters. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 19066–19076

  36. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778

  37. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1–9

  38. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition

  39. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2818–2826

  40. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. Mobilenets: efficient convolutional neural networks for mobile vision applications

  41. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255

  42. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Proces Syst 28

Download references

Acknowledgements

This work was supported by The National Natural Science Foundation of China (No. 61402537), Sichuan Science and Technology Program (Nos. 2019ZDZX0006, 2020YFQ0056), the West Light Foundation of Chinese Academy of Sciences (201899) and the Talents by Sichuan provincial Party Committee Organization Department, and Science and Technology Service Network Initiative (KFJ-STS-QYZD-2021-21-001).

Author information

Authors and Affiliations

Authors

Contributions

Boyi Fu performed the conceptualization, methodology, software, validation, formal analysis, data curation, visualization and wrote the manuscript. Yungcong Peng performed the formal analysis, investigation and reviewed the manuscript. Xiaolin Qin performed the supervision, project administration, funding acquisition and reviewed the manuscript.

Corresponding author

Correspondence to Xiaolin Qin.

Ethics declarations

Ethical and informed consent for data used

The data used is completely public and free, and does not involve ethics and informed consent.

Conflict of interest

The authors declare there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, B., Peng, Y. & Qin, X. Learning with noisy labels via logit adjustment based on gradient prior method. Appl Intell 53, 24393–24406 (2023). https://doi.org/10.1007/s10489-023-04609-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04609-1

Keywords

Navigation