Skip to main content
Log in

A Deep Model for Partial Multi-label Image Classification with Curriculum-based Disambiguation

  • Research Article
  • Published:
Machine Intelligence Research Aims and scope Submit manuscript

Abstract

In this paper, we study the partial multi-label (PML) image classification problem, where each image is annotated with a candidate label set consisting of multiple relevant labels and other noisy labels. Existing PML methods typically design a disambiguation strategy to filter out noisy labels by utilizing prior knowledge with extra assumptions, which unfortunately is unavailable in many real tasks. Furthermore, because the objective function for disambiguation is usually elaborately designed on the whole training set, it can hardly be optimized in a deep model with stochastic gradient descent (SGD) on mini-batches. In this paper, for the first time, we propose a deep model for PML to enhance the representation and discrimination ability. On the one hand, we propose a novel curriculum-based disambiguation strategy to progressively identify ground-truth labels by incorporating the varied difficulties of different classes. On the other hand, consistency regularization is introduced for model training to balance fitting identified easy labels and exploiting potential relevant labels. Extensive experimental results on the commonly used benchmark datasets show that the proposed method significantly outperforms the SOTA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Z. D. Ge, S. Mahapatra, S. Sedai, R. Garnavi, R. Chakravorty. Chest X-rays classification: A multi-label and finegrained problem, [Online], Available: https://arxiv.org/abs/1807.07247, 2018.

  2. M. K. Xie, S. J. Huang. Partial multi-label learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, Louisiana, USA, pp. 4302–4309, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.11644.

  3. L. J. Sun, S. H. Feng, T. Wang, C. Y. Lang, Y. Jin. Partial multi-label learning by low-rank and sparse decomposition. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, USA, pp. 5016–5023, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33015016.

  4. M. L. Zhang, J. P. Fang. Partial multi-label learning via credible label elicitation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3587–3599, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2985210.

    Article  Google Scholar 

  5. H. B. Wang, W. W. Liu, Y. Zhao, C. Zhang, T. L. Hu, G. Chen. Discriminative and correlative partial multi-label learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, pp. 3691–3697, 2019.

  6. N. Xu, Y. P. Liu, Y. Zhang, X. Geng. Progressive enhancement of label distributions for partial multilabel learning. IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 8, pp. 4856–4867, 2023. DOI: https://doi.org/10.1109/TNNLS.2021.3125366.

    Article  Google Scholar 

  7. G. Y. Lyu, S. H. Feng, Y. D. Li. Partial multi-label learning via probabilistic graph matching mechanism. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 105–113, 2020. DOI: https://doi.org/10.1145/3394486.3403053.

  8. Z. M. Chen, X. S. Wei, P. Wang, Y. W. Guo. Multi-label image recognition with graph convolutional networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 5177–5186, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00532.

  9. T. Ridnik, E. Ben-Baruch, N. Zamir, A. Noy, I. Friedman, M. Protter, L. Zelnik-Manor. Asymmetric loss for multi-label classification. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 82–91, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00015.

  10. S. L. Liu, L. Zhang, X. Yang, H. Su, J. Zhu. Query2Label: A simple transformer way to multi-label classification, [Online], Available: https://arxiv.org/abs/2107.10834, 2021.

  11. M. K. Xie, F. Sun, S. J. Huang. Partial multi-label learning with meta disambiguation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, pp. 1904–1912, 2021. DOI: https://doi.org/10.1145/3447548.3467259.

  12. J. Wang, Y. Yang, J. H. Mao, Z. H. Huang, C. Huang, W. Xu. CNN-RNN: A unified framework for multi-label image classification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2285–2294, 2016. DOI: https://doi.org/10.1109/CVPR.2016.251.

  13. N. Xu, Y. P. Liu, X. Geng. Partial multi-label learning with label distribution. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, USA, pp. 6510–6517, 2020. DOI: https://doi.org/10.1609/aaai.v34i04.6124.

  14. Y. Bengio, J. L. Louradour, R. Collobert, J. Weston. Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Canada, pp. 41–48, 2009. DOI: https://doi.org/10.1145/1553374.1553380.

  15. M. P. Kumar, B. Packer, D. Koller. Self-paced learning for latent variable models. In Proceedings of the 23rd Internatonal Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1189–1197, 2010.

  16. L. Jiang, Z. Y. Zhou, T. Leung, L. J. Li, L. Fei-Fei. MentorNet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 2309–2318, 2018

  17. G. Y. Lyu, S. H. Feng, Y. Jin, Y. D. Li. Partial label learning via self-paced curriculum strategy. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Ghent, Belgium, pp. 489–505, 2021. DOI: https://doi.org/10.1007/978-3-030-67661-229.

  18. T. Durand, N. Mehrasa, G. Mori. Learning a deep ConvNet for multi-label classification with partial labels. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 647–657, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00074.

  19. L. Jiang, D. Y. Meng, Q. Zhao, S. G. Shan, A. G. Hauptmann. Self-paced curriculum learning. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, USA., pp. 2694–2700, 2015. DOI: https://doi.org/10.1609/aaai.v29i1.9608.

  20. J. Gorski, F. Pfeuffer, K. Klamroth. Biconvex sets and optimization with biconvex functions: A survey and extensions. Mathematical Methods of Operations Research, vol. 66, no. 3, pp. 373–407, 2007. DOI: https://doi.org/10.1007/s00186-007-0161-1.

    Article  MathSciNet  Google Scholar 

  21. D. Arpit, S. Jastrzębski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, Y. Bengio, S. Lacoste-Julien. A closer look at memorization in deep networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 233–242, 2017.

  22. B. Han, Q. M. Yao, X. R. Yu, G. Niu, M. Xu, W. H. Hu, I. W. Tsang, M. Sugiyama. Co-teaching: Robust training of deep neural networks with extremely noisy labels. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 8536–8546, 2018.

  23. T. Wu, Q. Q. Huang, Z. W. Liu, Y. Wang, D. H. Lin. Distribution-balanced loss for multi-label classification in long-tailed datasets. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 162–178, 2020. DOI: https://doi.org/10.1007/978-3-030-58548-810.

  24. K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017. DOI: https://doi.org/10.1109/MSP.2017.2743240.

    Article  Google Scholar 

  25. D. Berthelot, N. Carlini, I. Goodfellow, A. Oliver, N. Papernot, C. Raffel. MixMatch: A holistic approach to semi-supervised learning. In Proceedings of the 33rd Internatonal Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 454, 2019.

  26. K. Sohn, D. Berthelot, C. L. Li, H. Z. Zhang, N. Carlini, E. D. Cubuk, A. Kurakin, H. Zhang, C. Raffel. FixMatch: SIMPLIFYING semi-supervised learning with consistency and confidence. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 51, 2020.

  27. T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Zürich, Switzerland, pp. 740–755, 2014. DOI: https://doi.org/10.1007/978-3-319-10602-1_48.

  28. M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, 2010. DOI: https://doi.org/10.1007/s11263-009-0275-4.

    Article  Google Scholar 

  29. M. K. Xie, S. J. Huang. Partial multi-label learning with noisy label identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3676–3687, 2022. DOI: https://doi.org/10.1109/TPAMI.2021.3059290.

    Google Scholar 

  30. G. X. Yu, X. Chen, C. Domeniconi, J. Wang, Z. Li, Z. L. Zhang, X. D. Wu. Feature-induced partial multi-label learning. In Proceedings of IEEE International Conference on Data Mining, Singapore, pp. 1398–1403, 2018. DOI: https://doi.org/10.1109/ICDM.2018.00192.

  31. T. Ridnik, H. Lawen, A. Noy, E. Ben, B. G. Sharir, I. Friedman. TResNet: High performance GPU-dedicated architecture. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, pp. 1400–1409, 2021. DOI: https://doi.org/10.1109/WACV48630.2021.00144.

  32. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.

  33. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

  34. E. D. Cubuk, B. Zoph, J. Shlens, Q. V. Le. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, USA, pp. 702–703, 2020. DOI: https://doi.org/10.1109/CVPRW50498.2020.00359.

  35. L. N. Smith, N. Topin. Super-convergence: Very fast training of neural networks using large learning rates. In Proceedings of SPIE 11006, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, Baltimore, USA, Article number 1100612, 2019. DOI: https://doi.org/10.1117/12.2520589.

  36. Y. Kim, J. M. Kim, Z. Akata, J. Lee. Large loss matters in weakly supervised multi-label classification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, pp. 14156–14165, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.01376.

Download references

Acknowledgements

This research was supported by the National Key R&D Program of China (No. 2020AAA0107000), the National Natural Science Foundation of China (NSFC) (No. 62222605), the Natural Science Foundation of Jiangsu Province of China (Nos. BK20222012 and BK20211517).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng-Jun Huang.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Feng Sun received the B. Sc. degree in Internet of things engineering from the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China in 2020. He is currently a third year master student in the MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing University of Aeronautics and Astronautics, China.

His research interests include weakly-supervised learning and multi-label learning.

E-mail: sunfeng@nuaa.edu.cn

Ming-Kun Xie received the B. Sc. degree in Internet of things engineering from the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China in 2018. He is currently a Ph. D. degree candidate in the MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing University of Aeronautics and Astronautics, China. He has served as a PC member of NeurIPS 2022, ICML 2022, IJCAI 2020, 2022 and AAAI 2021, 2022 and is also a reviewer of TNNLS.

His research interests include machine learning, multi-label learning and weakly-supervised learning.

E-mail: mkxie@nuaa.edu.cn

Sheng-Jun Huang received the B. Sc. and the Ph. D. degrees in computer science from Nanjing University, China in 2008 and 2014, respectively. He is now a professor in the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China. He was selected for the Young Elite Scientists Sponsorship Program by CAST in 2016, and won the China Computer Federation Outstanding Doctoral Dissertation Award in 2015, the KDD Best Poster Award in 2012, and the Microsoft Fellowship Award in 2011. He is a Junior Associate Editor of Frontiers of Computer Science.

His research interests include machine learning and data mining.

E-mail: huangsj@nuaa.edu.cn (Corresponding author)

ORCID iD: 0000-0002-7673-5367

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, F., Xie, MK. & Huang, SJ. A Deep Model for Partial Multi-label Image Classification with Curriculum-based Disambiguation. Mach. Intell. Res. (2024). https://doi.org/10.1007/s11633-023-1439-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11633-023-1439-3

Keywords

Navigation