Abstract
In this paper, we study the partial multi-label (PML) image classification problem, where each image is annotated with a candidate label set consisting of multiple relevant labels and other noisy labels. Existing PML methods typically design a disambiguation strategy to filter out noisy labels by utilizing prior knowledge with extra assumptions, which unfortunately is unavailable in many real tasks. Furthermore, because the objective function for disambiguation is usually elaborately designed on the whole training set, it can hardly be optimized in a deep model with stochastic gradient descent (SGD) on mini-batches. In this paper, for the first time, we propose a deep model for PML to enhance the representation and discrimination ability. On the one hand, we propose a novel curriculum-based disambiguation strategy to progressively identify ground-truth labels by incorporating the varied difficulties of different classes. On the other hand, consistency regularization is introduced for model training to balance fitting identified easy labels and exploiting potential relevant labels. Extensive experimental results on the commonly used benchmark datasets show that the proposed method significantly outperforms the SOTA methods.
Similar content being viewed by others
References
Z. D. Ge, S. Mahapatra, S. Sedai, R. Garnavi, R. Chakravorty. Chest X-rays classification: A multi-label and finegrained problem, [Online], Available: https://arxiv.org/abs/1807.07247, 2018.
M. K. Xie, S. J. Huang. Partial multi-label learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, Louisiana, USA, pp. 4302–4309, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.11644.
L. J. Sun, S. H. Feng, T. Wang, C. Y. Lang, Y. Jin. Partial multi-label learning by low-rank and sparse decomposition. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, USA, pp. 5016–5023, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33015016.
M. L. Zhang, J. P. Fang. Partial multi-label learning via credible label elicitation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3587–3599, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2985210.
H. B. Wang, W. W. Liu, Y. Zhao, C. Zhang, T. L. Hu, G. Chen. Discriminative and correlative partial multi-label learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, pp. 3691–3697, 2019.
N. Xu, Y. P. Liu, Y. Zhang, X. Geng. Progressive enhancement of label distributions for partial multilabel learning. IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 8, pp. 4856–4867, 2023. DOI: https://doi.org/10.1109/TNNLS.2021.3125366.
G. Y. Lyu, S. H. Feng, Y. D. Li. Partial multi-label learning via probabilistic graph matching mechanism. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 105–113, 2020. DOI: https://doi.org/10.1145/3394486.3403053.
Z. M. Chen, X. S. Wei, P. Wang, Y. W. Guo. Multi-label image recognition with graph convolutional networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 5177–5186, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00532.
T. Ridnik, E. Ben-Baruch, N. Zamir, A. Noy, I. Friedman, M. Protter, L. Zelnik-Manor. Asymmetric loss for multi-label classification. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 82–91, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00015.
S. L. Liu, L. Zhang, X. Yang, H. Su, J. Zhu. Query2Label: A simple transformer way to multi-label classification, [Online], Available: https://arxiv.org/abs/2107.10834, 2021.
M. K. Xie, F. Sun, S. J. Huang. Partial multi-label learning with meta disambiguation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, pp. 1904–1912, 2021. DOI: https://doi.org/10.1145/3447548.3467259.
J. Wang, Y. Yang, J. H. Mao, Z. H. Huang, C. Huang, W. Xu. CNN-RNN: A unified framework for multi-label image classification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2285–2294, 2016. DOI: https://doi.org/10.1109/CVPR.2016.251.
N. Xu, Y. P. Liu, X. Geng. Partial multi-label learning with label distribution. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, USA, pp. 6510–6517, 2020. DOI: https://doi.org/10.1609/aaai.v34i04.6124.
Y. Bengio, J. L. Louradour, R. Collobert, J. Weston. Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Canada, pp. 41–48, 2009. DOI: https://doi.org/10.1145/1553374.1553380.
M. P. Kumar, B. Packer, D. Koller. Self-paced learning for latent variable models. In Proceedings of the 23rd Internatonal Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1189–1197, 2010.
L. Jiang, Z. Y. Zhou, T. Leung, L. J. Li, L. Fei-Fei. MentorNet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 2309–2318, 2018
G. Y. Lyu, S. H. Feng, Y. Jin, Y. D. Li. Partial label learning via self-paced curriculum strategy. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Ghent, Belgium, pp. 489–505, 2021. DOI: https://doi.org/10.1007/978-3-030-67661-229.
T. Durand, N. Mehrasa, G. Mori. Learning a deep ConvNet for multi-label classification with partial labels. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 647–657, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00074.
L. Jiang, D. Y. Meng, Q. Zhao, S. G. Shan, A. G. Hauptmann. Self-paced curriculum learning. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, USA., pp. 2694–2700, 2015. DOI: https://doi.org/10.1609/aaai.v29i1.9608.
J. Gorski, F. Pfeuffer, K. Klamroth. Biconvex sets and optimization with biconvex functions: A survey and extensions. Mathematical Methods of Operations Research, vol. 66, no. 3, pp. 373–407, 2007. DOI: https://doi.org/10.1007/s00186-007-0161-1.
D. Arpit, S. Jastrzębski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, Y. Bengio, S. Lacoste-Julien. A closer look at memorization in deep networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 233–242, 2017.
B. Han, Q. M. Yao, X. R. Yu, G. Niu, M. Xu, W. H. Hu, I. W. Tsang, M. Sugiyama. Co-teaching: Robust training of deep neural networks with extremely noisy labels. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 8536–8546, 2018.
T. Wu, Q. Q. Huang, Z. W. Liu, Y. Wang, D. H. Lin. Distribution-balanced loss for multi-label classification in long-tailed datasets. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 162–178, 2020. DOI: https://doi.org/10.1007/978-3-030-58548-810.
K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017. DOI: https://doi.org/10.1109/MSP.2017.2743240.
D. Berthelot, N. Carlini, I. Goodfellow, A. Oliver, N. Papernot, C. Raffel. MixMatch: A holistic approach to semi-supervised learning. In Proceedings of the 33rd Internatonal Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 454, 2019.
K. Sohn, D. Berthelot, C. L. Li, H. Z. Zhang, N. Carlini, E. D. Cubuk, A. Kurakin, H. Zhang, C. Raffel. FixMatch: SIMPLIFYING semi-supervised learning with consistency and confidence. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 51, 2020.
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Zürich, Switzerland, pp. 740–755, 2014. DOI: https://doi.org/10.1007/978-3-319-10602-1_48.
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, 2010. DOI: https://doi.org/10.1007/s11263-009-0275-4.
M. K. Xie, S. J. Huang. Partial multi-label learning with noisy label identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3676–3687, 2022. DOI: https://doi.org/10.1109/TPAMI.2021.3059290.
G. X. Yu, X. Chen, C. Domeniconi, J. Wang, Z. Li, Z. L. Zhang, X. D. Wu. Feature-induced partial multi-label learning. In Proceedings of IEEE International Conference on Data Mining, Singapore, pp. 1398–1403, 2018. DOI: https://doi.org/10.1109/ICDM.2018.00192.
T. Ridnik, H. Lawen, A. Noy, E. Ben, B. G. Sharir, I. Friedman. TResNet: High performance GPU-dedicated architecture. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, pp. 1400–1409, 2021. DOI: https://doi.org/10.1109/WACV48630.2021.00144.
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
E. D. Cubuk, B. Zoph, J. Shlens, Q. V. Le. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, USA, pp. 702–703, 2020. DOI: https://doi.org/10.1109/CVPRW50498.2020.00359.
L. N. Smith, N. Topin. Super-convergence: Very fast training of neural networks using large learning rates. In Proceedings of SPIE 11006, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, Baltimore, USA, Article number 1100612, 2019. DOI: https://doi.org/10.1117/12.2520589.
Y. Kim, J. M. Kim, Z. Akata, J. Lee. Large loss matters in weakly supervised multi-label classification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, pp. 14156–14165, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.01376.
Acknowledgements
This research was supported by the National Key R&D Program of China (No. 2020AAA0107000), the National Natural Science Foundation of China (NSFC) (No. 62222605), the Natural Science Foundation of Jiangsu Province of China (Nos. BK20222012 and BK20211517).
Author information
Authors and Affiliations
Corresponding author
Additional information
Colored figures are available in the online version at https://link.springer.com/journal/11633
Feng Sun received the B. Sc. degree in Internet of things engineering from the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China in 2020. He is currently a third year master student in the MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing University of Aeronautics and Astronautics, China.
His research interests include weakly-supervised learning and multi-label learning.
E-mail: sunfeng@nuaa.edu.cn
Ming-Kun Xie received the B. Sc. degree in Internet of things engineering from the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China in 2018. He is currently a Ph. D. degree candidate in the MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing University of Aeronautics and Astronautics, China. He has served as a PC member of NeurIPS 2022, ICML 2022, IJCAI 2020, 2022 and AAAI 2021, 2022 and is also a reviewer of TNNLS.
His research interests include machine learning, multi-label learning and weakly-supervised learning.
E-mail: mkxie@nuaa.edu.cn
Sheng-Jun Huang received the B. Sc. and the Ph. D. degrees in computer science from Nanjing University, China in 2008 and 2014, respectively. He is now a professor in the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China. He was selected for the Young Elite Scientists Sponsorship Program by CAST in 2016, and won the China Computer Federation Outstanding Doctoral Dissertation Award in 2015, the KDD Best Poster Award in 2012, and the Microsoft Fellowship Award in 2011. He is a Junior Associate Editor of Frontiers of Computer Science.
His research interests include machine learning and data mining.
E-mail: huangsj@nuaa.edu.cn (Corresponding author)
ORCID iD: 0000-0002-7673-5367
Rights and permissions
About this article
Cite this article
Sun, F., Xie, MK. & Huang, SJ. A Deep Model for Partial Multi-label Image Classification with Curriculum-based Disambiguation. Mach. Intell. Res. (2024). https://doi.org/10.1007/s11633-023-1439-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11633-023-1439-3