Abstract
Self-training is an effective solution for semi-supervised learning, in which both labeled and unlabeled data are leveraged for training. However, the application scenarios of existing self-training frameworks are mostly confined to single-label classification. There exist difficulties in applying self-training under multi-label scenario, since unlike single-label classification, there is no constraint of mutual exclusion over categories, and the vast number of possible label vectors makes discovery of credible predictions harder. For realizing effective self-training under multi-label scenario, we propose ML-DST and ML-DST+ that utilize contextualized document representations of pretrained language models. A BERT-based multi-label classifier and newly designed weighted loss functions for finetuning are proposed. Two label propagation-based algorithms SemLPA and SemLPA+ are also proposed to enhance multi-label prediction, whose similarity measure is iteratively improved through semantic-space finetuning, by which semantic space consisting of document representations is finetuned to better reflect learnt label correlations. High-confidence label predictions are recognized through examining the prediction score on each category separately, which are in turn used for both classifier finetuning and semantic-space finetuning. According to our experiment results, the performance of our approach steadily exceeds the representative baselines under different label rates, proving the superiority of our proposed approach.
Similar content being viewed by others
References
Meng, Y., et al.: Weakly-supervised neural text classification. In: CIKM, pp. 983–992. (2018)
Meng, Y., et al.: Weakly-supervised hierarchical text classification. In: AAAI, pp. 6826–6833. (2019)
Xie, Q., et al.: Self-training with noisy student improves imagenet classification. In: CVPR, pp. 10687–10698. (2020)
Zou, Y., et al.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: ECCV Part III, pp. 297–313. (2018)
Kong, X., Ng, M.K., Zhou, Z.: Transductive multilabel learning via label set propagation. IEEE Trans. Knowl. Data Eng. 25(3), 704–719 (2011)
Wang, L., et al.: Dual relation semi-supervised multi-label learning. In: AAAI, pp. 6227–6234. (2020)
Zhan, W., Zhang, M.: Inductive semi-supervised multi-label learning with co-training. In: SIGKDD, pp. 1305–1314. (2017)
Xing, Y., et al.: Multi-label co-training. In: IJCAI, pp. 2882–2888. (2018)
Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186. (2019)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: EMNLP-IJCNLP, pp. 3982–3992. (2019)
Thakur, N., et al.: Augmented SBERT: data augmentation method for improving Bi-encoders for pairwise sentence scoring tasks. In: NAACL, pp, 296–310. (2021)
Xu, Z., Iwaihara, M.: Integrating semantic-space finetuning and self-training for semi-supervised multi-label text classification. In: ICADL, pp. 249–263. (2021)
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint: 2004.05150 (2020).
Joshi, M., et al.: SpanBERT: improving pre-training by representing and predicting spans. Trans. Assoc. Comput. Linguist. 8, 64–77 (2020)
Kitaev, N., Kaiser, L., Levskaya, A.: Reformer: the efficient transformer. In: ICLR. (2020)
Lan, Z., et al. ALBERT: a lite BERT for self-supervised learning of language representations. In: ICLR (2020).
Liu, Y., et al. RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint: 1907.11692 (2019)
Sanh, V., et al.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. In: the 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing—NeurIPS (2019)
Sun, Y., et al.: ERNIE: enhanced representation through knowledge integration. arXiv preprint:1904.09223 (2019)
Zaheer, M., et al.: Big bird: transformers for longer sequences. In: NeurIPS (2020)
Sun, C., et al.: How to fine-tune BERT for text classification?. In: CCL , pp. 194–206. (2019)
Gururangan, S., et al. Don’t stop pretraining: adapt language models to domains and tasks. In: ACL, pp. 8342–8360. (2020)
Xiong, Y., et al.: Fusing label embedding into BERT: an efficient improvement for text classification. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP, pp. 1743–1750. (2021)
Meng, Y., et al. Text classification using label names only: a language model self-training approach. In: EMNLP, pp. 9006–9017. (2020)
Schick, T., Schütze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: EACL, pp. 255–269. (2021)
Scudder, H.J.: Probability of error of some adaptive pattern-recognition machines. IEEE Trans. Inf. Theory 11(3), 363–371 (1965)
Lee, D.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML Workshop: Challenges in Representative Learning. (2013)
Zou, Y., et al.: Confidence regularized self-training. In: ICCV, pp. 5982–5991. (2019)
Wei, C., et al.: CReST: a class-rebalancing self-training framework for imbalanced semi-supervised learning. In: CVPR, pp. 10857–10866. (2021)
Li, X., et al.: Learning to self-train for semi-supervised few-shot classification. In: NeurIPS. (2019)
Mukherjee, S., Ahmed, A.: Uncertainty-aware self-training for few-shot text classification. In: NeurIPS. (2020).
X. Zhu and Z. Ghahramani: Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02–107, Carnegie Mellon University (2002).
Kang, F., Jin, R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: CVPR, vol. 2, pp. 1719–1726. (2006)
Wang, B., Tu, Z., Tsotsos, J.K.: Dynamic label propagation for semi-supervised multi-class multi-label classification. In: ICCV, pp. 425–432. (2013)
Liu, Y., et al.: Learning to propagate labels: transductive propagation network for few-shot learning. In: ICLR. (2019)
Iscen, A., et al.: Label propagation for deep semi-supervised learning. In: CVPR, pp. 5070–5079. (2019)
Su, J.: https://www.spaces.ac.cn/archives/7359, Blog post, last accessed 8 May 2022
Apte, C., Damerau, F., Weiss, S.M.: Towards language independent automated learning of text categorization models. In: SIGIR, pp. 23–30. (1994)
Yang, P., et al.: SGM: sequence generation model for multi-label classification. In: COLING, pp. 3915–3926. (2018)
Aly, R., Remus, S., Biemann, C.: Hierarchical multi-label classification of text with capsule networks. In: ACL: Student Research Workshop, pp. 323–330. (2019)
Funding
The funding was provided by Japan Society for the Promotion of Science (JP22J12044).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, Z., Iwaihara, M. Self-training involving semantic-space finetuning for semi-supervised multi-label document classification. Int J Digit Libr 25, 25–39 (2024). https://doi.org/10.1007/s00799-023-00355-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-023-00355-4