Improving text classification via a soft dynamical label strategy

Wang, Jingjing; Xie, Haoran; Wang, Fu Lee; Lee, Lap-Kei

doi:10.1007/s13042-022-01770-w

Improving text classification via a soft dynamical label strategy

Original Article
Published: 19 January 2023

Volume 14, pages 2395–2405, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Jingjing Wang¹,
Haoran Xie²,
Fu Lee Wang ORCID: orcid.org/0000-0002-3976-0053¹ &
…
Lap-Kei Lee¹

574 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Labels play a central role in the text classification tasks. However, most studies has a lossy label encoding problem, in which the label will be represented by a meaningless and independent one-hot vector. This paper proposes a novel strategy to dynamically generate a soft pseudo label based on the prediction for each training. This history-based soft pseudo label will be taken as the target to optimize parameters by minimizing the distance between the target and the prediction. In addition, we augment the training data with Mix-up, a widely used method, to prevent overfitting on the small dataset. Extensive experimental results demonstrate that the proposed dynamical soft label strategy significantly improves the performance of several widely used deep learning classification models on binary and multi-class text classification tasks. Not only is our simple and efficient strategy much easier to implement and train, it is also exhibits substantial improvements (up to 2.54% relative improvement on FDCNews datasets with an LSTM encoder) over Label Confusion Learning (LCM)—a state-of-the-art label smoothing model—under the same experimental setting. The experimental result also demonstrate that Mix-up improves our method's performance on smaller datasets, but introduce excess noise in larger datasets, which diminishes the model’s performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Impact of word embedding models on text analytics in deep learning environment: a review

Article 22 February 2023

TextConvoNet: a convolutional neural network based architecture for text classification

Article 22 October 2022

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

Notes

References

Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep learning. Int J Mach Learn Cybernet 11(4):747–750
Article Google Scholar
Qiao X, Peng C, Liu Z, Hu Y (2019) Word-character attention model for chinese text classification. Int J Mach Learn Cybernet 10(12):3521–3537
Article Google Scholar
Li Y, Wang J, Wang S, Liang J, Li J (2019) Local dense mixed region cutting+ global rebalancing: a method for imbalanced text sentiment classification. Int J Mach Learn Cybernet 10(7):1805–1820
Article Google Scholar
Li X, Xie H, Rao Y, Chen Y, Liu X, Huang H, Wang FL( 2016) Weighted multi-label classification model for sentiment analysis of online news. In: 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 215– 222 . IEEE
Huang X, Rao Y, Xie H, Wong T-L, Wang FL( 2017) Cross-domain sentiment classification via topic-related tradaboost. In: Thirty-First AAAI Conference on Artificial Intelligence
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89
Article Google Scholar
Liu J, Dolan P, Pedersen ER ( 2010) Personalized news recommendation based on click behavior. In: Proceedings of the 15th International Conference on Intelligent User Interfaces, pp. 31– 40
Yang S, Wang Y, Chu X (2020) A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526
Blei DM, Ng AY, Jordan MI ( 2003) Latent dirichlet allocation. Journal of machine Learning research 3( Jan), 993– 1022
Medsker LR, Jain L (2001) Recurrent neural networks. Design Appl 5:64–67
Google Scholar
Müller R, Kornblith S, Hinton GE (2019) When does label smoothing help? Advances in neural information processing systems 32
Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748
Article Google Scholar
Yang CC, Wang FL( 2003) Fractal summarization: summarization based on fractal theory. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 391– 392
Yang CC, Wang FL (2007) An information delivery system with automatic summarization for mobile commerce. Decision Support Syst 43(1):46–61
Article Google Scholar
Liang W, Xie H, Rao Y, Lau RY, Wang FL (2018) Universal affective model for readers’ emotion classification over short texts. Expert Syst Appl 114:322–333
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Al-Smadi M, Talafha B, Al-Ayyoub M, Jararweh Y (2019) Using long short-term memory deep neural networks for aspect-based sentiment analysis of arabic reviews. Int J Mach Learn Cybernet 10(8):2163–2175
Article Google Scholar
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5–6):602–610
Article Google Scholar
Zulqarnain M, Ghazali R, Ghouse MG, Mushtaq MF (2019) Efficient processing of gru based on word embedding for text classification. JOIV 3(4):377–383
Article Google Scholar
Liu B, Zhou Y, Sun W (2020) Character-level text classification via convolutional neural network and gated recurrent unit. Int J Mach Learn Cybernet 11(8):1939–1949
Article Google Scholar
Kalchbrenner N, Grefenstette E, Blunsom P(2014) A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188
Lai S, Xu L, Liu K, Zhao J ( 2015) Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI Conference on Artificial Intelligence
Dos Santos C, Gatti M( 2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 69– 78
Huang M, Xie H, Rao Y, Feng J, Wang FL (2020) Sentiment strength detection with a context-dependent lexicon-based convolutional neural network. Inform Sci 520:389–399
Article Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Lee K, Palsetia D, Narayanan R, Patwary MMA, Agrawal A, Choudhary A( 2011) Twitter trending topic classification. In: 2011 IEEE 11th International Conference on Data Mining Workshops, pp. 251– 258 . IEEE
Wei J, Zou K (2019) Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196
Zhang H, Cisse M, Dauphin, Y.N., Lopez-Paz, D(2017) mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412
Liang D, Yang F, Zhang T, Yang P (2018) Understanding mixup training methods. IEEE. Access 6:58774–58783
Article Google Scholar
Guo, H., Mao, Y., Zhang, R(2019) Augmenting data with mixup for sentence classification: An empirical study. arXiv preprint arXiv:1905.08941
Tang J, Qu M, Mei Q ( 2015) Pte: Predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1165– 1174
Zhang H, Xiao L, Chen W, Wang Y, Jin Y (2017) Multi-task label embedding for text classification. arXiv preprint arXiv:1710.07210
Wang, G., Li, C., Wang, W., Zhang, Y., Shen, D., Zhang, X., Henao, R., Carin, L(2018) Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174
Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm: sequence generation model for multi-label classification. arXiv preprint arXiv:1806.04822
Du C, Chen Z, Feng F, Zhu L, Gan T, Nie L ( 2019) Explicit interaction model towards text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6359– 6366
Lienen J, Hüllermeier E ( 2021) From label smoothing to label relaxation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 8583– 8591
Li Y, Yang J, Song Y, Cao L, Luo J, Li L-J ( 2017) Learning from noisy labels with distillation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1910– 1918
Xu Y, Qiu X, Zhou L, Huang X (2020) Improving bert fine-tuning via self-ensemble and self-distillation. arXiv preprint arXiv:2002.10345
Zhang Z-Y, Sheng X-R, Zhang Y, Jiang B, Han S, Deng H, Zheng B (2022) Towards understanding the overfitting phenomenon of deep click-through rate prediction models. arXiv preprint arXiv:2209.06053
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article MathSciNet MATH Google Scholar
Guo B, Han S, Han X, Huang H. Lu T ( 2021) Label confusion learning to enhance text classification models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 12929– 12936
Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K ( 2019) Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3713– 3722
Zhang, X., Zhao, J., LeCun, Y (2015) Character-level convolutional networks for text classification. Advances in neural information processing systems 28
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C ( 2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631– 1642
Liu, P., Qiu, X., Huang, X (2016) Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101
Zhang Y, Wallace B (2015) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., Liu, Q (2019) Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R (2019) Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532– 1543 ( 2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv Neural Inform Process Syst 26 (2013)
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

Download references

Author information

Authors and Affiliations

School of Science and Technology, Hong Kong Metropolitan University, 30 Good Shepherd Street, Homantin, 999077, Kowloon, Hong Kong SAR
Jingjing Wang, Fu Lee Wang & Lap-Kei Lee
Department of Computing and Decision Sciences, Lingnan University, 8 Castle Peak Road, Tuen Mun, 999077, New Territories, Hong Kong SAR
Haoran Xie

Authors

Jingjing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Xie
View author publications
You can also search for this author in PubMed Google Scholar
Fu Lee Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lap-Kei Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fu Lee Wang.

Ethics declarations

Funding

The research described in this article has been supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (UGC/FDS16/E01/19), the Lam Woo Research Fund (LWP20019) and the Faculty Research Grants (DB22A5 and DB22B4) of Lingnan University, Hong Kong. The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, J., Xie, H., Wang, F.L. et al. Improving text classification via a soft dynamical label strategy. Int. J. Mach. Learn. & Cyber. 14, 2395–2405 (2023). https://doi.org/10.1007/s13042-022-01770-w

Download citation

Received: 22 August 2022
Accepted: 28 December 2022
Published: 19 January 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s13042-022-01770-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving text classification via a soft dynamical label strategy

Abstract

Access this article

Similar content being viewed by others

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Funding

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving text classification via a soft dynamical label strategy

Abstract

Access this article

Similar content being viewed by others

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Funding

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation