Multi-label sequence generating model via label semantic attention mechanism

Zhang, Xiuling; Tan, Xiaofei; Luo, Zhaoci; Zhao, Jun

doi:10.1007/s13042-022-01722-4

Multi-label sequence generating model via label semantic attention mechanism

Original Article
Published: 18 November 2022

Volume 14, pages 1711–1723, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Xiuling Zhang¹^na1,
Xiaofei Tan¹^na1,
Zhaoci Luo¹^na1 &
…
Jun Zhao¹^na1

311 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In recent years, a new attempt has been made to capture label co-occurrence by applying the sequence-to-sequence (Seq2Seq) model to multi-label text classification (MLTC). However, existing approaches frequently ignore the semantic information contained in the labels themselves. Besides, the Seq2Seq model is susceptible to the negative impact of label sequence order. Furthermore, it has been demonstrated that the traditional attention mechanism underperforms in MLTC. Therefore, we propose a novel Seq2Seq model with a different label semantic attention mechanism (S2S-LSAM), which generates fused information containing label and text information through the interaction of label semantics and text features in the label semantic attention mechanism. With the fused information, our model can select the text features that are most relevant to the labels more effectively. A combination of the cross-entropy loss function and the policy gradient-based loss function is employed to reduce the label sequence order effect. The experiments show that our model outperforms the baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text Data Augmentation for Deep Learning

Article Open access 19 July 2021

Knowledge Injection to Counter Large Language Model (LLM) Hallucination

Prompt-based data labeling method for aspect based sentiment analysis

Article 23 May 2024

Notes

References

Zhang ML (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26:1819–1837
Article Google Scholar
Du J, Chen Q, Peng Y, Xiang Y, Tao C, Lu Z (2019) Ml-net: multi-label classification of biomedical texts with deep neural networks. J Am Med Inform Assoc 26(11):1279–1285
Article Google Scholar
Katakis I, Tsoumakas G, Vlahavas I (2008) Multilabel text classification for automated tag suggestion. In: Proceedings of the ECML/PKDD, vol. 18, p. 5 . Citeseer
Cambria E, Olsher D, Rajagopal D (2014) Senticnet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1515–1521
MJ B (2014) Large scale multi-label text classification with semantic word vectors. In: Technical Report, pp. 1–8
Chalkidis I, Fergadiotis E, Malakasiotis P, Androutsopoulos I (2019) Large-scale multi-label text classification on eu legislation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6314–6322
Nam J, KHFJ Mencía EL (2017) Maximizing subset accuracy with recurrent neural networks in multi-label classification. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 5419–5429. Curran Associates Inc., Red Hook, NY, USA
Yang PC, Lwea Sun X (2018) Sgm: sequence generation model for multi-label classification. In: In Proceedings of the 27th International Conference on Computational Linguistics, pp. 3915–3926
Yang P, Luo F, Ma S, Lin J, Sun X (2019) A deep reinforced sequence-to-set model for multi-label classification. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5252–5258
Lin J, Su Q, Yang P, Ma S, Sun X (2018) Semantic-unit-based dilated convolution for multi-label text classification. arXiv preprint arXiv:1808.08561
Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234
Rennie SJ, Marcheret E, Mroueh Y, Ross J, Goel V (2016) Self-critical sequence training for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern RECOGNITION, pp. 7008–7024
Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: Sequence generative adversarial nets with policy gradient. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31
Zhang M-L, Zhou Z-H (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351
Article Google Scholar
Nam J, Kim J (2014) Large-scale multi-label text classification-revisiting neural networks. Joint European Conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 437–452
Chapter Google Scholar
Y K (2014) Convolutional neural networks for sentence classification. In: EMNLP 2014-2014 Conference on Empirical Methods in Natural Language Processing, pp. 437–452 . Springer
Kurata G, Xiang B, Zhou B (2016) Improved neural network-based multi-label classification with better initialization leveraging label co-occurrence. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 521–526
Zhang X, Zhang Q-W, Yan Z, Liu R, Cao Y (2021) Enhancing label correlation feedback in multi-label text classification via multi-task learning. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1190–1200
Maltoudoglou L, Paisios A, Lenc L, Martínek J, Král P, Papadopoulos H (2022) Well-calibrated confidence measures for multi-label text classification with a large number of labels. Pattern Recogn 122:108271
Article Google Scholar
Xiao L, Zhang X, Jing L, Huang C, Song M (2021) Does head label help for long-tailed multi-label text classification. Proc AAAI Conf Artif Intell 35:14103–14111
Google Scholar
Chen G, Ye D, Xing Z, Chen J, Cambria E (2017) Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2377–2383 . IEEE
Pappas N, Henderson J (2019) Gile: a generalized input-label embedding for text classification. Trans Assoc Comput Linguist 7:139–155
Article Google Scholar
Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174
Du C, Chen Z, Feng F, Zhu L, Gan T, Nie L (2019) Explicit interaction model towards text classification. Proc AAAI Conf Artif Intell 33:6359–6366
Google Scholar
Zhang W, Yan J, Wang X, Zha H (2018) Deep extreme multi-label learning. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp. 100–107
Huang X, Chen B, Xiao L, Yu J, Jing L (2021) Label-aware document representation via hybrid attention for extreme multi-label text classification. Neural Process Lett 2:1–17
Google Scholar
Zhang X, Xu J, Soh C, Chen L (2022) La-hcn: Label-based attention for hierarchical multi-label text classification neural network. Expert Syst Appl 187:115922
Article Google Scholar
Peng H, Li J, Wang S, Wang L, Gong Q, Yang R, Li B, Philip SY, He L (2019) Hierarchical taxonomy-aware and attentional graph capsule rcnns for large-scale multi-label text classification. IEEE Trans Knowl Data Eng 33(6):2505–2519
Article Google Scholar
Wiseman S, Rush AM (2016) Sequence-to-sequence learning as beam-search optimization. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1296–1306
Sutton RS, McAllester D, Singh S, Mansour Y (1999) Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems 12
Lewis DD, Yang Y, Russell-Rose T, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
Google Scholar
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 . PMLR
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336
Article MATH Google Scholar
Manning C, Raghavan P, Schütze H (2010) Introduction to information retrieval. Nat Lang Eng 16(1):100–103
MATH Google Scholar
Boutell MR (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771
Article Google Scholar
Tsoumakas GKI (2006) Multi-label classification: an overview. Int J Data Warehouse Min 3(3):1–13
Google Scholar
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Liu H, Yuan C, Wang X (2020) Label-wise document pre-training for multi-label text classification. In: CCF International Conference on Natural Language Processing and Chinese Computing, pp. 641–653 . Springer
Pal A, Selvakumar M, Sankarasubbu M (2020) Multi-label text classification using attention-based graph neural network. arXiv preprint arXiv:2003.11644
Wang R, Ridley R, Qu W, Dai X (2021) A novel reasoning mechanism for multi-label text classification. Inf Process Manag 58(2):102441
Article Google Scholar
Chen Z, Ren J (2021) Multi-label text classification with latent word-wise label information. Appl Intell 51(2):966–979
Article Google Scholar

Download references

Acknowledgements

This work is supported by the Hebei Provincial Department of education in 2021 provincial postgraduate demonstration course project construction under Grant KCJSX2021024.

Author information

These authors contributed equally to this work.

Authors and Affiliations

Engineering Research Center of the Ministry of Education for Intelligent Control System and Intelligent Equipment, Yanshan University, Qinhuangdao, 066000, Hebei Province, China
Xiuling Zhang, Xiaofei Tan, Zhaoci Luo & Jun Zhao

Authors

Xiuling Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofei Tan
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoci Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaoci Luo.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, X., Tan, X., Luo, Z. et al. Multi-label sequence generating model via label semantic attention mechanism. Int. J. Mach. Learn. & Cyber. 14, 1711–1723 (2023). https://doi.org/10.1007/s13042-022-01722-4

Download citation

Received: 18 March 2022
Accepted: 05 November 2022
Published: 18 November 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s13042-022-01722-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label sequence generating model via label semantic attention mechanism

Abstract

Access this article

Similar content being viewed by others

Text Data Augmentation for Deep Learning

Knowledge Injection to Counter Large Language Model (LLM) Hallucination

Prompt-based data labeling method for aspect based sentiment analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-label sequence generating model via label semantic attention mechanism

Abstract

Access this article

Similar content being viewed by others

Text Data Augmentation for Deep Learning

Knowledge Injection to Counter Large Language Model (LLM) Hallucination

Prompt-based data labeling method for aspect based sentiment analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation