Skip to main content
Log in

Multi-label sequence generating model via label semantic attention mechanism

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In recent years, a new attempt has been made to capture label co-occurrence by applying the sequence-to-sequence (Seq2Seq) model to multi-label text classification (MLTC). However, existing approaches frequently ignore the semantic information contained in the labels themselves. Besides, the Seq2Seq model is susceptible to the negative impact of label sequence order. Furthermore, it has been demonstrated that the traditional attention mechanism underperforms in MLTC. Therefore, we propose a novel Seq2Seq model with a different label semantic attention mechanism (S2S-LSAM), which generates fused information containing label and text information through the interaction of label semantics and text features in the label semantic attention mechanism. With the fused information, our model can select the text features that are most relevant to the labels more effectively. A combination of the cross-entropy loss function and the policy gradient-based loss function is employed to reduce the label sequence order effect. The experiments show that our model outperforms the baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.ai.mit.edu/projects/jmlr/papers/volume5/lewis04a/lyrl2004_rcv1v2_README.htm.

  2. https://github.com/lancopku/SGM/issues/25.

References

  1. Zhang ML (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26:1819–1837

    Article  Google Scholar 

  2. Du J, Chen Q, Peng Y, Xiang Y, Tao C, Lu Z (2019) Ml-net: multi-label classification of biomedical texts with deep neural networks. J Am Med Inform Assoc 26(11):1279–1285

    Article  Google Scholar 

  3. Katakis I, Tsoumakas G, Vlahavas I (2008) Multilabel text classification for automated tag suggestion. In: Proceedings of the ECML/PKDD, vol. 18, p. 5 . Citeseer

  4. Cambria E, Olsher D, Rajagopal D (2014) Senticnet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1515–1521

  5. MJ B (2014) Large scale multi-label text classification with semantic word vectors. In: Technical Report, pp. 1–8

  6. Chalkidis I, Fergadiotis E, Malakasiotis P, Androutsopoulos I (2019) Large-scale multi-label text classification on eu legislation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6314–6322

  7. Nam J, KHFJ Mencía EL (2017) Maximizing subset accuracy with recurrent neural networks in multi-label classification. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 5419–5429. Curran Associates Inc., Red Hook, NY, USA

  8. Yang PC, Lwea Sun X (2018) Sgm: sequence generation model for multi-label classification. In: In Proceedings of the 27th International Conference on Computational Linguistics, pp. 3915–3926

  9. Yang P, Luo F, Ma S, Lin J, Sun X (2019) A deep reinforced sequence-to-set model for multi-label classification. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5252–5258

  10. Lin J, Su Q, Yang P, Ma S, Sun X (2018) Semantic-unit-based dilated convolution for multi-label text classification. arXiv preprint arXiv:1808.08561

  11. Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234

  12. Rennie SJ, Marcheret E, Mroueh Y, Ross J, Goel V (2016) Self-critical sequence training for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern RECOGNITION, pp. 7008–7024

  13. Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: Sequence generative adversarial nets with policy gradient. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31

  14. Zhang M-L, Zhou Z-H (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351

    Article  Google Scholar 

  15. Nam J, Kim J (2014) Large-scale multi-label text classification-revisiting neural networks. Joint European Conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 437–452

    Chapter  Google Scholar 

  16. Y K (2014) Convolutional neural networks for sentence classification. In: EMNLP 2014-2014 Conference on Empirical Methods in Natural Language Processing, pp. 437–452 . Springer

  17. Kurata G, Xiang B, Zhou B (2016) Improved neural network-based multi-label classification with better initialization leveraging label co-occurrence. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 521–526

  18. Zhang X, Zhang Q-W, Yan Z, Liu R, Cao Y (2021) Enhancing label correlation feedback in multi-label text classification via multi-task learning. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1190–1200

  19. Maltoudoglou L, Paisios A, Lenc L, Martínek J, Král P, Papadopoulos H (2022) Well-calibrated confidence measures for multi-label text classification with a large number of labels. Pattern Recogn 122:108271

    Article  Google Scholar 

  20. Xiao L, Zhang X, Jing L, Huang C, Song M (2021) Does head label help for long-tailed multi-label text classification. Proc AAAI Conf Artif Intell 35:14103–14111

    Google Scholar 

  21. Chen G, Ye D, Xing Z, Chen J, Cambria E (2017) Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2377–2383 . IEEE

  22. Pappas N, Henderson J (2019) Gile: a generalized input-label embedding for text classification. Trans Assoc Comput Linguist 7:139–155

    Article  Google Scholar 

  23. Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174

  24. Du C, Chen Z, Feng F, Zhu L, Gan T, Nie L (2019) Explicit interaction model towards text classification. Proc AAAI Conf Artif Intell 33:6359–6366

    Google Scholar 

  25. Zhang W, Yan J, Wang X, Zha H (2018) Deep extreme multi-label learning. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp. 100–107

  26. Huang X, Chen B, Xiao L, Yu J, Jing L (2021) Label-aware document representation via hybrid attention for extreme multi-label text classification. Neural Process Lett 2:1–17

    Google Scholar 

  27. Zhang X, Xu J, Soh C, Chen L (2022) La-hcn: Label-based attention for hierarchical multi-label text classification neural network. Expert Syst Appl 187:115922

    Article  Google Scholar 

  28. Peng H, Li J, Wang S, Wang L, Gong Q, Yang R, Li B, Philip SY, He L (2019) Hierarchical taxonomy-aware and attentional graph capsule rcnns for large-scale multi-label text classification. IEEE Trans Knowl Data Eng 33(6):2505–2519

    Article  Google Scholar 

  29. Wiseman S, Rush AM (2016) Sequence-to-sequence learning as beam-search optimization. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1296–1306

  30. Sutton RS, McAllester D, Singh S, Mansour Y (1999) Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems 12

  31. Lewis DD, Yang Y, Russell-Rose T, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397

    Google Scholar 

  32. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  33. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  34. Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 . PMLR

  35. Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336

    Article  MATH  Google Scholar 

  36. Manning C, Raghavan P, Schütze H (2010) Introduction to information retrieval. Nat Lang Eng 16(1):100–103

    MATH  Google Scholar 

  37. Boutell MR (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771

    Article  Google Scholar 

  38. Tsoumakas GKI (2006) Multi-label classification: an overview. Int J Data Warehouse Min 3(3):1–13

    Google Scholar 

  39. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  40. Liu H, Yuan C, Wang X (2020) Label-wise document pre-training for multi-label text classification. In: CCF International Conference on Natural Language Processing and Chinese Computing, pp. 641–653 . Springer

  41. Pal A, Selvakumar M, Sankarasubbu M (2020) Multi-label text classification using attention-based graph neural network. arXiv preprint arXiv:2003.11644

  42. Wang R, Ridley R, Qu W, Dai X (2021) A novel reasoning mechanism for multi-label text classification. Inf Process Manag 58(2):102441

    Article  Google Scholar 

  43. Chen Z, Ren J (2021) Multi-label text classification with latent word-wise label information. Appl Intell 51(2):966–979

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Hebei Provincial Department of education in 2021 provincial postgraduate demonstration course project construction under Grant KCJSX2021024.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhaoci Luo.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Tan, X., Luo, Z. et al. Multi-label sequence generating model via label semantic attention mechanism. Int. J. Mach. Learn. & Cyber. 14, 1711–1723 (2023). https://doi.org/10.1007/s13042-022-01722-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01722-4

Keywords

Navigation