Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification

Huang, Xin; Chen, Boli; Xiao, Lin; Yu, Jian; Jing, Liping

doi:10.1007/s11063-021-10444-7

Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification

Published: 18 March 2021

Volume 54, pages 3601–3617, (2022)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Xin Huang¹,
Boli Chen¹,
Lin Xiao¹,
Jian Yu¹ &
…
Liping Jing¹

860 Accesses
12 Citations
Explore all metrics

Abstract

Extreme multi-label text classification (XMTC) aims at tagging a document with most relevant labels from an extremely large-scale label set. It is a challenging problem especially for the tail labels because there are only few training documents to build classifier. This paper is motivated to better explore the semantic relationship between each document and extreme labels by taking advantage of both document content and label correlation. Our objective is to establish an explicit label-aware representation for each document with a hybrid attention deep neural network model(LAHA). LAHA consists of three parts. The first part adopts a multi-label self-attention mechanism to detect the contribution of each word to labels. The second part exploits the label structure and document content to determine the semantic connection between words and labels in a same latent space. An adaptive fusion strategy is designed in the third part to obtain the final label-aware document representation so that the essence of previous two parts can be sufficiently integrated. Extensive experiments have been conducted on six benchmark datasets by comparing with the state-of-the-art methods. The results show the superiority of our proposed LAHA method, especially on the tail labels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scaling Up Multi-domain Semantic Segmentation with Sentence Embeddings

Article 01 May 2024

Impact of word embedding models on text analytics in deep learning environment: a review

Article 22 February 2023

TextConvoNet: a convolutional neural network based architecture for text classification

Article 22 October 2022

Notes

References

Balasubramanian K, Lebanon G (2012) The landmark selection method for multiple output prediction. arXiv preprint arXiv:1206.6479
Bhatia K, Jain H, Kar P, Varma M, Jain P (2015) Sparse local embeddings for extreme multi-label classification. In: Advances in neural information processing systems, pp 730–738
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Cisse MM, Usunier N, Artieres T, Gallinari P (2013) Robust bloom filters for large multilabel classification tasks. In: Advances in Neural Information Processing Systems, pp 1851–1859
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Du C, Chen Z, Feng F, Zhu L, Gan T, Nie L (2019) Explicit interaction model towards text classification. Proc AAAI Conf Artif Intell 33:6359–6366
Google Scholar
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 855–864
Hsu DJ, Kakade SM, Langford J, Zhang T (2009) Multi-label prediction via compressed sensing. In: Advances in neural information processing systems, pp 772–780
Jain H, Prabhu Y, Varma M (2016) Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 935–944
Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5(4):361–397
Google Scholar
Li Z, Zhang Z, Qin J, Zhang Z, Shao L (2020) Discriminative fisher embedding dictionary learning algorithm for object recognition. IEEE Trans Neural Netw Learn Syst 31(3):786–800. https://doi.org/10.1109/TNNLS.2019.2910146
Article MathSciNet Google Scholar
Lin Z, Feng M, Santos CNd, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130
Liu J, Chang WC, Wu Y, Yang Y (2017) Deep learning for extreme multi-label text classification. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp 115–124
McAuley J, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM conference on Recommender systems, ACM, pp 165–172
Mencia EL, Fürnkranz J (2008) Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 50–65
Munkhdalai T, Yu H (2017) Neural semantic encoders. In: Proceedings of the conference. association for computational linguistics. Meeting, NIH Public Access, vol 1, p 397
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Prabhu Y, Varma M (2014) Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 263–272
Prabhu Y, Kag A, Harsola S, Agrawal R, Varma M (2018) Parabel: Partitioned label trees for extreme classification with application to dynamic search advertising. In: Proceedings of the 2018 world wide web conference, international world wide web conferences steering committee, pp 993–1002
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Tai F, Lin HT (2012) Multilabel classification with principal label space transformation. Neural Comput 24(9):2508–2542
Article MathSciNet Google Scholar
Wang L, Cao Z, De Melo G, Liu Z (2016) Relation classification via multi-level attention cnns. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers), pp 1298–1307
Wang S, Jiang J (2015) Learning natural language inference with lstm. arXiv preprint arXiv:1512.08849
Wang X, Jiang W, Luo Z (2016) Combination of convolutional and recurrent neural network for sentiment analysis of short texts. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: Technical papers, pp 2428–2437
Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm: sequence generation model for multi-label classification. arXiv preprint arXiv:1806.04822
You R, Dai S, Zhang Z, Mamitsuka H, Zhu S (2018) Attentionxml: Extreme multi-label text classification with multi-label attention based recurrent neural networks. arXiv preprint arXiv:1811.01727
Zhang W, Yan J, Wang X, Zha H (2018) Deep extreme multi-label learning. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, ACM, pp 100–107
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649–657
Zhang Y, Schneider J (2011) Multi-label output codes using canonical correlation analysis. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 873–882
Zhang Z, Xu Y, Shao L, Yang J (2018) Discriminative block-diagonal representation learning for image recognition. IEEE Trans Neural Netw Learn Syst 29(7):3111–3125. https://doi.org/10.1109/TNNLS.2017.2712801
Article MathSciNet Google Scholar
Zhang Z, Liu L, Shen F, Shen HT, Shao L (2019) Binary multi-view clustering. IEEE Trans Pattern Anal Mach Intell 41(7):1774–1782. https://doi.org/10.1109/TPAMI.2018.2847335
Article Google Scholar
Zhou C, Sun C, Liu Z, Lau F (2015) A c-lstm neural network for text classification. arXiv preprint arXiv:1511.08630
Zhou P, Qi Z, Zheng S, Xu J, Bao H, Xu B (2016) Text classification improved by integrating bidirectional lstm with two-dimensional max pooling. arXiv preprint arXiv:1611.06639
Zubiaga A (2012) Enhancing navigation on wikipedia with social tags. arXiv preprint arXiv:1202.5469

Download references

Author information

Authors and Affiliations

Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, People’s Republic of China
Xin Huang, Boli Chen, Lin Xiao, Jian Yu & Liping Jing

Authors

Xin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Boli Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lin Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yu
View author publications
You can also search for this author in PubMed Google Scholar
Liping Jing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liping Jing.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the Fundamental Research Funds for the Central Universities (2018JBZ006)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, X., Chen, B., Xiao, L. et al. Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification. Neural Process Lett 54, 3601–3617 (2022). https://doi.org/10.1007/s11063-021-10444-7

Download citation

Accepted: 02 February 2021
Published: 18 March 2021
Issue Date: October 2022
DOI: https://doi.org/10.1007/s11063-021-10444-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification

Abstract

Access this article

Similar content being viewed by others

Scaling Up Multi-domain Semantic Segmentation with Sentence Embeddings

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification

Abstract

Access this article

Similar content being viewed by others

Scaling Up Multi-domain Semantic Segmentation with Sentence Embeddings

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation