Reinforcement Learning for Extreme Multi-label Text Classification

Teng, Hui; Li, Yulei; Long, Fei; Xu, Meixia; Ling, Qiang

doi:10.1007/978-981-16-2336-3_22

Hui Teng^8,9,
Yulei Li^8,9,
Fei Long^8,9,10,
Meixia Xu^8,9 &
…
Qiang Ling¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1397))

Included in the following conference series:

International Conference on Cognitive Systems and Signal Processing

1 Citations

Abstract

Extreme multi-label text classification (XMC) is an important yet challenging problem in the NLP community, which refers to the problem of assigning to each document its most relevant subset of class labels from an extremely large label collection. For example, the input text could be a story document on chinastory.cn and the labels could be story categories that implies the potential meaning. However, naively applying normal neural network models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue. In this paper, we presents the first attempt at applying reinforcement learning to XMC. Experimental results on public and our own engineering datasets demonstrate that our approach achieves expecting performance compared with the evaluation of the state-of-the-art methods.

H. Teng and Y. Li—Contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Liu, J., Chang, W. C., Wu, Y., Yang, Y.: Deep learning for extreme multi-label text classification. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115–124 (2017)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Advances in Neural Information Processing Systems, pp. 730–738 (2015)
Google Scholar
Choo, J., Lee, C., Reddy, C.K., Park, H.: UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Vis. Comput. Graph. 19(12), 1992–2001 (2013)
Article Google Scholar
Teng, H., Liu, H., Yu, L., Sun, F.: Representative video action discovery using interactive non-negative matrix factorization. In: Hu, X., Xia, Y., Zhang, Y., Zhao, D. (eds.) ISNN 2015. LNCS, vol. 9377, pp. 205–212. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25393-0_23
Chapter Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Prabhu, Y., Varma, M.: FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 263–272 (2014)
Google Scholar
Zhang, T., Huang, M., Zhao, L.: Learning structured representation for text classification via reinforcement learning. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Google Scholar
Chinastory. https://www.chinastory.cn/english/index.html

Download references

Author information

Authors and Affiliations

Chinaso Inc., Beijing, 100077, China
Hui Teng, Yulei Li, Fei Long & Meixia Xu
State Key Laboratory of Media Convergence Production Technology and Systems, Xinhua News Agency, Beijing, China
Hui Teng, Yulei Li, Fei Long & Meixia Xu
Department of Computer Science, Northeast University, Shenyang, 110819, China
Fei Long
Department of Automation, University of Science and Technology of China, Hefei, 230027, China
Qiang Ling

Authors

Hui Teng
View author publications
You can also search for this author in PubMed Google Scholar
Yulei Li
View author publications
You can also search for this author in PubMed Google Scholar
Fei Long
View author publications
You can also search for this author in PubMed Google Scholar
Meixia Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Ling
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Teng .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fuchun Sun
Tsinghua University, Beijing, China
Huaping Liu
Tsinghua University, Beijing, China
Bin Fang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Teng, H., Li, Y., Long, F., Xu, M., Ling, Q. (2021). Reinforcement Learning for Extreme Multi-label Text Classification. In: Sun, F., Liu, H., Fang, B. (eds) Cognitive Systems and Signal Processing. ICCSIP 2020. Communications in Computer and Information Science, vol 1397. Springer, Singapore. https://doi.org/10.1007/978-981-16-2336-3_22

Download citation

DOI: https://doi.org/10.1007/978-981-16-2336-3_22
Published: 05 May 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2335-6
Online ISBN: 978-981-16-2336-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics