Skip to main content

Reinforcement Learning for Extreme Multi-label Text Classification

  • Conference paper
  • First Online:
Cognitive Systems and Signal Processing (ICCSIP 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1397))

Included in the following conference series:

Abstract

Extreme multi-label text classification (XMC) is an important yet challenging problem in the NLP community, which refers to the problem of assigning to each document its most relevant subset of class labels from an extremely large label collection. For example, the input text could be a story document on chinastory.cn and the labels could be story categories that implies the potential meaning. However, naively applying normal neural network models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue. In this paper, we presents the first attempt at applying reinforcement learning to XMC. Experimental results on public and our own engineering datasets demonstrate that our approach achieves expecting performance compared with the evaluation of the state-of-the-art methods.

H. Teng and Y. Li—Contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Liu, J., Chang, W. C., Wu, Y., Yang, Y.: Deep learning for extreme multi-label text classification. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115–124 (2017)

    Google Scholar 

  2. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)

  3. Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)

  4. Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Advances in Neural Information Processing Systems, pp. 730–738 (2015)

    Google Scholar 

  5. Choo, J., Lee, C., Reddy, C.K., Park, H.: UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Vis. Comput. Graph. 19(12), 1992–2001 (2013)

    Article  Google Scholar 

  6. Teng, H., Liu, H., Yu, L., Sun, F.: Representative video action discovery using interactive non-negative matrix factorization. In: Hu, X., Xia, Y., Zhang, Y., Zhao, D. (eds.) ISNN 2015. LNCS, vol. 9377, pp. 205–212. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25393-0_23

    Chapter  Google Scholar 

  7. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)

  8. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  9. Prabhu, Y., Varma, M.: FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 263–272 (2014)

    Google Scholar 

  10. Zhang, T., Huang, M., Zhao, L.: Learning structured representation for text classification via reinforcement learning. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  11. Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)

    Google Scholar 

  12. Chinastory. https://www.chinastory.cn/english/index.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Teng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Teng, H., Li, Y., Long, F., Xu, M., Ling, Q. (2021). Reinforcement Learning for Extreme Multi-label Text Classification. In: Sun, F., Liu, H., Fang, B. (eds) Cognitive Systems and Signal Processing. ICCSIP 2020. Communications in Computer and Information Science, vol 1397. Springer, Singapore. https://doi.org/10.1007/978-981-16-2336-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-2336-3_22

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-2335-6

  • Online ISBN: 978-981-16-2336-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics