Skip to main content
Log in

Generating word and document matrix representations for document classification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

We present an effective word and document matrix representation architecture based on a linear operation, referred to as doc2matrix, to learn representations for document-level classification. It uses a matrix to present each word or document, which is different from the traditional form of vector representation. Doc2matrix defines proper subwindows as the scale of text. A word matrix and a document matrix are generated by stacking the information of these subwindows. Our document matrix not only contains more fine-grained semantic and syntactic information than the original representation but also introduces abundant two-dimensional features. Experiments conducted on four document-level classification tasks demonstrate that the proposed architecture can generate higher-quality word and document representations and outperform previous models based on linear operations. We can see that compared to different classifiers, a convolutional-based classifier is more suitable for our document matrix. Furthermore, we also demonstrate that the convolution operation can better capture the two-dimensional features of the proposed document matrix by the analysis from both theoretical and experimental perspectives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Harris ZS (1981) Distributional structure. Word 10:146–162

    Article  Google Scholar 

  2. Silva J, Coheur L, Mendes AC, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154

    Article  Google Scholar 

  3. Mikolov T et al (2013) Efficient estimation of word representations in vector space. In: Computer science

  4. Zhang H, Wang S, Xu X et al (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 99:1–15

    MathSciNet  Google Scholar 

  5. Zhang H, Wang S, Zhao M et al (2018) Locality reconstruction models for book representation. IEEE Trans Knowl Data Eng 30:1873–1886

    Article  Google Scholar 

  6. Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Computer science

  7. Chen M (2017) Efficient vector representation for documents through corruption. In: Proceedings of the fifth international conference on learning representations. ICLR

  8. Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of conference on empirical methods in natural language processing. EMNLP, pp 1642

  9. Mesnil et al (2015) Ensemble of generative and discriminative techniques for sentiment analysis of movie reviews. J Lightwave Technol 32(17):3043–3060

    Google Scholar 

  10. Zhang H et al (2013) Multidimensional latent semantic analysis using term spatial information. IEEE Trans Cybern 43:1625–1640

    Article  Google Scholar 

  11. Huang EH, Socher R, Manning CD et al (2012) Improving word representations via global context and multiple word prototypes. In: Proceedings of meeting of the Association for Computational Linguistics: long papers

  12. Kim Y (2014). Convolutional neural networks for sentence classification. In: Proceedings of conference on empirical methods in natural language processing. EMNLP, pp 1746–1751

  13. Kim Y, Jernite Y, Sontag D et al (2016) Character-aware neural language models. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI, pp 2741–2749

  14. Conneau A, Schwenk H et al (2016) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the Association for Computational Linguistics, vol 1, long papers

  15. Shen D, Min MR, Li Y et al (2017) Learning context-sensitive convolutional filters for text processing

  16. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computation 9(8):1735–1780

    Article  Google Scholar 

  17. Zhou C, Sun C, Liu Z, et al (2015) A C-LSTM neural network for text classification. In: Computer science, pp 39–44

  18. Ding Z, Xia R, Yu J, et al (2018) Densely connected bidirectional LSTM with applications to sentence classification. In: Natural language processing and chinese computing, pp 278–287

  19. Pappas N, Popescu-Belis A (2017) Multilingual hierarchical attention networks for document classification

  20. Kumar A, Kawahara D, Kurohashi S (2018) Knowledge-enriched two-layered attention network for sentiment analysis

  21. Zhang T, Huang M, Zhao L (2018) Learning structured representation for text classification via reinforcement learning. In: Proceedings of the thirty-second AAAI conference on artificial intelligence. AAAI

  22. Feng J, Huang M, Zhao L et al (2018) Reinforcement learning for relation classification from noisy data. In: Proceedings of the thirty-second AAAI conference on artificial intelligence. AAAI

  23. Miyato T, Dai A M, Goodfellow I (2017) Adversarial training methods for semi-supervised text classification. In: Proceedings of the fifth international conference on learning representations. ICLR

  24. Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification

  25. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. Meeting of the Association for Computational Linguistics. ACL, pp 142–150

  26. Johnson R, Zhang T (2015) Semi-supervised convolutional neural networks for text categorization via region embedding. In: Advances in neural information processing systems, pp 919–927

  27. Joachims T (1996) A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: ICML, pp 143–151

  28. Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the twenty-fifth international conference. ICML, pp 1096–1103

  29. Mikolov T et al (2010) Recurrent neural network based language model. In: Proceedings of the 37th international symposium on computer architecture. ISCA, pp 1045–1048

  30. Fan RE, Chang KW, Hsieh CJ et al (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the Innovation Foundation of Science and Technology of Dalian under grant No.2018J12GX045.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shun Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, S., Yao, N. Generating word and document matrix representations for document classification. Neural Comput & Applic 32, 10087–10108 (2020). https://doi.org/10.1007/s00521-019-04541-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04541-x

Keywords

Navigation