Skip to main content
Log in

Hierarchical Neural Representation for Document Classification

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Text representation, which converts text spans into real-valued vectors or matrices, is a crucial tool for machines to understand the semantics of text. Although most previous works employed classic methods based on statistics and neural networks, such methods might suffer from data sparsity and insensitivity to the text structure, respectively. To address the above drawbacks, we propose a general and structure-sensitive framework, i.e., the hierarchical architecture. Specifically, we incorporate the hierarchical architecture into three existing neural network models for document representation, thereby producing three new representation models for document classification, i.e., TextHFT, TextHRNN, and TextHCNN. Our comprehensive experimental results on two public datasets demonstrate the effectiveness of the hierarchical architecture. With a comparable (or substantially less) time expense, our proposals obtain significant improvements ranging from 4.65 to 35.08% in terms of accuracy against the baseline. We can conclude that the hierarchical architecture can enhance the classification performance. In addition, we find that the benefits provided by the hierarchical architecture can be strengthened as the document length increases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://www.yelp.com/dataset/challenge

  2. http://jmcauley.ucsd.edu/data/amazon/

References

  1. Al-Radaideh QA, Bataineh DQ. A hybrid approach for arabic text summarization using domain knowledge and genetic algorithms. Cogn Comput 2018;10(4):651–69.

    Article  Google Scholar 

  2. Bengio Y, Ducharme R, Vincent P, Janvin C. A neural probabilistic language models. J Mach Learn Res 2003;3(6):1137–55.

    Google Scholar 

  3. Chen YW, Zhou Q, Luo W, Du JX. Classification of chinese texts based on recognition of semantic topics. Cogn Comput 2016;8(1):114–24.

    Article  Google Scholar 

  4. Collobert R, Weston J, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. J Mach Learn Res 2011;12(1):2493–537.

    Google Scholar 

  5. Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics; 2010. p. 249–256.

  6. He R, Lee WS, Ng HT, Dahlmeier D. Exploiting document knowledge for aspect-level sentiment classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics; 2018. p. 579–585.

  7. Henao R, Li C, Carin L, Shen D, Wang G, Wang W, Zhang Y, Zhang X. Joint embedding of words and labels for text classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics; 2018. p. 2321–2331.

  8. Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning; 2015. p. 448–456.

  9. Isbell CL. Sparse multi-level representations for retrieval. J Comput Inf Sci Eng 1998;8(3):603–16.

    Google Scholar 

  10. Jianming Z, Fei C, Taihua S, Honghui C. Self-interaction attention mechanism-based text representation for document classification. Appl Sci 2018;8(4):613.

    Article  Google Scholar 

  11. Joachims T. Text categorization with suport vector machines: Learning with many relevant features. Proceedings of European Conference on Machine Learning; 1998. p. 137– 142.

  12. Joulin A, Grave E, Bojanowski P, Mikolov T. Bag of tricks for efficient text classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics; 2017. p. 427–431.

  13. Kia D, Soujanya P, Amir H, Erik C, Hawalah AYA, Alexander G, Qiang Z. Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn Comput 2016;8(4):1–4.

    Google Scholar 

  14. Kim Y. Convolutional neural networks for sentence classification. Proceedings of Conference on Empirical Methods in Natural Language Processing; 2014. p. 1746–1751.

  15. Lai S, Xu L, Liu K, Jun Z. Recurrent convolutional neural networks for text classification. Proceedings of Association for the Advancement of Artificial Intelligence; 2015. p. 2267–2273.

  16. Lai S, Liu K, He S, Zhao J. How to generate a good word embedding. IEEE Intell Syst 2016;31(6): 5–14.

    Article  Google Scholar 

  17. Le Q, Mikolov T. Distributed representations of sentences and documents. Proceedings of the 31st International Conference on Machine Learning; 2014.

  18. Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput 2017;9(6):843–51.

    Article  Google Scholar 

  19. Liu P, Qiu X, Huang X. Recurrent neural network for text classification with multi-task learning. Proceedings of International Joint Conference on Artificial Intelligence; 2016. p. 2873–2879.

  20. Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, Mcclosky D. The Stanford CoreNLP natural language processing toolkit. Meeting of the association for computational linguistics: system demonstrations; 2014. p. 55–60.

  21. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. Proceedings of International Conference on Learning Representations; 2013.

  22. Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. Computer Science 2012;52(3):III–1310.

    Google Scholar 

  23. Pennington J, Socher R, Manning C. Glove: Global vectors for word representation. Proceedings of Conference on Empirical Methods in Natural Language Processing; 2014. p. 1532–1543.

  24. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014;15(1):1929–58.

    Google Scholar 

  25. Tang D. Sentiment-specific representation learning for document-level sentiment analysis. Proceedings of 8th ACM International Conference on Web Search and Data Mining; 2015. p. 447–452.

  26. Wang M, Liu M, Feng S, Wang D, Zhang Y. A novel calibrated label ranking based method for multiple emotions detection in chinese microblogs. Berlin: Natural Language Processing And Chinese Computing; 2014.

    Book  Google Scholar 

  27. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2017. p. 1480–1489.

  28. Zhang X, Zhao J, Lecun Y. Character-level convolutional networks for text classification. Proceedings of Advances in Neural Information Processing Systems vol. 28; 2015. p. 649– 657.

  29. Zhao Z, Liu T, Hou X, Li B, Du X. Distributed text representation with weighting scheme guidance for sentiment analysis. Proceedings of Asia-Pacific Web Conference; 2016. p. 41–52.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Cai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent was not required as no human or animals were involved.

Human and Animal Rights

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jianming Zheng and Wanyu Chen are co-first authors of this article.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, J., Cai, F., Chen, W. et al. Hierarchical Neural Representation for Document Classification. Cogn Comput 11, 317–327 (2019). https://doi.org/10.1007/s12559-018-9621-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-018-9621-6

Keywords

Navigation