Cognitive Computation

, Volume 11, Issue 2, pp 317–327 | Cite as

Hierarchical Neural Representation for Document Classification

  • Jianming Zheng
  • Fei CaiEmail author
  • Wanyu Chen
  • Chong Feng
  • Honghui Chen


Text representation, which converts text spans into real-valued vectors or matrices, is a crucial tool for machines to understand the semantics of text. Although most previous works employed classic methods based on statistics and neural networks, such methods might suffer from data sparsity and insensitivity to the text structure, respectively. To address the above drawbacks, we propose a general and structure-sensitive framework, i.e., the hierarchical architecture. Specifically, we incorporate the hierarchical architecture into three existing neural network models for document representation, thereby producing three new representation models for document classification, i.e., TextHFT, TextHRNN, and TextHCNN. Our comprehensive experimental results on two public datasets demonstrate the effectiveness of the hierarchical architecture. With a comparable (or substantially less) time expense, our proposals obtain significant improvements ranging from 4.65 to 35.08% in terms of accuracy against the baseline. We can conclude that the hierarchical architecture can enhance the classification performance. In addition, we find that the benefits provided by the hierarchical architecture can be strengthened as the document length increases.


Document representation Neural networks Hierarchical architecture Document classification 


Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent was not required as no human or animals were involved.

Human and Animal Rights

This article does not contain any studies with human participants performed by any of the authors.


  1. 1.
    Al-Radaideh QA, Bataineh DQ. A hybrid approach for arabic text summarization using domain knowledge and genetic algorithms. Cogn Comput 2018;10(4):651–69.CrossRefGoogle Scholar
  2. 2.
    Bengio Y, Ducharme R, Vincent P, Janvin C. A neural probabilistic language models. J Mach Learn Res 2003;3(6):1137–55.Google Scholar
  3. 3.
    Chen YW, Zhou Q, Luo W, Du JX. Classification of chinese texts based on recognition of semantic topics. Cogn Comput 2016;8(1):114–24.CrossRefGoogle Scholar
  4. 4.
    Collobert R, Weston J, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. J Mach Learn Res 2011;12(1):2493–537.Google Scholar
  5. 5.
    Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics; 2010. p. 249–256.Google Scholar
  6. 6.
    He R, Lee WS, Ng HT, Dahlmeier D. Exploiting document knowledge for aspect-level sentiment classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics; 2018. p. 579–585.Google Scholar
  7. 7.
    Henao R, Li C, Carin L, Shen D, Wang G, Wang W, Zhang Y, Zhang X. Joint embedding of words and labels for text classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics; 2018. p. 2321–2331.Google Scholar
  8. 8.
    Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning; 2015. p. 448–456.Google Scholar
  9. 9.
    Isbell CL. Sparse multi-level representations for retrieval. J Comput Inf Sci Eng 1998;8(3):603–16.Google Scholar
  10. 10.
    Jianming Z, Fei C, Taihua S, Honghui C. Self-interaction attention mechanism-based text representation for document classification. Appl Sci 2018;8(4):613.CrossRefGoogle Scholar
  11. 11.
    Joachims T. Text categorization with suport vector machines: Learning with many relevant features. Proceedings of European Conference on Machine Learning; 1998. p. 137– 142.Google Scholar
  12. 12.
    Joulin A, Grave E, Bojanowski P, Mikolov T. Bag of tricks for efficient text classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics; 2017. p. 427–431.Google Scholar
  13. 13.
    Kia D, Soujanya P, Amir H, Erik C, Hawalah AYA, Alexander G, Qiang Z. Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn Comput 2016;8(4):1–4.Google Scholar
  14. 14.
    Kim Y. Convolutional neural networks for sentence classification. Proceedings of Conference on Empirical Methods in Natural Language Processing; 2014. p. 1746–1751.Google Scholar
  15. 15.
    Lai S, Xu L, Liu K, Jun Z. Recurrent convolutional neural networks for text classification. Proceedings of Association for the Advancement of Artificial Intelligence; 2015. p. 2267–2273.Google Scholar
  16. 16.
    Lai S, Liu K, He S, Zhao J. How to generate a good word embedding. IEEE Intell Syst 2016;31(6): 5–14.CrossRefGoogle Scholar
  17. 17.
    Le Q, Mikolov T. Distributed representations of sentences and documents. Proceedings of the 31st International Conference on Machine Learning; 2014.Google Scholar
  18. 18.
    Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput 2017;9(6):843–51.CrossRefGoogle Scholar
  19. 19.
    Liu P, Qiu X, Huang X. Recurrent neural network for text classification with multi-task learning. Proceedings of International Joint Conference on Artificial Intelligence; 2016. p. 2873–2879.Google Scholar
  20. 20.
    Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, Mcclosky D. The Stanford CoreNLP natural language processing toolkit. Meeting of the association for computational linguistics: system demonstrations; 2014. p. 55–60.Google Scholar
  21. 21.
    Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. Proceedings of International Conference on Learning Representations; 2013.Google Scholar
  22. 22.
    Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. Computer Science 2012;52(3):III–1310.Google Scholar
  23. 23.
    Pennington J, Socher R, Manning C. Glove: Global vectors for word representation. Proceedings of Conference on Empirical Methods in Natural Language Processing; 2014. p. 1532–1543.Google Scholar
  24. 24.
    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014;15(1):1929–58.Google Scholar
  25. 25.
    Tang D. Sentiment-specific representation learning for document-level sentiment analysis. Proceedings of 8th ACM International Conference on Web Search and Data Mining; 2015. p. 447–452.Google Scholar
  26. 26.
    Wang M, Liu M, Feng S, Wang D, Zhang Y. A novel calibrated label ranking based method for multiple emotions detection in chinese microblogs. Berlin: Natural Language Processing And Chinese Computing; 2014.CrossRefGoogle Scholar
  27. 27.
    Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2017. p. 1480–1489.Google Scholar
  28. 28.
    Zhang X, Zhao J, Lecun Y. Character-level convolutional networks for text classification. Proceedings of Advances in Neural Information Processing Systems vol. 28; 2015. p. 649– 657.Google Scholar
  29. 29.
    Zhao Z, Liu T, Hou X, Li B, Du X. Distributed text representation with weighting scheme guidance for sentiment analysis. Proceedings of Asia-Pacific Web Conference; 2016. p. 41–52.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Jianming Zheng
    • 1
  • Fei Cai
    • 1
    Email author
  • Wanyu Chen
    • 1
  • Chong Feng
    • 1
  • Honghui Chen
    • 1
  1. 1.Science and Technology on Information Systems Engineering LaboratoryNational University of Defense TechnologyChangshaChina

Personalised recommendations