TBCNN for Constituency Trees in Natural Language Processing

  • Lili MouEmail author
  • Zhi Jin
Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)


In this and the following chapters, we will apply the tree-based convolutional neural network (TBCNN) to the natural language processing. This chapter deals with constituency trees of natural language sentences, whereas the next chapter deals with dependency trees. In this chapter, we propose a constituency tree-based convolutional network (c-TBCNN). As usual, c-TBCNN can effectively extract structural information of constituency trees, which is aggregated in one or a few vectors for further information processing. c-TBCNN is applied in two sentence classification tasks: sentiment analysis and question classification. In both experiments, we achieve high performance similar to state-of-the-art models.


Tree-based convolution Constituency parsing Sentence modeling 


  1. 1.
    Aizawa, A.: An information-theoretic perspective of TF-IDF measures. Inf. Process. Manag. 39(1), 45–65 (2003)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)Google Scholar
  3. 3.
    Erhan, D., Manzagol, P., Bengio, Y., Bengio, S., Vincent, P.: The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 153–160 (2009)Google Scholar
  4. 4.
    Hatzivassiloglou, V., McKeown, K.: Predicting the semantic orientation of adjectives. In: Proceedings of the 8th Conference on European Chapter of the Association for Computational Linguistics, pp. 174–181 (1997)Google Scholar
  5. 5.
    Irsoy, O., Cardie, C.: Deep recursive neural networks for compositionality in language. In: Advances in Neural Information Processing Systems, pp. 2096–2104 (2014)Google Scholar
  6. 6.
    Jurafsky, D., Martin, J.: Speech and Language Processing. Pearson Education (2000)Google Scholar
  7. 7.
    Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 655–665 (2014)Google Scholar
  8. 8.
    Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751 (2014)Google Scholar
  9. 9.
    Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the International Conference on Machine Learning, pp. 1188–1196 (2014)Google Scholar
  10. 10.
    Le, P., Zuidema, W.: Compositional distributional semantics with long short term memory (2015). arXiv preprint arXiv:1503.02510
  11. 11.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  12. 12.
    Mou, L., Peng, H., Li, G., Xu, Y., Zhang, L., Jin, Z.: Discriminative neural sentence modeling by tree-based convolution. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2315–2325 (2015)Google Scholar
  13. 13.
    Reichartz, F., Korte, H., Paass, G.: Semantic relation extraction with kernels over typed dependency trees. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 773–782 (2010)Google Scholar
  14. 14.
    Silva, J., Coheur, L., Mendes, A., Wichert, A.: From symbolic to sub-symbolic information in question classification. Artif. Intell. Rev. 35(2), 137–154 (2011)CrossRefGoogle Scholar
  15. 15.
    Socher, R., Pennington, J., Huang, E., Ng, A., Manning, C.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 151–161 (2011)Google Scholar
  16. 16.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)Google Scholar
  17. 17.
    Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1556–1566 (2015)Google Scholar
  18. 18.
    Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. J. Mach. Learn. Res. 3, 1083–1106 (2003)Google Scholar
  19. 19.
    Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: Proceedings of Intentional Joint Conference in Artificial Intelligence, pp. 4069–4076 (2015)Google Scholar
  20. 20.
    Zhu, X., Sobihani, P., Guo, H.: Long short-term memory over tree structures. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 1604–1612 (2015)Google Scholar

Copyright information

© The Author(s) 2018

Authors and Affiliations

  1. 1.AdeptMind ResearchTorontoCanada
  2. 2.Institute of SoftwarePeking UniversityBeijingChina

Personalised recommendations