Abstract
The ability to explicitly represent sentences is central to natural language processing. Convolutional neural network (CNN), recurrent neural network and recursive neural networks are mainstream architectures. We introduce a novel structure to combine the strength of them for semantic modelling of sentences. Sentence representations are generated by Dynamic CNN (DCNN, a variant of CNN). At pooling stage, attention pooling is adopted to capture most significant information with the guide of Tree-LSTM (a variant of Recurrent NN) sentence representations. Comprehensive information is extracted by the pooling scheme and the combination of the convolutional layer and the tree long-short term memory. We evaluate the model on sentiment classification task. Experiment results show that utilization of the given structures and combination of Tree-LSTM and DCNN outperforms both Tree-LSTM and DCNN and achieves outstanding performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality, pp. 3111–3119 (2013)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Association for Computational Linguistics, pp. 151–161 (2011)
Lawrence, S., Giles, C.L., Fong, S.: Natural language grammatical inference with recurrent neural networks. IEEE Trans. Knowl. Data Eng. 12(1), 126–140 (2000)
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014)
Socher, R., Lin, C.C., Manning, C., Ng, A.Y.: Parsing natural scenes and natural language with recursive neural networks, pp. 129–136 (2011)
Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks, pp. 1–9 (2010)
Funahashi, K., Nakamura, Y.: Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw. 6(6), 801–806 (1993)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. P IEEE 86(11), 2278–2324 (1998)
Er, M.J., Zhang, Y., Wang, N., Pratama, M.: Attention pooling-based convolutional neural network for sentence modelling. Inf. Sci. 373, 388–403 (2016)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank, p. 1642. Citeseer (2013)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Association for Computational Linguistics, pp. 423–430 (2003)
Zhang, Y., Wallace, B.: A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification. arXiv preprint arXiv:1510.03820 (2015)
Acknowledgements
This research was supported by National High-tech R&D Program (863 Program No 2015AA015403) and National Natural Science Foundation of China (No 61370131).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Chen, L., Zeng, G., Zhang, Q., Chen, X. (2018). Tree-LSTM Guided Attention Pooling of DCNN for Semantic Sentence Modeling. In: Long, K., Leung, V., Zhang, H., Feng, Z., Li, Y., Zhang, Z. (eds) 5G for Future Wireless Networks. 5GWN 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 211. Springer, Cham. https://doi.org/10.1007/978-3-319-72823-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-72823-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72822-3
Online ISBN: 978-3-319-72823-0
eBook Packages: Computer ScienceComputer Science (R0)