Tree-LSTM Guided Attention Pooling of DCNN for Semantic Sentence Modeling

Chen, Liu; Zeng, Guangping; Zhang, Qingchuan; Chen, Xingyu

doi:10.1007/978-3-319-72823-0_6

Liu Chen^21,23,
Guangping Zeng^21,23,
Qingchuan Zhang^22,23 &
…
Xingyu Chen^21,23

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 211))

Included in the following conference series:

International Conference on 5G for Future Wireless Networks

2348 Accesses
2 Citations

Abstract

The ability to explicitly represent sentences is central to natural language processing. Convolutional neural network (CNN), recurrent neural network and recursive neural networks are mainstream architectures. We introduce a novel structure to combine the strength of them for semantic modelling of sentences. Sentence representations are generated by Dynamic CNN (DCNN, a variant of CNN). At pooling stage, attention pooling is adopted to capture most significant information with the guide of Tree-LSTM (a variant of Recurrent NN) sentence representations. Comprehensive information is extracted by the pooling scheme and the combination of the convolutional layer and the tree long-short term memory. We evaluate the model on sentiment classification task. Experiment results show that utilization of the given structures and combination of Tree-LSTM and DCNN outperforms both Tree-LSTM and DCNN and achieves outstanding performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
MATH Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality, pp. 3111–3119 (2013)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Association for Computational Linguistics, pp. 151–161 (2011)
Google Scholar
Lawrence, S., Giles, C.L., Fong, S.: Natural language grammatical inference with recurrent neural networks. IEEE Trans. Knowl. Data Eng. 12(1), 126–140 (2000)
Article Google Scholar
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014)
Socher, R., Lin, C.C., Manning, C., Ng, A.Y.: Parsing natural scenes and natural language with recursive neural networks, pp. 129–136 (2011)
Google Scholar
Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks, pp. 1–9 (2010)
Google Scholar
Funahashi, K., Nakamura, Y.: Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw. 6(6), 801–806 (1993)
Article Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. P IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Er, M.J., Zhang, Y., Wang, N., Pratama, M.: Attention pooling-based convolutional neural network for sentence modelling. Inf. Sci. 373, 388–403 (2016)
Article Google Scholar
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank, p. 1642. Citeseer (2013)
Google Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Association for Computational Linguistics, pp. 423–430 (2003)
Google Scholar
Zhang, Y., Wallace, B.: A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification. arXiv preprint arXiv:1510.03820 (2015)

Download references

Acknowledgements

This research was supported by National High-tech R&D Program (863 Program No 2015AA015403) and National Natural Science Foundation of China (No 61370131).

Author information

Authors and Affiliations

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
Liu Chen, Guangping Zeng & Xingyu Chen
School of Computer and Information Engineering, Beijing Technology and Business University, Beijing, China
Qingchuan Zhang
Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, China
Liu Chen, Guangping Zeng, Qingchuan Zhang & Xingyu Chen

Authors

Liu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Guangping Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Qingchuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xingyu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liu Chen .

Editor information

Editors and Affiliations

University of Science and Technology, Beijing, China
Keping Long
The University of British Columbia, Vancouver, British Columbia, Canada
Victor C.M. Leung
University of Science and Technology, Beijing, China
Haijun Zhang
Beijing University of Posts and Telecommunications, Beijing, China
Zhiyong Feng
Centre of Excellence in Telecommunications, The University of Sydney , Maze Crescent, Australia
Yonghui Li
University of Science and Technology, Beijing, China
Zhongshan Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, L., Zeng, G., Zhang, Q., Chen, X. (2018). Tree-LSTM Guided Attention Pooling of DCNN for Semantic Sentence Modeling. In: Long, K., Leung, V., Zhang, H., Feng, Z., Li, Y., Zhang, Z. (eds) 5G for Future Wireless Networks. 5GWN 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 211. Springer, Cham. https://doi.org/10.1007/978-3-319-72823-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-72823-0_6
Published: 31 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72822-3
Online ISBN: 978-3-319-72823-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics