Abstract
Automatic question answering (QA) system is the inevitable trend of future search engines. As the essential steps of QA, question classification and text retrieval both require algorithms to capture the semantic information and syntactic structure of natural language. This paper proposes dependency-based convolutional networks to learn a representation of sentences. First, we use dependency layer to map discrete word depth on the dependency tree of a sentence into continuous real space. Then, the mapping result serves as weight of word vectors and convolutional kernels are employed as feature extractors for further specific tasks. The method proposed allows convolutional networks to take the advantage of higher representational ability of dependency structure. Experiments involving three tasks including text classification, duplicate classification and text pairs ranking confirm the advantages of our model.
Similar content being viewed by others
Notes
http://nlp.stanford.edu/sentiment/ Data are actually provided with phrase-level annotation; however, to enable direct comparison in learning representation of sentences, we do not use phrase-level annotation here.
References
Ferrucci DA (2012) Introduction to “This is Watson”. IBM J Res Dev 56(3/4):1:1–1:15
Lally A, Prager JM, McCord MC et al (2012) Question analysis: how Watson reads a clue. IBM J Res Dev 56(3/4):2:1–2:14
Chu-Carroll J, Fan J, Schlaefer N, Zadrozny W (2012) Textual resource acquisition and engineering. IBM J Res Dev 56(3/4):3:1–3:11
Loni B (2011) A survey of state-of-the-art methods on question classification. Delft University of Technology, Tech. Rep: 1–40
Li X, Roth D (2002) Learning question classifiers. In: Proceedings of ACL, pp 1–7
Wen X, Zhang Y, Liu T et al (2006) Syntactic structure parsing based Chinese question classification. J Chin Inf Process 20(2):33–39
Boot C, Meijman FJ (2010) Classifying health questions asked by the public using the ICPC-2 classification and a taxonomy of generic clinical questions: an empirical exploration of the feasibility. Health Commun 25(2):175–181
Ely JW, Osheroff JA, Gorman PN et al (2000) A taxonomy of generic clinical questions: classification study. Br Med J 321(7258):429–432
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. Mach Learn Res 3:137–1155
Mikolov T, Karafiat M, Burget L et al (2010) Recurrent neural network based language model. In: Proceedings of Interspeech, pp 1045–1048
Mikolov T, Chen K, Corrado GS et al (2013) Efficient estimation of word representations in vector space. In: Proceedings of ICLR
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of ACL, pp 655–665
Socher R, Lin C, Manning CD et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of ICML, pp 129–136
Socher R, Huval B, Manning CD et al (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of EMNLP, pp: 1201–1211
Socher R, Perelygin A., Wu JY et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP, pp 1631–1642
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL, pp 1556–1566
Li X, Roth D (2004) Learning question classifiers: The role of semantic information. In: Proceedings of COLING, pp 556–562
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of EMNLP, pp 1746–1751
Yih W, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of ACL, pp 643–648
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324
Silva J, Coheur L, Mendes A, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154
Hu B, Lu Z, Li H et al (2014) Convolutional neural network architectures for matching natural language sentences. In: International conference on NIPS, pp 2042–2050
Severyn A, Moschitti A (2015) Learning to rank short text pairs with convolutional deep neural networks. In: Proceedings of SIGIR, pp 373–382
Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of SIGIR, pp 26–32
Lees RB, Chomsky N (1957) Syntactic structures. Language 33(3 Part 1):375–408
Farabet C, Couprie C, Najman L et al (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Echihabi A, Marcu D (2003) A noisy-channel approach to question answering. In: Proceedings of ACL, pp 16–23
Bordes A, Weston J, Usunier N (2014) Open question answering with weakly supervised embedding models. In: Joint European conference on machine learning and knowledge discovery in databases, pp 165–180
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on NIPS, pp 1097–1105
Collobert R, Weston J, Bottou L et al (2011) Natural language processing from scratch. J Mach Learn Res 12(1):2493–2537
Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Hu M, Liu B (2004) Mining opinion features in customer reviews. In: Proceedings on AAAI, pp 755–760
Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of international conference on web search and data mining. ACM, pp 231–240
Liu Q, Gao Z, Liu B et al (2015) Automated rule selection for aspect extraction in opinion mining. In: Proceedings of IJCAI, pp 1291–1297
Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2–3):165–210
Kingma DP, Adam JB (2015) A method for stochastic optimization. In: Proceedings of ICLR, pp 1–10
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML, pp 1188–1196
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of EMNLP
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings on EMNLP, pp 1532–1543
Manning CD, Surdeanu M, Bauer J et al (2014) The Stanford CoreNLP natural language processing toolkit. In: Meeting of the Association for Computational Linguistics: System Demonstrations
Acknowledgements
The authors are thankful for the financial support from the National Natural Science Foundation of China (U1636220, 61432008, 61472423), the Huawei Innovation Research Program (HO2017050001BI), and the Beijing Natural Science Foundation (4172063).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, S., Zhang, W. & Niu, J. Improving short-text representation in convolutional networks by dependency parsing. Knowl Inf Syst 61, 463–484 (2019). https://doi.org/10.1007/s10115-018-1312-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1312-9