Skip to main content
Log in

Improving short-text representation in convolutional networks by dependency parsing

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Automatic question answering (QA) system is the inevitable trend of future search engines. As the essential steps of QA, question classification and text retrieval both require algorithms to capture the semantic information and syntactic structure of natural language. This paper proposes dependency-based convolutional networks to learn a representation of sentences. First, we use dependency layer to map discrete word depth on the dependency tree of a sentence into continuous real space. Then, the mapping result serves as weight of word vectors and convolutional kernels are employed as feature extractors for further specific tasks. The method proposed allows convolutional networks to take the advantage of higher representational ability of dependency structure. Experiments involving three tasks including text classification, duplicate classification and text pairs ranking confirm the advantages of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.watsonclinic.com/.

  2. http://www.120ask.com/.

  3. https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#datasets.

  4. http://mpqa.cs.pitt.edu/corpora/mpqa_corpus/.

  5. http://nlp.stanford.edu/sentiment/ Data are actually provided with phrase-level annotation; however, to enable direct comparison in learning representation of sentences, we do not use phrase-level annotation here.

  6. http://www.120ask.com/.

  7. http://hanlp.linrunsoft.com/.

  8. https://nlp.stanford.edu/projects/glove/.

  9. https://dumps.wikimedia.org/enwiki/20140102/.

  10. https://catalog.ldc.upenn.edu/LDC2011T07.

  11. More details: 1. For CR, dataset used by Kim [18] is a subset of ours; 2. For SST2/SST5, Kim [18] and other state-of-the-art obtained a higher performance by using phrase-level annotation, which is out of our scope.

References

  1. Ferrucci DA (2012) Introduction to “This is Watson”. IBM J Res Dev 56(3/4):1:1–1:15

    Article  Google Scholar 

  2. Lally A, Prager JM, McCord MC et al (2012) Question analysis: how Watson reads a clue. IBM J Res Dev 56(3/4):2:1–2:14

    Article  Google Scholar 

  3. Chu-Carroll J, Fan J, Schlaefer N, Zadrozny W (2012) Textual resource acquisition and engineering. IBM J Res Dev 56(3/4):3:1–3:11

    Google Scholar 

  4. Loni B (2011) A survey of state-of-the-art methods on question classification. Delft University of Technology, Tech. Rep: 1–40

  5. Li X, Roth D (2002) Learning question classifiers. In: Proceedings of ACL, pp 1–7

  6. Wen X, Zhang Y, Liu T et al (2006) Syntactic structure parsing based Chinese question classification. J Chin Inf Process 20(2):33–39

    Google Scholar 

  7. Boot C, Meijman FJ (2010) Classifying health questions asked by the public using the ICPC-2 classification and a taxonomy of generic clinical questions: an empirical exploration of the feasibility. Health Commun 25(2):175–181

    Article  Google Scholar 

  8. Ely JW, Osheroff JA, Gorman PN et al (2000) A taxonomy of generic clinical questions: classification study. Br Med J 321(7258):429–432

    Article  Google Scholar 

  9. Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. Mach Learn Res 3:137–1155

    MATH  Google Scholar 

  10. Mikolov T, Karafiat M, Burget L et al (2010) Recurrent neural network based language model. In: Proceedings of Interspeech, pp 1045–1048

  11. Mikolov T, Chen K, Corrado GS et al (2013) Efficient estimation of word representations in vector space. In: Proceedings of ICLR

  12. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of ACL, pp 655–665

  13. Socher R, Lin C, Manning CD et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of ICML, pp 129–136

  14. Socher R, Huval B, Manning CD et al (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of EMNLP, pp: 1201–1211

  15. Socher R, Perelygin A., Wu JY et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP, pp 1631–1642

  16. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL, pp 1556–1566

  17. Li X, Roth D (2004) Learning question classifiers: The role of semantic information. In: Proceedings of COLING, pp 556–562

  18. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of EMNLP, pp 1746–1751

  19. Yih W, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of ACL, pp 643–648

  20. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324

  21. Silva J, Coheur L, Mendes A, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154

    Article  Google Scholar 

  22. Hu B, Lu Z, Li H et al (2014) Convolutional neural network architectures for matching natural language sentences. In: International conference on NIPS, pp 2042–2050

  23. Severyn A, Moschitti A (2015) Learning to rank short text pairs with convolutional deep neural networks. In: Proceedings of SIGIR, pp 373–382

  24. Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of SIGIR, pp 26–32

  25. Lees RB, Chomsky N (1957) Syntactic structures. Language 33(3 Part 1):375–408

    Article  Google Scholar 

  26. Farabet C, Couprie C, Najman L et al (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929

    Article  Google Scholar 

  27. Echihabi A, Marcu D (2003) A noisy-channel approach to question answering. In: Proceedings of ACL, pp 16–23

  28. Bordes A, Weston J, Usunier N (2014) Open question answering with weakly supervised embedding models. In: Joint European conference on machine learning and knowledge discovery in databases, pp 165–180

  29. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on NIPS, pp 1097–1105

  30. Collobert R, Weston J, Bottou L et al (2011) Natural language processing from scratch. J Mach Learn Res 12(1):2493–2537

    MATH  Google Scholar 

  31. Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  32. Hu M, Liu B (2004) Mining opinion features in customer reviews. In: Proceedings on AAAI, pp 755–760

  33. Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of international conference on web search and data mining. ACM, pp 231–240

  34. Liu Q, Gao Z, Liu B et al (2015) Automated rule selection for aspect extraction in opinion mining. In: Proceedings of IJCAI, pp 1291–1297

  35. Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2–3):165–210

    Article  Google Scholar 

  36. Kingma DP, Adam JB (2015) A method for stochastic optimization. In: Proceedings of ICLR, pp 1–10

  37. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML, pp 1188–1196

  38. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  39. Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of EMNLP

  40. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings on EMNLP, pp 1532–1543

  41. Manning CD, Surdeanu M, Bauer J et al (2014) The Stanford CoreNLP natural language processing toolkit. In: Meeting of the Association for Computational Linguistics: System Demonstrations

Download references

Acknowledgements

The authors are thankful for the financial support from the National Natural Science Foundation of China (U1636220, 61432008, 61472423), the Huawei Innovation Research Program (HO2017050001BI), and the Beijing Natural Science Foundation (4172063).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wensheng Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, S., Zhang, W. & Niu, J. Improving short-text representation in convolutional networks by dependency parsing. Knowl Inf Syst 61, 463–484 (2019). https://doi.org/10.1007/s10115-018-1312-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-018-1312-9

Keywords

Navigation