Improving short-text representation in convolutional networks by dependency parsing

Zhang, Siheng; Zhang, Wensheng; Niu, Jinghao

doi:10.1007/s10115-018-1312-9

Improving short-text representation in convolutional networks by dependency parsing

Regular Paper
Published: 11 December 2018

Volume 61, pages 463–484, (2019)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Siheng Zhang¹,
Wensheng Zhang¹ &
Jinghao Niu¹

488 Accesses
8 Citations
Explore all metrics

Abstract

Automatic question answering (QA) system is the inevitable trend of future search engines. As the essential steps of QA, question classification and text retrieval both require algorithms to capture the semantic information and syntactic structure of natural language. This paper proposes dependency-based convolutional networks to learn a representation of sentences. First, we use dependency layer to map discrete word depth on the dependency tree of a sentence into continuous real space. Then, the mapping result serves as weight of word vectors and convolutional kernels are employed as feature extractors for further specific tasks. The method proposed allows convolutional networks to take the advantage of higher representational ability of dependency structure. Experiments involving three tasks including text classification, duplicate classification and text pairs ranking confirm the advantages of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

A survey on deep learning approaches for text-to-SQL

Article Open access 23 January 2023

FakeBERT: Fake news detection in social media with a BERT-based deep learning approach

Article 07 January 2021

Notes

http://www.watsonclinic.com/.
http://www.120ask.com/.
https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#datasets.
http://mpqa.cs.pitt.edu/corpora/mpqa_corpus/.
http://nlp.stanford.edu/sentiment/ Data are actually provided with phrase-level annotation; however, to enable direct comparison in learning representation of sentences, we do not use phrase-level annotation here.
http://www.120ask.com/.
http://hanlp.linrunsoft.com/.
https://nlp.stanford.edu/projects/glove/.
https://dumps.wikimedia.org/enwiki/20140102/.
https://catalog.ldc.upenn.edu/LDC2011T07.
More details: 1. For CR, dataset used by Kim [18] is a subset of ours; 2. For SST2/SST5, Kim [18] and other state-of-the-art obtained a higher performance by using phrase-level annotation, which is out of our scope.

References

Ferrucci DA (2012) Introduction to “This is Watson”. IBM J Res Dev 56(3/4):1:1–1:15
Article Google Scholar
Lally A, Prager JM, McCord MC et al (2012) Question analysis: how Watson reads a clue. IBM J Res Dev 56(3/4):2:1–2:14
Article Google Scholar
Chu-Carroll J, Fan J, Schlaefer N, Zadrozny W (2012) Textual resource acquisition and engineering. IBM J Res Dev 56(3/4):3:1–3:11
Google Scholar
Loni B (2011) A survey of state-of-the-art methods on question classification. Delft University of Technology, Tech. Rep: 1–40
Li X, Roth D (2002) Learning question classifiers. In: Proceedings of ACL, pp 1–7
Wen X, Zhang Y, Liu T et al (2006) Syntactic structure parsing based Chinese question classification. J Chin Inf Process 20(2):33–39
Google Scholar
Boot C, Meijman FJ (2010) Classifying health questions asked by the public using the ICPC-2 classification and a taxonomy of generic clinical questions: an empirical exploration of the feasibility. Health Commun 25(2):175–181
Article Google Scholar
Ely JW, Osheroff JA, Gorman PN et al (2000) A taxonomy of generic clinical questions: classification study. Br Med J 321(7258):429–432
Article Google Scholar
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. Mach Learn Res 3:137–1155
MATH Google Scholar
Mikolov T, Karafiat M, Burget L et al (2010) Recurrent neural network based language model. In: Proceedings of Interspeech, pp 1045–1048
Mikolov T, Chen K, Corrado GS et al (2013) Efficient estimation of word representations in vector space. In: Proceedings of ICLR
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of ACL, pp 655–665
Socher R, Lin C, Manning CD et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of ICML, pp 129–136
Socher R, Huval B, Manning CD et al (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of EMNLP, pp: 1201–1211
Socher R, Perelygin A., Wu JY et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP, pp 1631–1642
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL, pp 1556–1566
Li X, Roth D (2004) Learning question classifiers: The role of semantic information. In: Proceedings of COLING, pp 556–562
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of EMNLP, pp 1746–1751
Yih W, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of ACL, pp 643–648
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324
Silva J, Coheur L, Mendes A, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154
Article Google Scholar
Hu B, Lu Z, Li H et al (2014) Convolutional neural network architectures for matching natural language sentences. In: International conference on NIPS, pp 2042–2050
Severyn A, Moschitti A (2015) Learning to rank short text pairs with convolutional deep neural networks. In: Proceedings of SIGIR, pp 373–382
Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of SIGIR, pp 26–32
Lees RB, Chomsky N (1957) Syntactic structures. Language 33(3 Part 1):375–408
Article Google Scholar
Farabet C, Couprie C, Najman L et al (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Article Google Scholar
Echihabi A, Marcu D (2003) A noisy-channel approach to question answering. In: Proceedings of ACL, pp 16–23
Bordes A, Weston J, Usunier N (2014) Open question answering with weakly supervised embedding models. In: Joint European conference on machine learning and knowledge discovery in databases, pp 165–180
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on NIPS, pp 1097–1105
Collobert R, Weston J, Bottou L et al (2011) Natural language processing from scratch. J Mach Learn Res 12(1):2493–2537
MATH Google Scholar
Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Hu M, Liu B (2004) Mining opinion features in customer reviews. In: Proceedings on AAAI, pp 755–760
Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of international conference on web search and data mining. ACM, pp 231–240
Liu Q, Gao Z, Liu B et al (2015) Automated rule selection for aspect extraction in opinion mining. In: Proceedings of IJCAI, pp 1291–1297
Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2–3):165–210
Article Google Scholar
Kingma DP, Adam JB (2015) A method for stochastic optimization. In: Proceedings of ICLR, pp 1–10
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML, pp 1188–1196
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of EMNLP
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings on EMNLP, pp 1532–1543
Manning CD, Surdeanu M, Bauer J et al (2014) The Stanford CoreNLP natural language processing toolkit. In: Meeting of the Association for Computational Linguistics: System Demonstrations

Download references

Acknowledgements

The authors are thankful for the financial support from the National Natural Science Foundation of China (U1636220, 61432008, 61472423), the Huawei Innovation Research Program (HO2017050001BI), and the Beijing Natural Science Foundation (4172063).

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, School of Computer and Control Engineering, University of Chinese Academy of Sciences, 95 Zhongguancun East Road, Beijing, 100190, China
Siheng Zhang, Wensheng Zhang & Jinghao Niu

Authors

Siheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wensheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jinghao Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wensheng Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, S., Zhang, W. & Niu, J. Improving short-text representation in convolutional networks by dependency parsing. Knowl Inf Syst 61, 463–484 (2019). https://doi.org/10.1007/s10115-018-1312-9

Download citation

Received: 24 June 2017
Revised: 23 April 2018
Accepted: 28 November 2018
Published: 11 December 2018
Issue Date: 01 October 2019
DOI: https://doi.org/10.1007/s10115-018-1312-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving short-text representation in convolutional networks by dependency parsing

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on deep learning approaches for text-to-SQL

FakeBERT: Fake news detection in social media with a BERT-based deep learning approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving short-text representation in convolutional networks by dependency parsing

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on deep learning approaches for text-to-SQL

FakeBERT: Fake news detection in social media with a BERT-based deep learning approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation