Research on Chinese Parsing Based on the Improved Compositional Vector Grammar

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9332)


The basic task of syntactic parsing is to determine the syntactic structure of the sentence. Because the natural language is very complex, syntactic structure has a lot of ambiguities. Resolving ambiguity need to introduce a lot of information, and Compositional Vector Grammar (CVG) can well capture fine-grained syntactic and compositional-semantic information on phrases and words. In this paper, we first use a standard CVG model for Chinese parsing, and then we have made improvements on the CVG model. In order to introduce more information, for the word vector, we add the part of speech information; for the type of newborn node after binarization, add temporary node basic type; when computing node score, add the node type information. We also propose a solution for unknown word, replaced with structural vector. Our CVG parser improves the standard CVG parser by nearly 1% F1 on the development set of CTB8.0.


Parsing Ambiguity CVG Word vector Unknown word 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wu, W.C., Zhou, J.S., Qu, W.G.: A Survey of Syntactic Parsing Based on Statistical Learning. Journal of Chinese Information Processing 27(3), 9–19 (2013). (in Chinese)Google Scholar
  2. 2.
    Socher, R., Bauer, J., Manning, C.D., et al.: Parsing with compositional vector grammars. In: Proceedings of the ACL Conference (2013)Google Scholar
  3. 3.
    Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: Proceedings of the NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop, pp.1–9 (2010)Google Scholar
  4. 4.
    Socher, R., Lin, C.C., Manning, C., et al.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 129–136 (2011)Google Scholar
  5. 5.
    Charniak, E., Johnson, M., Elsner, M., et al.: Multilevel coarse-to-fine PCFG parsing. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 168–175. Association for Computational Linguistics (2006)Google Scholar
  6. 6.
    Xue, N., Xia, F., Chiou, F.-D., Palmer, M.: The Penn Chinese TreeBank: Phrase Structure Annotation of a Large Corpus. Natural Language Engineering 11(2), 207–238 (2005)CrossRefGoogle Scholar
  7. 7.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)Google Scholar
  8. 8.
    Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese treebank?. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1, pp. 439–446. ACL (2003)Google Scholar
  9. 9.
    Abney, S., Flickenger, S., Gdaniec, C., et al.: Procedure for quantitatively comparing the syntactic coverage of English grammars. In: Proceedings of the workshop on Speech and Natural Language, pp. 306–311. Association for Computational Linguistics (1991)Google Scholar
  10. 10.
    Kummerfeld, J.K., Tse, D., Curran, J.R., et al.: An empirical examination of challenges in Chinese parsing. In: ACL (2), pp. 98–103 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.College of Information EngineeringZhengzhou UniversityZhengzhouChina

Personalised recommendations