Parsing the Penn Chinese Treebank with Semantic Knowledge

  • Deyi Xiong
  • Shuanglong Li
  • Qun Liu
  • Shouxun Lin
  • Yueliang Qian
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3651)


We build a class-based selection preference sub-model to incorporate external semantic knowledge from two Chinese electronic semantic dictionaries. This sub-model is combined with modifier-head generation sub-model. After being optimized on the held out data by the EM algorithm, our improved parser achieves 79.4% (F1 measure), as well as a 4.4% relative decrease in error rate on the Penn Chinese Treebank (CTB). Further analysis of performance improvement indicates that semantic knowledge is helpful for nominal compounds, coordination, and N⋄V tagging disambiguation, as well as alleviating the sparseness of information available in treebank.


Ambiguity Resolution Semantic Category Semantic Knowledge Heuristic Rule Nominal Compound 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania (1999)Google Scholar
  2. 2.
    Resnik, P.S.: Selection and Information: A Class-Based Approach to Lexical Relationships. PhD thesis, University of Pennsylvania, Philadelphia, PA, USA (1993)Google Scholar
  3. 3.
    Harabagiu, S.: An Application of WordNet to Prepositional Attachement. In: Proceedings of ACL-1996, Santa Cruz CA, June 1996, pp. 360–363 (1996)Google Scholar
  4. 4.
    Krymolowski, Y., Roth, D.: Incorporating Knowledge in Natural Language Learning: A Case Study. In: COLING-ACL 1998 Workshop on Usage of WordNet in Natural Language Processing Systems, Montreal, Canada (1998)Google Scholar
  5. 5.
    McLauchlan, M.: Thesauruses for Prepositional Phrase Attachment. In: Proceedings of CoNLL-2004, Boston, MA, USA, pp. 73–80 (2004)Google Scholar
  6. 6.
    Xia, F.: Automatic Grammar Generation from Two Different Perspectives. PhD thesis, University of Pennsylvania (1999)Google Scholar
  7. 7.
    Klein, D., Manning, C.D.: Fast Exact Natural Language Parsing with a Factored Model. Advances in Neural Information Processing Systems 15 (NIPS-2002) (2002)Google Scholar
  8. 8.
    Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceedings of ACL-2003 (2003)Google Scholar
  9. 9.
    Gildea, D.: Corpus variation and parser performance. In: Proceedings of EMNLP-2001, Pittsburgh, Pennsylvania (2001)Google Scholar
  10. 10.
    Bikel, D.M.: On the Parameter Space of Generative Lexicalized Statistical Parsing Models. PhD thesis, University of Pennsylvania (2004a)Google Scholar
  11. 11.
    Xue, N., Xia, F.: The Bracketing Guidelines for Chinese Treebank Project. Technical Report IRCS 00-08, University of Pennsylvania (2000)Google Scholar
  12. 12.
    Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese Treebank? In: Proceedings of ACL-2003 (2003)Google Scholar
  13. 13.
    Xiong, D., Liu, Q., Lin, S.: Lexicalized Beam Thresholding Parsing with Prior and Boundary Estimates. In: Proceedings of the 6th Conference on Intelligent Text Processing and Computational Linguistics (CICLing), Mexico City, Mexico (2005)Google Scholar
  14. 14.
    Bikel, D.M., Chiang, D.: Two statistical parsing models applied to the chinese treebank. In: Proceedings of the Second Chinese Language Processing Workshop, pp. 1–6 (2000)Google Scholar
  15. 15.
    Bikel, D.M.: Intricacies of Collins’ Parsing Model. to appear in Computational Linguistics (2004b)Google Scholar
  16. 16.
    Chen, K., Hong, W.: Resolving Ambiguities of Predicate-object and Modifier-noun Structures for Chinese V-N Patterns. Communication of COLIPS 6(2), 73–79 (1996) (in Chinese)Google Scholar
  17. 17.
    Chiang, D., Bikel, D.M.: Recovering Latent Information in Treebanks. In: Proceedings of COLING 2002 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Deyi Xiong
    • 1
    • 2
  • Shuanglong Li
    • 1
    • 3
  • Qun Liu
    • 1
  • Shouxun Lin
    • 1
  • Yueliang Qian
    • 1
  1. 1.Institute of Computing TechnologyChinese Academy of SciencesBeijingChina
  2. 2.Graduate School of Chinese Academy of Sciences 
  3. 3.University of Science and TechnologyBeijing

Personalised recommendations