Zero Anaphora Resolution in Chinese and Its Application in Chinese-English Machine Translation

  • Jing Peng
  • Kenji Araki
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4592)

Abstract

In this paper, we propose a learning classifier based on maximum entropy (ME) for resolving ZA in Chinese. Besides regular grammatical, lexical, positional and semantic features, we develop two innovative Web-based features for extracting additional semantic information of ZA from the Web. Our study shows the Web as a knowledge source can be incorporated effectively in the learning framework and significantly improves its performance. In the application of ZA resolution in MT, it is viewed as a pre-processing module that is detachable and MT-independent. The experiment results demonstrate a signifcant improvement on BLEU/NIST scores after the ZA resolution is employed.

Keywords

zero anaphora resolution Web-based features ME-based classifier machine translation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Feng, Z.W.: New Review of Machine Translation. Chinese Publishing Company (1994)Google Scholar
  2. 2.
    Li, C., Thompson, S.: Mandarin Chinese - A Functional Reference Grammar. University of California Press (1981)Google Scholar
  3. 3.
    Resnik, P., Smith, N.A.: The web as a parallel corpus. Computational Linguistics 127, 349–380 (2003)CrossRefGoogle Scholar
  4. 4.
    Yeh, C.L., Chen, Y.C.: Zero anaphora resolution in chinese with shallow parsing. Journal of Chinese Language and Computing (to appear)Google Scholar
  5. 5.
    Zhang, W., Zhou, C.L.: Study on meta-anaphoric resolution in chinese discourse understanding. Journal of Software 13, 732–738 (2002)MathSciNetGoogle Scholar
  6. 6.
    Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: a framework for modeling the local coherence of discourse. Computational Linguistics 21, 203–225 (1995)Google Scholar
  7. 7.
    Ge, N.Y., Hale, J., Eugene, C.: A statistical approach to anaphora resolution. In: Proc. 6th Workshop on Very Large Corpora, Montreal, Canada, pp. 161–170 (1998)Google Scholar
  8. 8.
    Soon, W.M., Ng, H.T., Lim, C.Y.: Machine learning approach to coreference resolution of noun phrases. Computational Linguistics 127, 521–544 (2001)CrossRefGoogle Scholar
  9. 9.
    Isozaki, H., Hirao, T.: Japanese zero pronoun resolution based on ranking rules and machine learning. In: Proc. the, Conf. on Empirical Methods in NLP (EMNLP), Sapporo, Japan, pp. 184–191 (2003)Google Scholar
  10. 10.
    Hinrichs, E.W., Filippova, K., Wunsch, H.: A data-driven approach to pronominal anaphora resolution for german. In: Proc. Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria, pp. 239–245 (2005)Google Scholar
  11. 11.
    Ratnaparkhi, A.: Maximum entropy models for natural language ambiguity resolution. PhD thesis, University of Pennsylvania, Philadelphia (1998)Google Scholar
  12. 12.
    Pietra, S.D., Pietra, V.D., Lafferty, J.: Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 380–393 (1977)CrossRefGoogle Scholar
  13. 13.
    Wang, H., Yu, S., Zhan, W.: The specification of the semantic knowledge-based on contemporary chinese. Journal of Chinese Language and Computing 113, 159–176 (2003)Google Scholar
  14. 14.
    Lapata, M., Keller, F.: Web-based models for natural language processing. ACM Transactions on Speech and Language Processing (TSLP) 2, 1–31 (2005)CrossRefGoogle Scholar
  15. 15.
    Papineni, K., Roukos, S., Zhu, T.: Bleu: a method for automatic evaluation of machine translation. In: Proc. 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002) Philadelphia, PA, US, pp. 311–318 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jing Peng
    • 1
  • Kenji Araki
    • 1
  1. 1.Language Media Laboratory, Hokkaido University Kita-14, Nishi-9, Kita-ku, SapporoJapan

Personalised recommendations