Advertisement

Improving Noun Phrase Coreference Resolution by Matching Strings

  • Xiaofeng Yang
  • Guodong Zhou
  • Jian Su
  • Chew Lim Tan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3248)

Abstract

In this paper we present a noun phrase coreference resolution system which aims to enhance the identification of the coreference realized by string matching. For this purpose, we make two extensions to the standard learn-ing-based resolution framework. First, to improve the recall rate, we introduce an additional set of features to capture the different matching patterns between noun phrases. Second, to improve the precision, we modify the instance selection strategy to allow non-anaphors to be included during training instance generation. The evaluation done on MEDLINE data set shows that the combination of the two extensions provides significant gains in the F-measure.

Keywords

Noun Phrase Relative Clause Recall Rate Training Instance String Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aone, C., Bennett, S.W.: Evaluating automated and manual acquistion of anaphora resolution strategies. In: Proceedings of the 33rd Annual Meeting of the Association for Compuational Linguistics, pp. 122–129 (1995)Google Scholar
  2. 2.
    McCarthy, J., Lehnert, Q.: Using decision trees for coreference resolution. In: Proceedings of the 14th International Conference on Artificial Intelligences, pp. 1050–1055 (1995)Google Scholar
  3. 3.
    Soon, W., Ng, H., Lim, D.: A machine learning approach to coreference resolution of noun phrases. Computational Linguistics 27, 521–544 (2001)CrossRefGoogle Scholar
  4. 4.
    Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, pp. 104–111 (2002)Google Scholar
  5. 5.
    Yang, X., Zhou, G., Su, J., Tan, C.: Coreference resolution using competition learning approach. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Japan (2003) Google Scholar
  6. 6.
    MUC-6: Proceedings of the Sixth Message Understanding Conference. Morgan Kaufmann Publishers, San Francisco, CA (1995)Google Scholar
  7. 7.
    MUC-7: Proceedings of the Seventh Message Understanding Conference. Morgan Kaufmann Publishers, San Francisco, CA (1998) Google Scholar
  8. 8.
    Poesio, M., Vieira, R.: A corpus-based investigation of definite description use. Computational Linguistics 24, 183–261 (1998)Google Scholar
  9. 9.
    Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string distance metrics for namematching tasks. In: Procedings of IJCAI 2003 Workshop on Information Integration on the Web (2003)Google Scholar
  10. 10.
    Vieira, R., Poesio, M.: An empirically based system for processing definite descriptions. Computational Linguistics 27, 539–592 (2001)Google Scholar
  11. 11.
    Strube, M., Rapp, S., Muller, C.: The influence of minimum edit distance on reference resolution. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Philadelphia, pp. 312–319 (2002)Google Scholar
  12. 12.
    Castano, J., Zhang, J., Pustejovsky, J.: Anaphora resolution in biomedical literature. In: International Symposium on Reference Resolution, Alicante, Spain (2002)Google Scholar
  13. 13.
    Zhou, G., Su, J.: Error-driven HMM-based chunk tagger with context-dependent lexicon. In: Proceedings of the Joint Conference on Empirical Methods on Natural Language Processing and Very Large Corpus, Hong Kong (2000)Google Scholar
  14. 14.
    Zhou, G., Su, J.: Named Entity recognition using a HMM-based chunk tagger. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia (2002)Google Scholar
  15. 15.
    Shen, D., Zhang, J., Zhou, G., Su, J., Tan, C.: Effective adaptation of hidden markov modelbased named-entity recognizer for biomedical domain. In: Proceedings of ACL 2003 Workshop on Natural Language Processing in Biomedicine, Japan (2003)Google Scholar
  16. 16.
    Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Francisco (1993)Google Scholar
  17. 17.
    Vilain, M., Burger, J., Aberdeen, J., Connolly, D., Hirschman, L.: A model-theoretic coreference scoring scheme. In: Proceedings of the Sixth Message understanding Conference (MUC-6), pp. 45–52. Morgan Kaufmann, San Francisco (1995)CrossRefGoogle Scholar
  18. 18.
    Ng, V., Cardie, C.: Identifying anaphoric and non-anaphoric noun phrases to improve coreference resolution. In: Proceedings of the 19th International Conference on Computational Linguistics, COLING 2002 (2002)Google Scholar
  19. 19.
    McCallum, A., Wellner, B.: Toward conditional models of identity uncertainty with application to proper noun coreference. In: Procedings of IJCAI 2003 Workshop on Information Integration on the Web, pp. 79–86 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Xiaofeng Yang
    • 1
    • 2
  • Guodong Zhou
    • 1
  • Jian Su
    • 1
  • Chew Lim Tan
    • 2
  1. 1.Institute for Infocomm ResearchSingapore
  2. 2.Department of Computer ScienceNational University of SingaporeSingapore

Personalised recommendations