Co-reference Resolution in Farsi Corpora

Conference paper
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 312)


Natural Language Processing (NLP) includes Tasks such as Information Extraction (IE), text summarization, and question and answering, all of which require identifying all the information about an entity exists in the discourse. Therefore a system capable of studying Co-reference Resolution (CR) will contribute to the successful completion of these Tasks. In this paper we are going to study process of Co-reference Resolution and represent a system capable of identifying Co-reference mentions for first the time in Farsi corpora. So we should consider three main steps of Farsi Corpus with Co-reference annotation, system of Mention Recognition and its domain, and the algorithm of predicting Co-reference Mentions as the basis of our study. Therefore, in first step, we prepare a Corpus with suitable labels, and this Corpus as first Farsi corpus having Mention and Co-reference labels can be the basis of many researches related to mention Detection (MD) and CR. Also using such corpus and studying rules and priorities among the mentions, we present a system that identifies the mentions and negative and positive examples. Then by using learning algorithm such as SVM, Neural Network and Decision Tree on extracted samples we have evaluated models for predicting Co-reference mentions in Farsi Language. Finally, we conclude that the performance of neural network is better than other learners.


Co-reference Resolution Mention Detection SVM Neural Network and Decision Tree Farsi Corpus 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Deemter, K.V., Kibble, R.: On coreferring: coreference in MUC and related annotation schemes. Computational Linguistics 26(4), 629–637 (2000)Google Scholar
  2. 2.
    ACE (Automatic Content Extraction) English Annotation Guidelines for Entities. Version 6.06 2008.06.13 Google Scholar
  3. 3.
    Hirschman, L., Chinchor, N.: ۱۹۹۸, Muc-۷ coreference task de_nition. Version ۳. In: Proceedings of the Seventh Message Understanding Conference (MUC-۷),Google Scholar
  4. 4.
    Chinchor, N.A.: Overview of MUC-7/MET-2. In: Proceedings of the Seventh Message Understanding Conference, MUC-7 (1998),
  5. 5.
    Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., Jurafsky, D.: Stanford’s multi-pass sieve coreference resolution system at the conll-2011 shared task. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning, Shared Task, pp. 28–34 (2011)Google Scholar
  6. 6.
    Kobdani, H., Schutze, H., Schiehlen, M., Kamp, H.: Bootsrapping Coreference Resolution Using Word Association. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, June 19-24, pp. 783–792 (2011)Google Scholar
  7. 7.
    Bunescu, R.: An Adaptive Clustering Model that Integrates Expert. In: Proceedings of the 20th European Conference for Artificial Intelligence (ECAI-2012). Short paper, Montpellier, France (2012)Google Scholar
  8. 8.
    Strube, M., Hahn, U.: Functional centering-grounding referential coherence in information structure. Computational Linguistics 25(3), 309–344 (1999)Google Scholar
  9. 9.
    Sidner, C.: Towards a Computational Theory of Definite Anaphora Comprehension in English Discourse. PhD thesis, Massachusetts Institute of Technology (1979)Google Scholar
  10. 10.
    Soon, W., Ng, H., Lim, D.: A machine learning approach to Coreference resolution of noun phrases. Computational Linguistics 27(4), 521–544 (2001)CrossRefGoogle Scholar
  11. 11.
    Aone, C., Bennett, S.W.: Applying Machin Learning to Anaphora Resolution.Google Scholar
  12. 12.
    Aone, C., Bennett, S.W.: Evaluating automated and manual acquisition of anaphora resolution strategies. In: Proceedings of the 33rd Annual Meeting of the Association for Omputational Linguistics, Cambridge, Mass, June 26-30, vol. 30, pp. 122–129 (1995)Google Scholar
  13. 13.
    Cardie, C., Wagstaff, K.: Noun phrase Coreference as clustering. In: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods (1999)Google Scholar
  14. 14.
    Fisher, F., Soderland, S., Mccarthy, J., Feng, F., Lehnert, W.: Description of the umass system as used for muc-6. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), pp. 127–140 (1995)Google Scholar
  15. 15.
    Kummerfeld, J.K., Bansal, M., Burkett, D., Klein, D.: Mention Detection, Heuristics for the Onto Notes annotations (2010)Google Scholar
  16. 16.
    Strube, M., Rapp, S., Müller, C.: The influence of minimum edit distance on reference resolution. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, pp. 312–319 (2002)Google Scholar
  17. 17.
    McCarthy, J.: A Trainable Approach to Coreference Resolution for Information Extraction. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst MA (1996)Google Scholar
  18. 18.
    Ng, V., Cardie, C.: Identifying anaphoric and Non-Anaphoric Noun Phrase to Improve Coreference Resolution. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 1–7. Coling (2002)Google Scholar
  19. 19.
    Ng, V., Cardie, C.: Bootstrapping Coreference classifiers with multiple machine learning algorithms. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 113–120 (2003)Google Scholar
  20. 20.
    Ng, V., Cardie, C.: Improving machine learning approaches to Coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), pp. 104–111 (2002c)Google Scholar
  21. 21.
    van Deemter, K., Kibble, R.: ۲۰۰۰. On coreferring: coreference in MUC and related annotation schemes. Computational Linguistics, ۲۶(۴):۶۲۹–۶۳۷Google Scholar
  22. 22.
    Mooney, R.: Comparative experiments on disambiguating word senses, An illustration of the role of bias in machine learning. In: Brill, E., Church, K. (eds.) Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 82–91 (1996)Google Scholar
  23. 23.
    Tetreault, J.: Empirical evaluations of pronoun resolution. PhD thesis, University of Rochester, Cited on page(s) (2005)Google Scholar
  24. 24.
    Stede, M.: Discourse Processing, Synthesis lectueres On Human language Tecnology (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Information TechnologyUniversity of QomQomIran
  2. 2.Department of Computer EngineeringIran University of Science and TechnologyTehranIran
  3. 3.Department of Industrial EngineeringIslamic Azad UniversityLenjanIran

Personalised recommendations