Abstract
Natural Language Processing (NLP) includes Tasks such as Information Extraction (IE), text summarization, and question and answering, all of which require identifying all the information about an entity exists in the discourse. Therefore a system capable of studying Co-reference Resolution (CR) will contribute to the successful completion of these Tasks. In this paper we are going to study process of Co-reference Resolution and represent a system capable of identifying Co-reference mentions for first the time in Farsi corpora. So we should consider three main steps of Farsi Corpus with Co-reference annotation, system of Mention Recognition and its domain, and the algorithm of predicting Co-reference Mentions as the basis of our study. Therefore, in first step, we prepare a Corpus with suitable labels, and this Corpus as first Farsi corpus having Mention and Co-reference labels can be the basis of many researches related to mention Detection (MD) and CR. Also using such corpus and studying rules and priorities among the mentions, we present a system that identifies the mentions and negative and positive examples. Then by using learning algorithm such as SVM, Neural Network and Decision Tree on extracted samples we have evaluated models for predicting Co-reference mentions in Farsi Language. Finally, we conclude that the performance of neural network is better than other learners.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Deemter, K.V., Kibble, R.: On coreferring: coreference in MUC and related annotation schemes. Computational Linguistics 26(4), 629–637 (2000)
ACE (Automatic Content Extraction) English Annotation Guidelines for Entities. Version 6.06 2008.06.13
Hirschman, L., Chinchor, N.: ۱۹۹۸, Muc-۷ coreference task de_nition. Version ۳. In: Proceedings of the Seventh Message Understanding Conference (MUC-۷),
Chinchor, N.A.: Overview of MUC-7/MET-2. In: Proceedings of the Seventh Message Understanding Conference, MUC-7 (1998), http://www.itl.nist.gov/iad/894.02/relatedprojects/muc/proceedings/muc7toc.html
Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., Jurafsky, D.: Stanford’s multi-pass sieve coreference resolution system at the conll-2011 shared task. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning, Shared Task, pp. 28–34 (2011)
Kobdani, H., Schutze, H., Schiehlen, M., Kamp, H.: Bootsrapping Coreference Resolution Using Word Association. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, June 19-24, pp. 783–792 (2011)
Bunescu, R.: An Adaptive Clustering Model that Integrates Expert. In: Proceedings of the 20th European Conference for Artificial Intelligence (ECAI-2012). Short paper, Montpellier, France (2012)
Strube, M., Hahn, U.: Functional centering-grounding referential coherence in information structure. Computational Linguistics 25(3), 309–344 (1999)
Sidner, C.: Towards a Computational Theory of Definite Anaphora Comprehension in English Discourse. PhD thesis, Massachusetts Institute of Technology (1979)
Soon, W., Ng, H., Lim, D.: A machine learning approach to Coreference resolution of noun phrases. Computational Linguistics 27(4), 521–544 (2001)
Aone, C., Bennett, S.W.: Applying Machin Learning to Anaphora Resolution.
Aone, C., Bennett, S.W.: Evaluating automated and manual acquisition of anaphora resolution strategies. In: Proceedings of the 33rd Annual Meeting of the Association for Omputational Linguistics, Cambridge, Mass, June 26-30, vol. 30, pp. 122–129 (1995)
Cardie, C., Wagstaff, K.: Noun phrase Coreference as clustering. In: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods (1999)
Fisher, F., Soderland, S., Mccarthy, J., Feng, F., Lehnert, W.: Description of the umass system as used for muc-6. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), pp. 127–140 (1995)
Kummerfeld, J.K., Bansal, M., Burkett, D., Klein, D.: Mention Detection, Heuristics for the Onto Notes annotations (2010)
Strube, M., Rapp, S., Müller, C.: The influence of minimum edit distance on reference resolution. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, pp. 312–319 (2002)
McCarthy, J.: A Trainable Approach to Coreference Resolution for Information Extraction. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst MA (1996)
Ng, V., Cardie, C.: Identifying anaphoric and Non-Anaphoric Noun Phrase to Improve Coreference Resolution. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 1–7. Coling (2002)
Ng, V., Cardie, C.: Bootstrapping Coreference classifiers with multiple machine learning algorithms. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 113–120 (2003)
Ng, V., Cardie, C.: Improving machine learning approaches to Coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), pp. 104–111 (2002c)
van Deemter, K., Kibble, R.: ۲۰۰۰. On coreferring: coreference in MUC and related annotation schemes. Computational Linguistics, ۲۶(۴):۶۲۹–۶۳۷
Mooney, R.: Comparative experiments on disambiguating word senses, An illustration of the role of bias in machine learning. In: Brill, E., Church, K. (eds.) Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 82–91 (1996)
Tetreault, J.: Empirical evaluations of pronoun resolution. PhD thesis, University of Rochester, Cited on page(s) (2005)
Stede, M.: Discourse Processing, Synthesis lectueres On Human language Tecnology (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Nazaridoust, M., Bidgoli, B.M., Nazaridoust, S. (2014). Co-reference Resolution in Farsi Corpora. In: Jamshidi, M., Kreinovich, V., Kacprzyk, J. (eds) Advance Trends in Soft Computing. Studies in Fuzziness and Soft Computing, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-319-03674-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-03674-8_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03673-1
Online ISBN: 978-3-319-03674-8
eBook Packages: EngineeringEngineering (R0)