Abstract
Anaphora resolution algorithms have long made use of the reliable agreement between pronouns and their antecedents in properties such as gender and number. To apply constraints or preferences for anaphoric agreement, real systems need ways to automatically determine these properties for arbitrary noun phrases, in context. This chapter describes a variety of algorithms for extracting noun gender and number, ranging from simple heuristics to large-scale machine learning approaches. We describe the drawbacks and advantages of the different algorithms, focusing mostly on English anaphora resolution. We pay special attention to recent methods for extracting agreement information directly from large volumes of raw text.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
See chapter “Linguistic and Cognitive Evidence About Anaphora” of this book for further discussion on the role of agreement information in human anaphora interpretation.
- 2.
- 3.
Their model’s complexity only allowed training on a very small fraction of the total number of articles in Wikipedia. It would be interesting to assess the feasibility of using the resolution-to-article-topic heuristic on its own to learn a gender/number model from all of Wikipedia.
References
Amaral, C., Cassan, A., Figueira, H., Martins, A., Mendes, A., Mendes, P., Pinto, C., Vidal, D.: Priberam’s question answering system in QA@CLEF 2007. In: Cross Language Evaluation Forum: Working Notes for the CLEF 2007 Workshop, Budapest (2007)
Arnold, J., Eisenband, J., Brown-Schmidt, S., Trueswell, J.: The rapid use of gender information: evidence of the time course of pronoun resolution from eyetracking. Cognition 76 (1), B13–B26 (2000)
Baldwin, B.: CogNIAC: high precision coreference with limited knowledge and linguistic resources. In: Proceedings of the ACL Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts, Madrid (1997)
Barbu, C., Evans, R., Mitkov, R.: A corpus based investigation of morphological disagreement in anaphoric relations. In: LREC, Las Palmas (2002)
Bengtson, E., Roth, D.: Understanding the value of features for coreference resolution. In: EMNLP, Waikiki (2008)
Berg-Kirkpatrick, T., Bouchard-Côté, A., DeNero, J., Klein, D.: Painless unsupervised learning with features. In: NAACL, Los Angeles (2010)
Bergsma, S.: Automatic acquisition of gender information for anaphora resolution. In: Proceedings of the 18th Conference of the Canadian Society for Computational Studies of Intelligence (Canadian AI), Victoria (2005)
Bergsma, S., Lin, D.: Bootstrapping path-based pronoun resolution. In: COLING-ACL, Sydney (2006)
Bergsma, S., Lin, D., Goebel, R.: Glen, Glenda or Glendale: unsupervised and semi-supervised learning of English noun gender. In: CoNLL, Boulder (2009)
Brennan, S.E., Friedman, M.W., Pollard, C.J.: A centering approach to pronouns. In: ACL, Stanford (1987)
Byron, D.K., Tetreault, J.R.: A flexible architecture for reference resolution. In: EACL, Bergen (1999)
Cardie, C., Wagstaff, K.: Noun phrase coreference as clustering. In: EMNLP-VLC, College Park (1999)
Charniak, E., Elsner, M.: EM works for pronoun anaphora resolution. In: EACL, Athens (2009)
Cherry, C., Bergsma, S.: An Expectation Maximization approach to pronoun resolution. In: CoNLL, Ann Arbor (2005)
Church, K.W., Mercer, R.L.: Introduction to the special issue on computational linguistics using large corpora. Comput. Linguist. 19 (1), 1–24 (1993)
Cucerzan, S., Yarowsky, D.: Minimally supervised induction of grammatical gender. In: NAACL, Edmonton (2003)
Daumé III, H., Marcu, D.: A large-scale exploration of effective global features for a joint entity detection and tracking model. In: HLT-EMNLP, Vancouver (2005)
Denber, M.: Automatic resolution of anaphora in English. Technical report, Imaging Science Division, Eastman Kodak Co. (1998)
Elsner, M., Charniak, E., Johnson, M.: Structured generative models for unsupervised named-entity clustering. In: HLT-NAACL, Boulder (2009)
Evans, R., Orăsan, C.: Improving anaphora resolution by identifying animate entities in texts. In: DAARC, Lancaster (2000)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT, Cambridge (1998)
Filippova, K., Strube, M.: Using linguistically motivated features for paragraph boundary detection. In: EMNLP, Sydney (2006)
Ge, N., Hale, J., Charniak, E.: A statistical approach to anaphora resolution. In: Proceedings of the Sixth Workshop on Very Large Corpora, Montreal (1998)
Haegeman, L.: Introduction to Government & Binding theory, 2nd edn. Basil Blackwell, Cambridge (1994)
Haghighi, A., Klein, D.: Unsupervised coreference resolution in a nonparametric Bayesian model. In: ACL, Prague (2007)
Haghighi, A., Klein, D.: Simple coreference resolution with rich syntactic and semantic features. In: EMNLP, Singapore (2009)
Haghighi, A., Klein, D.: Coreference resolution in a modular, entity-centered model. In: HLT-NAACL, Los Angeles (2010)
Hajič, J., Hladká, B.: Probabilistic and rule-based tagger of an inflective language: a comparison. In: ANLP, Washington DC (1997)
Hale, J., Charniak, E.: Getting useful gender statistics from English text. Technical report: CS-98-06, Brown University (1998)
Harabagiu, S., Bunescu, R., Maiorano, S.: Text and knowledge mining for coreference resolution. In: NAACL, Pittsburgh (2001)
Hobbs, J.: Resolving pronoun references. Lingua 44 (311), 311–338 (1978)
Ji, H., Lin, D.: Gender and animacy knowledge discovery from web-scale N-grams for unsupervised person mention detection. In: PACLIC, Hong Kong (2009)
Kennedy, C., Boguraev, B.: Anaphora for everyone: pronominal anaphora resolution without a parser. In: COLING, Copenhagen (1996)
Lappin, S., Leass, H.J.: An algorithm for pronominal anaphora resolution. Comput. Linguist. 20 (4), (1994)
Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Comput. Linguist. 39 (4), 885–916 (2013)
Lin, D., Church, K., Ji, H., Sekine, S., Yarowsky, D., Bergsma, S., Patil, K., Pitler, E., Lathbury, R., Rao, V., Dalwani, K., Narsale, S.: New tools for web-scale N-grams. In: LREC, Valletta (2010)
Mikheev, A., Grover, C., Moens, M.: Description of the LTG system used for MUC-7. In: 7th Message Understanding Conference, Fairfax (1998)
Miller, G.A.: Nouns in WordNet: a lexical inheritance system. Int. J. Lexicogr. 3 (4), 245–264 (1990)
Miltsakaki, E.: Antelogue: pronoun resolution for text and dialogue. In: Coling 2010: Demonstrations, Beijing (2010)
Mitkov, R.: Factors in anaphora resolution: they are not the only things that matter. a case study based on two different approaches. In: ACL/EACL Workshop on Operational Factors in Practical, Robust Anaphora Resolution, Madrid (1997)
Mitkov, R.: Robust pronoun resolution with limited knowledge. In: ACL-COLING, Montreal (1998)
MUC-6: Coreference task definition (v2.3, 8 Sept 1995). In: Proceedings of the Sixth Message Understanding Conference (MUC-6), Columbia (1995)
MUC-7: Coreference task definition (v3.0, 13 July 1997). In: Proceedings of the Seventh Message Understanding Conference (MUC-7), New York (1997)
Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: ACL, Philadephia (2002)
Orăsan, C., Evans, R.: NP animacy identification for anaphora resolution. JAIR 29 (1), 79–103 (2007)
Øvrelid, L.: Towards robust animacy classification using morphosyntactic distributional features. In: EACL Student Research Workshop, Trento (2006)
Pantel, P., Ravichandran, D.: Automatically labeling semantic classes. In: HLT-NAACL, Boston (2004)
Pradhan, S., Ramshaw, L., Marcus, M., Palmer, M., Weischedel, R., Xue, N.: Conll-2011 shared task: modeling unrestricted coreference in ontonotes. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task, Portland, pp. 1–27 (2011)
Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky, D., Manning, C.: A multi-pass sieve for coreference resolution. In: EMNLP, Portland (2010)
Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27 (4), 521–544 (2001)
Stuckardt, R.: Design and enhanced evaluation of a robust anaphor resolution algorithm. Comput. Linguist. 27 (4), 479–506 (2001)
Tetreault, J.R.: A corpus-based evaluation of centering and pronoun resolution. Comput. Linguist. 27 (4), 507–520 (2001)
Zaenen, A., Carletta, J., Garretson, G., Bresnan, J., Koontz-Garboden, A., Nikitina, T., O’Connor, M.C., Wasow, T.: Animacy encoding in English: why and how. In: ACL Workshop on Discourse Annotation, Barcelona (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bergsma, S. (2016). Extracting Anaphoric Agreement Properties from Corpora. In: Poesio, M., Stuckardt, R., Versley, Y. (eds) Anaphora Resolution. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47909-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-662-47909-4_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-47908-7
Online ISBN: 978-3-662-47909-4
eBook Packages: Computer ScienceComputer Science (R0)