Advertisement

SANAPHOR: Ontology-Based Coreference Resolution

  • Roman ProkofyevEmail author
  • Alberto Tonon
  • Michael Luggen
  • Loic Vouilloz
  • Djellel Eddine Difallah
  • Philippe Cudré-Mauroux
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9366)

Abstract

We tackle the problem of resolving coreferences in textual content by leveraging Semantic Web techniques. Specifically, we focus on noun phrases that coreference identifiable entities that appear in the text; the challenge in this context is to improve the coreference resolution by leveraging potential semantic annotations that can be added to the identified mentions. Our system, SANAPHOR, first applies state-of-the-art techniques to extract entities, noun phrases, and candidate coreferences. Then, we propose an approach to type noun phrases using an inverted index built on top of a Knowledge Graph (e.g., DBpedia). Finally, we use the semantic relatedness of the introduced types to improve the state-of-the-art techniques by splitting and merging coreference clusters. We evaluate SANAPHOR on CoNLL datasets, and show how our techniques consistently improve the state of the art in coreference resolution.

Keywords

Noun Phrase Computational Linguistic Semantic Annotation Name Entity Recognition Inverted Index 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ariel, M.: Accessing noun-phrase antecedents. Routledge (2014)Google Scholar
  2. 2.
    Bagga, A., Baldwin, B.: Algorithms for scoring coreference chains. In: The first International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference, vol. 1, pp. 563–566. Citeseer (1998)Google Scholar
  3. 3.
    Baldwin, B.: Cogniac: high precision coreference with limited knowledge and linguistic resources. In: Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts, pp. 38–45. Association for Computational Linguistics (1997)Google Scholar
  4. 4.
    Borthwick, A., Sterling, J., Agichtein, E., Grishman, R.: Exploiting diverse knowledge sources via maximum entropy in named entity recognition. In: Sixth Workshop on Very Large Corpora (1998)Google Scholar
  5. 5.
    Bryl, V., Giuliano, C., Serafini, L., Tymoshenko, K.: Using background knowledge to support coreference resolution. In: ECAI, vol. 10, pp. 759–764 (2010)Google Scholar
  6. 6.
    Cheng, X., Roth, D.: Relational inference for wikification. In: Empirical Methods in Natural Language Processing, pp. 1787–1796 (2013)Google Scholar
  7. 7.
    Elango, P.: Coreference resolution: A survey. Technical report, University of Wisconsin, Madison (2005)Google Scholar
  8. 8.
    Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, Stroudsburg, PA, USA, pp. 363–370. Association for Computational Linguistics (2005)Google Scholar
  9. 9.
    Gangemi, A., Nuzzolese, A.G., Presutti, V., Draicchio, F., Musetti, A., Ciancarini, P.: Automatic typing of DBpedia entities. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 65–81. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  10. 10.
    Ge, N., Hale, J., Charniak, E.: A statistical approach to anaphora resolution. In: Proceedings of the Sixth Workshop on Very Large Corpora, vol. 71 (1998)Google Scholar
  11. 11.
    Grosz, B.J., et al.: The representation and use of focus in a system for understanding dialogs. In: IJCAI, vol. 67, pp. 76 (1977)Google Scholar
  12. 12.
    Grosz, B.J., Weinstein, S., Joshi, A.K.: Centering: A framework for modeling the local coherence of discourse. Computational Linguistics 21(2), 203–225 (1995)Google Scholar
  13. 13.
    Haghighi, A., Klein, D.: Simple coreference resolution with rich syntactic and semantic features. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, EMNLP 2009, Stroudsburg, PA, USA, vol. 3, pp. 1152–1161. Association for Computational Linguistics (2009)Google Scholar
  14. 14.
    Halliday, M.A.K., Hasan, R.: Cohesion in English. Longman, London (1976) Google Scholar
  15. 15.
    Harabagiu, S.M., Bunescu, R.C., Maiorano, S.J.: Text and knowledge mining for coreference resolution. In: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, pp. 1–8. Association for Computational Linguistics (2001)Google Scholar
  16. 16.
    Hobbs, J.: Resolving pronoun references. In: Readings in Natural Language Processing, pp. 339–352. Morgan Kaufmann Publishers Inc. (1986)Google Scholar
  17. 17.
    Ciaramita, M., Houlsby, N.: A scalable gibbs sampler for probabilistic entity linking. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 335–346. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  18. 18.
    Lappin, S., Leass, H.J.: An algorithm for pronominal anaphora resolution. Computational Linguistics 20(4), 535–561 (1994)Google Scholar
  19. 19.
    Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., Jurafsky, D.: Stanford’s multi-pass sieve coreference resolution system at the conll-2011 shared task. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task, pp. 28–34. Association for Computational Linguistics (2011)Google Scholar
  20. 20.
    Luo, X.: On coreference resolution performance metrics. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 25–32. Association for Computational Linguistics (2005)Google Scholar
  21. 21.
    Meij, E., Balog, K., Odijk, D.: Entity linking and retrieval. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, pp. 1127–1127. ACM, New York (2013)Google Scholar
  22. 22.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  23. 23.
    Mitkov, R.: Robust pronoun resolution with limited knowledge. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, vol. 2, pp. 869–875. Association for Computational Linguistics (1998)Google Scholar
  24. 24.
    Nakashole, N., Tylenda, T., Weikum, G.: Fine-grained semantic typing of emerging entities. In: ACL (1), pp. 1488–1497 (2013)Google Scholar
  25. 25.
    Ng, V.: Machine learning for coreference resolution: Recent successes and future challenges. Technical report, Cornell University (2003)Google Scholar
  26. 26.
    Ng, V.: Semantic class induction and coreference resolution. In: AcL, pp. 536–543 (2007)Google Scholar
  27. 27.
    Ng, V.: Supervised noun phrase coreference research: the first fifteen years. In: Proceedings of the 48th annual meeting of the association for Computational Linguistics, pp. 1396–1411. Association for Computational Linguistics (2010)Google Scholar
  28. 28.
    Paulheim, H., Bizer, C.: Improving the Quality of Linked Data Using Statistical Distributions. Int. J. Semantic Web Inf. Syst. 10(2), 63–86 (2014)CrossRefGoogle Scholar
  29. 29.
    Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 192–199. Association for Computational Linguistics (2006)Google Scholar
  30. 30.
    Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: Joint Conference on EMNLP and CoNLL - Shared Task, CoNLL 2012, Stroudsburg, PA, USA, pp. 1–40. Association for Computational Linguistics (2012)Google Scholar
  31. 31.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66(336), 846–850 (1971)CrossRefGoogle Scholar
  32. 32.
    Recasens, M., Hovy, E.: Blanc: Implementing the rand index for coreference evaluation. Nat. Lang. Eng. 17(4), 485–510 (2011)CrossRefGoogle Scholar
  33. 33.
    Rizzo, G., Troncy, R.: NERD : a framework for evaluating named entity recognition tools in the web of data. In: Proceedings of the 11th Interational Semantic Web Conference ISWC 2011, pp. 1–4 (2011)Google Scholar
  34. 34.
    Sidner, C.: Focusing in the comprehension of definite anaphora. In: Readings in Natural Language Processing, pp. 363–394. Morgan Kaufmann Publishers Inc. (1986)Google Scholar
  35. 35.
    Sidner, C.L.: Towards a computational theory of definite anaphora comprehension in english discourse. Technical report, DTIC Document (1979)Google Scholar
  36. 36.
    Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Computational Linguistics 27(4), 521–544 (2001)CrossRefGoogle Scholar
  37. 37.
    Strube, M., Ponzetto, S.P.: Wikirelate! computing semantic relatedness using wikipedia. In: AAAI, vol. 6, pp. 1419–1424 (2006)Google Scholar
  38. 38.
    Tonon, A., Catasta, M., Demartini, G., Cudré-Mauroux, P., Aberer, K.: TRank: ranking entity types using the web of data. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 640–656. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  39. 39.
    Tylenda, T., Sozio, M., Weikum, G.: Einstein: physicist or vegetarian? summarizing semantic type graphs for knowledge discovery. In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011, pp. 273–276. ACM, New York (2011)Google Scholar
  40. 40.
    Uryupina, O., Poesio, M., Giuliano, C., Tymoshenko, K.: Disambiguation and filtering methods in using web knowledge for coreference resolution. In: FLAIRS Conference, pp. 317–322 (2011)Google Scholar
  41. 41.
    Van Deemter, K., Kibble, R.: On coreferring: Coreference in muc and related annotation schemes. Computational Linguistics 26(4), 629–637 (2000)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Roman Prokofyev
    • 1
    Email author
  • Alberto Tonon
    • 1
  • Michael Luggen
    • 1
  • Loic Vouilloz
    • 2
  • Djellel Eddine Difallah
    • 1
  • Philippe Cudré-Mauroux
    • 1
  1. 1.eXascale InfolabUniversity of FribourgFribourgSwitzerland
  2. 2.Linguistics DepartmentUniversity of FribourgFribourgSwitzerland

Personalised recommendations