Score-Based Approach for Anaphora Resolution in Drug-Drug Interactions Documents

  • Isabel Segura-Bedmar
  • Mario Crespo
  • Cesar de Pablo-Sánchez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5723)


Drug-drug interactions are frequently reported in biomedical literature and Information Extraction (IE) techniques have been devised as a useful instrument for managing this knowledge. Nevertheless, IE at the sentence level has a limited effect because there are frequent references to previous entities in the discourse, a phenomenon known as ‘anaphora’. The problem of resolving pronominal and nominal anaphora to improve a system that detects drug interactions is addressed in this paper. To our knowledge, this is the first research article that tackles this issue. A corpus and a system for the evaluation of drug anaphora resolution have been developed and an analysis of the phenomena is also included. The system uses a domain-specific syntactic and semantic parser, UMLS Metamap Transfer (MMTx) [1], to select anaphoric expressions and candidate references. It is shown that a combination of the domain-specific syntax and semantic information with generic heuristics can be leveraged to produce good results comparable to other related domains. Furthermore, the analysis of the errors suggests that the use of additional semantic knowledge is needed to improve results and deal with this linguistic phenomenon in this particular domain.


Information Extraction Anaphora Resolution Drug-Drug Interactions 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aronson, A.R.: Effective mapping of biomedical text to the UMLS metathesaurus: the Metamap program. In: Proceedings of AMIA Symp., pp. 17–21 (2001)Google Scholar
  2. 2.
    WHO. The Importance of Pharmacovigilance: Safety Monitoring of Medicinal Products. World Health Organization (2002) Google Scholar
  3. 3.
    Stockley, I.: Stockley’s Drug Interactions. Pharmaceutical Press (2007)Google Scholar
  4. 4.
    Jankel, C., McMillan, J., Martin, B.: Effects of drug interactions on outcomes of patients receiving warfarin or theophylline. Am. J. Hosp. Pharm. 51, 661–666 (1994)Google Scholar
  5. 5.
    Aronson, J.K.: Communicating information about drug interactions. British Journal of Clinical Pharmacology 63(6), 637–639 (2007)CrossRefGoogle Scholar
  6. 6.
    Duda, S., Aliferis, C., Miller, R., Statnikov, A., Johnson, K.: Extracting Drug-Drug Interaction Articles from MEDLINE to Improve the Content of Drug Databases. In: AMIA Annual Symposium Proceedings (2005)Google Scholar
  7. 7.
    Wishart, D.S., Knox, C., Guo, A.C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B., Hassanali, M.: Drugbank: a knowledgebase for drugs, drug actions and drug targets. Nucl. Acids Res. (2007), doi:10.1093/nar/gkm958Google Scholar
  8. 8.
    Kim, J.J., Park, J.C.: BioAR: Anaphora Resolution for Relating Protein Names to Proteome Database Entries. In: Proceedings of ACL, pp. 79–86 (2004)Google Scholar
  9. 9.
    Castaño, J., Zhang, J., Pustejovsky, J.: Anaphora resolution in biomedical literature. In: Int’l Symp. Reference Resolution in NLP, Alicante, Spain (2002)Google Scholar
  10. 10.
    Lin, Y.H., Liang, T.: Pronominal and sortal anaphora resolution for biomedical literature. In: Proceedings of ROCLING XVI: Conference on Computational Linguistics and Speech Processing (2004)Google Scholar
  11. 11.
    Temkin, J., Gilder, M.: Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics 19(16), 2046–2053 (2003)CrossRefGoogle Scholar
  12. 12.
    Fundel, K., Kuffner, R., Zimmer, R.: RelEx-Relation extraction using dependency parse trees. Bioinformatics 23(3), 365 (2007)CrossRefGoogle Scholar
  13. 13.
    Kolarik, C., Hofmann-Apitius, M., Zimmermann, M., Fluck, J.: Identification of new drug classiffcation terms in textual resources. Bioinformatics 23(13), i264 (2007)CrossRefGoogle Scholar
  14. 14.
    Pustejovsky, J., Castaño, J., Saurí, R., Rumshisky, A., Zhang, J., Luo, W.: Medstract: Creating Large-scale Information Servers for Biomedical Libraries. In: Proceedings of ACL 2002 Workshop on Natural Language Processing in the Biomedical Domain, Philadelphia (2002)Google Scholar
  15. 15.
    Liang, T., Lin, Y.: Anaphora Resolution for Biomedical Literature by Exploiting Multiple Resources. In: Proceedings of IJCNLP, pp. 742–753 (2005)Google Scholar
  16. 16.
    Gasperin, C.: Semi-supervised anaphora resolution in biomedical texts. In: Proceedings of BioNLP in HLT-NAACL, New York, pp. 96–103 (2006)Google Scholar
  17. 17.
  18. 18.
    Eilbeck, K., Lewis, S.E., Mungall, C.J., Yandell, M., Stein, L., Durbin, R., et al.: The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. (2005)Google Scholar
  19. 19.
    Briscoe, T., Carroll, J.: Robust accurate statistical annotation of general text (2002)Google Scholar
  20. 20.
    Kim, J.J., Park, J.C.: BioIE: retargetable information extraction and ontological annotation of biological interactions from the literature. J. Bioinformatics and Computational Biology 2(3), 551–568 (2004)CrossRefGoogle Scholar
  21. 21.
    Bairoch, A., Apweiler, R.: The swiss-prot protein sequence database and its supplement TrEMBL in 2000. Nucl. Acids Res. 28(1), 45–48 (2000)CrossRefGoogle Scholar
  22. 22.
    Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: A framework for modelling the local coherence of discourse. Computational Linguistics 21(2), 203–225 (1995)Google Scholar
  23. 23.
    Sanchez, O., Poesio, M., Kabadjov, M., Tesar, R.: What kind of problems do protein interactions raise for anaphora resolution? - A preliminary analysis. In: SMBM, Jena, Germany (2006)Google Scholar
  24. 24.
    Poesio, M., Kabadjov, M.A.: A general-purpose, off-the-shelf anaphora resolution module: implementation and preliminary evaluation. In: Proceedings of LREC, Lisbon, Portugal (2004)Google Scholar
  25. 25.
    Segura-Bedmar, I., Martínez, P., Segura-Bedmar, M.: Drug Name Recognition and classification in biomedical texts. Drug Discovery Today 13(17), 816–823 (2008)CrossRefGoogle Scholar
  26. 26.
    Poprat, M., Hahn, U.: Quantitative Data on Referring Expressions in Biomedical Abstracts. In: Proceedings of the Workshop on BioNLP 2007, pp. 193–194 (2007)Google Scholar
  27. 27.
    Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: A framework for modelling the local coherence of discourse. In: Computational Linguistics, pp. 203–225 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Isabel Segura-Bedmar
    • 1
  • Mario Crespo
    • 1
  • Cesar de Pablo-Sánchez
    • 1
  1. 1.Computer Science DepartmentUniversidad Carlos III de MadridLeganés, MadridSpain

Personalised recommendations