Abstract
Drug-drug interactions are frequently reported in biomedical literature and Information Extraction (IE) techniques have been devised as a useful instrument for managing this knowledge. Nevertheless, IE at the sentence level has a limited effect because there are frequent references to previous entities in the discourse, a phenomenon known as ‘anaphora’. The problem of resolving pronominal and nominal anaphora to improve a system that detects drug interactions is addressed in this paper. To our knowledge, this is the first research article that tackles this issue. A corpus and a system for the evaluation of drug anaphora resolution have been developed and an analysis of the phenomena is also included. The system uses a domain-specific syntactic and semantic parser, UMLS Metamap Transfer (MMTx) [1], to select anaphoric expressions and candidate references. It is shown that a combination of the domain-specific syntax and semantic information with generic heuristics can be leveraged to produce good results comparable to other related domains. Furthermore, the analysis of the errors suggests that the use of additional semantic knowledge is needed to improve results and deal with this linguistic phenomenon in this particular domain.
This research paper is supported by the Regional Government of Madrid under the Research Network MAVIR (S-0505/TIC-0267) and by the Spanish Ministry of Science and Innovation under the project BRAVO (TIN2007-67407-C03-01).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aronson, A.R.: Effective mapping of biomedical text to the UMLS metathesaurus: the Metamap program. In: Proceedings of AMIA Symp., pp. 17–21 (2001)
WHO. The Importance of Pharmacovigilance: Safety Monitoring of Medicinal Products. World Health Organization (2002)
Stockley, I.: Stockley’s Drug Interactions. Pharmaceutical Press (2007)
Jankel, C., McMillan, J., Martin, B.: Effects of drug interactions on outcomes of patients receiving warfarin or theophylline. Am. J. Hosp. Pharm. 51, 661–666 (1994)
Aronson, J.K.: Communicating information about drug interactions. British Journal of Clinical Pharmacology 63(6), 637–639 (2007)
Duda, S., Aliferis, C., Miller, R., Statnikov, A., Johnson, K.: Extracting Drug-Drug Interaction Articles from MEDLINE to Improve the Content of Drug Databases. In: AMIA Annual Symposium Proceedings (2005)
Wishart, D.S., Knox, C., Guo, A.C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B., Hassanali, M.: Drugbank: a knowledgebase for drugs, drug actions and drug targets. Nucl. Acids Res. (2007), doi:10.1093/nar/gkm958
Kim, J.J., Park, J.C.: BioAR: Anaphora Resolution for Relating Protein Names to Proteome Database Entries. In: Proceedings of ACL, pp. 79–86 (2004)
Castaño, J., Zhang, J., Pustejovsky, J.: Anaphora resolution in biomedical literature. In: Int’l Symp. Reference Resolution in NLP, Alicante, Spain (2002)
Lin, Y.H., Liang, T.: Pronominal and sortal anaphora resolution for biomedical literature. In: Proceedings of ROCLING XVI: Conference on Computational Linguistics and Speech Processing (2004)
Temkin, J., Gilder, M.: Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics 19(16), 2046–2053 (2003)
Fundel, K., Kuffner, R., Zimmer, R.: RelEx-Relation extraction using dependency parse trees. Bioinformatics 23(3), 365 (2007)
Kolarik, C., Hofmann-Apitius, M., Zimmermann, M., Fluck, J.: Identification of new drug classiffcation terms in textual resources. Bioinformatics 23(13), i264 (2007)
Pustejovsky, J., Castaño, J., Saurí, R., Rumshisky, A., Zhang, J., Luo, W.: Medstract: Creating Large-scale Information Servers for Biomedical Libraries. In: Proceedings of ACL 2002 Workshop on Natural Language Processing in the Biomedical Domain, Philadelphia (2002)
Liang, T., Lin, Y.: Anaphora Resolution for Biomedical Literature by Exploiting Multiple Resources. In: Proceedings of IJCNLP, pp. 742–753 (2005)
Gasperin, C.: Semi-supervised anaphora resolution in biomedical texts. In: Proceedings of BioNLP in HLT-NAACL, New York, pp. 96–103 (2006)
FlyBase, http://www.flybase.org
Eilbeck, K., Lewis, S.E., Mungall, C.J., Yandell, M., Stein, L., Durbin, R., et al.: The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. (2005)
Briscoe, T., Carroll, J.: Robust accurate statistical annotation of general text (2002)
Kim, J.J., Park, J.C.: BioIE: retargetable information extraction and ontological annotation of biological interactions from the literature. J. Bioinformatics and Computational Biology 2(3), 551–568 (2004)
Bairoch, A., Apweiler, R.: The swiss-prot protein sequence database and its supplement TrEMBL in 2000. Nucl. Acids Res. 28(1), 45–48 (2000)
Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: A framework for modelling the local coherence of discourse. Computational Linguistics 21(2), 203–225 (1995)
Sanchez, O., Poesio, M., Kabadjov, M., Tesar, R.: What kind of problems do protein interactions raise for anaphora resolution? - A preliminary analysis. In: SMBM, Jena, Germany (2006)
Poesio, M., Kabadjov, M.A.: A general-purpose, off-the-shelf anaphora resolution module: implementation and preliminary evaluation. In: Proceedings of LREC, Lisbon, Portugal (2004)
Segura-Bedmar, I., Martínez, P., Segura-Bedmar, M.: Drug Name Recognition and classification in biomedical texts. Drug Discovery Today 13(17), 816–823 (2008)
Poprat, M., Hahn, U.: Quantitative Data on Referring Expressions in Biomedical Abstracts. In: Proceedings of the Workshop on BioNLP 2007, pp. 193–194 (2007)
Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: A framework for modelling the local coherence of discourse. In: Computational Linguistics, pp. 203–225 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Segura-Bedmar, I., Crespo, M., de Pablo-Sánchez, C. (2010). Score-Based Approach for Anaphora Resolution in Drug-Drug Interactions Documents. In: Horacek, H., Métais, E., Muñoz, R., Wolska, M. (eds) Natural Language Processing and Information Systems. NLDB 2009. Lecture Notes in Computer Science, vol 5723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12550-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-12550-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12549-2
Online ISBN: 978-3-642-12550-8
eBook Packages: Computer ScienceComputer Science (R0)