Abstract
We introduce an approach to the automatic generation of biological pathway diagrams from scientific literature. It is composed of the automatic extraction of single interaction relations which are typically found in the full text (rather than the abstract) of a scientific publication, and their subsequent integration into a complex pathway diagram. Our focus is here on relation extraction from full-text documents. We compare the performance of automatic full-text extraction procedures with a manually generated gold standard in order to validate the extracted data which serve as input for the pathway integration procedure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Airola, A., Pyysalo, S., Björne, J., Pahikkala, T., Ginter, F., Salakoski, T.: A graph kernel for protein-protein interaction extraction. In: BioNLP 2008 – Proceedings of the ACL/HLT 2008 Workshop on Current Trends in Biomedical Natural Language Processing, Columbus, OH, USA, June 19, pp. 1–9 (2008)
Baumgartner Jr., W.A., Cohen, K.B., Fox, L.M., Acquaah-Mensah, G., Hunter, L.: Manual curation is not sufficient for annotation of genomic databases. Bioinformatics (ISMB/ECCB 2007 Supplement) 23(13), i41–i48 (2007)
Beisswanger, E., Lee, V., Kim, J.j., Rebholz-Schuhmann, D., Splendiani, A., Dameron, O., Schulz, S., Hahn, U.: Gene Regulation Ontology gro: Design principles and use cases. In: MIE 2008 – Proceedings of the 20th International Congress of the European Federation for Medical Informatics, Göteborg, Sweden, May 26-28, pp. 9–14 (2008)
Buyko, E., Beisswanger, E., Hahn, U.: The GeneReg corpus for gene expression regulation events: An overview of the corpus and its in-domain and out-of-domain interoperability. In: LREC 2010 – Proceedings of the 7th International Conference on Language Resources and Evaluation, La Valletta, Malta, May 19-21, pp. 2662–2666 (2010)
Buyko, E., Faessler, E., Wermter, J., Hahn, U.: Syntactic simplification and semantic enrichment: Trimming dependency graphs for event extraction. Computational Intelligence 27(4) (2011)
Buyko, E., Hahn, U.: Generating semantics for the life sciences via text analytics. In: ICSC 2011 – Proceedings of the 5th IEEE International Conference on Semantic Computing, Stanford University, CA, USA (September 19-21, 2011)
Hahn, U., Buyko, E., Landefeld, R., Mühlhausen, M., Poprat, M., Tomanek, K., Wermter, J.: An overview of JCoRe, the Julie Lab Uima component repository. In: Proceedings of the LREC 2008 Workshop ‘Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP’, Marrakech, Morocco, May 31, pp. 1–7 (2008)
Hahn, U., Tomanek, K., Buyko, E., Kim, J.J., Rebholz-Schuhmann, D.: How feasible and robust is the automatic extraction of gene regulation events? A cross-method evaluation under lab and real-life conditions. In: BioNLP 2009 – Proceedings of the NAACL/HLT BioNLP 2009 Workshop, Boulder, CO, USA, June 4-5, pp. 37–45 (2009)
Luciano, J.S., Stevens, R.D.: e-sience and biological pathway semantics. BMC Bioinformatics 8 (Suppl 3) (S3) (2007)
McDonald, R.T., Pereira, F., Kulick, S., Winters, R.S., Jin, Y., Pete, W.: Simple algorithms for complex relation extraction with applications to biomedical IE. In: ACL 2005 – Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, MI, USA, June 25-30, pp. 491–498 (2005)
Nédellec, C.: Learning Language in Logic: Genic interaction extraction challenge. In: Proceedings LLL 2005 – 4th Learning Language in Logic Workshop, Bonn, Germany, August 7, pp. 31–37 (2005)
Oda, K., Kim, J.D., Ohta, T., Okanohara, D., Matsuzaki, T., Tateisi, Y., Tsujii, J.: New challenges for text mining: Mapping between text and manually curated pathways. BMC Bioinformatics 9(suppl. 3) (S5) (2008)
Odds, F.C.: Candida and Candidosis, 2nd edn. Baillière Tindall, London (1988)
Rodríguez-Penagos, C., Salgado, H., Martínez-Flores, I., Collado-Vides, J.: Automatic reconstruction of a bacterial regulatory network using natural language processing. BMC Bioinformatics 8(293) (2007)
Sanchez, O., Poesio, M., Kabadjov, M.A., Tesar, R.: What kind of problems do protein interactions raise for anaphora resolution? A preliminary analysis. In: SMBM 2006 – Proceedings of the 2nd International Symposium on Semantic Mining in Biomedicine, Jena, Germany, April 9-12, pp. 109–112 (2006)
Viswanathan, G.A., Seto, J., Patil, S., Nudelman, G., Sealfon, S.C.: Getting started in biological pathway construction and analysis. PLoS Computational Biology 4(2), e16 (2008)
Šarić, J., Jensen, L.J., Ouzounova, R., Rojas, I., Bork, P.: Extracting regulatory gene expression networks from PubMed. In: ACL 2004 – Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain, July 21-26, pp. 191–198 (2004)
Wermter, J., Tomanek, K., Hahn, U.: High-performance gene name normalization with GeNo. Bioinformatics 25(6), 815–821 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buyko, E., Linde, J., Priebe, S., Hahn, U. (2011). Towards Automatic Pathway Generation from Biological Full-Text Publications. In: Gama, J., Bradley, E., Hollmén, J. (eds) Advances in Intelligent Data Analysis X. IDA 2011. Lecture Notes in Computer Science, vol 7014. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24800-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-24800-9_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24799-6
Online ISBN: 978-3-642-24800-9
eBook Packages: Computer ScienceComputer Science (R0)