Supporting Biological Pathway Curation Through Text Mining

Ananiadou, Sophia; Thompson, Paul

doi:10.1007/978-3-319-57135-5_5

Supporting Biological Pathway Curation Through Text Mining

Sophia Ananiadou¹² &
Paul Thompson¹²

Conference paper
First Online: 23 April 2017

638 Accesses
2 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 706))

Abstract

Text mining technology performs automated analysis of large document collections, in order to detect various aspects of information about their structure and meaning. This information can be used to develop systems that make it much easier for researchers to locate information of relevance to their needs in huge volumes of text, compared to standard search mechanisms. With a focus on the challenging task of constructing biological pathway models, which typically involves gathering, interpreting and combining complex information from a large number of publications, we show how text mining applications can provide various levels of support to ease the burden placed on pathway curators. Such support ranges from applications that provide help in searching and exploring the literature for evidence relevant to pathway reactions, to those which are able to make automated suggestions about how to construct and update pathway models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Caron, E., et al.: A comprehensive map of the mTOR signaling network. Mol. Syst. Biol. 6, 453 (2010)
Article Google Scholar
Oda, K., et al.: New challenges for text mining: mapping between text and manually curated pathways. BMC Bioinform. 9(Suppl 3), S5 (2008)
Article Google Scholar
Herrgard, M.J., et al.: A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nat. Biotechnol. 26(10), 1155–1160 (2008)
Article Google Scholar
Thiele, I., Palsson, B.Ø.: Reconstruction annotation jamborees: a community approach to systems biology. Mol. Syst. Biol. 6, 361 (2010)
Article Google Scholar
Ananiadou, S., McNaught, J. (eds.): Text Mining for Biology and Biomedicine. Artech House, Boston/London (2006)
Google Scholar
Ananiadou, S., Kell, D.B., Tsujii, J.: Text mining and its potential applications in systems biology. Trends Biotechnol. 24(12), 571–579 (2006)
Article Google Scholar
Ananiadou, S.: Text mining bridging the gap between knowledge and text. In: Selected Papers of the XVIII International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2016), vol. 1752, pp. 140–141 (2016). http://ceur-ws.org/
Rak, R., et al.: Argo: an integrative, interactive, text mining-based workbench supporting curation. Database: J. Biol. Databases Curation 2012 (2012). bas010
Google Scholar
Rak, R., et al.: Interoperability and customisation of annotation schemata in Argo. In: Proceedings of LREC, pp. 3837–3842 (2014)
Google Scholar
Ferrucci, D., et al.: Towards an interoperability standard for text and multi-modal analytics. IBM Research Report RC24122 (2006)
Google Scholar
Batista-Navarro, R., Rak, R., Ananiadou, S.: Optimising chemical named entity recognition with pre-processing analytics, knowledge-rich features and heuristics. J. Cheminf. 7(Suppl. 1), S6 (2015)
Article Google Scholar
Okazaki, N., Ananiadou, S., Tsujii, J.: Building a high-quality sense inventory for improved abbreviation disambiguation. Bioinformatics 26(9), 1246–1253 (2010)
Article Google Scholar
Alnazzawi, N., Thompson, P., Ananiadou, S.: Mapping phenotypic information in heterogeneous textual sources to a domain-specific terminological resource. PLoS ONE 11(9), e0162287 (2016)
Article Google Scholar
Nobata, C., et al.: Kleio: a knowledge-enriched information retrieval system for biology. In: Proceedings of the 31st Annual International ACM SIGIR, pp. 787–788 (2008)
Google Scholar
Tsuruoka, Y., Tsujii, J., Ananiadou, S.: FACTA: a text search engine for finding associated biomedical concepts. Bioinformatics 24(21), 2559–2560 (2008)
Article Google Scholar
Tsuruoka, Y., et al.: Discovering and visualizing indirect associations between biomedical concepts. Bioinformatics 27(13), i111–i119 (2011)
Article Google Scholar
Miyao, Y., et al.: Semantic retrieval for the accurate identification of relational concepts in massive textbases. In: Proceedings of ACL, pp. 1017–1024 (2005)
Google Scholar
Tsuruoka, Y., Tsujii, J.: Bidirectional inference with the easiest-first strategy for tagging sequence data. In: Proceedings of HLT/EMNLP, pp. 467–474 (2005)
Google Scholar
Hara, T., Miyao, Y., Tsujii, J.: Adapting a probabilistic disambiguation model of an HPSG parser to a new domain. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 199–210. Springer, Heidelberg (2005). doi:10.1007/11562214_18
Chapter Google Scholar
Cohen, K.B., Palmer, M., Hunter, L.: Nominalization and alternations in biomedical language. PLoS ONE 3(9), e3158 (2008)
Article Google Scholar
Kim, J.-D., et al.: Extracting bio-molecular event from literature—The BioNLP’09 shared task. Computational Intelligence 27(4), 513–540 (2011)
Article MathSciNet Google Scholar
Kim, J.-D., Pyysalo, S., Nedellec, C., Ananiadou, S., Tsujii, J. (eds.): Selected Articles from the BioNLP Shared Task 2011. BMC Bioinformatics, vol. 13, Suppl. 11 (2012)
Google Scholar
Nédellec, C., Kim, J.-D., Pyysalo, S., Ananiadou, S., Zweigenbaum, P. (eds.): BioNLP Shared Task 2013: Part 1. BMC Bioinformatics, vol. 16, Suppl. 10 (2015)
Google Scholar
Nédellec, C., Kim, J.-D., Pyysalo, S., Ananiadou, S., Zweigenbaum, P. (eds.): BioNLP Shared Task 2013: Part 2. BMC Bioinformatics, vol. 16, Suppl. 16 (2015)
Google Scholar
Thompson, P., Iqbal, S., McNaught, J., Ananiadou, S.: Construction of an annotated corpus to support biomedical information extraction. BMC Bioinform. 10, 349 (2009)
Article Google Scholar
Pyysalo, S., et al.: BioInfer: a corpus for information extraction in the biomedical domain. BMC Bioinform. 8, 50 (2007)
Article Google Scholar
Ananiadou, S., et al.: Event-based text mining for biology and functional genomics. Brief. Funct. Genomics 14(3), 213–230 (2015)
Article Google Scholar
Miwa, M., et al.: Event extraction with complex event classification using rich features. J Bioinform. Comput. Biol. 8(1), 131–146 (2010)
Article Google Scholar
Sagae, K., Tsujii, J.: Dependency parsing and domain adaptation with LR models and parser ensembles. In: Proceedings of the CoNLL 2007 Shared Task, pp. 1044–1050 (2007)
Google Scholar
Miyao, Y., et al.: Evaluating contributions of natural language parsers to protein-protein interaction extraction. Bioinformatics 25(3), 394–400 (2009)
Article Google Scholar
Miwa, M., Ananiadou, S.: Adaptable, high recall, event extraction system with minimal configuration. BMC Bioinform. 16(Suppl. 10), S7 (2015)
Article Google Scholar
Miwa, M., Thompson, P., Ananiadou, S.: Boosting automatic event extraction from the literature using domain adaptation and coreference resolution. Bioinformatics 28(13), 1759–1765 (2012)
Article Google Scholar
Miwa, M., et al.: Extracting semantically enriched events from biomedical literature. BMC Bioinform. 13, 108 (2012)
Article Google Scholar
Nawaz, R., et al.: Meta-knowledge annotation of bio-events. Proc. LREC 2010, 2498–2507 (2010)
Google Scholar
Nawaz, R., Thompson, P., Ananiadou, S.: Evaluating a meta-knowledge annotation scheme for bio-events. In: Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, pp. 69–77 (2010)
Google Scholar
Thompson, P., et al.: Enriching a biomedical event corpus with meta-knowledge annotation. BMC Bioinform. 12, 393 (2011)
Article Google Scholar
Hucka, M., et al.: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19(4), 524–531 (2003)
Article Google Scholar
Hucka, M., et al.: Evolving a lingua franca and associated software infrastructure for computational systems biology: the Systems Biology Markup Language (SBML) project. Syst. Biol. 1(1), 41–53 (2004)
Article Google Scholar
Demir, E., et al.: The BioPAX community standard for pathway data sharing. Nat. Biotechnol. 28(9), 935–942 (2010)
Article Google Scholar
Ohta, T., Pyysalo, S., Tsujii, J.: From pathways to biomolecular events: opportunities and challenges. In: Proceedings of BioNLP 2011 Workshop, pp. 105–113 (2011)
Google Scholar
Miwa, M., et al.: A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text. Bioinformatics 29(13), i44–i52 (2013)
Article Google Scholar

Download references

Acknowledgements

The work described in this article has been supported by the BBSRC-funded EMPATHY project (Grant No. BB/M006891/1) and by the DARPA-funded Big Mechanism project Grant No. DARPA-BAA-14-14).

Author information

Authors and Affiliations

School of Computer Science, National Centre for Text Mining, University of Manchester, Manchester, UK
Sophia Ananiadou & Paul Thompson

Authors

Sophia Ananiadou
View author publications
You can also search for this author in PubMed Google Scholar
Paul Thompson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sophia Ananiadou .

Editor information

Editors and Affiliations

Federal Research Center “Computer Science and Control” of RAS, Moscow, Russia
Leonid Kalinichenko
National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
Aristotle University of Thessaloniki, Thessaloniki, Greece
Yannis Manolopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ananiadou, S., Thompson, P. (2017). Supporting Biological Pathway Curation Through Text Mining. In: Kalinichenko, L., Kuznetsov, S., Manolopoulos, Y. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2016. Communications in Computer and Information Science, vol 706. Springer, Cham. https://doi.org/10.1007/978-3-319-57135-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-57135-5_5
Published: 23 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57134-8
Online ISBN: 978-3-319-57135-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics