Abstract
The study of biological networks is playing an increasingly important role in the life sciences. Many different kinds of biological system can be modelled as networks; perhaps the most important examples are protein–protein interaction (PPI) networks, metabolic pathways, gene regulatory networks, and signalling networks. Although much useful information is easily accessible in publicly databases, a lot of extra relevant data lies scattered in numerous published papers. Hence there is a pressing need for automated text-mining methods capable of extracting such information from full-text articles. Here we present practical guidelines for constructing a text-mining pipeline from existing code and software components capable of extracting PPI networks from full-text articles. This approach can be adapted to tackle other types of biological network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barabási AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113
Czarnecki J, Nobeli I, Smith AM, Shepherd AJ (2012) A text-mining system for extracting metabolic reactions from full-text articles. BMC Bioinformatics 13:172
Kabiljo R, Clegg AB, Shepherd AJ (2009) A realistic assessment of methods for extracting gene/protein interactions from free text. BMC Bioinformatics 10:233
Ferrucci D, Lally A, Gruhl D, Epstein E, Schor M, Murdock JW, Frenkiel A, Brown EW, Hampp T, Doganata Y, Welty C, Amini L, Kofman G, Kozakov L, Mass Y (2006) Towards an interoperability standard for text and multi-modal analytics. IBM research report
Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge, MA
Leaman R, Gonzalez G (2008) BANNER: an executable survey of advances in biomedical named entity recognition. Pac Symp Biocomput2008:652–63
Sætre R, Kenji S, Tsujii J (2008) Syntactic features for protein-protein interaction extraction. In: Short paper proceedings of the 2nd international symposium on languages in biology and medicine (LBM 2007). ISSN 1613-0073319. Singapore, pp 6.1–6.14, CEUR workshop proceedings (CEUR-WS.org)
Hara T, Miyao Y, Tsujii J (2007) Evaluating impact of re-training a lexical disambiguation model on domain adaptation of an HPSG parser. In: Proceedings of IWPT 2007 Prague, Czech Republic
Moschitti A (2004) A study on convolution kernels for shallow semantic parsing. In: Proceedings of the 42nd conference on association for computational linguistic (ACL-2004), Barcelona, Spain
Clegg AB, Shepherd AJ (2008) Text mining. In: Keith JM (ed) Bioinformatics volume II: structure, function and applications, vol 453, Methods in molecular biology. Humana Press, New York, pp 471–491
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Czarnecki, J., Shepherd, A.J. (2014). Mining Biological Networks from Full-Text Articles. In: Kumar, V., Tipney, H. (eds) Biomedical Literature Mining. Methods in Molecular Biology, vol 1159. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0709-0_8
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0709-0_8
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0708-3
Online ISBN: 978-1-4939-0709-0
eBook Packages: Springer Protocols