Text Mining for Systems Modeling

Kowald, Axel; Schmeier, Sebastian

doi:10.1007/978-1-60761-987-1_19

Axel Kowald⁴ &
Sebastian Schmeier⁵

Part of the book series: Methods in Molecular Biology ((MIMB,volume 696))

2888 Accesses
3 Citations

Abstract

The yearly output of scientific papers is constantly rising and makes it often impossible for the individual researcher to keep up. Text mining of scientific publications is, therefore, an interesting method to automate knowledge and data retrieval from the literature. In this chapter, we discuss specific tasks required for text mining, including their problems and limitations. The second half of the chapter demonstrates the various aspects of text mining using a practical example. Publications are transformed into a vector space representation and then support vector machines are used to classify papers depending on their content of kinetic parameters, which are required for model building in systems biology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 159.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

White J, Wain H, Bruford E, Povey S (1999) Promoting a standard nomenclature for genes and proteins. Nature 402(6760):347
Article CAS PubMed Google Scholar
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29
Article CAS PubMed Google Scholar
Chen L, Liu H, Friedman C (2005) Gene name ambiguity of eukaryotic nomenclatures. Bioinformatics 21(2):248–256
Article PubMed Google Scholar
Doms A, Schroeder M (2005) GoPubMed: exploring PubMed with the gene ontology. Nucleic Acids Res 33:W783–W786 (Web Server issue)
Article CAS PubMed Google Scholar
Soldatova LN, King RD (2005) Are the current ontologies in biology good ontologies? Nat Biotechnol 23(9):1095–1098
Article CAS PubMed Google Scholar
Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W et al (2007) The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 25(11):1251–1255
Article CAS PubMed Google Scholar
Spasic I, Ananiadou S, McNaught J, Kumar A (2005) Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinform 6(3):239–251
Article CAS PubMed Google Scholar
Hoffmann R, Valencia A (2004) A gene network for navigating the literature. Nat Genet 36(7):664
Article CAS PubMed Google Scholar
Rebholz-Schuhmann D, Kirsch H, Arregui M, Gaudan S, Riethoven M, Stoehr P (2007) EBIMed-text crunching to gather facts for proteins from Medline. Bioinformatics 23(2):e237–e244
Article CAS PubMed Google Scholar
Plake C, Schiemann T, Pankalla M, Hakenberg J, Leser U (2006) AliBaba: PubMed as a graph. Bioinformatics 22(19):2444–2445
Article CAS PubMed Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874
Article Google Scholar
Hakenberg J, Schmeier S, Kowald A, Klipp E, Leser U (2004) Finding kinetic parameters using text mining. OMICS 8(2):131–152
Article CAS PubMed Google Scholar
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Article Google Scholar
Strasberg HR, Manning CD, Rindfleisch TC, Melmon KL (2000) What’s related? Generalizing approaches to related articles in medicine. Proc AMIA Symp 838–842
Google Scholar
Glenisson P, Antal P, Mathys J, Moreau Y, De Moor B (2003) Evaluation of the vector space representation in text-based gene clustering. Pac Symp Biocomput 391–402
Google Scholar
Vapnik VN (1995) The nature of statistical learning theory. Springer, Berlin
Google Scholar

Download references

Author information

Authors and Affiliations

Protagen AG, Dortmund, Germany
Axel Kowald
South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
Sebastian Schmeier

Authors

Axel Kowald
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Schmeier
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Lead Discovery Center GmbH, Emil-Figge-Straße 76a76a, Dortmund, 44227, Germany
Michael Hamacher
Medizinisches Proteom-Center, Ruhr-Universität Bochum, Universitätsstraße 150, Bochum, 44801, Germany
Martin Eisenacher
Medizinisches Proteom-Center, Ruhr-Universität Bochum, Universitätsstraße 150, Bochum, 44801, Germany
Christian Stephan

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Kowald, A., Schmeier, S. (2011). Text Mining for Systems Modeling. In: Hamacher, M., Eisenacher, M., Stephan, C. (eds) Data Mining in Proteomics. Methods in Molecular Biology, vol 696. Humana Press. https://doi.org/10.1007/978-1-60761-987-1_19

Download citation

DOI: https://doi.org/10.1007/978-1-60761-987-1_19
Published: 13 October 2010
Publisher Name: Humana Press
Print ISBN: 978-1-60761-986-4
Online ISBN: 978-1-60761-987-1
eBook Packages: Springer Protocols

Publish with us

Policies and ethics