Text Mining for Systems Modeling
The yearly output of scientific papers is constantly rising and makes it often impossible for the individual researcher to keep up. Text mining of scientific publications is, therefore, an interesting method to automate knowledge and data retrieval from the literature. In this chapter, we discuss specific tasks required for text mining, including their problems and limitations. The second half of the chapter demonstrates the various aspects of text mining using a practical example. Publications are transformed into a vector space representation and then support vector machines are used to classify papers depending on their content of kinetic parameters, which are required for model building in systems biology.
KeywordsSupport Vector Machine Text Mining Receiver Operator Curve Biological Entity Vector Space Model
- 14.Strasberg HR, Manning CD, Rindfleisch TC, Melmon KL (2000) What’s related? Generalizing approaches to related articles in medicine. Proc AMIA Symp 838–842Google Scholar
- 15.Glenisson P, Antal P, Mathys J, Moreau Y, De Moor B (2003) Evaluation of the vector space representation in text-based gene clustering. Pac Symp Biocomput 391–402Google Scholar
- 16.Vapnik VN (1995) The nature of statistical learning theory. Springer, BerlinGoogle Scholar