Development of a Machine Learning Framework for Biomedical Text Mining

Conference paper

DOI: 10.1007/978-3-319-40126-3_5

Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 477)
Cite this paper as:
Rodrigues R., Costa H., Rocha M. (2016) Development of a Machine Learning Framework for Biomedical Text Mining. In: Saberi Mohamad M., Rocha M., Fdez-Riverola F., Domínguez Mayo F., De Paz J. (eds) 10th International Conference on Practical Applications of Computational Biology & Bioinformatics. Advances in Intelligent Systems and Computing, vol 477. Springer, Cham

Abstract

Biomedical text mining (BTM) aims to create methods for searching and structuring knowledge extracted from biomedical literature. Named entity recognition (NER), a BTM task, seeks to identify mentions to biological entities in texts. Dictionaries, regular expressions, natural language processing and machine learning (ML) algorithms are used in this task. Over the last years, @Note2, an open-source software framework, which includes user-friendly interfaces for important tasks in BTM, has been developed, but it did not include ML-based methods. In this work, the development of a framework, BioTML, including a number of ML-based approaches for NER is proposed, to fill the gap between @Note2 and state-of-the-art ML approaches. BioTML was integrated in @Note2 as a novel plug-in, where Hidden Markov Models, Conditional Random Fields and Support Vector Machines were implemented to address NER tasks, working with a set of over 60 feature types used to train ML models. The implementation was supported in open-source software, such as MALLET, LibSVM, ClearNLP or OpenNLP. Several manually annotated corpora were used in the validation of BioTML. The results are promising, while there is room for improvement.

Keywords

Biomedical text mining Named entity recognition Machine learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Centre of Biological EngineeringUniversity of MinhoBragaPortugal
  2. 2.Silicolife, LdaBragaPortugal

Personalised recommendations