Evolutionary Approach for Classifier Ensemble: An Application to Bio-molecular Event Extraction

Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)

Abstract

The main goal of Biomedical Natural Language Processing (BioNLP) is to capture biomedical phenomena from textual data by extracting relevant entities, information and relations between biomedical entities (i.e. proteins and genes). Most of the previous works focused on extracting binary relations among proteins. In recent years, the focus is shifted towards extracting more complex relations in the form of bio-molecular events that may include several entities or other relations. In this paper we propose a classifier ensemble based on an evolutionary approach, namely differential evolution that enables extraction, i.e. identification and classification of relatively complex bio-molecular events. The ensemble is built on the base classifiers, namely Support Vector Machine, nave-Bayes and IBk. Based on these individual classifiers, we generate 15 models by considering various subsets of features. We identify and implement a rich set of statistical and linguistic features that represent various morphological, syntactic and contextual information of the candidate bio-molecular trigger words. Evaluation on the BioNLP 2009 shared task datasets show the overall recall, precision and F-measure values of 42.76%, 49.21% and 45.76%, respectively for the three-fold cross validation. This is better than the best performing SVM based individual classifier by 4.10 F-measure points.

Keywords

Support Vector Machine Differential Evolution Output Class Shared Task Event Extraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Nédellec, C.: Learning Language in Logic -Genic Interaction Extraction Challenge. In: Cussens, J., Nédellec, C. (eds.) Proceedings of the 4th Learning Language in Logic Workshop, LLL 2005, pp. 31–37 (2005)Google Scholar
  2. 2.
    Hirschman, L., Krallinger, M., Valencia, A. (eds.): Proceedings of the Second BioCreative Challenge Evaluation Workshop. CNIO Centro Nacional de Investigaciones Oncológicas (2007)Google Scholar
  3. 3.
    Chatr-aryamontri, A., Ceol, A., Palazzi, L.M., Nardelli, G., Schneider, M.V., Castagnoli, L., Cesareni, G.: MINT: the Molecular INTeraction database. Nucleic Acids Research 35(suppl. 1), 572–574 (2007)CrossRefGoogle Scholar
  4. 4.
    Kim, J.-D., Ohta, T., Pyysalo, S., Kano, Y., Tsujii, J.: Overview of BioNLP 2009 shared task on event extraction. In: BioNLP 2009: Proceedings of the Workshop on BioNLP, pp. 1–9 (2009)Google Scholar
  5. 5.
    Ekbal, A., Saha, S.: Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach. ACM Trans. Asian Lang. Inf. Process. 10(2), 9 (2011)Google Scholar
  6. 6.
    Storn, R., Price, K.: Differential Evolution A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J. of Global Optimization 11(4), 341–359 (1997), http://dx.doi.org/10.1023/A:1008202821328, doi:10.1023/A:1008202821328MathSciNetMATHCrossRefGoogle Scholar
  7. 7.
    Kim, J.-D., Ohta, T., Tsujii, J.: Corpus annotation for mining biomedical events from literature. BMC Bioinformatics 9, 10 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology PatnaPatnaIndia

Personalised recommendations