Abstract
In this paper, we propose a two-stage multiobjective-simulated annealing (MOSA)-based technique for named entity recognition (NER). At first, MOSA is used for feature selection under two statistical classifiers, viz. conditional random field (CRF) and support vector machine (SVM). Each solution on the final Pareto optimal front provides a different classifier. These classifiers are then combined together by using a new classifier ensemble technique based on MOSA. Several different versions of the objective functions are exploited. We hypothesize that the reliability of prediction of each classifier differs among the various output classes. Thus, in an ensemble system, it is necessary to find out the appropriate weight of vote for each output class in each classifier. We propose a MOSA-based technique to determine the weights for votes automatically. The proposed two-stage technique is evaluated for NER in Bengali, a resource-poor language, as well as for English. Evaluation results yield the highest recall, precision and F-measure values of 93.95, 95.15 and 94.55 %, respectively for Bengali and 89.01, 89.35 and 89.18 %, respectively for English. Experiments also suggest that the classifier ensemble identified by the proposed MOO-based approach optimizing the F-measure values of named entity (NE) boundary detection outperforms all the individual classifiers and four conventional baseline models.
Similar content being viewed by others
Notes
We use ‘ensemble classifier’ and ‘classifier ensemble’ interchangeably.
References
Alfonseca E, Manandhar S (1999) An unsupervised method for general named entity recognition and automated concept discovery. In: Proceedings AAAI ’99/IAAI ’99: proceedings of the sixteenth national conference on artificial intelligence and the eleventh conference on innovative applications of artificial intelligence, pp 474–479
Anderson TW, Scolve S (1978) Introduction to the statistical analysis of data. Houghton Mifflin, New York
Bandyopadhyay S, Saha S, Maulik U, Deb K (2008) A simulated annealing based multi-objective optimization algorithm: AMOSA. IEEE Trans Evol Comput 12(3):269–283
Bennet SW, Aone C, Lovell C (1997) Learning to tag multilingual texts through observation. In: Proceedings of empirical methods of natural language processing, Providence, Rhode Island, pp 109–116
Bikel DM, Schwartz RL, Weischedel RM (1999) An algorithm that learns what’s in a name. Mach Learn 34(1–3):211–231
Borthwick A (1999) Maximum entropy approach to named entity recognition. PhD thesis, New York University
Borthwick A, Sterling J, Agichtein E, Grishman R (1998) NYU: description of the MENE named entity system as used in MUC-7. In: MUC-7, Fairfax
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Cherkauer K (1996) Human expert-level performance on a scientific image analysis task by a system using combined artificial neural networks. In: Working notes of the AAAI workshop on integrating multiple learned models, pp 15–21
Chieu HL, Ng HT (2003) Named entity recognition with a maximum entropy approach. In: Proceedings of CoNLL-2003, HLT-NAACL 2003, pp 160–163
Coello Coello CA (1999) A comprehensive survey of evolutionary-based multiobjective optimization techniques. Knowl Inf Syst 1(3):129–156
Collins M, Singer Y (1999) Unsupervised models for named entity classification. In: Proceedings of the joint SIGDAT conference on empirical methods in natural language processing and very large corpora
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, England
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):181–197
Dietterich TG (2000) Ensemble methods in machine learning. In: Kittler J, Roli F (eds) Multiple classifiers systems: first international workshop; proceedings/ MCS 2000. Springer, Berlin
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error correcting output codes. J Artif Intell Res 2:263–286
Ekbal A, Bandyopadhyay S (2007) Lexical pattern learning from corpus data for named entity recognition. In: Proceedings of the 5th international conference on natural language processing (ICON), India, pp 123–128
Ekbal A, Bandyopadhyay S (2008a) A web-based Bengali news corpus for named entity recognition. Lang Resour Eval J 42(2): 173–182
Ekbal A, Bandyopadhyay S (2008b) Web-based Bengali news Corpus for Lexicon development and POS tagging. POLIBITS 37:20–29. ISSN:1870-9044
Ekbal A, Bandyopadhyay S (2008c) Bengali named entity recognition using support vector machine. In: Proceedings of workshop on NER for South and South East Asian Languages, 3rd international joint conference on natural language processing (IJCNLP), India, pp 51–58
Ekbal A, Bandyopadhyay S (2009a) Voted NER system using appropriate unlabeled data. In: Proceedings of the 2009 named entities workshop: shared task on transliteration (NEWS 2009), ACL-IJCNLP 2009 (2009), pp 202–210
Ekbal A, Bandyopadhyay S (2009b) A conditional random field approach for named entity recognition in Bengali and Hindi. Linguist Issues Lang Technol (LiLT) 2(1):1–44
Ekbal A, Haque R, Bandyopadhyay S (2008) Named entity recognition in Bengali: a conditional random field approach. In: Proceedings of the 3rd international joint conference on natural language processing (IJCNLP 2008), pp 589–594
Ekbal A, Naskar S, Bandyopadhyay S (2007) Named entity recognition and transliteration in Bengali. Named entities: recognition, classification and use. Special Issue Lingvisticae Investigationes J 30(1): 95–114
Ekbal A, Saha S (2010) Classifier ensemble selection using genetic algorithm for named entity recognition. Res Lang Comput 8:73–99
Ekbal A, Saha S (2011a) A multiobjective simulated annealing approach for classifier ensemble: Named entity recognition in indian languages as case studies. Expert Syst Appl 38(12):14760–14772
Ekbal A, Saha S (2011b) Weighted vote-based classifier ensemble for named entity recognition: a genetic algorithm-based approach. ACM Trans Asian Lang Inf Process 10(2):9
Ekbal A, Saha S (2012) Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition. IJDAR 15(2):143–166
Etzioni O, Cafarrella M, Downey D, Popescu AM, Shaked T, Soderland S, Weld DS, Yates A (2005) Unsupervised named entity extraction from the web: an experimental study. Artif Intell 165:91–134
Florian R, Ittycheriah A, Jing H, Zhang T (2003) Named entity recognition through classifier combination. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003
Freund Y, Schapire R (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the second European conference on computational learning theory, Taipei, Taiwan, pp 23–37
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, New York
Ishibuchi H, Nojima Y (2005) Performance evaluation of evolutionary multiobjective approaches to the design of fuzzy rule-based ensemble classifiers. In: Proceedings of 5th international conference on hybrid intelligent systems (Rio de Janeiro, Brazil), Rio de Janeiro, Brazil, November 6–9, pp 271–276
Joachims T (1999) Making large scale SVM learning practical. MIT Press, Cambridge, pp 169–184
Klein D, Smarr J, Nguyen H, Manning, CD (2003) Named Entity Recognition with Character-level Models. In: Proceedings of CoNLL-2003, HLT-NAACL 2003, pp 188–191
Klinger R, Friedrich CM (2009) User’s choice of precision and recall in named entity recognition. In: Angelova G, Bontcheva K, Mitkov R, Nicolov N, Nikolov N (eds) Proceedings of recent advances in natural language processing (RANLP), Borovets, Bulgaria, pp 192–196
Kirkpatrick S Jr, Gelatt CD, Vechhi MP (1983) Optimization by simulated annealing. Science 220:671–680
Kolen JF, Pollack JB (1991) Back propagation is sensitive to initial conditions. Adv Neural Inf Process Syst 860–867
Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp 282–289
Lin D, Wu X (2009) Phrase clustering for discriminative learning. In: Proceedings of 47th annual meeting of the ACL and the 4th IJCNLP of the AFNLP, pp 1030–1038
McCallum A, Li W (2003) Early results for named entity recognition with conditional random fields, feature induction and web-enhanced Lexicons. In: Proceedings of CoNLL, Canada, pp 188–191
Mikheev A, Grover C, Moens M (1998) Description of the LTG system used for MUC-7. In: MUC-7, Fairfax, Virginia
Miller S, Crystal M, Fox H, Ramshaw L, Schawartz R, Stone R, Weischedel R, the Annotation Group (1998) BBN: description of the SIFT system as used for MUC-7. In: MUC-7, Fairfax, Virginia
Riloff E, Jones R (1999) Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings AAAI ’99/IAAI ’99: proceedings of the sixteenth national conference on artificial intelligence and the eleventh conference on innovative applications of artificial intelligence, pp 474–479
Sekine S (1998) Description of the Japanese NE system used for MET-2. In: MUC-7, Fairfax, Virginia
Shinyama Y, Sekine S (2004) Named entity discovery using comparable news articles. In: Proceedings of the international conference on computational linguistics (COLING), Switzerland, pp 848–855
Srihari R, Niu C, Li W (2002) A hybrid approach for named entity and sub-type tagging. In: Proceedings of sixth conference on applied natural language processing (ANLP), pp 247–254
Suzuki J, Isozaki H (2008) Semi-supervised sequential labeling and segmentation using Gigaword Scale unlabeled data. In: Proceedings of ACL/HLT-08, pp 665–673
Tjong Kim Sang EF, De Meulder F (2003) Introduction to the Conll-2003 shared task: language independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003, pp 142–147
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Wolpert D (1992) Stacked generalization. Neural Netw 5:241–259
Wu D, Ngai G, Carput M (2003) A stacked, voted, stacked model for named entity recognition. In: Proceedings of the CoNLL-2003, HLT-NAACL
Yangarber R, Lin W, Grishman R (2002) Unsupervised learning of generalized names. In: Proceedings of the 19th international conference on computational linguistics (COLING-2002), pp 1–7
Yu X (2007) Chinese named entity recognition with cascaded hybrid model. In: Proceedings of NAACL HLT 2007, Prague, pp 197–200
Author information
Authors and Affiliations
Corresponding author
Additional information
Asif Ekbal and Sriparna Saha have equally contributed to this work.
Rights and permissions
About this article
Cite this article
Ekbal, A., Saha, S. Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition. Soft Comput 17, 1–16 (2013). https://doi.org/10.1007/s00500-012-0885-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-012-0885-6