Abstract
Combining accurate neural networks (NN) in the ensemble with negative error correlation greatly improves the generalization ability. Mixture of experts (ME) is a popular combining method which employs special error function for the simultaneous training of NN experts to produce negatively correlated NN experts. Although ME can produce negatively correlated experts, it does not include a control parameter like negative correlation learning (NCL) method to adjust this parameter explicitly. In this study, an approach is proposed to introduce this advantage of NCL into the training algorithm of ME, i.e., mixture of negatively correlated experts (MNCE). In this proposed method, the capability of a control parameter for NCL is incorporated in the error function of ME, which enables its training algorithm to establish better balance in bias-variance-covariance trade-off and thus improves the generalization ability. The proposed hybrid ensemble method, MNCE, is compared with their constituent methods, ME and NCL, in solving several benchmark problems. The experimental results show that our proposed ensemble method significantly improves the performance over the original ensemble methods.
Similar content being viewed by others
References
Mu XY, Watta P, Hassoun M (2009) Analysis of a plurality voting-based combination of classifiers. Neural Process Lett 29(2): 89–107. doi:10.1007/s11063-009-9097-1
Wang Z, Chen SC, Xue H, Pan ZS (2010) A Novel regularization learning for single-view patterns: multi-view discriminative regularization. Neural Process Lett 31(3): 159–175. doi:10.1007/s11063-010-9132-2
Valle C, Saravia F, Allende H, Monge R, Fernandez C (2010) Parallel approach for ensemble learning with locally coupled neural networks. Neural Process Lett 32(3): 277–291. doi:10.1007/s11063-010-9157-6
Aladag CH, Egrioglu E, Yolcu U (2010) Forecast combination by using artificial neural networks. Neural Process Lett 32(3): 269–276. doi:10.1007/s11063-010-9156-7
Gómez-Gil P, Ramírez-Cortes JM, Pomares Hernández SE, Alarcón-Aquino V (2011) A neural network scheme for long-term forecasting of chaotic time series. Neural Process Lett 33(3): 215–233
Lorrentz P, Howells WGJ, McDonald-Maier KD (2010) A novel weightless artificial neural based multi-classifier for complex classifications. Neural Process Lett 31(1): 25–44. doi:10.1007/s11063-009-9125-1
Ghaderi R (2000) Arranging simple neural networks to solve complex classification problems. Surrey University, Surrey
Ghaemi M, Masoudnia S, Ebrahimpour R (2010) A new framework for small sample size face recognition based on weighted multiple decision templates. Neural Inf Process Theory Algorithms 6643/2010:470–477. doi:10.1007/978-3-642-17537-4_58
Tresp V, Taniguchi M (1995) Combining estimators using non-constant weighting functions. Adv Neural Inf Process Syst:419–426
Engineering T-IIoTDoE: (1994) Bias, variance and the combination of estimators: the case of linear least squares. TR Deptartment of Electrical Engineering, Technion, Haifa
Tumer K, Ghosh J (1996) Error correlation and error reduction in ensemble classifiers. Connect Sci 8(3): 385–404
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, Hoboken
Jacobs RA (1997) Bias/variance analyses of mixtures-of-experts architectures. Neural Comput 9(2): 369–383
Hansen JV (2000) Combining predictors: meta machine learning methods and bias/variance & ambiguity decompositions. Computer Science Deptartment, Aarhus University, Aarhus
Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2): 197–227
Liu Y, Yao X (1999) Ensemble learning via negative correlation. Neural Netw 12(10): 1399–1404
Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1): 79–87
Islam MM, Yao X, Nirjon SMS, Islam MA, Murase K (2008) Bagging and boosting negatively correlated neural networks. Ieee Trans Syst Man Cybern B 38(3): 771–784. doi:10.1109/Tsmcb.2008.922055
Ebrahimpour R, Arani SAAA, Masoudnia S (2011) Improving combination method of NCL experts using gating network. Neural Comput Appl:1–7. doi:10.1007/s00521-011-0746-8
Waterhouse SR (1997) Classification and regression using mixtures of experts. Unpublished doctoral dissertation, Cambridge University
Waterhouse S, Cook G (1997) Ensemble methods for phoneme classification. Adv Neural Inf Process Syst:800–806
Avnimelech R, Intrator N (1999) Boosted mixture of experts: an ensemble learning scheme. Neural Comput 11(2): 483–497
Liu Y, Yao X (1999) Simultaneous training of negatively correlated neural networks in an ensemble. Ieee Trans Syst Man Cybern B 29(6): 716–725
Ueda N, Nakano R (1996) Generalization error of ensemble estimators. Proc Int Conf Neural Netw 91: 90–95
Brown G, Wyatt JM (2003) Negative correlation learning and the ambiguity family of ensemble methods. Mult Classif Syst Proc 2709: 266–275
Brown G (2004) Diversity in neural network ensembles. Unpublished doctoral thesis, University of Birmingham, Birmingham, UK
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1): 5–20
Chen H (2008) Diversity and regularization in neural network ensembles. PhD thesis, School of Computer Science, University of Birmingham
Hansen JV (2000) Combining predictors: Meta machine learning methods and bias/variance & ambiguity decompositions. Unpublished doctoral thesis, Computer Science Deptartment, Aarhus University, Aarhus
Jacobs RA, Jordan MI, Barto AG (1991) Task Decomposition through competition in a modular connectionist architecture: the what and where vision tasks. Cogn Sci 15(2): 219–250
Dailey MN, Cottrell GW (1999) Organization of face and object recognition in modular neural network models. Neural Netw 12(7–8): 1053–1074
Ebrahimpour R, Kabir E, Yousefi MR (2007) Face detection using mixture of MLP experts. Neural Process Lett 26(1): 69–82. doi:10.1007/s11063-007-9043-z
Rokach L (2010) Pattern classification using ensemble methods, vol 75. World Scientific Pub Co Inc., Singapore
Ebrahimpour R, Nikoo H, Masoudnia S, Yousefi MR, Ghaemi MS (2011) Mixture of MLP-experts for trend forecasting of time series: A case study of the Tehran stock exchange. Int J Forecast 27(3): 804–816
Xing HJ, Hua BG (2008) An adaptive fuzzy c-means clustering-based mixtures of experts model for unlabeled data classification. Neurocomputing 71(4-6): 1008–1021. doi:10.1016/j.neucom.2007.02.010
Ebrahimpour R, Kabir E, Yousefi MR (2008) Teacher-directed learning in view-independent face recognition with mixture of experts using overlapping eigenspaces. Comput Vis Image Underst 111(2): 195–206. doi:10.1016/j.cviu.2007.10.003
Ubeyli ED (2009) Modified mixture of experts employing eigenvector methods and Lyapunov exponents for analysis of electroencephalogram signals. Expert Syst 26(4): 339–354. doi:10.1111/j.1468-0394.2009.00490.x
Asuncion A, Newman DJ (2007) UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California. School of Information and Computer Science
Pepe MS (2004) The statistical evaluation of medical tests for classification and prediction. Oxford University Press, Oxford
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Masoudnia, S., Ebrahimpour, R. & Arani, S.A.A.A. Incorporation of a Regularization Term to Control Negative Correlation in Mixture of Experts. Neural Process Lett 36, 31–47 (2012). https://doi.org/10.1007/s11063-012-9221-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-012-9221-5