Incorporation of a Regularization Term to Control Negative Correlation in Mixture of Experts

Masoudnia, Saeed; Ebrahimpour, Reza; Arani, Seyed Ali Asghar Abbaszadeh

doi:10.1007/s11063-012-9221-5

Incorporation of a Regularization Term to Control Negative Correlation in Mixture of Experts

Published: 05 April 2012

Volume 36, pages 31–47, (2012)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Saeed Masoudnia¹,
Reza Ebrahimpour^2,3 &
Seyed Ali Asghar Abbaszadeh Arani²

218 Accesses
13 Citations
Explore all metrics

Abstract

Combining accurate neural networks (NN) in the ensemble with negative error correlation greatly improves the generalization ability. Mixture of experts (ME) is a popular combining method which employs special error function for the simultaneous training of NN experts to produce negatively correlated NN experts. Although ME can produce negatively correlated experts, it does not include a control parameter like negative correlation learning (NCL) method to adjust this parameter explicitly. In this study, an approach is proposed to introduce this advantage of NCL into the training algorithm of ME, i.e., mixture of negatively correlated experts (MNCE). In this proposed method, the capability of a control parameter for NCL is incorporated in the error function of ME, which enables its training algorithm to establish better balance in bias-variance-covariance trade-off and thus improves the generalization ability. The proposed hybrid ensemble method, MNCE, is compared with their constituent methods, ME and NCL, in solving several benchmark problems. The experimental results show that our proposed ensemble method significantly improves the performance over the original ensemble methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development and Application of Artificial Neural Network

Article 30 December 2017

A survey on ensemble learning

Article 30 August 2019

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

References

Mu XY, Watta P, Hassoun M (2009) Analysis of a plurality voting-based combination of classifiers. Neural Process Lett 29(2): 89–107. doi:10.1007/s11063-009-9097-1
Article Google Scholar
Wang Z, Chen SC, Xue H, Pan ZS (2010) A Novel regularization learning for single-view patterns: multi-view discriminative regularization. Neural Process Lett 31(3): 159–175. doi:10.1007/s11063-010-9132-2
Article Google Scholar
Valle C, Saravia F, Allende H, Monge R, Fernandez C (2010) Parallel approach for ensemble learning with locally coupled neural networks. Neural Process Lett 32(3): 277–291. doi:10.1007/s11063-010-9157-6
Article Google Scholar
Aladag CH, Egrioglu E, Yolcu U (2010) Forecast combination by using artificial neural networks. Neural Process Lett 32(3): 269–276. doi:10.1007/s11063-010-9156-7
Article Google Scholar
Gómez-Gil P, Ramírez-Cortes JM, Pomares Hernández SE, Alarcón-Aquino V (2011) A neural network scheme for long-term forecasting of chaotic time series. Neural Process Lett 33(3): 215–233
Article Google Scholar
Lorrentz P, Howells WGJ, McDonald-Maier KD (2010) A novel weightless artificial neural based multi-classifier for complex classifications. Neural Process Lett 31(1): 25–44. doi:10.1007/s11063-009-9125-1
Article Google Scholar
Ghaderi R (2000) Arranging simple neural networks to solve complex classification problems. Surrey University, Surrey
Google Scholar
Ghaemi M, Masoudnia S, Ebrahimpour R (2010) A new framework for small sample size face recognition based on weighted multiple decision templates. Neural Inf Process Theory Algorithms 6643/2010:470–477. doi:10.1007/978-3-642-17537-4_58
Tresp V, Taniguchi M (1995) Combining estimators using non-constant weighting functions. Adv Neural Inf Process Syst:419–426
Engineering T-IIoTDoE: (1994) Bias, variance and the combination of estimators: the case of linear least squares. TR Deptartment of Electrical Engineering, Technion, Haifa
Google Scholar
Tumer K, Ghosh J (1996) Error correlation and error reduction in ensemble classifiers. Connect Sci 8(3): 385–404
Article Google Scholar
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, Hoboken
Book MATH Google Scholar
Jacobs RA (1997) Bias/variance analyses of mixtures-of-experts architectures. Neural Comput 9(2): 369–383
Article MATH Google Scholar
Hansen JV (2000) Combining predictors: meta machine learning methods and bias/variance & ambiguity decompositions. Computer Science Deptartment, Aarhus University, Aarhus
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140
MathSciNet MATH Google Scholar
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2): 197–227
Google Scholar
Liu Y, Yao X (1999) Ensemble learning via negative correlation. Neural Netw 12(10): 1399–1404
Article Google Scholar
Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1): 79–87
Article Google Scholar
Islam MM, Yao X, Nirjon SMS, Islam MA, Murase K (2008) Bagging and boosting negatively correlated neural networks. Ieee Trans Syst Man Cybern B 38(3): 771–784. doi:10.1109/Tsmcb.2008.922055
Article Google Scholar
Ebrahimpour R, Arani SAAA, Masoudnia S (2011) Improving combination method of NCL experts using gating network. Neural Comput Appl:1–7. doi:10.1007/s00521-011-0746-8
Waterhouse SR (1997) Classification and regression using mixtures of experts. Unpublished doctoral dissertation, Cambridge University
Waterhouse S, Cook G (1997) Ensemble methods for phoneme classification. Adv Neural Inf Process Syst:800–806
Avnimelech R, Intrator N (1999) Boosted mixture of experts: an ensemble learning scheme. Neural Comput 11(2): 483–497
Article Google Scholar
Liu Y, Yao X (1999) Simultaneous training of negatively correlated neural networks in an ensemble. Ieee Trans Syst Man Cybern B 29(6): 716–725
Article Google Scholar
Ueda N, Nakano R (1996) Generalization error of ensemble estimators. Proc Int Conf Neural Netw 91: 90–95
Google Scholar
Brown G, Wyatt JM (2003) Negative correlation learning and the ambiguity family of ensemble methods. Mult Classif Syst Proc 2709: 266–275
Article Google Scholar
Brown G (2004) Diversity in neural network ensembles. Unpublished doctoral thesis, University of Birmingham, Birmingham, UK
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1): 5–20
Article Google Scholar
Chen H (2008) Diversity and regularization in neural network ensembles. PhD thesis, School of Computer Science, University of Birmingham
Hansen JV (2000) Combining predictors: Meta machine learning methods and bias/variance & ambiguity decompositions. Unpublished doctoral thesis, Computer Science Deptartment, Aarhus University, Aarhus
Jacobs RA, Jordan MI, Barto AG (1991) Task Decomposition through competition in a modular connectionist architecture: the what and where vision tasks. Cogn Sci 15(2): 219–250
Article Google Scholar
Dailey MN, Cottrell GW (1999) Organization of face and object recognition in modular neural network models. Neural Netw 12(7–8): 1053–1074
Article Google Scholar
Ebrahimpour R, Kabir E, Yousefi MR (2007) Face detection using mixture of MLP experts. Neural Process Lett 26(1): 69–82. doi:10.1007/s11063-007-9043-z
Article Google Scholar
Rokach L (2010) Pattern classification using ensemble methods, vol 75. World Scientific Pub Co Inc., Singapore
Google Scholar
Ebrahimpour R, Nikoo H, Masoudnia S, Yousefi MR, Ghaemi MS (2011) Mixture of MLP-experts for trend forecasting of time series: A case study of the Tehran stock exchange. Int J Forecast 27(3): 804–816
Article Google Scholar
Xing HJ, Hua BG (2008) An adaptive fuzzy c-means clustering-based mixtures of experts model for unlabeled data classification. Neurocomputing 71(4-6): 1008–1021. doi:10.1016/j.neucom.2007.02.010
Article Google Scholar
Ebrahimpour R, Kabir E, Yousefi MR (2008) Teacher-directed learning in view-independent face recognition with mixture of experts using overlapping eigenspaces. Comput Vis Image Underst 111(2): 195–206. doi:10.1016/j.cviu.2007.10.003
Article Google Scholar
Ubeyli ED (2009) Modified mixture of experts employing eigenvector methods and Lyapunov exponents for analysis of electroencephalogram signals. Expert Syst 26(4): 339–354. doi:10.1111/j.1468-0394.2009.00490.x
Article Google Scholar
Asuncion A, Newman DJ (2007) UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California. School of Information and Computer Science
Pepe MS (2004) The statistical evaluation of medical tests for classification and prediction. Oxford University Press, Oxford
MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics, Statistics and Computer Science, University of Tehran, Tehran, Iran
Saeed Masoudnia
Brain & Intelligent Systems Research Laboratory, Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University, 16785-163, Tehran, Iran
Reza Ebrahimpour & Seyed Ali Asghar Abbaszadeh Arani
School of Cognitive Sciences (SCS), Institute for Research in Fundamental Sciences (IPM), 19395-5746, Tehran, Iran
Reza Ebrahimpour

Authors

Saeed Masoudnia
View author publications
You can also search for this author in PubMed Google Scholar
Reza Ebrahimpour
View author publications
You can also search for this author in PubMed Google Scholar
Seyed Ali Asghar Abbaszadeh Arani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reza Ebrahimpour.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Masoudnia, S., Ebrahimpour, R. & Arani, S.A.A.A. Incorporation of a Regularization Term to Control Negative Correlation in Mixture of Experts. Neural Process Lett 36, 31–47 (2012). https://doi.org/10.1007/s11063-012-9221-5

Download citation

Published: 05 April 2012
Issue Date: August 2012
DOI: https://doi.org/10.1007/s11063-012-9221-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incorporation of a Regularization Term to Control Negative Correlation in Mixture of Experts

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Incorporation of a Regularization Term to Control Negative Correlation in Mixture of Experts

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation