The Dynamics of Negative Correlation Learning
- 64 Downloads
In this paper we combine two points made in two previous papers on negative correlation learning (NC) by different authors, which have theoretical implications for the optimal setting of λ, a parameter of the method whose correct choice is critical for stability and good performance. An expression for the optimal λ is derived whose value λ* depends only on the number of classifiers in the ensemble. This result arises from the form of the ambiguity decomposition of the ensemble error, and the close links between this and the error function used in NC. By analyzing the dynamics of the outputs we find dramatically different behavior for λ < λ*, λ = λ* and λ > λ*, providing further motivation for our choice of λ and theoretical explanations for some empirical observations in other papers on NC. These results will be illustrated using well known synthetic and medical datasets.
Keywordsnegative correlation learning dynamics stability ensemble methods combination classification neural networks
Unable to display preview. Download preview PDF.
- 3.G. Brown, and J.L. Wyatt, “The Use of the Ambiguity Decomposition in Neural Network Ensemble Learning Methods,” in 20th International Conference on Machine Learning (ICML’03), T. Fawcett and N. Mishra (Eds.), Washington DC, USA, August 2003.Google Scholar
- 4.C.L. Blake D.J. Newman, S. Hettich, and C.J. Merz, UCI Repository of Machine Learning Databases, 1998.Google Scholar
- 5.Y. Freund and R.E. Schapire, “Experiments with a New Boosting Algorithm,” in Proceedings of the 13th International Conference on Machine Learning, Morgan Kaufmann, 1996, pp. 148–156.Google Scholar
- 8.A. Krogh, and J.A. Hertz, “A Simple Weight Decay Can Improve Generalization,” in Advances in Neural Information Processing Systems, volume 4, J.E. Moody, S.J. Hanson, and R.P. Lippmann (Eds.), Morgan Kaufmann Publishers, Inc., 1992, pp. 950–957.Google Scholar
- 9.A. Krogh and J. Vedelsby, “Neural Network Ensembles, Cross Validation, and Active Learning,” Advances in Neural Information Processing Systems, vol. 7, 1995, pp. 231–238.Google Scholar
- 11.R. McKay and H. Abbass, “Analyzing Anticorrelation in Ensemble Learning,” in Proceedings of 2001 Conference on Artificial Neural Networks and Expert Systems, Otago, New Zealand, 2001, pp. 22–27.Google Scholar
- 12.P. Melville and R. Mooney, “Constructing diverse classifier ensembles using artificial training examples,” in Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, Mexico, August 2003, pp. 505–510.Google Scholar
- 13.D. Opitz and J. Shavlik, “A genetic Algorithm Approach for Creating Neural Network Ensembles,” Combining Artificial Neural Nets, Springer, 1999, pp. 79–99.Google Scholar
- 15.W.N. Street, W.H. Wolberg, and O.L. Mangasarian, “Nuclear Feature Extraction for Breast Tumour Diagnosis,” International Symposium on Electronic Imaging: Science and Technology, vol. 1905, 1993, pp. 861–870.Google Scholar