Treating Harmful Collinearity in Neural Network Ensembles

Sharkey, Amanda J. C.

doi:10.1007/978-1-4471-0793-4_5

Treating Harmful Collinearity in Neural Network Ensembles

Amanda J. C. Sharkey³

Chapter

214 Accesses
1 Citations

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

Summary

In the last decade, several techniques have been developed for combining neural networks [48, 49]. Combining a number of trained neural networks to form what is often referred to as a neural network ensemble, may yield better model accuracy without requiring extensive efforts in training the individual networks or optimising their architecture [21, 48]. However, because the corresponding outputs of the individual networks approximate the same physical quantity (or quantities), they may be highly positively correlated or collinear (linearly dependent). Thus, the estimation of the optimal weights for combining such networks may be subjected to the harmful effects of Collinearity, resulting in a neural network ensemble with inferior generalisation ability compared to the individual networks [20, 42, 48].

In this chapter, we discuss the harmful effects of collinearity on the estimation of the optimal combination-weights for combining the networks. We describe an approach for treating collinearity by the proper selection of the component networks, and test two algorithms for selecting the components networks in order to improve the generalisation ability of the ensemble. We present experimental results to demonstrate the effectiveness of optimal linear combinations, guided by the selection algorithms, in improving model accuracy.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Alpaydin. Multiple networks for function learning. In Proceedings of the 1993 IEEE International Conference on Neural Networks, volume I, pages 914. IEEE Press, Apr. 1993.
Google Scholar
R. Battiti and A. M. Colla. Democracy in neural nets: Voting schemes for classification. Neural Networks, 7 (4): 691–707, 1994.
Article Google Scholar
W. G. Baxt. Improving the accuracy of an artificial neural network using multiple differently trained networks. Neural Computation, 4: 772–780, 1992.
Article Google Scholar
D. A. Belsley. Assessing the presence of harmful collinearity and other forms of weak data through a test for signal-to-noise. Journal of Econometrics, 20: 211253, 1982.
Google Scholar
D. A. Belsley. Conditioning Diagnostics: Collinearity and Weak Data in Regression. John Wiley & Sons, New York, 1991.
MATH Google Scholar
D. A. Belsley, E. Kuth, and R. E. Welsch. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley & Sons, New York, 1980.
Book MATH Google Scholar
J. A. Benediktsson, J. R. Sveinsson, O. K. Ersoy, and P. H. Swain. Parallel consensual neural networks. IEEE Transactions on Neural Networks, 8 (1): 5464, 1997.
Article Google Scholar
L. Breiman. Stacked regressions. Technical Report 367, Department of Statistics, University of California, Berkeley, California 94720, USA, Aug. 1992. Revised June 1994.
Google Scholar
D. W. Bunn. Statistical efficiency in the linear combination of forecasts. International Journal of Forecasting, 1: 151–163, 1985.
Article Google Scholar
D. W. Bunn. Forecasting with more than one model. Journal of Forecasting, 8: 161–166, 1989.
Article Google Scholar
V. Cherkassky, D. Gehring, and F. Mulier. Pragmatic comparison of statistical and neural network methods for function estimation. In Proceedings of the 1995 World Congress on Neural Networks, volume II, pages 917–926, 1995.
Google Scholar
V. Cherkassky and H. Lari-Najafi. Constrained topological mapping for non-parametric regression analysis. Neural Networks, 4: 27–40, 1991.
Article Google Scholar
R. T. Clemen. Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 5: 559–583, 1989.
Article Google Scholar
R. T. Clemen and R. L. Winkler. Combining economic forecasts. Journal of Business & Economic Statistics, 4 (1): 39–46, Jan. 1986.
Article Google Scholar
L. Cooper. Hybrid neural network architectures: Equilibrium systems that pay attention. In R. J. Mammone and Y. Y. Zeevi, editors, Neural Networks: Theory and Applications, pages 81–96. Academic Press, 1991.
Google Scholar
C. W. J. Granger. Combining forecasts–twenty years later. Journal of Forecasting, 8: 167–173, 1989.
Article Google Scholar
J. B. Guerard Jr. and R. T. Clemen. Collinearity and the use of latent root regression for combining GNP forecasts. Journal of Forecasting, 8: 231–238, 1989.
Article Google Scholar
L. K. Hansen and P. Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12 (10): 993–1001, 1990.
Article Google Scholar
S. Hashem. Optimal Linear Combinations of Neural Networks. PhD thesis, School of Industrial Engineering, Purdue University, Dec. 1993.
Google Scholar
S. Hashem. Effects of collinearity on combining neural networks. Connection Science, 8(3 & 4):315–336, 1996. Special issue on Combining Neural Networks: Ensemble Approaches.
Google Scholar
S. Hashem. Optimal linear combinations of neural networks. Neural Networks, 10 (4): 599–614, 1997.
Article Google Scholar
S. Hashem and B. Schmeiser. Approximating a function and its derivatives using MSE-optimal linear combinations of trained feedforward neural networks. In Proceedings of the 1993 World Congress on Neural Networks, volume I, pages 617–620, New Jersey, 1993. Lawrence Erlbaum Associates.
Google Scholar
S. Hashem and B. Schmeiser. Improving model accuracy using optimal linear combinations of trained neural networks. IEEE Transactions on Neural Networks, 6 (3): 792–794, 1995.
Article Google Scholar
S. Hashem, B. Schmeiser, and Y. Yih. Optimal linear combinations of neural networks: An overview. In Proceedings of the 1994 IEEE International Conference on Neural Networks, volume III, pages 1507–1512. IEEE Press, 1994.
Chapter Google Scholar
W. W. Hines and D. C. Montgomery. Probability and Statistics in Engineering and Management Science. John Wiley & Sons, 1990.
MATH Google Scholar
J.-N. Hwang, S.-R. Lay, M. Maechler, R. D. Martin, and J. Schimert. Regression modeling in back-propagation and projection pursuit learning. IEEE Transactions on Neural Networks, 5 (3): 342–353, May 1994.
Article Google Scholar
R. A. Jacobs. Bias/variance analysis of mixtures-of-experts architectures. Neural Computation, 9: 369–383, 1997.
Article MathSciNet MATH Google Scholar
R. A. Jacobs and M. Jordan. A competitive modular connectionist architecture. In R. Lippmann, J. Moody, and D. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 767–773. Morgan Kaufman, 1991.
Google Scholar
R. A. Jacobs and M. Jordan. Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6: 181–214, 1994.
Article Google Scholar
A. Krogh and J. Vedelsby. Neural network ensembles, cross validation, and active learning. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems 7, pages 231–238. MIT Press, 1995.
Google Scholar
M. Maechler, D. Martin, J. Schimert, M. Csoppenszky, and J. Hwang. Projection pursuit learning networks for regression. In Proceedings of the 2nd International Conference on Tools for Artificial Intelligence, Washington D.C., pages 350–358. IEEE Press, November 1990.
Chapter Google Scholar
G. Mani. Lowering variance of decisions by using artificial neural networks portfolios. Neural Computation, 3: 484–486, 1991.
Article Google Scholar
L. Menezes and D. Bunn. Specification of predictive distribution from a combination of forecasts. Methods of Operations Research, 64: 397–405, 1991.
Google Scholar
H. Moskowitz and G. P. Wright. Statistics for Management and Economics. Charles Merrill Publishing Company, Ohio, 1985.
Google Scholar
J. Neter, W. Wasserman, and M. H. Kutner. Applied Linear Statistical Models. Irwin, Homewood, IL, 1990. 3rd Edition.
Google Scholar
L. Ohno-Machado and M. A. Musen. Hierarchical neural networks for partial diagnosis in medicine. In Proceedings of the 1994 World Congress on Neural Networks, volume 1, pages 291–296. Lawrence Erlbaum Associates, 1994.
Google Scholar
D. W. Opitz and J. W. Shavlik. Actively searching for an effective neural network ensemble. Connection Science, 8(3 &4):337–353, Dec. 1996. Special issue on Combining Neural Networks: Ensemble Approaches.
Google Scholar
B. Parmanto, P. W. Munro, and H. R. Doyle. Reducing variance of committee prediction with resampling techniques. Connection Science, 8(3 & 4):405–425, 1996. Special issue on Combining Neural Networks: Ensemble Approaches.
Google Scholar
B. Parmanto, P. W. Munro, H. R. Doyle, C. Doria, L. Aldrighetti, I. R. Marino, S. Mitchel, and J. J. Fung. Neural network classifier for hepatoma detection. In Proceedings of the 1994 World Congress on Neural Networks, volume I, pages 285–290, New Jersey, 1994. Lawrence Erlbaum Associates.
Google Scholar
B. A. Pearlmutter and R. Rosenfeld. Chaitin-Kolmogorov complexity and generalization in neural networks. In Advances in Neural Information Processing Systems 3, pages 925–931, 1991.
Google Scholar
M. P. Perrone. Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization. PhD thesis, Department of Physics, Brown University, May 1993.
Google Scholar
M. P. Perrone and L. N. Cooper. When networks disagree: Ensemble methods for hybrid neural networks. In R. J. Mammone, editor, Neural Networks for Speech and Image Processing. Chapman & Hall, 1993.
Google Scholar
Y. Raviv and N. Intrator. Bootstrapping with noise: An effective regularization technique. Connection Science, 8(3 & 4):355–372, 1996. Special issue on Combining Neural Networks: Ensemble Approaches.
Google Scholar
G. Rogova. Combining the results of several neural network classifiers. Neural Networks, 7 (5): 777–781, 1994.
Article Google Scholar
B. E. Rosen. Ensemble learning using decorrelated neural networks. Connection Science, 8(3 & 4):373–383, 1996. Special issue on Combining Neural Networks: Ensemble Approaches.
Google Scholar
R. L. Scheaffer and J. T. McClave. Probability and Statistics for Engineers. PWS-KENT Publishing Company, Boston, 1990.
Google Scholar
D. C. Schmittlein, J. Kim, and D. G. Morrison. Combining forecasts: Operational adjustments to theoretically optimal rules. Management Science, 36 (9): 1044–1056, Sept. 1990.
Article Google Scholar
A. J. Sharkey. On combining artificial neural nets. Connection Science, 8(3 & 4):299–313, 1996. Special issue on Combining Neural Networks: Ensemble Approaches.
Google Scholar
A. J. Sharkey. Modularity, combining and artificial neural nets. Connection Science, 9(1):3–10, 1997. Special issue on Combining Neural Networks: Modular Approaches.
Google Scholar
I. M. Sobol’. The Monte Carlo method. University of Chicago Press, 1974. Translated and adapted from the 2nd Russian edition by R. Messer, J. Stone, and P. Fortini.
Google Scholar
K. Turner and J. Ghosh. Error correction and error reduction in ensemble classifiers. Connection Science, 8(3 & 4):385–404, 1996. Special issue on Combining Neural Networks: Ensemble Approaches.
Google Scholar
C. T. West. System-based weights versus series-specific weights in the combination of forecasts. Journal of Finance, 15: 369–383, 1996.
Google Scholar
R. L. Winkler and R. T. Clemen. Sensitivity of weights in combining forecasts. Operations Research, 40(3):609–614, May-June 1992.
Article MATH Google Scholar
D. H. Wolpert. Stacked generalization. Neural Networks, 5: 241–259, 1992.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield, S1 4DP, UK
Amanda J. C. Sharkey

Authors

Amanda J. C. Sharkey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield, S1 4DP, UK
Amanda J. C. Sharkey

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sharkey, A.J.C. (1999). Treating Harmful Collinearity in Neural Network Ensembles. In: Sharkey, A.J.C. (eds) Combining Artificial Neural Nets. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0793-4_5

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0793-4_5
Publisher Name: Springer, London
Print ISBN: 978-1-85233-004-0
Online ISBN: 978-1-4471-0793-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics