Linear unlearning for cross-validation
- 65 Downloads
The leave-one-out cross-validation scheme for generalization assessment of neural network models is computationally expensive due to replicated training sessions. In this paper we suggest linear unlearning of examples as an approach to approximative cross-validation. Further, we discuss the possibility of exploiting the ensemble of networks offered by leave-one-out for performing ensemble predictions. We show that the generalization performance of the equally weighted ensemble predictor is identical to that of the network trained on the whole training set.
Numerical experiments on the sunspot time series prediction benchmark demonstrate the potential of the linear unlearning technique.
KeywordsNeural Network Time Series Numerical Experiment Network Model Training Session
Unable to display preview. Download preview PDF.
- H. Akaike, Fitting autoregressive models for prediction, Annals of the Institute of Statistical Mathematics 21, 1969, 243–247.Google Scholar
- W.L. Buntine and A.S. Weigend, Bayesian back-propagation, Complex Systems 5, 1991, 603–643.Google Scholar
- T. Fox, D. Hinkley and K. Larntz, Jackknifing in nonlinear regression, Technometrics 22, 1980, 29–33.Google Scholar
- A. Krogh and J. Vedelsby, Neural network ensembles, cross validation, and active learning, in:Advances in Neural Information Processing Systems 7, eds. G. Tesauro et al., MIT Press, Cambridge, Massachusetts, 1995.Google Scholar
- J. Larsen and L.K. Hansen, Generalization performance of regularized neural network models, in:Proceedings of the IEEE Workshop on Neural Networks for Signal Processing IV, eds. J. Vlontzos, J.-N. Hwang and E. Wilson, IEEE, Piscataway, NJ, 1994, pp. 42–51.Google Scholar
- J. Larsen and L.K. Hansen, Empirical generalization assessment of neural network models, in:Proceedings of the IEEE Workshop on Neural Networks for Signal Processing V, eds. F. Girosi, J. Makhoul, E. Manolakos and E. Wilson, IEEE, Piscataway, NJ, 1995, pp. 30–39.Google Scholar
- L. Ljung,System Identification: Theory for the User, Prentice-Hall, Englewood Cliffs, NJ, 1987.Google Scholar
- J. Moody, Note on generalization, regularization, and architecture selection in nonlinear learning systems, in:Proceedings of the 1st IEEE Workshop on Neural Networks for Signal Processing, eds. B.H. Juang, S.Y. Kung and C.A. Kamm, IEEE, Piscataway, NJ, 1991, pp. 1–10.Google Scholar
- J. Moody, Prediction risk and architecture selection for neural networks, in:From Statistics to Neural Networks: Theory and Pattern Recognition Applications, eds. V. Cherkassky, J.H. Friedman and H. Wechsler, Series F, vol. 136, Springer, Berlin, 1994.Google Scholar
- G.A.F. Seber and C.J. Wild,Nonlinear Regression, Wiley, New York, 1989.Google Scholar
- M. Stone, Cross-validatory choice and assessment of statistical predictors, Journal of the Royal Statistical Society B 36(2), 1974, 111–147.Google Scholar
- G. Wahba, Spline models for observational data,CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 59, SIAM, 1990.Google Scholar