Computational Properties of Cyclic and Almost-Cyclic Learning with Momentum for Feedforward Neural Networks
- 2.1k Downloads
Two backpropagation algorithms with momentum for feedforward neural networks with a single hidden layer are considered. It is assumed that the training samples are supplied to the network in a cyclic or an almost-cyclic fashion in the learning procedure. A re-start strategy for the momentum is adopted such that the momentum coefficient is set to zero at the beginning of each training cycle. Corresponding weak and strong convergence results are presented, respectively. The convergence conditions on the learning rate, the momentum coefficient and the activation functions are much relaxed compared with those of the existing results. Numerical examples are implemented to support our theoretical results and demonstrate that ACMFNN does much better than CMFNN on both convergence speed and generalization ability.
KeywordsBackpropagation momentum cyclic almost-cyclic convergence
Unable to display preview. Download preview PDF.
- 1.Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Tsinghua University Press and Prentice Hall (2001)Google Scholar
- 2.Rumelhart, D.E., McClelland, J.L.: Parall Distributed Processing-Explorations in the Microstructure of Cognition. MIT Press, Cambridge (1986)Google Scholar
- 8.Hertz, J., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. Addison Wesley, Redwood City (1991)Google Scholar
- 9.Becker, S., Le Cun, Y.: Improving the convergence of back-propagation learning with second-order methods. In: Proc. of the 1988 Conneciiontst Models Summer School, San Mateo, pp. 29–37 (1989)Google Scholar
- 10.Hagan, M.T., Demuth, H.B., Beale, M.H.: Neural Network Design. PWS, Boston (1996)Google Scholar
- 11.Liang, Y.C., Feng, D.P., Lee, H.P.: Successive Approximation Training Algorithm for Feedforward Neural Networks. Neurocomputing 42, 11–322 (2002)Google Scholar
- 14.Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific (1996)Google Scholar
- 15.Tadic, V., Stankovic, S.: Learning in neural networks by normalized stochastic gradient algorithm: Local convergence. In: Proceedings of the 5th Seminar Neural Networks Application Electronic Engineering, Yugoslavia (2000)Google Scholar
- 16.Wu, W., Shao, H.M., Qu, D.: Strong Convergence of Gradient Methods for BP Networks Training. In: Proc. Int. Conf. Neural Networks & Brains, pp. 332–334 (2005)Google Scholar
- 26.Ren, Y.J.: Numerical analysis and the implementations based on Matlab. Higer Education Press, Beijing (2007)Google Scholar