Transients and Asymptotics of Natural Gradient Learning
Conference paper
Abstract
We analyse natural gradient learning in a two-layer feed-forward neural network using a statistical mechanics framework which is appropriate for large input dimension. We find significant improvement over standard gradient descent in both the transient and asymptotic phases of learning.
Keywords
Learning Rate Gradient Descent Fisher Information Fisher Information Matrix Generalization Error
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- [1]Amari S. Neural Computation 10(2) 251 (1998).MathSciNetCrossRefGoogle Scholar
- [2]Yang HY, Amari S. Advances in Neural Information Processing Systems vol 10, ed Jordan MI, Kearns MJ and Solla SA (Cambridge, MA: MIT Press, 1998).Google Scholar
- [3]Yang HY, Amari S. Natural Gradient Descent for Training Multi-Layer Perceptrons. Submitted to IEEE Transactions on Neural Networks (1998).Google Scholar
- [4]Saad D, Solla SA. Phys. Rev. Lett. 74, 4337 (1995); Phys. Rev. E 52 4225 (1995).CrossRefGoogle Scholar
- [5]Barber D, Saad D, Sollich P. Europhysics Letters 34 151 (1996).CrossRefGoogle Scholar
- [6]Saad D, Rattray M. Phys. Rev. Lett. 79 2578 (1997).CrossRefGoogle Scholar
- [7]West AHL, Saad D. Phys. Rev. E 56 3426 (1997).CrossRefGoogle Scholar
- [8]Leen TK, Schottky B, Saad D. Advances in Neural Information Processing Systems vol 10, ed Jordan MI, Kearns MJ and Solla SA (Cambridge, MA: MIT Press, 1998).Google Scholar
- [9]Cybenko G. Math. Control Signals and Systems 2 303 (1989).MathSciNetMATHCrossRefGoogle Scholar
- [10]Rattray M, Saad D, Amari S. Natural gradient descent for on-line learning (in preparation, 1998).Google Scholar
- [11]Orr GB, Leen TK. Advances in Neural Information Processing Systems vol 9, ed Mozer MC, Jordan MI and Petsche T (Cambridge, MA: MIT Press, 1997) p 606.Google Scholar
Copyright information
© Springer-Verlag London 1998