Sampling Hidden Parameters from Oracle Distribution
A new sampling learning method for neural networks is proposed. Derived from an integral representation of neural networks, an oracle probability distribution of hidden parameters is introduced. In general rigorous sampling from the oracle distribution holds numerical difficulty, a linear-time sampling algorithm is also developed. Numerical experiments showed that when hidden parameters were initialized by the oracle distribution, following backpropagation converged faster to better parameters than when parameters were initialized by a normal distribution.
KeywordsIntegral representation neural networks sampling learning oracle distribution backpropagation weight initialization
Unable to display preview. Download preview PDF.
- 6.LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient Backprop. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012)Google Scholar
- 7.LeCun, Y., Cortes, C.: The MNIST database of handwritten digits, http://yann.lecun.com/exdb/mnist/