Hamiltonian Monte Carlo based on evidence framework for Bayesian learning to neural network
The multilayer perceptron is the most useful artificial neural network widely used to approximate the nonlinear function in various fields, but the determination of its suitable weights and regularization parameters is a fundamental problem due to their direct impact on the network convergence and generalization performance. The Bayesian approach for neural networks is to consider all parameters of networks are random variables; then, computation of the posterior distribution is based on prior overall parameters and likelihood function through the use of Bayes’ theorem. In this paper, we train the network weights by means of Hamiltonian Monte Carlo (HMC); for hyperparameters, we propose to sample from posterior distribution using HMC in order to approximate the derivative of evidence which allow to re-estimate hyperparameters. The case problem studied in this paper includes a regression and classification problem. The obtained results illustrate the advantages of our approach in terms of accuracy compared to old Bayesian approach for neural network.
KeywordsMultilayer perceptron Bayesian learning Hamiltonian Monte Carlo Evidence framework Hyperparameter
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- Andrieu C, de Freitas N (2000) Sequential Monte Carlo for model selection and estimation of neural networks. In: 2000 IEEE international conference on acoustics, speech, and signal processing, 2000. ICASSP’00. Proceedings, vol 6. IEEE, pp 3410–3413Google Scholar
- Bache K, Lichman M (2013) UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets.html
- de Jesús Rubio J (2017a) Discrete time control based in neural networks for pendulums. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2017.04.056
- Gelman A, Carlin JB, Stern HS, Rubin DB (2003) Bayesian data analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton, FlaGoogle Scholar
- Kocadagli O (2012) Hybrid bayesian neural networks with genetic algorithms and fuzzy membership functions. Mimar Sinan FA University, IstanbulGoogle Scholar
- Lan S (2013) Advanced Bayesian computational methods through geometric techniques. University of California, IrvineGoogle Scholar
- Neal RM (1992) Bayesian training of backpropagation networks by the hybrid monte carlo method. Technical report, CiteseerGoogle Scholar
- Neal RM (1993) Probabilistic inference using markov chain monte carlo methods. Technical Report CRG-TR-93-1, Department of Computer Science, University of TorontoGoogle Scholar
- Neal RM et al (2011) Mcmc using hamiltonian dynamics. In: Handbook of Markov Chain Monte Carlo, vol 2, no 11Google Scholar
- Ramchoun H, Idrissi MAJ, Ghanou Y, Ettaouil M (2017) New modeling of multilayer perceptron architecture optimization with regularization: an application to pattern classification. IAENG Int J Comput Sci 44(3):261–269Google Scholar