Hamiltonian Monte Carlo based on evidence framework for Bayesian learning to neural network

Methodologies and Application
  • 12 Downloads

Abstract

The multilayer perceptron is the most useful artificial neural network widely used to approximate the nonlinear function in various fields, but the determination of its suitable weights and regularization parameters is a fundamental problem due to their direct impact on the network convergence and generalization performance. The Bayesian approach for neural networks is to consider all parameters of networks are random variables; then, computation of the posterior distribution is based on prior overall parameters and likelihood function through the use of Bayes’ theorem. In this paper, we train the network weights by means of Hamiltonian Monte Carlo (HMC); for hyperparameters, we propose to sample from posterior distribution using HMC in order to approximate the derivative of evidence which allow to re-estimate hyperparameters. The case problem studied in this paper includes a regression and classification problem. The obtained results illustrate the advantages of our approach in terms of accuracy compared to old Bayesian approach for neural network.

Keywords

Multilayer perceptron Bayesian learning Hamiltonian Monte Carlo Evidence framework Hyperparameter 

Notes

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. Andrieu C, de Freitas N (2000) Sequential Monte Carlo for model selection and estimation of neural networks. In: 2000 IEEE international conference on acoustics, speech, and signal processing, 2000. ICASSP’00. Proceedings, vol 6. IEEE, pp 3410–3413Google Scholar
  2. Bache K, Lichman M (2013) UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets.html
  3. Bishop C M (1995) Neural networks for pattern recognition. Oxford university press, OxfordMATHGoogle Scholar
  4. Buntine WL, Weigend AS (1991) Bayesian back-propagation. Complex Syst 5(6):603–643MATHGoogle Scholar
  5. Chua C, Goh A (2003) A hybrid Bayesian back-propagation neural network approach to multivariate modelling. Int J Numer Anal Methods Geomech 27(8):651–667CrossRefMATHGoogle Scholar
  6. de Jesús Rubio J (2017a) Discrete time control based in neural networks for pendulums. Appl Soft Comput.  https://doi.org/10.1016/j.asoc.2017.04.056
  7. de Jesús Rubio J (2017b) Stable kalman filter and neural network for the chaotic systems identification. J Frankl Inst 354(16):7444–7462MathSciNetCrossRefMATHGoogle Scholar
  8. Duane S, Kennedy AD, Pendleton BJ, Roweth D (1987) Hybrid monte carlo. Phys Lett B 195(2):216–222CrossRefGoogle Scholar
  9. Gelman A, Carlin JB, Stern HS, Rubin DB (2003) Bayesian data analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton, FlaGoogle Scholar
  10. Girolami M, Calderhead B (2011) Riemann manifold langevin and hamiltonian monte carlo methods. J R Stat Soc Ser B (Stat Methodol) 73(2):123–214MathSciNetCrossRefGoogle Scholar
  11. Hippert HS, Taylor JW (2010) An evaluation of bayesian techniques for controlling model complexity and selecting inputs in a neural network for short-term load forecasting. Neural Netw 23(3):386–395CrossRefGoogle Scholar
  12. Husmeier D, Penny WD, Roberts SJ (1999) An empirical evaluation of bayesian sampling with hybrid monte carlo for training neural network classifiers. Neural Netw 12(4–5):677–705CrossRefGoogle Scholar
  13. Kocadagli O (2012) Hybrid bayesian neural networks with genetic algorithms and fuzzy membership functions. Mimar Sinan FA University, IstanbulGoogle Scholar
  14. Kocadağlı O (2015) A novel hybrid learning algorithm for full bayesian approach of artificial neural networks. Appl Soft Comput 35:52–65CrossRefGoogle Scholar
  15. Kocadağlı O, Aşıkgil B (2014) Nonlinear time series forecasting with bayesian neural networks. Expert Syst Appl 41(15):6596–6610CrossRefGoogle Scholar
  16. Lampinen J, Vehtari A (2001) Bayesian approach for neural networks: review and case studies. Neural Netw 14(3):257–274CrossRefGoogle Scholar
  17. Lan S (2013) Advanced Bayesian computational methods through geometric techniques. University of California, IrvineGoogle Scholar
  18. Lan S, Stathopoulos V, Shahbaba B, Girolami M (2015) Markov chain monte carlo from lagrangian dynamics. J Comput Graph Stat 24(2):357–378MathSciNetCrossRefGoogle Scholar
  19. Liang F (2005) Bayesian neural networks for nonlinear time series forecasting. Stat Comput 15(1):13–29MathSciNetCrossRefGoogle Scholar
  20. Liang F, Wong WH (2001) Real-parameter evolutionary monte carlo with applications to bayesian mixture models. J Am Stat Assoc 96(454):653–666MathSciNetCrossRefMATHGoogle Scholar
  21. MacKay DJ (1992) The evidence framework applied to classification networks. Neural Comput 4(5):720–736CrossRefGoogle Scholar
  22. MacKay DJ (1995) Probable networks and plausible predictionsa review of practical bayesian methods for supervised neural networks. Netw Comput Neural Syst 6(3):469–505CrossRefMATHGoogle Scholar
  23. Marwala T (2007) Bayesian training of neural networks using genetic programming. Pattern Recognit Lett 28(12):1452–1458CrossRefGoogle Scholar
  24. Neal RM (1992) Bayesian training of backpropagation networks by the hybrid monte carlo method. Technical report, CiteseerGoogle Scholar
  25. Neal RM (1993) Probabilistic inference using markov chain monte carlo methods. Technical Report CRG-TR-93-1, Department of Computer Science, University of TorontoGoogle Scholar
  26. Neal R M (2012) Bayesian learning for neural networks, vol 118. Springer, BerlinMATHGoogle Scholar
  27. Neal RM et al (2011) Mcmc using hamiltonian dynamics. In: Handbook of Markov Chain Monte Carlo, vol 2, no 11Google Scholar
  28. Niu D-X, Shi H-F, Wu DD (2012) Short-term load forecasting using bayesian neural networks learned by hybrid monte carlo algorithm. Appl Soft Comput 12(6):1822–1827CrossRefGoogle Scholar
  29. Pan Y, Liu Y, Xu B, Yu H (2016a) Hybrid feedback feedforward: an efficient design of adaptive neural network control. Neural Netw 76:122–134CrossRefGoogle Scholar
  30. Pan Y, Sun T, Yu H (2016b) Composite adaptive dynamic surface control using online recorded data. Int J Robust Nonlinear Control 26(18):3921–3936MathSciNetCrossRefMATHGoogle Scholar
  31. Penny WD, Roberts SJ (1999) Bayesian neural networks for classification: how useful is the evidence framework? Neural Netw 12(6):877–892CrossRefGoogle Scholar
  32. Ramchoun H, Idrissi MAJ, Ghanou Y, Ettaouil M (2017) New modeling of multilayer perceptron architecture optimization with regularization: an application to pattern classification. IAENG Int J Comput Sci 44(3):261–269Google Scholar
  33. Shahbaba B, Lan S, Johnson WO, Neal RM (2014) Split hamiltonian monte carlo. Stat Comput 24(3):339–349MathSciNetCrossRefMATHGoogle Scholar
  34. Thodberg HH (1996) A review of bayesian neural networks with an application to near infrared spectroscopy. IEEE Trans Neural Netw 7(1):56–72CrossRefGoogle Scholar
  35. Vehtari A, Lampinen J (2000) Bayesian mlp neural networks for image analysis. Pattern Recognit Lett 21(13–14):1183–1191CrossRefMATHGoogle Scholar
  36. Vivarelli F, Williams CK (2001) Comparing bayesian neural network algorithms for classifying segmented outdoor images. Neural Netw 14(4–5):427–437CrossRefGoogle Scholar
  37. Yan D, Zhou Q, Wang J, Zhang N (2017) Bayesian regularisation neural network based on artificial intelligence optimisation. Int J Prod Res 55(8):2266–2287CrossRefGoogle Scholar
  38. Zhang H, Tang Y (2017) Online gradient method with smoothing \(l_{0}\) regularization for feedforward neural networks. Neurocomputing 224:1–8CrossRefGoogle Scholar
  39. Zhang C, Shahbaba B, Zhao H (2017) Hamiltonian monte carlo acceleration using surrogate functions with random bases. Stat Comput 27(6):1473–1490MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Laboratory of modelling and scientific Computing, Department of Mathematics and Informatics, Faculty of Sciences and Techniques, University of Sidi Mohamed Ben AbdellahFezMorocco

Personalised recommendations