Skip to main content
Log in

Solving Bayesian inverse problems from the perspective of deep generative networks

  • Original Paper
  • Published:
Computational Mechanics Aims and scope Submit manuscript

Abstract

Deep generative networks have achieved great success in high dimensional density approximation, especially for applications in natural images and language. In this paper, we investigate their approximation capability in capturing the posterior distribution in Bayesian inverse problems by learning a transport map. Because only the unnormalized density of the posterior is available, training methods that learn from posterior samples, such as variational autoencoders and generative adversarial networks, are not applicable in our setting. We propose a class of network training methods that can be combined with sample-based Bayesian inference algorithms, such as various MCMC algorithms, ensemble Kalman filter and Stein variational gradient descent. Our experiment results show the pros and cons of deep generative networks in Bayesian inverse problems. They also reveal the potential of our proposed methodology in capturing high dimensional probability distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223

  2. Beskos A, Roberts G, Stuart A, Voss J (2008) MCMC methods for diffusion bridges. Stoch Dyn 8(03):319–350

    Article  MathSciNet  MATH  Google Scholar 

  3. Brock A, Donahue J, Simonyan K (2018) Large scale GAN training for high fidelity natural image synthesis. CoRR. arXiv:1809.11096

  4. Carmeli C, De Vito E, Toigo A, Umanitá V (2010) Vector valued reproducing kernel Hilbert spaces and universality. Anal Appl 8(01):19–61

    Article  MathSciNet  MATH  Google Scholar 

  5. Carreira-Perpinan MA, Hinton GE (2005) On contrastive divergence learning. In: Aistats, vol 10. Citeseer, pp 33–40

  6. Chechkin GA, Piatnitski AL, Shamaev AS (2007) Homogenization: methods and applications, vol 234. American Mathematical Society, Providence

    Book  MATH  Google Scholar 

  7. Chwialkowski K, Strathmann H, Gretton A (2016) A kernel test of goodness of fit. In: JMLR: workshop and conference proceedings

  8. Cotter SL, Roberts GO, Stuart AM, White D (2013) MCMC methods for functions: modifying old algorithms to make them faster. Stat Sci 28:424–446

    Article  MathSciNet  MATH  Google Scholar 

  9. Cover TM, Thomas JA (2012) Elements of information theory. Wiley, Hoboken

    MATH  Google Scholar 

  10. Cui T, Law KJ, Marzouk YM (2016) Dimension-independent likelihood-informed MCMC. J Comput Phys 304:109–137

    Article  MathSciNet  MATH  Google Scholar 

  11. Dashti M, Law KJ, Stuart AM, Voss J (2013) MAP estimators and their consistency in Bayesian nonparametric inverse problems. Inverse Probl 29(9):095,017

    Article  MathSciNet  MATH  Google Scholar 

  12. Duane S, Kennedy AD, Pendleton BJ, Roweth D (1987) Hybrid Monte Carlo. Phys Lett B 195(2):216–222

    Article  MathSciNet  Google Scholar 

  13. Dziugaite GK, Roy DM, Ghahramani Z (2015) Training generative neural networks via maximum mean discrepancy optimization. In: Proceedings of the 31st conference on uncertainty in artificial intelligence, UAI’15. AUAI Press, pp 258–267

  14. Efendiev Y, Hou TY (2009) Multiscale finite element methods: theory and applications, vol 4. Springer, Berlin

    MATH  Google Scholar 

  15. Evensen G (2003) The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dyn 53(4):343–367

    Article  Google Scholar 

  16. Geers MG, Kouznetsova VG, Brekelmans W (2010) Multi-scale computational homogenization: trends and challenges. J Comput Appl Math 234(7):2175–2182

    Article  MATH  Google Scholar 

  17. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  18. Hoffman M, Bach FR, Blei DM (2010) Online learning for latent Dirichlet allocation. In: Advances in neural information processing systems, pp 856–864

  19. Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J Mach Learn Res 14(1):1303–1347

    MathSciNet  MATH  Google Scholar 

  20. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 5967–5976

  21. Jordan MI, Ghahramani Z, Jaakkola TS, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233

    Article  MATH  Google Scholar 

  22. Kingma DP, Welling M (2014) Auto-encoding variational Bayes. In: International conference on learning representations

  23. Lin K, Li D, He X, Zhang Z, Sun MT (2017) Adversarial ranking for language generation. In: Advances in neural information processing systems, pp 3155–3165

  24. Liu Q, Lee J, Jordan M (2016) A kernelized Stein discrepancy for goodness-of-fit tests. In: International conference on machine learning, pp 276–284

  25. Liu Q, Wang D (2016) Stein variational gradient descent: a general purpose Bayesian inference algorithm. In: Advances in neural information processing systems, pp 2378–2386

  26. Neal RM et al (2011) MCMC using Hamiltonian dynamics. Handb Markov Chain Monte Carlo 2(11):2

    MATH  Google Scholar 

  27. Oberai AA, Gokhale NH, Feijóo GR (2003) Solution of inverse problems in elasticity imaging using the adjoint method. Inverse Probl 19(2):297

    Article  MathSciNet  MATH  Google Scholar 

  28. Oliver DS, Reynolds AC, Liu N (2008) Inverse theory for petroleum reservoir characterization and history matching. Cambridge University Press, Cambridge

    Book  Google Scholar 

  29. Opper M, Saad D (2001) Advanced mean field methods: theory and practice. MIT Press, Cambridge

    Book  MATH  Google Scholar 

  30. Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: International conference on learning representations

  31. Rezende DJ, Mohamed S, Wierstra D (2014) Stochastic backpropagation and approximate inference in deep generative models. In: ICML

  32. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  33. Sato MA (2001) Online model selection based on the variational Bayes. Neural Comput 13(7):1649–1681

    Article  MATH  Google Scholar 

  34. Sei A, Symes WW (1994) Gradient calculation of the traveltime cost function without ray tracing. In: SEG technical program expanded abstracts, 1994. Society of Exploration Geophysicists, pp 1351–1354

  35. Sriperumbudur BK, Gretton A, Fukumizu K, Schölkopf B, Lanckriet GR (2010) Hilbert space embeddings and metrics on probability measures. J Mach Learn Res 11(Apr):1517–1561

    MathSciNet  MATH  Google Scholar 

  36. Stein C et al (1972) A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of the 6th Berkeley symposium on mathematical statistics and probability: probability theory, vol 2. The Regents of the University of California

  37. Wang D, Liu Q (2016) Learning to draw samples: with application to amortized MLE for generative adversarial learning. arXiv preprint arXiv:1611.01722

  38. Xu T, Zhang P, Huang Q, Zhang H, Gan Z, Huang X, He X (2018) AttnGAN: fine-grained text to image generation with attentional generative adversarial networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 1316–1324

  39. Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: sequence generative adversarial nets with policy gradient

Download references

Acknowledgements

The research of T. Y. Hou, K. C. Lam, and S. Zhang was in part supported by an NSF Grant DMS-1613861. We would also like to thank Microsoft Research for providing the computing facility in carrying some of the computations reported in this paper.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ka Chun Lam or Shumao Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hou, T.Y., Lam, K.C., Zhang, P. et al. Solving Bayesian inverse problems from the perspective of deep generative networks. Comput Mech 64, 395–408 (2019). https://doi.org/10.1007/s00466-019-01739-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00466-019-01739-7

Keywords

Navigation