Skip to main content
Log in

Gradient Convergence of Deep Learning-Based Numerical Methods for BSDEs

  • Published:
Chinese Annals of Mathematics, Series B Aims and scope Submit manuscript

Abstract

The authors prove the gradient convergence of the deep learning-based numerical method for high dimensional parabolic partial differential equations and backward stochastic differential equations, which is based on time discretization of stochastic differential equations (SDEs for short) and the stochastic approximation method for nonconvex stochastic programming problem. They take the stochastic gradient decent method, quadratic loss function, and sigmoid activation function in the setting of the neural network. Combining classical techniques of randomized stochastic gradients, Euler scheme for SDEs, and convergence of neural networks, they obtain the \(O(K^{\frac{1}{4}})\) rate of gradient convergence with K being the total number of iterative steps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Beck, C., Becker, S., Grohs, P., et al., Solving stochastic differential equations and Kolmogorov equations by means of deep learning. arXiv: 1806.00421, 2018

  2. Bender, C. and Zhang, J., Time discretization and Markovian iteration for coupled FBSDEs, The Annals of Applied Probability, 18(1), 2008, 143–177.

    Article  MathSciNet  Google Scholar 

  3. Bouchard, B. and Touzi, N., Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations, Stochastic Processes and their applications, 111(2), 2004, 175–206.

    Article  MathSciNet  Google Scholar 

  4. Carreira-Perpinan, M. and Wang, W., Distributed optimization of deeply nested systems, Appearing in Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. JMLR: W&CP volume 33.

  5. Cvitanic, J. and Zhang, J., The steepest descent method for forward-backward SDEs, Electronic Journal of Probability, 10, 2005, 1468–1495.

    Article  MathSciNet  Google Scholar 

  6. Delarue, F. and Menozzi, S., A forward-backward stochastic algorithm for quasi-linear PDEs, The Annals of Applied Probability, 16(1), 2006, 140–184.

    Article  MathSciNet  Google Scholar 

  7. Douglas, J., Ma, J. and Protter, P., Numerical methods for forward-backward stochastic differential equations, The Annals of Applied Probability, 6(3), 1996, 940–968.

    Article  MathSciNet  Google Scholar 

  8. E. W., Han, J. and Jentzen A., Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Communications in Mathematics and Statistics, 5(4), 2017, 349–380.

    Article  MathSciNet  Google Scholar 

  9. E. W., Ma, C. and Wu, L., A priori estimates of the generalization error for two-layer neural networks. arXiv:1810.06397, 2018

  10. E. W., A proposal on machine learning via dynamical systems, Communications in Mathematics and Statistics, 5(1), 2017, 1–11.

    Article  MathSciNet  Google Scholar 

  11. Ghadimi, S. and Lan, G., Stochastic first- and zeroth-order methods for nonconvex stochastic programming, SIAM Journal on Optimization, 23(4), 2013, 2341–2368.

    Article  MathSciNet  Google Scholar 

  12. Han, J. and Long, J., Convergence of the deep BSDE method for coupled FBSDEs. arXiv: 1811.01165, 2018

  13. Han, J. and E, W., Deep learning approximation for stochastic control problems. arXiv: 1611.07422, 2016

  14. Huijskens, T. P., Ruijter, M. J. and Oosterlee, C. W., Efficient numerical Fourier methods for coupled forward-backward SDEs, Journal of Computational and, Applied, Mathematics, 296, 2016, 593–612.

    Article  MathSciNet  Google Scholar 

  15. Ithapu, V. K., Ravi, S. N. and Singh, V., On the interplay of network structure and gradient convergence in deep learning, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), IEEE, 2016, 488–495

  16. Li, Q., Chen, L., Tai, C. and E. W., Maximum principle based algorithms for deep learning, Journal of Machine Learning Research, 18(165), 2017, 1–29.

    MathSciNet  Google Scholar 

  17. Ma, J., Shen, J. and Zhao, Y., On numerical approximations of forward-backward stochastic differential equations, SIAM Journal on Numerical Analysis, 46(5), 2008, 2636–2661.

    Article  MathSciNet  Google Scholar 

  18. Malek, A. and Beidokhti, R., Numerical solution for high order differential equations using a hybrid neural network-optimization method, Appl. Math. Comput., 183(1), 2006, 260–271.

    MathSciNet  MATH  Google Scholar 

  19. Nemirovski, A., Juditsky, A., Lan, G. and Shapiro, A., Robust stochastic approximation approach to stochastic programming, SIAM Journal on Optimization, 19(4), 2009, 1574–1609.

    Article  MathSciNet  Google Scholar 

  20. Pardoux, E. and Peng, S., Backward stochastic differential equations and quasilinear parabolic partial differential equations, Stochastic Partial Differential Equations and Their Applications, Springer-Verlag, Berlin, Heidelberg, 1992, 200–217.

    MATH  Google Scholar 

  21. Rudd, K., Solving Partial Differential Equations Using Artificial Neural Networks, Ph.D. Thesis, Duke University, 2013.

  22. Ruijter, M. J. and Oosterlee, C. W., Numerical Fourier method and second-order Taylor scheme for backward SDEs in finance, Applied Numerical Mathematics, 103, 2016, 1–26.

    Article  MathSciNet  Google Scholar 

  23. Shao, H. and Zheng, G., Convergence analysis of a back-propagation algorithm with adaptive momentum, Neurocomputing, 74(5), 2011, 749–752.

    Article  Google Scholar 

  24. Sirignano, J. and Spiliopoulos, K., DGM: A deep learning algorithm for solving partial differential equations, Journal of Computational Physics, 375, 2018, 1339–1364.

    Article  MathSciNet  Google Scholar 

  25. Pardoux, E. and Tang, S., Forward-backward stochastic differential equations and quasilinear parabolic PDEs, Probability Theory and Related Fields, 114(2), 1999, 123–150.

    Article  MathSciNet  Google Scholar 

  26. Xu, Y. and Yin, W., A globally convergent algorithm for nonconvex optimization based on block coordinate update, Journal of Scientific Computing, 72(2), 2017, 700–734.

    Article  MathSciNet  Google Scholar 

  27. Zeng, J., Ouyang, S., Lau, T. T. K., et al., Global convergence in deep learning with variable splitting via the Kurdyka-łojasiewicz property. arXiv: 1803.00225, 2018

  28. Zhang, X. and Zhang, N., A study on the convergence of gradient method with momentum for sigma-pi-sigma neural networks, Journal of Applied Mathematics and Physics, 6(04), 2018, 880–887.

    Article  Google Scholar 

  29. Zou, D., Cao, Y., Zhou, D. and Gu, Q., Stochastic gradient descent optimizes over-parameterized deep ReLU networks. arXiv: 1811.08888, 2018

Download references

Acknowledgement

The authors would like to thank the anonymous reviewers for their careful work and many useful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zixuan Wang.

Additional information

This work was supported by the National Key R&D Program of China (No. 2018YFA0703900) and the National Natural Science Foundation of China (No. 11631004).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Tang, S. Gradient Convergence of Deep Learning-Based Numerical Methods for BSDEs. Chin. Ann. Math. Ser. B 42, 199–216 (2021). https://doi.org/10.1007/s11401-021-0253-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11401-021-0253-x

Keywords

2020 MR Subject Classification

Navigation