Skip to main content
Log in

Deep PDE solution to BSDE

  • Original Article
  • Published:
Digital Finance Aims and scope Submit manuscript

Abstract

We numerically solve a high-dimensional backward stochastic differential equation (BSDE) by solving the corresponding partial differential equation (PDE) instead. To have a good approximation of the gradient of the solution of the PDE, we numerically solve a coupled PDE, consisting of the original semilinear parabolic PDE and the PDEs for its derivatives. We then prove the existence and uniqueness of the classical solution of this coupled PDE, and then show how to truncate the unbounded domain to a bounded one, so that the error between the original solution and that of the same coupled PDE but on the bounded domain, is small. We then solve this coupled PDE using neural networks, and proceed to establish a convergence of the numerical solution to the true solution. Finally, we test this on 100-dimensional Allen–Cahn equation, a nonlinear Black–Scholes equation and other examples. We also compare our results to the result of solving the BSDE directly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The authors declare that the data supporting the findings of this study are available within the paper.

References

  • Allen, S., & Cahn, J. (1972). Ground state structures in ordered binary alloys with second neighbor interactions. Acta Metallurgica, 20(3), 423–433.

    Article  Google Scholar 

  • Amann, H. (1985). Global existence for semilinear parabolic systems. Journal für die reine und angewandte Mathematik, 360, 47–83.

    Google Scholar 

  • Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In Precup, D. and Teh, Y. W., editors, Proceedings of the 34th international conference on machine learning, Volume 70 of Proceedings of Machine Learning Research (pp. 214–223). PMLR.

  • Bally, V., & Pagès, G. (2003). A quantization algorithm for solving multi-dimensional discrete-time optimal stopping problems. Bernoulli, 9(6), 1003–1049.

    Article  Google Scholar 

  • Beck, C., Becker, S., Cheridito, P., Jentzen, A., & Neufeld, A. (2021). Deep splitting method for parabolic pdes. SIAM Journal on Scientific Computing, 43(5), A3135–A3154.

    Article  Google Scholar 

  • Beck, C., Becker, S., Grohs, P., Jaafari, N., & Jentzen, A. (2021). Solving the kolmogorov PDE by means of deep learning. Journal of Scientific Computing, 88(3), 1–28.

    Article  Google Scholar 

  • Beck, C., Weinan, E., & Jentzen, A. (2019). Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. Journal of Nonlinear Science, 29, 1–57.

    Article  Google Scholar 

  • Becker, S., Cheridito, P., & Jentzen, A. (2019). Deep optimal stopping. Journal of Machine Learning Research, 20(74), 1–25.

    Google Scholar 

  • Bender, C., & Denk, R. (2007). A forward scheme for backward sdes. Stochastic Processes and their Applications, 117(12), 1793–1812.

    Article  Google Scholar 

  • Bender, C., Schweizer, N., & Zhuo, J. (2017). A primal-dual algorithm for bsdes. Mathematical Finance, 27(3), 866–901.

    Article  Google Scholar 

  • Berg, J., & Nyström, K. (2017). A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing, 317, 28–41.

    Article  Google Scholar 

  • Bottou, L. (2012). Large-scale machine learning with stochastic gradient descent. Statistical Learning and Data Science. Chapman & Hall/CRC Comput. Sci. Data Anal. Ser (pp. 17–25). CRC Press.

    Google Scholar 

  • Bouchard, B., & Touzi, N. (2004). Discrete-time approximation and monte-carlo simulation of backward stochastic differential equations. Stochastic Processes and their Applications, 111(2), 175–206.

    Article  Google Scholar 

  • Braess, D. (2007). Finite Elements (3rd ed.). Cambridge University Press. Theory, fast solvers, and applications in elasticity theory, Translated from the German by Larry L. Schumaker.

    Book  Google Scholar 

  • Brezis, H. (2011). Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer.

    Book  Google Scholar 

  • Chan-Wai-Nam, Q., Mikael, J., & Warin, X. (2019). Machine learning for semi linear pdes. Journal of Scientific Computing, 79, 1667–1712.

    Article  Google Scholar 

  • Chassagneux, J.-F. (2014). Linear multistep schemes for bsdes. SIAM Journal on Numerical Analysis, 52(6), 2815–2836.

    Article  Google Scholar 

  • Chassagneux, J.-F. (2014). Linear multistep schemes for bsdes. SIAM Journal on Numerical Analysis, 52(6), 2815–2836.

    Article  Google Scholar 

  • Chassagneux, J.-F., & Crisan, D. (2014). Runge-kutta schemes for backward stochastic differential equations. The Annals of Applied Probability, 24, 679–720.

    Article  Google Scholar 

  • Cohen, S. N., Jiang, D., & Sirignano, J. (2022). Neural q-learning for solving elliptic pdes. arXiv preprint arXiv:2203.17128.

  • Delarue, F. (2002). On the existence and uniqueness of solutions to fbsdes in a non-degenerate case. Stochastic Processes and their Applications, 99(2), 209–286.

    Article  Google Scholar 

  • Douglas, J., Ma, J., & Protter, P. (1996). Numerical methods for forward-backward stochastic differential equations. The Annals of Applied Probability, 6(3), 940–968.

    Article  Google Scholar 

  • Duffie, D., Schroder, M., & Skiadas, C. (1996). Recursive valuation of defaultable securities and the timing of resolution of uncertainty. The Annals of Applied Probability, 6(4), 1075–1090.

    Article  Google Scholar 

  • Farahmand, A.-m., Nabi, S., & Nikovski, D. N. (2017). Deep reinforcement learning for partial differential equation control. In 2017 American control conference (ACC) (pp. 3120–3127).

  • Friedman, A. (1964). Partial Differential Equations of Parabolic Type. Prentice-Hall Inc.

    Google Scholar 

  • Giga, Y., Goto, S., Ishii, H., & Sato, M.-H. (1991). Comparison principle and convexity preserving properties for singular degenerate parabolic equations on unbounded domains. Indiana University Mathematics Journal, 40(2), 443–470.

    Article  Google Scholar 

  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. Adaptive Computation and Machine Learning. MIT Press.

    Google Scholar 

  • Han, J., & Jentzen, A. (2017). Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in Mathematics and Statistics, 5(4), 349–380.

    Article  Google Scholar 

  • Han, J., Jentzen, A., & Weinan, E. (2018). Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences, 115(34), 8505–8510.

    Article  Google Scholar 

  • Hinton, G., Deng, L., Yu, D., Dahl, G., Rahman Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., & Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine, 29, 82–97.

    Article  Google Scholar 

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

    Article  Google Scholar 

  • Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks. Neural Networks, 4(2), 251–257.

    Article  Google Scholar 

  • Huré, C., Pham, H., & Warin, X. (2019). Some machine learning schemes for high-dimensional nonlinear pdes. Mathematics Computation, 89, 1547–1579.

    Article  Google Scholar 

  • Kangro, R., & Nicolaides, R. (2000). Far field boundary conditions for black-scholes equations. SIAM Journal on Numerical Analysis, 38(4), 1357–1368.

    Article  Google Scholar 

  • Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In ICLR (Poster).

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems. (Vol. 25). Curran Associates Inc.

    Google Scholar 

  • Larsson, S., & Thomée, V. (2003). Partial Differential Equations with Numerical Methods, Volume 45 of Texts in Applied Mathematics. Springer.

    Google Scholar 

  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.

    Article  Google Scholar 

  • Mooney, C. Z. (1997). Monte Carlo Simulation/Christopher Z. Mooney Sage University Papers Series Quantitative Applications in the Social Sciences; No 07–116. Sage Publications.

    Google Scholar 

  • Negyesi, B., Andersson, K., & Oosterlee, C. W. (2021). The one step malliavin scheme: New discretization of bsdes implemented with deep learning regressions.

  • Pardoux, E., & Peng, S. (1992). Backward stochastic differential equations and quasilinear parabolic partial differential equations. Stochastic Partial Differential Equations and their Applications (Charlotte, NC, 1991), Volume 176 of Lect. Notes Control Inf. Sci. (pp. 200–217). Springer.

    Chapter  Google Scholar 

  • Ruder, S. (2017). An overview of gradient descent optimization algorithms.

  • Sirignano, J., & Spiliopoulos, K. (2018). DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics, 375, 1339–1364.

    Article  Google Scholar 

  • Thomée, V. (2006). Galerkin Finite Element Methods for Parabolic Problems, Volume 25 of Springer Series in Computational Mathematics (2nd ed.). Springer.

    Google Scholar 

  • Weinan, E., & Yu, B. (2018). The deep ritz method: A deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics, 6(1), 1–12.

    Article  Google Scholar 

  • Zhang, C., Liao, Q., Rakhlin, A., Miranda, B., Golowich, N., & Poggio, T. (2017). Theory of deep learning iib: Optimization properties of sgd. CoRR.

Download references

Acknowledgements

This work was partially supported by NSF Grant DMS-1736414. Research was partially supported by the Acheson J. Duncan Fund for the Advancement of Research in Statistics. Jiahao Hou is grateful to the two anonymous referees whose comments and suggestions helped a lot in improving the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maxim Bichuch.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In this appendix, we present the proof of all the theorems and lemmas from the paper.

Proof for Theorem 4.1

First, we prove \(\lim \nolimits _{\left| \left| x \right| \right| \rightarrow \infty }u(t,x)= 0\). Recall u is the solution of PDE:

$$\begin{aligned} -u_t+{\mathcal {L}}u=-f(t,x,u,\sigma \nabla u),~ (t,x) \in (0,T]\times {\mathbb {R}}^d, \quad u(0,x)=g(x), ~x\in {\mathbb {R}}^d, \end{aligned}$$
(15)

where

$$\begin{aligned} {\mathcal {L}} =\frac{1}{2}\sum _{i,j=1}^d(\sigma \sigma ^T)_{i,j}(t,x)\frac{\partial ^2}{\partial x_i \partial x_j}+\sum _{i=1}^db_i(t,x)\frac{\partial }{\partial x_i}. \end{aligned}$$

By Delarue (2002) [Lemma 2.1], there exists a unique classical solution of PDE (15). By Assumption 4.1, there exist constant vectors \(K_1,K_2\in {\mathbb {R}}^d\), such that

$$\begin{aligned} f(t,x,r,0)+K_1^Tp\le f(t,x,r, p)\le f(t,x,r,0)+K_2^Tp \end{aligned}$$

for any \((t,x,r,p)\in [0,T]\times {\mathbb {R}}^d\times {\mathbb {R}}\times {\mathbb {R}}^d\). Consider two auxiliary problems

$$\begin{aligned} -{w_1}_t+{\mathcal {L}}{w_1} & = {} -f(t,x,{w_1},0)-K_1^T\nabla {w_1},~ (t,x) \in (0,T]\times {\mathbb {R}}^d, \quad {w_1}(0,x)\nonumber \\ & = {} g(x), ~x\in {\mathbb {R}}^d, \end{aligned}$$
(16)

and

$$\begin{aligned} -{w_2}_t+{\mathcal {L}}{w_2} & = {} -f(t,x,{w_2},0)-K_2^T\nabla {w_2},~ (t,x) \in (0,T]\times {\mathbb {R}}^d, \quad {w_2}(0,x)\nonumber \\ & = {} g(x), ~x\in {\mathbb {R}}^d. \end{aligned}$$
(17)

Also by Delarue (2002) [Lemma 2.1], there exist classical solutions for these two PDEs. By PDEs (15), (16) and (17), we have that

$$\begin{aligned}{} & {} u_t+F(t,x,u,\nabla u,\nabla ^2 u)=0,~ (t,x) \in (0,T]\times {\mathbb {R}}^d,\nonumber \\{} & {} {w_1}_t+F(t,x,{w_1},\nabla {w_1},\nabla ^2 {w_1})\le {w_1}_t-{\mathcal {L}}{w_1}-f(t,x,{w_1},0)-K_1^T\nabla {w_1} =0,\nonumber \\{} & {} {w_2}_t+F(t,x, {w_2},\nabla {w_2},\nabla ^2 {w_2})\ge {w_2}_t-{\mathcal {L}}{w_2}-f(t,x,{w_2},0)-K_2^T\nabla {w_2}=0, \end{aligned}$$
(18)

where \(F(t,x,p, X,Y)=-{\mathcal {L}}p-f(t,x,p,X)\). Here, u is a solution to the PDE (18), \(w_1\) is a sub-solution to the PDE and \(w_2\) is a super-solution to the PDE. By Giga et al. (1991) [Theorem 4.1], we have the following comparison result:

$$\begin{aligned} w_1(t,x)\le u(t,x)\le w_2(t,x),~ (t,x)\in [0,T]\times {\mathbb {R}}^d. \end{aligned}$$
(19)

Note that \(w_1\) and \(w_2\) are classical solutions of semilinear PDEs. Together with integrable condition and uniformly Lipschitz condition of f(txr, 0) a a function of r from Assumption 4.1, by Amann (1985)[Theorem 2.1], \(w_1\) and \(w_2\) are also global weak solutions of the corresponding semilinear PDEs. By Amann (1985)[Corollary 2.2], for all \(t\in [0,T]\), we have that \(w_1(t,x)\) and \(w_2(t,x)\) vanish as \(\left| \left| x \right| \right|\) approaches infinity. Combining this with the comparison result (19), we get that u(tx) vanishes there as well for all \(t\in [0,T]\).

Next, we prove \(\lim \nolimits _{\left| \left| x \right| \right| \rightarrow \infty }{\textbf{v}}(t,x)=0\) for any \(t\in [0,T]\). Recall that \(v_i\) are solution of PDEs (5). We have that \(f_w(t,x,u,\sigma \nabla u)\) and \(f_p(t,x,u,\sigma \nabla u)\) are bounded, since f is Lipschitz as a function of both w and p. \(\frac{\partial \sigma }{\partial x_i}\) is bounded by Assumption 3.1. \(f_{x_i}(t,x,u,\sigma \nabla u)\in L^{d+1}({\mathbb {R}}^d)\) by Assumption 4.1, and \({\mathcal {L}}_i u\in L^{d+1}({\mathbb {R}}^d)\) by the proof above. Together with the assumption that \(g\in W^{2,d+1}({\mathbb {R}}^d)\), by Amann (1985)[Corollary 2.2], we have that \(\lim \nolimits _{\left| \left| x \right| \right| \rightarrow \infty }{\textbf{v}}(t,x)=0\) for any \(t\in [0,T]\). \(\square\)

Proof for Theorem 4.2

By Assumption 4.1 and Amann (1985) [Theorem 2.1 (i)(ii)], we have that

$$\begin{aligned} u\in C^1((0,T],L^{d+1}({\mathbb {R}}^d))\cap C((0,T],W^{2,d+1}({\mathbb {R}}^d))\cap C([0,T],W^{1,d+1}({\mathbb {R}}^d)). \end{aligned}$$

Since \(u\in C([0,T],W^{1,d+1}({\mathbb {R}}^d))\) and u is a classical solution, then \(t\mapsto u(t,\cdot )\) is a continuous function on the compact set and thus uniformly continuous. For fixed \(\epsilon >0\), there exists \(\delta _i>0\), such that for \(|t_1-t_2|<\delta _i\), we have \(||u(t_1,\cdot )-u(t_2,\cdot )||_{W^{1,d+1}}<\epsilon\). By Brezis (2011)[Corollary 9.13], we have that

$$\begin{aligned} ||u(t_1,\cdot )-u(t_2,\cdot )||_{L^\infty ({\mathbb {R}}^d)} \le C||u(t_1,\cdot )-u(t_2,\cdot )||_{W^{1,d+1}({\mathbb {R}}^d)}<C\epsilon . \end{aligned}$$

for all \(t\in [0,T]\) and \(C>0\) is a constant depends only on d. Since u is a classical solution, i.e., continuous with respect to t and x, we have

$$\begin{aligned} |u(t_1,x)-u(t_2,x)|<C\epsilon \end{aligned}$$
(20)

for \(|t_1-t_2|<\delta _i\) and \(x\in {\mathbb {R}}^d\). We consider a partition of [0, T], i.e., \(t_0=0<t_1<\cdots <t_{k}=T\), such that \(\max _{0\le j\le k-1} \left| t_{j+1} -t_j \right| \le \delta _i\). By Theorem 4.1, we have that for all \(t\in [0,T]\), \(\lim \nolimits _{\left| \left| x \right| \right| \rightarrow \infty }u(t,x)=0\). Thus for \(j=0,\ldots ,k-1\), there exists \(N_j>0\) such that for \(\left| \left| x \right| \right| >N_j\), we have

$$\begin{aligned} |u(t_j,x)|<\epsilon . \end{aligned}$$
(21)

Combining (20) and (21), by triangle inequality, we have that

$$\begin{aligned} |u(t,x)|<(C+1)\epsilon \end{aligned}$$

for \((t,x)\in [t_j,t_{j+1}]\times {\mathbb {R}}^d\backslash Ball(0,N_j)\). We define \(N_{max}=\max _{j=0,\ldots ,k-1}(N_j)\). Then we can define \(U_1:=Ball(0,N_{max})\) and conclude that \(|u(t,x)|<(C+1)\epsilon\) for \((t,x)\in [0,T]\times \partial U_1\).

Let \(i\in \{1,\ldots , d\}\). We now prove that for fixed \(\epsilon >0\), there exists \(U^i_2\) such that \(|v_i(t,x)|<\epsilon\) for \((t,x)\in [0,T]\times \partial U^i_2\). As proved in Theorem 4.1, \(v_i\) are global weak solution of PDEs (5), by Amann (1985) [Theorem 2.1 (i)(ii)]. So we have that

$$\begin{aligned} v_i\in C^1((0,T],L^{d+1}({\mathbb {R}}^d))\cap C((0,T],W^{2,{d+1}}({\mathbb {R}}^d))\cap C([0,T],W^{1,{d+1}}({\mathbb {R}}^d)). \end{aligned}$$

Since \(v_i\in C([0,T],W^{1,d+1}({\mathbb {R}}^d))\) and \(v_i\) is a classical solution, then \(t\mapsto v_i(t,\cdot )\) is a continuous function on the compact set and thus uniformly continuous. For fixed \(\epsilon >0\), there exists \(\delta _i>0\), such that for \(|t_1-t_2|<\delta _i\), we have \(||v_i(t_1,\cdot )-v_i(t_2,\cdot )||_{W^{1,{d+1}}}<\epsilon\). By Brezis (2011)[Corollary 9.13], we have that

$$\begin{aligned} ||v_i(t_1,\cdot )-v_i(t_2,\cdot )||_{L^\infty ({\mathbb {R}}^d)} \le C||v_i(t_1,\cdot )-v_i(t_2,\cdot )||_{W^{1,{d+1}}({\mathbb {R}}^d)}<C\epsilon \end{aligned}$$

for all \(t\in [0,T]\) and \(C>0\) is a constant that depends only on d. Since \(v_i\) is a classical solution, i.e., continuous with respect to t and x, we have

$$\begin{aligned} |v_i(t_1,x)-v_i(t_2,x)|<C\epsilon \end{aligned}$$
(22)

for \(|t_1-t_2|<\delta _i\) and \(x\in {\mathbb {R}}^d\). We consider a partition of [0, T], i.e., \(t_0=0<t_1<\cdots <t_{k}=T\), such that \(\max _{0\le j\le k-1} \left| t_{j+1} -t_j \right| \le \delta _i\). By Theorem 4.1, we have that for all \(t\in [0,T]\), \(\lim \nolimits _{\left| \left| x \right| \right| \rightarrow \infty }v_i(t,x)=0\). Thus for \(j=0,\ldots ,k-1\), there exists \(N^i_j>0\) such that for \(\left| \left| x \right| \right| >N^i_j\), we have

$$\begin{aligned} |v_i(t_j,x)|<\epsilon . \end{aligned}$$
(23)

Combining (22) and (23), by triangle inequality, we have that

$$\begin{aligned} |v_i(t,x)|<(C+1)\epsilon \end{aligned}$$

for \((t,x)\in [t_j,t_{j+1}]\times {\mathbb {R}}^d\backslash {\text {Ball}}(0,N^i_j)\). We define \(N^i_{\max }=\max _{j=0,\ldots ,k-1}(N^i_j)\). Then we can define \(U^i_2:={\text {Ball}}(0,N^i_{\max })\) and we conclude that \(|v_i(t,x)|<(C+1)\epsilon\) for \((t,x)\in [0,T]\times \partial U^i_2\). We define \(U_2=\cup _{i=1}^d U^i_2\), then we have that \(\left| \left| {\textbf{v}}(t,x) \right| \right| <(C+1)\epsilon\) for \((t,x)\in [0,T]\times \partial U_2\). Note that since \(U_2^i\) are all ball center at 0, \(U_2\) is also a ball center at 0. Combining result from (1) of this proof, we define \(U= U_1\cup U_2\). We can conclude that \(|u(t,x)|<2\epsilon\) and \(\left| \left| {\textbf{v}}(t,x) \right| \right| <(C+1)\epsilon\) for \((t,x)\in [0,T]\times \partial U\). Furthermore, since \(U_1\) and \(U_2\) are balls centered at 0, U is still a ball center at 0. \(\square\)

Proof for Theorem 4.3

First, we prove this theorem for u. Let \(\alpha (t,x)=\bar{u}(t,x)+\epsilon (e^{Lt}-\frac{1}{2})\), where L is the global Lipschitz constant of f(txwp) with respect to w. We then have that \(\alpha (t,x)\) is the classical solution to the following PDE:

$$\begin{aligned} \begin{array}{lr} -\alpha _t+{\mathcal {L}}\alpha =-f(t,x,\bar{u},\sigma \nabla \alpha )-L\epsilon e^{Lt},~~~~ (t,x)\in (0,T]\times U,\\ \alpha (0,x)=g(x)+\epsilon \frac{1}{2},~~~~ x\in U,\\ \alpha (t,x)=\epsilon \left( e^{Lt}-\frac{1}{2}\right) ,~~~~(t,x)\in (0,T]\times \partial U. \end{array} \end{aligned}$$
(24)

Since f(txwp) is Lipschitz with respect to w, we have that \(|f(t,x,\bar{u},\sigma \nabla \alpha )-f(t,x,\alpha ,\sigma \nabla \alpha )|\le L|\bar{u}-\alpha |=L\epsilon \left( e^{Lt}-\frac{1}{2}\right)\), and thus

$$\begin{aligned} -f(t,x,\bar{u},\sigma \nabla \alpha )-L\epsilon e^{Lt}\le -f(t,x,\alpha ,\sigma \nabla \alpha )-L\epsilon \frac{1}{2}< -f(t,x,\alpha ,\sigma \nabla \alpha ). \end{aligned}$$

So for PDE of \(\alpha\) (24), we have

$$\begin{aligned} -\alpha _t+{\mathcal {L}}\alpha =-f(t,x,\bar{u},\sigma \nabla \alpha )-L\epsilon e^{Lt}<-f(t,x,\alpha ,\sigma \nabla \alpha ). \end{aligned}$$

Let \(H(t,x,\phi ,D\phi ,D^2\phi ):={\mathcal {L}}\phi +f(t,x,\phi ,\sigma \nabla \phi )\). As a classical solution to PDE (2), we have that \(u_t(t,x)= {\mathcal {L}}u(t,x)+f(t,x,u(t,x),\sigma \nabla u(t,x))\) for \((t,x)\in (0,T] \times {\mathbb {R}}^d\). Then we have that

$$\begin{aligned} \alpha _t>H(t,x,\alpha ,D\alpha ,D^2\alpha ),~ u_t\le H(t,x,u,Du,D^2u),~(t,x)\in (0,T]\times U. \end{aligned}$$

Since we have that \(||u||_{L^\infty ((0,T]\times \partial U)}< \frac{\epsilon }{2}<\epsilon \left( e^{Lt}-\frac{1}{2}\right) =\alpha (t,x)\) for \((t,x)\in (0,T]\times \partial U\) and \(\alpha (0,x)=g(x)+\frac{1}{2}\epsilon >g(x)=u(0,x)\) for any \(x\in U\). Thus, we have \(\alpha >u\) on \((0,T]\times \partial U\) and \(\{0\}\times U\). By comparison theorem (Friedman, 1964)[Theorem 2.16], we have that

$$\begin{aligned} u(t,x)<\alpha (t,x)=\bar{u}(t,x)+\epsilon \left( e^{Lt}-\frac{1}{2}\right) ~(t,x)\in (0,T]\times U. \end{aligned}$$

In the same way, we can prove that

$$\begin{aligned} u(t,x)>\bar{u}(t,x)-\epsilon \left( e^{Lt}-\frac{1}{2}\right) ,~(t,x)\in (0,T]\times U. \end{aligned}$$

Finally, we have \(||\bar{u}-u||_{L^\infty ((0,T]\times U )}<\epsilon \left( e^{LT}-\frac{1}{2}\right) ,\) which is the desired result in (10) for u.

Next, we prove that \(||\bar{v}_i-v_i||_{L^\infty ([0,T]\times U )}<\epsilon \left( e^{LT}-\frac{1}{2}\right)\) given \(||v_i||_{L^\infty ([0,T]\times \partial U )}<\frac{\epsilon }{2}\). Let \(i\in \{1,\ldots ,d\}\). Let \(\alpha _i(t,x)=\bar{v}_i+\epsilon (e^{Lt}-\frac{1}{2})\), then \(\alpha _i\) is the classical solution to the following PDE:

$$\begin{aligned}&-\partial _t \alpha _i+{\mathcal {L}}\alpha _i=-f_w(t,x,u,\sigma \nabla u)\left( \alpha _i-\epsilon \left( e^{Lt}-\frac{1}{2}\right) \right) -f_{x_i}(t,x,u,\sigma \nabla u)\\&\quad -f_p^T(t,x,u,\sigma \nabla u)\left( \frac{\partial \sigma }{\partial x_i}\nabla u+ \sigma \nabla \alpha _i\right) -{\mathcal {L}}_i u -L\epsilon e^{Lt},~~~~ (t,x) \in (0,T]\times U,\\&\alpha _i(0,x)=g_{x_i}(x)+\epsilon \frac{1}{2},~~~~x\in U,\\&\alpha _i(t,x)=\epsilon \left( e^{Lt}-\frac{1}{2}\right) ,~~~ (t,x)\in (0,T]\times \partial U. \end{aligned}$$

Since L is the Lipschitz constant of f with respect to w, then \(|f_w(t,x,w,p)|\le L\) in \([0,T]\times {\mathbb {R}}^d\times {\mathbb {R}}\), and we have that

$$\begin{aligned} f_w\epsilon (e^{Lt}-\frac{1}{2})\le L\epsilon \left( e^{Lt}-\frac{1}{2}\right) < L\epsilon e^{Lt}. \end{aligned}$$

Thus,

$$\begin{aligned}&-\partial _t \alpha _i+{\mathcal {L}}\alpha _i<-f_w(t,x,u,\sigma \nabla u)\alpha _i-f_{x_i}(t,x,u,\sigma \nabla u)\\&\quad -f_p^T(t,x,u,\sigma \nabla u)\left( \frac{\partial \sigma }{\partial x_i}\nabla u+ \sigma \nabla \alpha _i\right) -{\mathcal {L}}_i u. \end{aligned}$$

Define

$$\begin{aligned} H_i(t,x,\phi ,D\phi ,D^2\phi )&:={\mathcal {L}}\phi +f_w(t,x,u,\sigma \nabla u)\phi +f_{x_i}(t,x,u,\sigma \nabla u)\\&\quad +f_p^T(t,x,u,\sigma \nabla u)(\frac{\partial \sigma }{\partial x_i}\nabla u+ \sigma \nabla \phi )+{\mathcal {L}}_i u. \end{aligned}$$

It follows that

$$\begin{aligned} \partial _t \alpha _i>H_i(t,x,\alpha _i,D\alpha _i,D^2\alpha _i),~ v_{it}\le H_i(t,x,v_i,Dv_i,D^2v_i),~(t,x)\in (0,T]\times U. \end{aligned}$$

Since we have that \(||v_i||_{L^\infty ((0,T]\times \partial U)}< \frac{\epsilon }{2}<\epsilon \left( e^{Lt}-\frac{1}{2}\right) =\alpha _i(t,x)\) on boundary \((0,T]\times \partial U\) and \(\alpha _i(0,x)=g_{x_i}(x)+\frac{1}{2}\epsilon >g_{x_i}(x)=v_i(0,x)\) for any \(x\in U\), we have \(\alpha _i>v_i\) on \((0,T]\times \partial U\) and \(\{0\}\times U\). By Comparison Theorem (Friedman, 1964)[Theorem 2.16], we have that

$$\begin{aligned} v_i(t,x)<\alpha _i(t,x)=\bar{v}_i(t,x)+\epsilon \left( e^{Lt}-\frac{1}{2}\right) ,~(t,x)\in (0,T]\times U. \end{aligned}$$

In the same way, we can prove that

$$\begin{aligned} v_i(t,x)>\bar{v}_i(t,x)-\epsilon \left( e^{Lt}-\frac{1}{2}\right) ,~(t,x)\in (0,T]\times U. \end{aligned}$$

The desired result in (10) for \(v_i, \bar{v}_i\) now follows:

$$\begin{aligned} ||\bar{v}_i-v_i||_{L^\infty ((0,T]\times U )}<\epsilon \left( e^{LT}-\frac{1}{2}\right) . \end{aligned}$$

\(\square\)

Proof for Theorem 4.4

By Friedman (1964) [Theorems 7.4.6 and 7.4.10 ], there exists unique classical solution of PDE (11). Note that by Delarue (2002) equation 2.15, \(u\ge {\bar{u}}\) on \([0,T]\times \partial \Omega\). By comparison theorem (Friedman, 1964)[Theorem 2.6.16], we have that \(u\ge \bar{u}\) on \([0,T]\times \Omega\). So \(u-\bar{u}=|u-\bar{u}|\), i.e., \(|u(t,x)-\bar{u}(t,x)|\) is also twice continuously differentiable as a function of x. Now we want to show \(|u-\bar{u}|\) is a super-solution of PDE \(-u_t+{\mathcal {L}}u=f(t,x,u,\sigma \nabla u)\) with boundary condition \(u-h\) and initial condition 0. By Assumption 4.2: for any \(p_1,p_2\in {\mathbb {R}}^d\), \((t,x)\in [0,T]\times {\mathbb {R}}^d\) and \(v_1,v_2\in {\mathbb {R}}\), \(f(t,x,v_1,p_1)-f(t,x,v_2,p_2)\ge f(t,x,v_1-v_2,p_1-p_2)\). Therefore, the PDE that \(u-\bar{u}\) satisfies is:

$$\begin{aligned} -(u-\bar{u})_t+{\mathcal {L}}(u-\bar{u})=f(t,x,u,\sigma \nabla u)-f(t,x,\bar{u},\sigma \nabla \bar{u})\ge f(t,x,u-\bar{u},\sigma \nabla (u-\bar{u})) \end{aligned}$$

on \((0,T]\times \Omega\), which means \(|u-\bar{u}|\) is a super-solution.

Next, we find a sub-solution of the same PDE. We define a function \(W(t,x):=\sum _{i=1}^d W_i\), where \(W_i,~i=1,\ldots , d\), is defined as

$$\begin{aligned} W_i(t,x_i)=\frac{1}{\sqrt{t+\epsilon _i}}e^{(-\gamma _i\frac{(-k_i-\sqrt{M^2+1}+\sqrt{x_i^2+1})^2}{t+\epsilon _i})},~(t,x_i)\in [0,T]\times {\mathbb {R}},~i=1, \ldots , d, \end{aligned}$$
(25)

in which the parameters \(\epsilon _i\ge 0\),\(\gamma _i\ge 0\) and \(k_i\ge 0\) needs to be such that \(W_i\) satisfy the inequality

$$\begin{aligned} -W_{it}+{\mathcal {L}}W_i\le q_i\frac{\partial W_i}{\partial x_i}-L W_i+\frac{B}{d}. \end{aligned}$$

Here, \(B\ge 0\) is the lower bound of f(tx, 0, 0), L is the Lipschitz constant of f with respect to u, and \(q:=(q_1,\ldots ,q_n)^T\) is a column vector in \({\mathbb {R}}^d\), which satisfies \(f(t,x,v,0)+q^T p\le f(t,x,v, p)\) for all \((t,x,v,p)\in [0,T]\times {\mathbb {R}}^d\times {\mathbb {R}}\times {\mathbb {R}}^d\).

Given a structure of \(W_i\) in (25) form, we can find \(-W_{it}+{\mathcal {L}}W_i-q_i\sigma _i\frac{\partial W_i}{\partial x_i}+L W_i\) explicitly on \([0,T]\times [-M,M]\):

$$\begin{aligned}&-W_{it}+{\mathcal {L}}W_i-q_i\sigma _i\frac{\partial W_i}{\partial x_i}+L W_i\nonumber \\&\quad =\frac{W_i}{(t+\epsilon _i)^{5/2}} \Bigg [-\gamma _i Y_i^2+\frac{1}{2}(t+\epsilon _i)+\frac{2(b_i-q_i\sigma _i)x_iY_i\gamma _i(t+\epsilon _i)}{\sqrt{1+x_i^2}}+L(t+\epsilon _i)^2\nonumber \\&\qquad +\frac{4\sigma ^T\sigma _{ii}x_i^2Y_i^2\gamma _i^2}{1+x_i^2}-\frac{2\sigma ^T\sigma _{ii}x_i^2\gamma _i(t+\epsilon _i)}{1+x_i^2}+\frac{2\sigma ^T\sigma _{ii}x_i^2 Y_i \gamma _i(t+\epsilon _i)}{(1+x_i^2)^{3/2}} -\frac{2\sigma ^T\sigma _{ii}Y_i \gamma _i(t+\epsilon _i)}{\sqrt{1+x_i^2}}\Bigg ]\nonumber \\&\quad =\frac{W_i}{(t+\epsilon _i)^{5/2}}\Bigg [ Y_i^2\left( -\gamma _i+\frac{4\sigma ^T\sigma _{ii}x_i^2\gamma _i^2}{1+x_i^2}\right) +Y_i\gamma _i\left( \frac{2(b_i-q_i\sigma _i)x_i}{\sqrt{1+x_i^2}}+\frac{2\sigma ^T\sigma _{ii}x_i^2}{(1+x_i^2)^{3/2}}-\frac{2\sigma ^T\sigma _{ii}}{\sqrt{1+x_i^2}}\right) \nonumber \\&\qquad \times (t+\epsilon _i)+\left( \frac{1}{2}+L(t+\epsilon _i)-\frac{2\sigma ^T\sigma _{ii}x_i^2\gamma _i}{1+x_i^2}\right) (t+\epsilon _i) \Bigg ], \end{aligned}$$
(26)

where \(Y_i:=-k_i-\sqrt{M^2+1}+\sqrt{x_i^2+1}\le -k_i\). Take \(\gamma _i=\frac{1}{4||\sigma ^T\sigma _{ii}||_{L^\infty ([0,T]\times {\mathbb {R}}^d)}}\). Recall that by Assumption 3.1.1 we have that \(\sigma\) is bounded; therefore, \(0<\gamma _i<\infty .\) Then, \(-\gamma _i+\frac{4\sigma ^T\sigma _{ii}x_i^2\gamma _i^2}{1+x_i^2}< 0\). Since \(B\ge 0\), we only need to find parameters that make (26) less equal than 0. Because \(\frac{W_i}{(t+\epsilon _i)^{5/2}}>0\), it is equivalent to finding parameters such that

$$\begin{aligned} \begin{aligned}&Y_i^2\left( -\gamma _i+\frac{4\sigma ^T\sigma _{ii}x_i^2\gamma _i^2}{1+x_i^2}\right) +Y_i\gamma _i\left( \frac{2(b_i-q_i\sigma _i)x_i}{\sqrt{1+x_i^2}}+\frac{2\sigma ^T\sigma _{ii}x_i^2}{(1+x_i^2)^{3/2}}-\frac{2\sigma ^T\sigma _{ii}}{\sqrt{1+x_i^2}}\right) (t+\epsilon _i)\\&\quad +\left( \frac{1}{2}+L(t+\epsilon _i)-\frac{2\sigma ^T\sigma _{ii}x_i^2\gamma _i}{1+x_i^2}\right) (t+\epsilon _i)\le 0, \end{aligned} \end{aligned}$$
(27)

for \([0,T]\times [-M,M]\). This is a quadratic function of \(Y_i\), which has negative leading coefficient and \(Y_i\le -k_i\). With fixed \(\epsilon _i=1\), we can always find \(k_i\) large enough such that (27) hold. Therefore,

$$\begin{aligned} -W_{it}+{\mathcal {L}}W_i-q_i\sigma _i\frac{\partial W_i}{\partial x_i}+L W_i\le 0 \le \frac{B}{d}. \end{aligned}$$
(28)

Next, we prove that W satisfies: \(-u_t+{\mathcal {L}}u\le f(t,x,u,\nabla u)\). From (28), we have that W satisfies the following inequality:

$$\begin{aligned} -W_t+{\mathcal {L}}W=\sum _{i=1}^d -W_{it}+{\mathcal {L}}W_i\le \sum _{i=1}^d q_i\sigma _i\frac{\partial W_i}{\partial x_i} -LW+B. \end{aligned}$$
(29)

By definition (25), we have that \(W_i\ge 0\), and therefore \(W=|W|\). So (29) becomes:

$$\begin{aligned} \begin{aligned} -W_t+{\mathcal {L}}W&< \sum _{i=1}^d q_i\sigma _i\frac{\partial W_i}{\partial x_i} -L|W|+f(t,x,0,0)\le \sum _{i=1}^d q_i\sigma _i\frac{\partial W_i}{\partial x_i}+ f(t,x,W,0)\\&\le q^T\sigma \nabla W+f(t,x,W,0)\le f(t,x,W,\sigma \nabla W). \end{aligned} \end{aligned}$$

We define a function

$$\begin{aligned} \alpha (t,x):=\sup _{(t^\prime ,x^\prime )\in ((0,T]\times \partial \Omega ) \cup (\{0\}\times \Omega )}\left\{ \frac{|u(t^\prime ,x^\prime )-\bar{u}(t^\prime ,x^\prime )|}{W(t^\prime ,x^\prime )}\right\} W(t,x). \end{aligned}$$

Therefore, since \(\alpha\) is proportional to W, it is also a sub-solution of \(-\alpha _t+{\mathcal {L}}\alpha \le f(t,x,\alpha ,\sigma \nabla \alpha )\) and is greater than \(u-\bar{u}\) on boundary and is strictly positioned at the initial time \(t=0\). From Comparison Theorem (Friedman, 1964)[Theorem 2.6.16], we have that

$$\begin{aligned} |u(t,x)-\bar{u}(t,x)|&\le \sup _{(t^\prime ,x^\prime )\in ((0,T]\times \partial \Omega ) \cup (\{0\}\times \Omega )}\left\{ \frac{|u(t^\prime ,x^\prime )-\bar{u}(t^\prime ,x^\prime )|}{W(t^\prime ,x^\prime )}\right\} W(t,x)\\&\le ||u-\bar{u}||_{L^\infty ([0,T]\times \partial \Omega )} \cdot \frac{W(t,x)}{\inf _{(t^\prime ,x^\prime )\in [0,T]\times \partial \Omega } W(t^\prime ,x^\prime )}\\&\le ||u-\bar{u}||_{L^\infty ([0,T]\times \partial \Omega )} \cdot \frac{W(t,x)}{\sum _{i=1}^d\inf _{(t^\prime ,x^\prime )\in [0,T]\times \partial \Omega } W_i(t^\prime ,x^\prime )}\\&\le ||u-\bar{u}||_{L^\infty ([0,T]\times \partial \Omega )} \cdot \frac{W(t,x)}{\sum _{i=1}^d\inf _{{t^\prime \in [0,T]}} W_i(t^\prime ,M)}\\&\le C||u-\bar{u}||_{L^\infty ([0,T]\times \partial \Omega )}W(t,x)\\&\le C ||u-\bar{u}||_{L^\infty ([0,T]\times \partial \Omega )}\sum _{i=1}^d \frac{1}{\sqrt{t+1}}e^{\left( -\gamma _i\frac{(-k_i-\sqrt{M^2+1}+\sqrt{x_i^2+1})^2}{t+1}\right) }. \end{aligned}$$

This shows (13).

Next, we prove (14). Fix \(i\in \{1,\ldots ,d\}\). Existence and uniqueness of the classical solution of PDE(12) follows, for example, from Friedman (1964)[Theorems 7.4.6 and 7.4.10]. Note that by Delarue (2002)[equation 2.15], \(v_i\ge \bar{ v}_i\) on \([0,T]\times \partial \Omega\). By Comparison Theorem (Friedman, 1964)[Theorem 2.6.16], we have that \(v_i\ge \bar{v}_i\) on \([0,T]\times \Omega\). So \(v_i-\bar{v}_i=|v-\bar{v}_i|\), i.e., \(|v_i(t,x)-\bar{v}_i(t,x)|\) is also twice continuously differentiable w.r.t. x. Note that \(|v_i-\bar{v}_i|\) is a super-solution of PDE

$$\begin{aligned} -(v_{it}-{\bar{v}}_{it})+{\mathcal {L}}(v_i-\bar{v}_i)+f_w(t,x,{\bar{u}},\sigma \nabla {\bar{u}})(v_i-\bar{v}_i)+f_p^T(t,x,{\bar{u}},\sigma \nabla {\bar{u}})\sigma \nabla (v_i-\bar{v}_i)=0, \end{aligned}$$
(30)

with boundary condition \(v_i-{\bar{v}}_i\) and initial condition 0. Next, we want to find a sub-solution of PDE (30). Similar to the construction in (25), we define \(W^i\) as:

$$\begin{aligned} W^i=\sum _{j=1}^d \frac{1}{\sqrt{t+1}}e^{\left( -\gamma ^i\frac{(-k^i_j-\sqrt{M^2+1}+\sqrt{x_j^2+1})^2}{t+1}\right) }, \end{aligned}$$

which satisfy \(-W^i_t+{\mathcal {L}}W^i+f_w(t,x,{\bar{u}},\sigma \nabla {\bar{u}})W^i+f_p^T(t,x,{\bar{u}},\sigma \nabla {\bar{u}})\sigma \nabla W^i\le 0\). Again, we define

$$\begin{aligned} \alpha ^i(t,x):=\sup _{(t^\prime ,x^\prime )\in ((0,T]\times \partial \Omega ) \cup (\{0\}\times \Omega )}\left\{ \frac{|v_i(t^\prime ,x^\prime )-\bar{v}_i(t^\prime ,x^\prime )|}{W^i(t^\prime ,x^\prime )}\right\} W^i(t,x). \end{aligned}$$

It follows that \(\alpha\) is a sub-solution of \(-\alpha ^i_t+{\mathcal {L}}\alpha ^i+ f_w(t,x,{\bar{u}},\sigma \nabla \bar{u})\alpha ^i+f_p^T(t,x,{\bar{u}},\sigma \nabla {\bar{u}})\sigma \nabla \alpha ^i\le 0\), and \(\alpha \ge v_i-\bar{v}_i\) on the boundary \(\partial \Omega\) and strictly positive at the initial time \(t=0\). By the Comparison Theorem (Friedman, 1964)[Theorem 2.6.16], we have that

$$\begin{aligned} |v_i(t,x)-{\bar{v}}_i(t,x)|&\le \sup _{(t^\prime ,x^\prime )\in ((0,T]\times \partial \Omega ) \cup ({0}\times \Omega )}{\frac{|v_i(t^\prime ,x^\prime )-{\bar{v}}_i(t^\prime ,x^\prime )|}{W^i(t^\prime ,x^\prime )}}W_i(t,x)\\&\le ||v_i-{\bar{v}}_i||_{L^\infty ([0,T]\times \partial \Omega )} \cdot \frac{W_i(t,x)}{\inf {(t^\prime ,x^\prime )\in [0,T]\times \partial \Omega } W^i(t^\prime ,x^\prime )}\\&\le ||v_i-{\bar{v}}_i||_{L^\infty ([0,T]\times \partial \Omega )} \cdot \frac{W_i(t,x)}{\sum _{j=1}^d\inf _{(t^\prime ,x^\prime )\in [0,T]\times \partial \Omega } W^i_j(t^\prime ,x^\prime )}\\&\le ||v_i-{\bar{v}}_i||_{L^\infty ([0,T]\times \partial \Omega )} \cdot \frac{W_i(t,x)}{\sum _{j=1}^d\inf _{{t^\prime \in [0,T]}} W^i_j(t^\prime ,M)}\\&\le C||v_i-{\bar{v}}_i||_{L^\infty ([0,T]\times \partial \Omega )}W_i(t,x)\\&\le C ||v_i-{\bar{v}}_i||_{L^\infty ([0,T]\times \partial \Omega )}\sum _{j=1}^d \frac{1}{\sqrt{t+1}}e^{\left( -\gamma ^i\frac{(-k_j-\sqrt{M^2+1}+\sqrt{x_j^2+1})^2}{t+1}\right) }. \end{aligned}$$

This shows (14). \(\square\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bichuch, M., Hou, J. Deep PDE solution to BSDE. Digit Finance (2023). https://doi.org/10.1007/s42521-023-00098-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42521-023-00098-6

Keywords

JEL Classification

Navigation