Skip to main content
Log in

A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle

  • Published:
Stochastics and Partial Differential Equations: Analysis and Computations Aims and scope Submit manuscript

Abstract

This paper develops an efficient numerical algorithm for solving a class of partially observed stochastic optimal control problems with correlated noises. The main contribution of this paper is threefold: first, we introduce a relaxed system and assume the Roxin condition (convexity requirement) on coefficients. Then, an optimal relaxed system provides an optimal admissible control in a broader sense, and a relaxed control turns out to be a usual admissible control. Second, we transform the optimal control problem into an optimization problem for a convex functional by employing a projection operator. A stochastic gradient descent approach is then proposed and its convergence properties are demonstrated. Last but not least, we present a branching particle system (branching particle filter) to approximate the optimal filter. Due to the random nature of the coefficients in the Zakai equation, neither the dual approach nor the mild solution approach can be used. We devise a novel method for establishing the convergence of the branching particle system approximation, as well as its rate of convergence. This branching-type particle filter algorithm allows us to tackle non-Markovian environments. The major body of this paper concludes with a numerical case study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availibility

Enquiries about data availability should be directed to the authors.

References

  1. Archibald, R., Bao, F., Yong, J.M., Zhou, T.: An efficient numerical algorithm for solving data driven feedback control problems. J. Sci. Comput. 85(51), 1–27 (2020)

    MathSciNet  Google Scholar 

  2. Bain, A., Crisan, D.: Fundamentals of Stochastic Filtering. Stochastic Modelling and Applied Probability. Springer, New York (2009)

    Google Scholar 

  3. Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)

    Google Scholar 

  4. Bensoussan, A., Glowinski, R., Răşcanu, A.: Approximation of the Zakai equation by the splitting up method. SIAM J. Control Optim. 28(6), 1420–1431 (1990)

    MathSciNet  Google Scholar 

  5. Bensoussan, A., Viot, M.: Optimal control of stochastic linear distributed parameter systems. SIAM J. Control 13, 904–926 (1975)

    MathSciNet  Google Scholar 

  6. Buckdahn, R., Li, J., Ma, J.: A mean-field stochastic control problem with partial observations. Ann. Appl. Probab. 27(5), 3201–3245 (2017)

    MathSciNet  Google Scholar 

  7. Chang, D.J., Liu, H.L., Xiong, J.: A branching particle system approximation for a class of FBSDEs. Probab. Uncertain. Quant. Risk 1(9), 1–34 (2016)

    MathSciNet  Google Scholar 

  8. Charalambous, C.D., Elliott, R.J.: Classes of nonlinear partially observable stochastic optimal control problems with explicit optimal control laws. SIAM J. Control Optim. 36(2), 542–578 (1998)

    MathSciNet  Google Scholar 

  9. Crisan, D.: Particle approximations for a class of stochastic partial differential equations. Appl. Math. Optim. 54(3), 293–314 (2006)

    MathSciNet  Google Scholar 

  10. Crisan, D., Li, K.: Generalised particle filters with Gaussian mixtures. Stoch. Process. Appl. 125(7), 2643–2673 (2015)

    MathSciNet  Google Scholar 

  11. Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions, 2nd edn. Cambridge University Press, Cambridge (2014)

    Google Scholar 

  12. Du, K.: \(W^{2, p}\)-solutions of parabolic SPDEs in general domains. Stoch. Process. Appl. 130(1), 1–19 (2020)

    Google Scholar 

  13. El Karoui, N., Du, N.H., Monique, J.-P.: Existence of an optimal Markovian filter for the control under partial observations. SIAM J. Control Optim. 26(5), 1025–1061 (1988)

    MathSciNet  Google Scholar 

  14. Evans, L.C.: Partial Differential Equations, 2nd edn. American Mathematical Society, Providence (2010)

    Google Scholar 

  15. Fleming, W.H., Pardoux, É.: Optimal control for partially observed diffusions. SIAM J. Control Optim. 20(2), 261–285 (1982)

    MathSciNet  Google Scholar 

  16. Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions, 2nd edn. Springer, New York (2006)

    Google Scholar 

  17. Florentin, J.J.: Partial observability and optimal control. Int. J. Electron. 13, 263–279 (1962)

    Google Scholar 

  18. Gong, B., Liu, W.B., Tang, T., Zhao, W.D., Zhou, T.: An efficient gradient projection method for stochastic optimal control problems. SIAM J. Numer. Anal. 55(6), 2982–3005 (2017)

    MathSciNet  Google Scholar 

  19. Gozzi, F., Świȩch, A.: Hamilton-Jacobi-Bellman equations for the optimal control of the Duncan–Mortensen–Zakai equation. J. Funct. Anal. 172(2), 466–510 (2000)

    MathSciNet  Google Scholar 

  20. Gyöngy, I., Millet, A.: On discretization schemes for stochastic evolution equations. Potential Anal. 23(2), 99–134 (2005)

    MathSciNet  Google Scholar 

  21. Haussmann, U.G.: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Optim. 25(2), 341–361 (1987)

    MathSciNet  Google Scholar 

  22. Huang, J.H., Wang, G.C., Xiong, J.: A maximum principle for partial information backward stochastic control problems with applications. SIAM J. Control Optim. 48(4), 2106–2117 (2009)

    MathSciNet  Google Scholar 

  23. Kallianpur G., Xiong J.: Stochastic Differential Equations in Infinite Dimensional Spaces. IMS Lecture Notes—Monograph Series 26. Institute of Mathematical Statistics (1995)

  24. Krylov, N.V.: On \(L_p\)-theory of stochastic partial differential equations in the whole space. SIAM J. Math. Anal. 27(2), 313–340 (1996)

    MathSciNet  Google Scholar 

  25. Krylov, N.V., Rozovskii, B.L.: Stochastic evolution equations. J. Sov. Math. 16, 1233–1277 (1981)

    Google Scholar 

  26. Kurtz, T.G., Xiong, J.: Particle representations for a class of nonlinear SPDEs. Stoch. Process. Appl. 83(1), 103–126 (1999)

    MathSciNet  Google Scholar 

  27. Kurtz T.G., Xiong J.: Numerical solutions for a class of SPDEs with application to filtering. In: Stochastics in Finite and Infinite Dimensions, pp. 233–258. Birkhäuser, Boston (2001)

  28. Kushner, H.J.: Probability Methods for Approximations in Stochastic Control and for Elliptic Equations. Academic Press, New York (1977)

    Google Scholar 

  29. Kushner, H.J., Dupuis, P.: Numerical Methods for Stochastic Control Problems in Continuous Time, 2nd edn. Springer, New York (2001)

    Google Scholar 

  30. Li, X.J., Tang, S.J.: General necessary conditions for partially observed optimal stochastic controls. J. Appl. Probab. 32(4), 1118–1137 (1995)

    MathSciNet  Google Scholar 

  31. Liu, H.L., Xiong, J.: A branching particle system approximation for nonlinear stochastic filtering. Sci. China Math. 56(8), 1521–1541 (2013)

    MathSciNet  Google Scholar 

  32. Lototsky, S., Mikulevicius, R., Rozovskii, B.L.: Nonlinear filtering revisited: a spectral approach. SIAM J. Control Optim. 35(2), 435–461 (1997)

    MathSciNet  Google Scholar 

  33. Liu, W., Röckner, M.: Stochastic Partial Differential Equations: An Introduction. Springer, Cham (2015)

    Google Scholar 

  34. Milstein, G.N., Tretyakov, M.V.: Numerical algorithms for forward-backward stochastic differential equations. SIAM J. Sci. Comput. 28(2), 561–582 (2006)

    MathSciNet  Google Scholar 

  35. Nagase, N., Nisio, M.: Optimal controls for stochastic partial differential equations. SIAM J. Control Optim. 28(1), 186–213 (1990)

    MathSciNet  Google Scholar 

  36. Nisio, M.: Stochastic Control Theory. Dynamic Programming Principle, 2nd edn. Springer, Tokyo (2015)

    Google Scholar 

  37. Pardoux, É.: Stochastic partial differential equations and filtering of diffusion processes. Stochastics 3(2), 127–167 (1979)

    MathSciNet  Google Scholar 

  38. Pardoux, É.: Stochastic Partial Differential Equations. An introduction. Springer Briefs in Mathematics. Springer, Cham (2021)

    Google Scholar 

  39. Pardoux, É., Răşcanu, A.: Stochastic Differential Equations, Backward SDEs, Partial Differential Equations. Springer, Cham (2014)

    Google Scholar 

  40. Rogers L.C.G., Williams D.: Diffusions, Markov Processes, and Martingales, vol. 1. Foundations, vol. 2. Itô calculus. Cambridge University Press, Cambridge (2000)

  41. Rozovsky, B.L., Lototsky, S.V.: Stochastic Evolution Systems. Linear Theory and Applications to Non-linear Filtering, 2nd edn. Springer, Cham (2018)

    Google Scholar 

  42. Stroock D.W., Varadhan S.R.S.: Multidimensional Diffusion Processes. Reprint of the 1997 edition. Springer, Berlin (2006)

  43. Tang, S.J.: The maximum principle for partially observed optimal control of stochastic differential equations. SIAM J. Control Optim. 36(5), 1596–1617 (1998)

    MathSciNet  Google Scholar 

  44. Wang, G.C., Wu, Z.: Kalman–Bucy filtering equations of forward and backward stochastic systems and applications to recursive optimal control problems. J. Math. Anal. Appl. 342(2), 1280–1296 (2008)

    MathSciNet  Google Scholar 

  45. Wang, G.C., Wu, Z., Xiong, J.: Maximum principles for forward-backward stochastic control systems with correlated state and observation noises. SIAM J. Control Optim. 51(1), 491–524 (2013)

    MathSciNet  Google Scholar 

  46. Wang, G.C., Wu, Z., Xiong, J.: A linear-quadratic optimal control problem of forward-backward stochastic differential equations with partial information. IEEE Trans. Autom. Control 60(11), 2904–2916 (2015)

    MathSciNet  Google Scholar 

  47. Wang, G.C., Wu, Z., Xiong, J.: An Introduction to Optimal Control of FBSDE with Incomplete Information. Springer, Cham (2018)

    Google Scholar 

  48. Wonham, W.M.: On the separation theorem of stochastic control. SIAM J. Control 6, 312–326 (1968)

    MathSciNet  Google Scholar 

  49. Xiong, J.: An Introduction to Stochastic Filtering Theory. Oxford University Press, Oxford (2008)

    Google Scholar 

  50. Xiong, J.: Particle Approximations to the Filtering Problem in Continuous Time. The Oxford Handbook of Nonlinear Filtering, pp. 635–655. Oxford University Press, Oxford (2011)

    Google Scholar 

  51. Xiong, J., Zhou, X.Y.: Mean-variance portfolio selection under partial information. SIAM J. Control Optim. 46(1), 156–175 (2007)

    MathSciNet  Google Scholar 

  52. Yong, J.M., Zhou, X.Y.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer, New York (1999)

    Google Scholar 

  53. Zhang, J.F.: Backward Stochastic Differential Equations. From Linear to Fully Nonlinear Theory. Springer, New York (2017)

    Google Scholar 

  54. Zhang, Q.: Controlled partially observed diffusions with correlated noise. Appl. Math. Optim. 22(3), 265–285 (1990)

    MathSciNet  Google Scholar 

  55. Zhao, W.D., Chen, L.F., Peng, S.G.: A new kind of accurate numerical method for backward stochastic differential equations. SIAM J. Sci. Comput. 28(4), 1563–1581 (2006)

    MathSciNet  Google Scholar 

  56. Zhou, X.Y.: On the existence of optimal relaxed controls of stochastic partial differential equations. SIAM J. Control Optim. 30(2), 247–261 (1992)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

H.X. Wan thanks Professor J. Xiong for hosting a visit to the Department of Mathematics at SUSTech in 2021, during which this work was initiated. The authors would like to think referees for suggesting improvements in our original manuscript.

The research of G.C. Wang is supported partially by the NSFC under Grant Nos. 61925306, 61821004 and 11831010, the National Key R &D Program of China under Grant No. 2022YFA1006103, and the NSF of Shandong Province under Grant Nos. ZR2019ZD42 and ZR2020ZD24. The research of J. Xiong is supported partially by the NSFC under Grant No. 11831010, and the National Key R &D Program of China under Grant No. 2022YFA1006102.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Xiong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proof of Theorem 6

We will use two steps to finish the proof. In this appendix, \(\mathbb {E}\) represents \(\mathbb {E}^{\mathbb {Q}}\).

We begin by proving it for the special case that \(\tau \) takes values only in a countable set of times \(\{t_1,t_2,\ldots \}\). We need to show that \(\mathbb {E}[\pi _{\tau }(\phi )\textbf{1}_{A}]=\mathbb {E}[\phi (\mathcal {X}_{\tau })\textbf{1}_{A}]\) for every \(A\in \mathcal {F}_{\tau }^{Y}\). Note that \(A=\bigcup _{i\ge 1} A\cap \{\tau =t_i\} \triangleq \bigcup _{i\ge 1} A_i\) implies that

$$\begin{aligned} \begin{aligned}&\mathbb {E}[\pi _{\tau }(\phi )\textbf{1}_{A}] = \sum _{i=1}^{\infty } \mathbb {E}\left[ \pi _{\tau }(\phi )\textbf{1}_{A\cap \{t=t_i\}} \right] , \\&\mathbb {E}[\phi (\mathcal {X}_{\tau })\textbf{1}_{A}] = \sum _{i=1}^{\infty } \mathbb {E}\left[ \phi (\mathcal {X}_{\tau })\textbf{1}_{A\cap \{t=t_i\}} \right] . \end{aligned} \end{aligned}$$

Then it follows from \(A_i\in \mathcal {F}_{t_i}^{Y}\) that

$$\begin{aligned} \begin{aligned} \mathbb {E}[\pi _{\tau }(\phi )\textbf{1}_{A_i}] = \mathbb {E}[\pi _{t_i}(\phi )\textbf{1}_{A_i}] = \mathbb {E}[\phi (\mathcal {X}_{t_i})\textbf{1}_{A_i}] = \mathbb {E}[\phi (\mathcal {X}_{\tau })\textbf{1}_{A_i}]. \end{aligned} \end{aligned}$$

Moreover,

$$\begin{aligned} \begin{aligned} \{ \pi _{\tau }(\phi )\in B \} = \bigcup _{i\ge 0} \{ \pi _{\tau }(\phi )\in B~\text {and}~\tau =t_i \} = \bigcup _{i\ge 0} \{ \pi _{t_i}(\phi )\in B \} \cap \{ \tau =t_i \} \end{aligned} \end{aligned}$$

for every Borel set B; hence \(\{ \pi _{\tau }(\phi )\in B \} \cap \{ \tau =t_j \} \in \mathcal {F}_{t_j}^{Y}\subset \mathcal {F}_{t_i}^{Y}\) for every \(j\le i\). Consequently, \(\pi _{\tau }(\phi )=\bigcup _{j\le i} \{ \pi _{\tau }(\phi )\in B \} \cap \{ \tau =t_j \}\) is \(\mathcal {F}_{\tau }^{Y}\)-measurable. This completes the proof in discrete time case.

We proceed to the proof in continuous time case. Define the stopping times \(\tau _n=([2^n\tau ]+1)/2^n\). Then \(\tau _n\downarrow \tau \), and each \(\tau _n\) takes a countable number of values. It follows immediately that

$$\begin{aligned} \begin{aligned} \pi _{\tau _n}(\phi )=\mathbb {E}[\phi (\mathcal {X}_{\tau _n})|\mathcal {F}_{\tau _n}^{Y}] \end{aligned} \end{aligned}$$
(48)

for every n. By the continuity of \(\pi _{t}(\phi )\), we take the limit in the left-hand side of (48) and obtain

$$\begin{aligned} \begin{aligned} \lim _{n\rightarrow \infty } \pi _{\tau _n}(\phi ) = \pi _{\tau }(\phi ). \end{aligned} \end{aligned}$$

It remains to tackle the right-hand side of (48). We claim that \(\mathcal {F}_{\tau _{n+1}}^{Y}\subset \mathcal {F}_{\tau _{n}}^{Y}\). Since \(\tau _{n+1}\le \tau _n\) and \(\tau _n\) is an \(\mathcal {F}_{t}^{Y}\)-stopping time, we have

$$\begin{aligned} \begin{aligned} C\cap \{ \tau _n\le t \} = C \cap \{ \tau _{n+1}\le t \} \cap \{ \tau _n\le t \} \in \mathcal {F}_{t}^{Y} \end{aligned} \end{aligned}$$

for every \(C\in \mathcal {F}_{\tau _{n+1}}^{Y}\). Therefore, the claim follows from the definition of \(\mathcal {F}_{\tau _n}^{Y}\). According to Hunt’s Lemma, we draw a nontrivial conclusion

$$\begin{aligned} \begin{aligned} \mathbb {E}[\phi (\mathcal {X}_{\tau _n})|\mathcal {F}_{\tau _n}^{Y}] \rightarrow \mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}], \quad \text {in } L^2, \text { as } n\rightarrow \infty , \end{aligned} \end{aligned}$$

where \(\mathcal {G}=\bigcap _{n\ge 1}\mathcal {F}_{\tau _n}^{Y}\). We now have to show that \(\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]=\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {F}_{\tau }^{Y}]\). It clearly that \(\mathcal {F}_{\tau }^{Y}\subset \mathcal {G}\). Hence it remains to prove \(\mathcal {F}_{\tau }^{Y}\supset \mathcal {G}\). Note that \(\pi _{\tau }(\phi )=\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]\). We desire to verify that \(\pi _{\tau }(\phi )\) is \(\mathcal {F}_{\tau }^{Y}\)-measurable. Define the stopping times \(\sigma _n=\tau _n-2^{-n}\). Then \(\sigma _n\le \tau \), and \(\sigma _n\uparrow \tau \) as \(n\rightarrow \infty \). But \(\pi _{\sigma _n}(\phi )\) is \(\mathcal {F}_{\sigma _n}^{Y}\)-measurable, so it is \(\mathcal {F}_{\tau }^{Y}\)-measurable for every n. Thus, \(\pi _{\tau }(\phi )=\lim _{n\rightarrow \infty }\pi _{\sigma _n}(\phi )\) must be \(\mathcal {F}_{\tau }^{Y}\)-measurable. We conclude that \(\mathbb {E}[\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]|\mathcal {F}_{\tau }^{Y}] =\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]\), i.e., \(\mathcal {F}_{\tau }^{Y}\supset \mathcal {G}\). Then the proof is complete. \(\square \)

Appendix B: Proof of Lemma 5

Let us define the process \(q_{m}^{n}(t)\) as a unique \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};V_m)\) valued solution to the system

$$\begin{aligned} \begin{aligned}&(q_{m}^{n}(t),e_{\ell })_{H^0} \\&\quad = (q_{m}^{n}(k\delta ),e_{\ell })_{H^0} + \int _{k\delta }^{t} {_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),e_{\ell } \right\rangle _{H^1} \textrm{d}r + \int _{k\delta }^{t} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \textrm{d}Y_r \\&\qquad + \frac{1}{n} \sum _{j=1}^{n} \int _{k\delta }^{t} \xi _{k\delta }^{n} M_j^n(r) \left[ R_{r}^{1}(e_{\ell })\textrm{d}r + R_{r}^{2}(e_{\ell })\textrm{d}Y_r + R_{r}^{3}(e_{\ell }) \textrm{d}W_r^{j} \right] . \end{aligned} \end{aligned}$$
(49)

The well-posedness of the system (49) does not follow directly from the standard theory for SODEs, since the coefficients of that equation need not be Lipschitz. However, various conclusions, for example, Theorem 3.21 in Pardoux and Răşcanu [39], allow us to deal with the current predicament. Assuming that the system (49) is well-posed for the time being, the next step is to build a uniform bound on the family \(\{q_{m}^{n}(t)\}_{m\ge 1}\).

In this appendix, \(\mathbb {E}\) stands for \(\mathbb {E}^{\mathbb {P}}\). We now establish a sufficiently strong a priori bound

$$\begin{aligned} \begin{aligned} \sup _{m\ge 1} \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} + \int _{k\delta }^{(k+1)\delta } \Vert q_{m}^{n}(r) \Vert _{H^1}^{2} \textrm{d}r \right] < \infty . \end{aligned} \end{aligned}$$
(50)

With the use of (49) and Itô’s formula, we can get the identity for all \(1\le \ell \le m\)

$$\begin{aligned}&(q_{m}^{n}(t),e_{\ell })_{H^0}^2 \\&\quad = (q_{m}^{n}(k\delta ),e_{\ell })_{H^0}^2 + \int _{k\delta }^{t} \left[ 2\left( q_{m}^{n}(r),e_{\ell }\right) _{H^0} {_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),e_{\ell } \right\rangle _{H^1} + \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \int _{k\delta }^{t} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left[ 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} + 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \right. \\&\qquad \left. \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \right] \textrm{d}Y_r \\&\qquad + \int _{k\delta }^{t} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell })\right) \textrm{d}W_{r}^{j}. \end{aligned}$$

By adding the values from \(\ell =1\) to \(\ell =m\), we obtain the following result

$$\begin{aligned} \begin{aligned}&\left\| q_{m}^{n}(t) \right\| _{H^0}^2 \\&\quad = \sum _{\ell =1}^{m} (q_{m}^{n}(k\delta ),e_{\ell })_{H^0}^2 + \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + \left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2(q_{m}^{n}(r),e_{\ell }) \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left[ 2\left( \mathcal {B} q_{m}^{n}(r),q_{m}^{n}(r) \right) _{H^0} + 2\sum _{\ell =1}^{m} \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \right] \textrm{d}Y_r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell })\right) \textrm{d}W_{r}^{j}. \end{aligned} \end{aligned}$$
(51)

In order to take the expectation and eliminate the stochastic integrals, we first need to create a sequence of stopping times

$$\begin{aligned} \begin{aligned} \tau _R:= \inf \left\{ t\in [k\delta ,(k+1)\delta ): \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \vee \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^1}^{2} \textrm{d}r \ge R \right\} . \end{aligned} \end{aligned}$$

We observe that \(\tau _R\rightarrow \infty \), \(\mathbb {P}\)-a.s., as \(R\rightarrow \infty \). Consider the stopped process \(q_{m}^{n}(t\wedge \tau _R)\), for which we have the bound

$$\begin{aligned} \begin{aligned}&\mathbb {E} \Vert q_{m}^{n}(t\wedge \tau _R) \Vert _{H^0}^2 \\&\quad \le \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^2 + \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + \left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r. \end{aligned} \end{aligned}$$

It should be highlighted that \(n\gg m\) (for instance, take \({\widetilde{n}}=n\cdot m\)), then it is straightforward to demonstrate that

$$\begin{aligned} \begin{aligned}&\sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \\&\quad \lesssim \frac{1}{n} + \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r. \end{aligned} \end{aligned}$$

Combine the resulting inequality with the coercive condition (see (9)) to yield

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \Vert q_{m}^{n}(t\wedge \tau _R) \Vert _{H^0}^2 + C_1 \int _{k\delta }^{t\wedge \tau _R} \Vert q_{m}^{n}(r) \Vert _{H^1}^{2} \textrm{d}r \right] \\&\quad \lesssim \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^2 + \frac{1}{n} + (1+C)\mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \Vert q_{m}^{n}(r) \Vert _{H^0}^2 \textrm{d}r. \end{aligned} \end{aligned}$$

Thus, we infer that

$$\begin{aligned} \begin{aligned} \sup _{m\ge 1}\sup _{t\in [k\delta ,(k+1)\delta )} \mathbb {E}\left[ \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} + C_1 \int _{k\delta }^{t} \Vert q_{m}^{n}(t) \Vert _{H^1}^{2} \textrm{d}t \right] < \infty . \end{aligned} \end{aligned}$$

Obviously, we merely need to exchange the supremum over \([k\delta ,(k+1)\delta )\) with the expectation. It follows from Buckholder–Davis–Gundy’s inequality as well as Hölder’s inequality that

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \left| \int _{k\delta }^{t} \left( \mathcal {B}q_{m}^{n}(r),q_{m}^{n}(r) \right) _{H^0} \textrm{d}Y_r \right| \right] \\&\quad \le C \mathbb {E} \left[ \int _{k\delta }^{(k+1)\delta } \left( \mathcal {B}q_{m}^{n}(r),q_{m}^{n}(r) \right) _{H^0}^{2} \textrm{d}r \right] ^{1/2} \\&\quad \le C \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0} \sqrt{\int _{k\delta }^{(k+1)\delta } |\mathcal {B}q_{m}^{n}(r)|^2 \textrm{d}r} \right] \\&\quad \le \frac{1}{2} \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \right] + \frac{C^2}{2} \mathbb {E} \int _{k\delta }^{(k+1)\delta } |\mathcal {B}q_{m}^{n}(r)|^2 \textrm{d}r. \end{aligned} \end{aligned}$$

Similarly, one deduces that

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \left| \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2 \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}Y_r \right| \right] \\&\quad \le C \mathbb {E}\left[ \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0}^{2} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) ^2 \textrm{d}r \right] \\&\quad \lesssim \frac{1}{n} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \left| \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2 \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) \textrm{d}W_{r}^{j} \right| \right] \\&\quad \le C \mathbb {E}\left[ \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0}^{2} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \right] \\&\quad \lesssim \frac{1}{n} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r. \end{aligned} \end{aligned}$$

If we take a supremum over \(t\in [k\delta ,(k+1)\delta )\), prior to taking the expectation in (51), we therefore gain

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \right] \\&\quad \lesssim \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^{2} \\&\qquad + \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + (1+C^2)\left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \frac{1}{n} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r. \end{aligned} \end{aligned}$$

Employing the coercive condition, we have

$$\begin{aligned} \begin{aligned} \sup _{m\ge 1} \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \right] \lesssim \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^{2} + \sup _{m\ge 1} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{0} \textrm{d}r. \end{aligned} \end{aligned}$$

Hence, we ultimately establish the a priori bound (50).

According to (50) and the linear growth of \(\mathcal {A}^v\) (see (11)), one obtains that

  1. (i)

    \(\{q_{m}^{n}(t)\}_{m\ge 1}\) is uniformly bounded in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\);

  2. (ii)

    \(\{\mathcal {A}^v q_{m}^{n}(t)\}_{m\ge 1}\) is uniformly bounded in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})\);

  3. (iii)

    \(\{\mathcal {B} q_{m}^{n}(t)\}_{m\ge 1}\) is uniformly bounded in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^0)\).

Thus, the Banach–Alaoglu theorem implies that there are some weakly convergent subsequences. We do not relabel the indices with respect to m of the following convergent subsequences such that

  1. (i)

    \(q_{m}^{n}(t)\rightharpoonup q^n(t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\);

  2. (ii)

    \(\mathcal {A}^v q_{m}^{n}(t)\rightharpoonup \zeta (t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})\);

  3. (iii)

    \(\mathcal {B} q_{m}^{n}(t)\rightharpoonup \eta (t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^0)\).

It merely remains to verify that \(\zeta (t)=\mathcal {A}^v q^n(t)\) and \(\eta (t)=\mathcal {B} q^n(t)\). Additionally, we have the fact that \(q_{m}^{n}(t)\overset{*}{\rightharpoonup }q^n(t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};L^{\infty })\). By the dominated convergence theorem, we may pass to the limit as \(m\rightarrow \infty \) in (51) \((n\rightarrow \infty ~\text {as well})\)

$$\begin{aligned} \begin{aligned} \mathbb {E}\left[ \Vert q_{}^{n}((k+1)\delta ) \Vert _{H^0}^{2} - \Vert q_{}^{n}(k\delta ) \Vert _{H^0}^{2} \right]&= \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \zeta (r),q_{}^{n}(r) \right\rangle _{H^1} + \left\| \eta (r) \right\| _{H^0}^{2} \right] \textrm{d}r. \end{aligned} \end{aligned}$$

Since the map \(\varrho \rightarrow \mathbb {E}\Vert \varrho \Vert _{H^0}^{2}\) is convex, we get that

$$\begin{aligned} \begin{aligned} \mathbb {E}\left[ \Vert q_{}^{n}((k+1)\delta ) \Vert _{H^0}^{2} - \Vert q_{}^{n}(k\delta ) \Vert _{H^0}^{2} \right]&\le \liminf _{m\rightarrow \infty } \mathbb {E}\left[ \Vert q_{m}^{n}((k+1)\delta ) \Vert _{H^0}^{2} - \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^{2} \right] . \end{aligned} \end{aligned}$$

Then we have the inequality

$$\begin{aligned} \begin{aligned}&\mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \zeta (r),q_{}^{n}(r) \right\rangle _{H^1} + \left\| \eta (r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\quad \le \liminf _{m\rightarrow \infty } \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + \left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r. \end{aligned} \end{aligned}$$
(52)

From the monotonicity condition (see (10)) with \(\lambda =0\), for all \(p(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\) and \(m\ge 1\), we derive that

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(t) - \mathcal {A}^v p(t),q_{m}^{n}(t) - p(t) \right\rangle _{H^1} + \left\| \mathcal {B}q_{m}^{n}(t) - \mathcal {B}p(t) \right\| _{H^0}^{2} \right] \textrm{d}r \le 0. \end{aligned}\nonumber \\ \end{aligned}$$
(53)

With the assistance of (52) and (53), and proceeding in a similar manner to the limit, we have

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \zeta (t) - \mathcal {A}^v p(t),q_{}^{n}(t) - p(t) \right\rangle _{H^1} + \left\| \eta (t) - \mathcal {B}p(t) \right\| _{H^0}^{2} \right] \textrm{d}r \le 0. \end{aligned}\nonumber \\ \end{aligned}$$
(54)

If we set \(q_{}^{n}(t)=p(t)\) in (54) we immediately obtain \(\eta (t)=\mathcal {B}q_{}^{n}(t)\). To show that \(\zeta (t)=\mathcal {A}^v q_{}^{n}(t)\), let \(p(t)=q_{}^{n}(t) - \theta w(t)\) for \(\theta >0\) and \(w(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\). Diving both side by \(\theta \) we see that

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{(k+1)\delta } {_{H^{-1}}}\left\langle \zeta (r) - \mathcal {A}^v(q_{}^{n}(r)-\theta w(r)),w(r) \right\rangle _{H^1} \textrm{d}r \le 0. \end{aligned} \end{aligned}$$

Letting \(\theta \rightarrow 0\) and utilizing the weak continuity of \(\mathcal {A}^v\) (see (12)), we deduce that

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{(k+1)\delta } {_{H^{-1}}}\left\langle \zeta (r) - \mathcal {A}^v q_{}^{n}(r),w(r) \right\rangle _{H^1} \textrm{d}r \le 0, \quad \forall ~w(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1). \end{aligned} \end{aligned}$$

It follows unambiguously that \(\zeta (t)=\mathcal {A}^v q^n(t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})\).

The proof of the theorem has been done. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wan, H., Wang, G. & Xiong, J. A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle. Stoch PDE: Anal Comp 12, 675–735 (2024). https://doi.org/10.1007/s40072-023-00294-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40072-023-00294-w

Keywords

Mathematics Subject Classification

Navigation