A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle

Wan, Hexiang; Wang, Guangchen; Xiong, Jie

doi:10.1007/s40072-023-00294-w

A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle

Published: 24 March 2023

Volume 12, pages 675–735, (2024)
Cite this article

Stochastics and Partial Differential Equations: Analysis and Computations Aims and scope Submit manuscript

316 Accesses
1 Citation
Explore all metrics

Abstract

This paper develops an efficient numerical algorithm for solving a class of partially observed stochastic optimal control problems with correlated noises. The main contribution of this paper is threefold: first, we introduce a relaxed system and assume the Roxin condition (convexity requirement) on coefficients. Then, an optimal relaxed system provides an optimal admissible control in a broader sense, and a relaxed control turns out to be a usual admissible control. Second, we transform the optimal control problem into an optimization problem for a convex functional by employing a projection operator. A stochastic gradient descent approach is then proposed and its convergence properties are demonstrated. Last but not least, we present a branching particle system (branching particle filter) to approximate the optimal filter. Due to the random nature of the coefficients in the Zakai equation, neither the dual approach nor the mild solution approach can be used. We devise a novel method for establishing the convergence of the branching particle system approximation, as well as its rate of convergence. This branching-type particle filter algorithm allows us to tackle non-Markovian environments. The major body of this paper concludes with a numerical case study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A relaxation-based probabilistic approach for PDE-constrained optimization under uncertainty with pointwise state constraints

Article 27 February 2023

Stochastic Dynamic Programming and Control of Markov Processes

Control Strategies for the Dynamics of Large Particle Systems

Data availibility

Enquiries about data availability should be directed to the authors.

References

Archibald, R., Bao, F., Yong, J.M., Zhou, T.: An efficient numerical algorithm for solving data driven feedback control problems. J. Sci. Comput. 85(51), 1–27 (2020)
MathSciNet Google Scholar
Bain, A., Crisan, D.: Fundamentals of Stochastic Filtering. Stochastic Modelling and Applied Probability. Springer, New York (2009)
Google Scholar
Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)
Google Scholar
Bensoussan, A., Glowinski, R., Răşcanu, A.: Approximation of the Zakai equation by the splitting up method. SIAM J. Control Optim. 28(6), 1420–1431 (1990)
MathSciNet Google Scholar
Bensoussan, A., Viot, M.: Optimal control of stochastic linear distributed parameter systems. SIAM J. Control 13, 904–926 (1975)
MathSciNet Google Scholar
Buckdahn, R., Li, J., Ma, J.: A mean-field stochastic control problem with partial observations. Ann. Appl. Probab. 27(5), 3201–3245 (2017)
MathSciNet Google Scholar
Chang, D.J., Liu, H.L., Xiong, J.: A branching particle system approximation for a class of FBSDEs. Probab. Uncertain. Quant. Risk 1(9), 1–34 (2016)
MathSciNet Google Scholar
Charalambous, C.D., Elliott, R.J.: Classes of nonlinear partially observable stochastic optimal control problems with explicit optimal control laws. SIAM J. Control Optim. 36(2), 542–578 (1998)
MathSciNet Google Scholar
Crisan, D.: Particle approximations for a class of stochastic partial differential equations. Appl. Math. Optim. 54(3), 293–314 (2006)
MathSciNet Google Scholar
Crisan, D., Li, K.: Generalised particle filters with Gaussian mixtures. Stoch. Process. Appl. 125(7), 2643–2673 (2015)
MathSciNet Google Scholar
Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions, 2nd edn. Cambridge University Press, Cambridge (2014)
Google Scholar
Du, K.: $W^{2, p}$-solutions of parabolic SPDEs in general domains. Stoch. Process. Appl. 130(1), 1–19 (2020)
Google Scholar
El Karoui, N., Du, N.H., Monique, J.-P.: Existence of an optimal Markovian filter for the control under partial observations. SIAM J. Control Optim. 26(5), 1025–1061 (1988)
MathSciNet Google Scholar
Evans, L.C.: Partial Differential Equations, 2nd edn. American Mathematical Society, Providence (2010)
Google Scholar
Fleming, W.H., Pardoux, É.: Optimal control for partially observed diffusions. SIAM J. Control Optim. 20(2), 261–285 (1982)
MathSciNet Google Scholar
Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions, 2nd edn. Springer, New York (2006)
Google Scholar
Florentin, J.J.: Partial observability and optimal control. Int. J. Electron. 13, 263–279 (1962)
Google Scholar
Gong, B., Liu, W.B., Tang, T., Zhao, W.D., Zhou, T.: An efficient gradient projection method for stochastic optimal control problems. SIAM J. Numer. Anal. 55(6), 2982–3005 (2017)
MathSciNet Google Scholar
Gozzi, F., Świȩch, A.: Hamilton-Jacobi-Bellman equations for the optimal control of the Duncan–Mortensen–Zakai equation. J. Funct. Anal. 172(2), 466–510 (2000)
MathSciNet Google Scholar
Gyöngy, I., Millet, A.: On discretization schemes for stochastic evolution equations. Potential Anal. 23(2), 99–134 (2005)
MathSciNet Google Scholar
Haussmann, U.G.: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Optim. 25(2), 341–361 (1987)
MathSciNet Google Scholar
Huang, J.H., Wang, G.C., Xiong, J.: A maximum principle for partial information backward stochastic control problems with applications. SIAM J. Control Optim. 48(4), 2106–2117 (2009)
MathSciNet Google Scholar
Kallianpur G., Xiong J.: Stochastic Differential Equations in Infinite Dimensional Spaces. IMS Lecture Notes—Monograph Series 26. Institute of Mathematical Statistics (1995)
Krylov, N.V.: On $L_p$-theory of stochastic partial differential equations in the whole space. SIAM J. Math. Anal. 27(2), 313–340 (1996)
MathSciNet Google Scholar
Krylov, N.V., Rozovskii, B.L.: Stochastic evolution equations. J. Sov. Math. 16, 1233–1277 (1981)
Google Scholar
Kurtz, T.G., Xiong, J.: Particle representations for a class of nonlinear SPDEs. Stoch. Process. Appl. 83(1), 103–126 (1999)
MathSciNet Google Scholar
Kurtz T.G., Xiong J.: Numerical solutions for a class of SPDEs with application to filtering. In: Stochastics in Finite and Infinite Dimensions, pp. 233–258. Birkhäuser, Boston (2001)
Kushner, H.J.: Probability Methods for Approximations in Stochastic Control and for Elliptic Equations. Academic Press, New York (1977)
Google Scholar
Kushner, H.J., Dupuis, P.: Numerical Methods for Stochastic Control Problems in Continuous Time, 2nd edn. Springer, New York (2001)
Google Scholar
Li, X.J., Tang, S.J.: General necessary conditions for partially observed optimal stochastic controls. J. Appl. Probab. 32(4), 1118–1137 (1995)
MathSciNet Google Scholar
Liu, H.L., Xiong, J.: A branching particle system approximation for nonlinear stochastic filtering. Sci. China Math. 56(8), 1521–1541 (2013)
MathSciNet Google Scholar
Lototsky, S., Mikulevicius, R., Rozovskii, B.L.: Nonlinear filtering revisited: a spectral approach. SIAM J. Control Optim. 35(2), 435–461 (1997)
MathSciNet Google Scholar
Liu, W., Röckner, M.: Stochastic Partial Differential Equations: An Introduction. Springer, Cham (2015)
Google Scholar
Milstein, G.N., Tretyakov, M.V.: Numerical algorithms for forward-backward stochastic differential equations. SIAM J. Sci. Comput. 28(2), 561–582 (2006)
MathSciNet Google Scholar
Nagase, N., Nisio, M.: Optimal controls for stochastic partial differential equations. SIAM J. Control Optim. 28(1), 186–213 (1990)
MathSciNet Google Scholar
Nisio, M.: Stochastic Control Theory. Dynamic Programming Principle, 2nd edn. Springer, Tokyo (2015)
Google Scholar
Pardoux, É.: Stochastic partial differential equations and filtering of diffusion processes. Stochastics 3(2), 127–167 (1979)
MathSciNet Google Scholar
Pardoux, É.: Stochastic Partial Differential Equations. An introduction. Springer Briefs in Mathematics. Springer, Cham (2021)
Google Scholar
Pardoux, É., Răşcanu, A.: Stochastic Differential Equations, Backward SDEs, Partial Differential Equations. Springer, Cham (2014)
Google Scholar
Rogers L.C.G., Williams D.: Diffusions, Markov Processes, and Martingales, vol. 1. Foundations, vol. 2. Itô calculus. Cambridge University Press, Cambridge (2000)
Rozovsky, B.L., Lototsky, S.V.: Stochastic Evolution Systems. Linear Theory and Applications to Non-linear Filtering, 2nd edn. Springer, Cham (2018)
Google Scholar
Stroock D.W., Varadhan S.R.S.: Multidimensional Diffusion Processes. Reprint of the 1997 edition. Springer, Berlin (2006)
Tang, S.J.: The maximum principle for partially observed optimal control of stochastic differential equations. SIAM J. Control Optim. 36(5), 1596–1617 (1998)
MathSciNet Google Scholar
Wang, G.C., Wu, Z.: Kalman–Bucy filtering equations of forward and backward stochastic systems and applications to recursive optimal control problems. J. Math. Anal. Appl. 342(2), 1280–1296 (2008)
MathSciNet Google Scholar
Wang, G.C., Wu, Z., Xiong, J.: Maximum principles for forward-backward stochastic control systems with correlated state and observation noises. SIAM J. Control Optim. 51(1), 491–524 (2013)
MathSciNet Google Scholar
Wang, G.C., Wu, Z., Xiong, J.: A linear-quadratic optimal control problem of forward-backward stochastic differential equations with partial information. IEEE Trans. Autom. Control 60(11), 2904–2916 (2015)
MathSciNet Google Scholar
Wang, G.C., Wu, Z., Xiong, J.: An Introduction to Optimal Control of FBSDE with Incomplete Information. Springer, Cham (2018)
Google Scholar
Wonham, W.M.: On the separation theorem of stochastic control. SIAM J. Control 6, 312–326 (1968)
MathSciNet Google Scholar
Xiong, J.: An Introduction to Stochastic Filtering Theory. Oxford University Press, Oxford (2008)
Google Scholar
Xiong, J.: Particle Approximations to the Filtering Problem in Continuous Time. The Oxford Handbook of Nonlinear Filtering, pp. 635–655. Oxford University Press, Oxford (2011)
Google Scholar
Xiong, J., Zhou, X.Y.: Mean-variance portfolio selection under partial information. SIAM J. Control Optim. 46(1), 156–175 (2007)
MathSciNet Google Scholar
Yong, J.M., Zhou, X.Y.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer, New York (1999)
Google Scholar
Zhang, J.F.: Backward Stochastic Differential Equations. From Linear to Fully Nonlinear Theory. Springer, New York (2017)
Google Scholar
Zhang, Q.: Controlled partially observed diffusions with correlated noise. Appl. Math. Optim. 22(3), 265–285 (1990)
MathSciNet Google Scholar
Zhao, W.D., Chen, L.F., Peng, S.G.: A new kind of accurate numerical method for backward stochastic differential equations. SIAM J. Sci. Comput. 28(4), 1563–1581 (2006)
MathSciNet Google Scholar
Zhou, X.Y.: On the existence of optimal relaxed controls of stochastic partial differential equations. SIAM J. Control Optim. 30(2), 247–261 (1992)
MathSciNet Google Scholar

Download references

Acknowledgements

H.X. Wan thanks Professor J. Xiong for hosting a visit to the Department of Mathematics at SUSTech in 2021, during which this work was initiated. The authors would like to think referees for suggesting improvements in our original manuscript.

The research of G.C. Wang is supported partially by the NSFC under Grant Nos. 61925306, 61821004 and 11831010, the National Key R &D Program of China under Grant No. 2022YFA1006103, and the NSF of Shandong Province under Grant Nos. ZR2019ZD42 and ZR2020ZD24. The research of J. Xiong is supported partially by the NSFC under Grant No. 11831010, and the National Key R &D Program of China under Grant No. 2022YFA1006102.

Author information

Authors and Affiliations

School of Control Science and Engineering, Shandong University, Jinan, 250061, People’s Republic of China
Hexiang Wan & Guangchen Wang
Department of Mathematics and SUSTech International Center for Mathematics, Southern University of Science and Technology, Shenzhen, 518055, People’s Republic of China
Jie Xiong

Authors

Hexiang Wan
View author publications
You can also search for this author in PubMed Google Scholar
Guangchen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Xiong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proof of Theorem 6

We will use two steps to finish the proof. In this appendix, $\mathbb {E}$ represents $\mathbb {E}^{\mathbb {Q}}$.

We begin by proving it for the special case that $\tau $ takes values only in a countable set of times $\{t_1,t_2,\ldots \}$. We need to show that $\mathbb {E}[\pi _{\tau }(\phi )\textbf{1}_{A}]=\mathbb {E}[\phi (\mathcal {X}_{\tau })\textbf{1}_{A}]$ for every $A\in \mathcal {F}_{\tau }^{Y}$. Note that $A=\bigcup _{i\ge 1} A\cap \{\tau =t_i\} \triangleq \bigcup _{i\ge 1} A_i$ implies that

$$\begin{aligned} \begin{aligned}&\mathbb {E}[\pi _{\tau }(\phi )\textbf{1}_{A}] = \sum _{i=1}^{\infty } \mathbb {E}\left[ \pi _{\tau }(\phi )\textbf{1}_{A\cap \{t=t_i\}} \right] , \\&\mathbb {E}[\phi (\mathcal {X}_{\tau })\textbf{1}_{A}] = \sum _{i=1}^{\infty } \mathbb {E}\left[ \phi (\mathcal {X}_{\tau })\textbf{1}_{A\cap \{t=t_i\}} \right] . \end{aligned} \end{aligned}$$

Then it follows from $A_i\in \mathcal {F}_{t_i}^{Y}$ that

$$\begin{aligned} \begin{aligned} \mathbb {E}[\pi _{\tau }(\phi )\textbf{1}_{A_i}] = \mathbb {E}[\pi _{t_i}(\phi )\textbf{1}_{A_i}] = \mathbb {E}[\phi (\mathcal {X}_{t_i})\textbf{1}_{A_i}] = \mathbb {E}[\phi (\mathcal {X}_{\tau })\textbf{1}_{A_i}]. \end{aligned} \end{aligned}$$

Moreover,

$$\begin{aligned} \begin{aligned} \{ \pi _{\tau }(\phi )\in B \} = \bigcup _{i\ge 0} \{ \pi _{\tau }(\phi )\in B~\text {and}~\tau =t_i \} = \bigcup _{i\ge 0} \{ \pi _{t_i}(\phi )\in B \} \cap \{ \tau =t_i \} \end{aligned} \end{aligned}$$

for every Borel set B; hence $\{ \pi _{\tau }(\phi )\in B \} \cap \{ \tau =t_j \} \in \mathcal {F}_{t_j}^{Y}\subset \mathcal {F}_{t_i}^{Y}$ for every $j\le i$. Consequently, $\pi _{\tau }(\phi )=\bigcup _{j\le i} \{ \pi _{\tau }(\phi )\in B \} \cap \{ \tau =t_j \}$ is $\mathcal {F}_{\tau }^{Y}$-measurable. This completes the proof in discrete time case.

We proceed to the proof in continuous time case. Define the stopping times $\tau _n=([2^n\tau ]+1)/2^n$. Then $\tau _n\downarrow \tau $, and each $\tau _n$ takes a countable number of values. It follows immediately that

$$\begin{aligned} \begin{aligned} \pi _{\tau _n}(\phi )=\mathbb {E}[\phi (\mathcal {X}_{\tau _n})|\mathcal {F}_{\tau _n}^{Y}] \end{aligned} \end{aligned}$$

(48)

for every n. By the continuity of $\pi _{t}(\phi )$, we take the limit in the left-hand side of (48) and obtain

$$\begin{aligned} \begin{aligned} \lim _{n\rightarrow \infty } \pi _{\tau _n}(\phi ) = \pi _{\tau }(\phi ). \end{aligned} \end{aligned}$$

It remains to tackle the right-hand side of (48). We claim that $\mathcal {F}_{\tau _{n+1}}^{Y}\subset \mathcal {F}_{\tau _{n}}^{Y}$. Since $\tau _{n+1}\le \tau _n$ and $\tau _n$ is an $\mathcal {F}_{t}^{Y}$-stopping time, we have

$$\begin{aligned} \begin{aligned} C\cap \{ \tau _n\le t \} = C \cap \{ \tau _{n+1}\le t \} \cap \{ \tau _n\le t \} \in \mathcal {F}_{t}^{Y} \end{aligned} \end{aligned}$$

for every $C\in \mathcal {F}_{\tau _{n+1}}^{Y}$. Therefore, the claim follows from the definition of $\mathcal {F}_{\tau _n}^{Y}$. According to Hunt’s Lemma, we draw a nontrivial conclusion

$$\begin{aligned} \begin{aligned} \mathbb {E}[\phi (\mathcal {X}_{\tau _n})|\mathcal {F}_{\tau _n}^{Y}] \rightarrow \mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}], \quad \text {in } L^2, \text { as } n\rightarrow \infty , \end{aligned} \end{aligned}$$

where $\mathcal {G}=\bigcap _{n\ge 1}\mathcal {F}_{\tau _n}^{Y}$. We now have to show that $\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]=\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {F}_{\tau }^{Y}]$. It clearly that $\mathcal {F}_{\tau }^{Y}\subset \mathcal {G}$. Hence it remains to prove $\mathcal {F}_{\tau }^{Y}\supset \mathcal {G}$. Note that $\pi _{\tau }(\phi )=\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]$. We desire to verify that $\pi _{\tau }(\phi )$ is $\mathcal {F}_{\tau }^{Y}$-measurable. Define the stopping times $\sigma _n=\tau _n-2^{-n}$. Then $\sigma _n\le \tau $, and $\sigma _n\uparrow \tau $ as $n\rightarrow \infty $. But $\pi _{\sigma _n}(\phi )$ is $\mathcal {F}_{\sigma _n}^{Y}$-measurable, so it is $\mathcal {F}_{\tau }^{Y}$-measurable for every n. Thus, $\pi _{\tau }(\phi )=\lim _{n\rightarrow \infty }\pi _{\sigma _n}(\phi )$ must be $\mathcal {F}_{\tau }^{Y}$-measurable. We conclude that $\mathbb {E}[\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]|\mathcal {F}_{\tau }^{Y}] =\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]$, i.e., $\mathcal {F}_{\tau }^{Y}\supset \mathcal {G}$. Then the proof is complete. $\square $

Appendix B: Proof of Lemma 5

Let us define the process $q_{m}^{n}(t)$ as a unique $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};V_m)$ valued solution to the system

$$\begin{aligned} \begin{aligned}&(q_{m}^{n}(t),e_{\ell })_{H^0} \\&\quad = (q_{m}^{n}(k\delta ),e_{\ell })_{H^0} + \int _{k\delta }^{t} {_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),e_{\ell } \right\rangle _{H^1} \textrm{d}r + \int _{k\delta }^{t} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \textrm{d}Y_r \\&\qquad + \frac{1}{n} \sum _{j=1}^{n} \int _{k\delta }^{t} \xi _{k\delta }^{n} M_j^n(r) \left[ R_{r}^{1}(e_{\ell })\textrm{d}r + R_{r}^{2}(e_{\ell })\textrm{d}Y_r + R_{r}^{3}(e_{\ell }) \textrm{d}W_r^{j} \right] . \end{aligned} \end{aligned}$$

(49)

The well-posedness of the system (49) does not follow directly from the standard theory for SODEs, since the coefficients of that equation need not be Lipschitz. However, various conclusions, for example, Theorem 3.21 in Pardoux and Răşcanu [39], allow us to deal with the current predicament. Assuming that the system (49) is well-posed for the time being, the next step is to build a uniform bound on the family $\{q_{m}^{n}(t)\}_{m\ge 1}$.

In this appendix, $\mathbb {E}$ stands for $\mathbb {E}^{\mathbb {P}}$. We now establish a sufficiently strong a priori bound

$$\begin{aligned} \begin{aligned} \sup _{m\ge 1} \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} + \int _{k\delta }^{(k+1)\delta } \Vert q_{m}^{n}(r) \Vert _{H^1}^{2} \textrm{d}r \right] < \infty . \end{aligned} \end{aligned}$$

(50)

With the use of (49) and Itô’s formula, we can get the identity for all $1\le \ell \le m$

$$\begin{aligned}&(q_{m}^{n}(t),e_{\ell })_{H^0}^2 \\&\quad = (q_{m}^{n}(k\delta ),e_{\ell })_{H^0}^2 + \int _{k\delta }^{t} \left[ 2\left( q_{m}^{n}(r),e_{\ell }\right) _{H^0} {_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),e_{\ell } \right\rangle _{H^1} + \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \int _{k\delta }^{t} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left[ 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} + 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \right. \\&\qquad \left. \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \right] \textrm{d}Y_r \\&\qquad + \int _{k\delta }^{t} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell })\right) \textrm{d}W_{r}^{j}. \end{aligned}$$

By adding the values from $\ell =1$ to $\ell =m$, we obtain the following result

$$\begin{aligned} \begin{aligned}&\left\| q_{m}^{n}(t) \right\| _{H^0}^2 \\&\quad = \sum _{\ell =1}^{m} (q_{m}^{n}(k\delta ),e_{\ell })_{H^0}^2 + \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + \left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2(q_{m}^{n}(r),e_{\ell }) \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \\&\qquad + \int _{k\delta }^{t} \left[ 2\left( \mathcal {B} q_{m}^{n}(r),q_{m}^{n}(r) \right) _{H^0} + 2\sum _{\ell =1}^{m} \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \right] \textrm{d}Y_r \\&\qquad + \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell })\right) \textrm{d}W_{r}^{j}. \end{aligned} \end{aligned}$$

(51)

In order to take the expectation and eliminate the stochastic integrals, we first need to create a sequence of stopping times

$$\begin{aligned} \begin{aligned} \tau _R:= \inf \left\{ t\in [k\delta ,(k+1)\delta ): \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \vee \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^1}^{2} \textrm{d}r \ge R \right\} . \end{aligned} \end{aligned}$$

We observe that $\tau _R\rightarrow \infty $, $\mathbb {P}$-a.s., as $R\rightarrow \infty $. Consider the stopped process $q_{m}^{n}(t\wedge \tau _R)$, for which we have the bound

$$\begin{aligned} \begin{aligned}&\mathbb {E} \Vert q_{m}^{n}(t\wedge \tau _R) \Vert _{H^0}^2 \\&\quad \le \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^2 + \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + \left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r. \end{aligned} \end{aligned}$$

It should be highlighted that $n\gg m$ (for instance, take ${\widetilde{n}}=n\cdot m$), then it is straightforward to demonstrate that

$$\begin{aligned} \begin{aligned}&\sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} 2\left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{1}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \mathcal {B} q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}r \\&\qquad + \sum _{\ell =1}^{m} \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \\&\quad \lesssim \frac{1}{n} + \mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r. \end{aligned} \end{aligned}$$

Combine the resulting inequality with the coercive condition (see (9)) to yield

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \Vert q_{m}^{n}(t\wedge \tau _R) \Vert _{H^0}^2 + C_1 \int _{k\delta }^{t\wedge \tau _R} \Vert q_{m}^{n}(r) \Vert _{H^1}^{2} \textrm{d}r \right] \\&\quad \lesssim \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^2 + \frac{1}{n} + (1+C)\mathbb {E} \int _{k\delta }^{t\wedge \tau _R} \Vert q_{m}^{n}(r) \Vert _{H^0}^2 \textrm{d}r. \end{aligned} \end{aligned}$$

Thus, we infer that

$$\begin{aligned} \begin{aligned} \sup _{m\ge 1}\sup _{t\in [k\delta ,(k+1)\delta )} \mathbb {E}\left[ \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} + C_1 \int _{k\delta }^{t} \Vert q_{m}^{n}(t) \Vert _{H^1}^{2} \textrm{d}t \right] < \infty . \end{aligned} \end{aligned}$$

Obviously, we merely need to exchange the supremum over $[k\delta ,(k+1)\delta )$ with the expectation. It follows from Buckholder–Davis–Gundy’s inequality as well as Hölder’s inequality that

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \left| \int _{k\delta }^{t} \left( \mathcal {B}q_{m}^{n}(r),q_{m}^{n}(r) \right) _{H^0} \textrm{d}Y_r \right| \right] \\&\quad \le C \mathbb {E} \left[ \int _{k\delta }^{(k+1)\delta } \left( \mathcal {B}q_{m}^{n}(r),q_{m}^{n}(r) \right) _{H^0}^{2} \textrm{d}r \right] ^{1/2} \\&\quad \le C \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0} \sqrt{\int _{k\delta }^{(k+1)\delta } |\mathcal {B}q_{m}^{n}(r)|^2 \textrm{d}r} \right] \\&\quad \le \frac{1}{2} \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \right] + \frac{C^2}{2} \mathbb {E} \int _{k\delta }^{(k+1)\delta } |\mathcal {B}q_{m}^{n}(r)|^2 \textrm{d}r. \end{aligned} \end{aligned}$$

Similarly, one deduces that

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \left| \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2 \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) \textrm{d}Y_r \right| \right] \\&\quad \le C \mathbb {E}\left[ \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0}^{2} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{2}(e_{\ell }) \right) ^2 \textrm{d}r \right] \\&\quad \lesssim \frac{1}{n} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \left| \sum _{\ell =1}^{m} \int _{k\delta }^{t} 2 \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) \textrm{d}W_{r}^{j} \right| \right] \\&\quad \le C \mathbb {E}\left[ \sum _{\ell =1}^{m} \int _{k\delta }^{t} \left( q_{m}^{n}(r),e_{\ell } \right) _{H^0}^{2} \left( \frac{1}{n} \sum _{j=1}^{n} \xi _{k\delta }^{n} M_j^n(r)R_{r}^{3}(e_{\ell }) \right) ^2 \textrm{d}r \right] \\&\quad \lesssim \frac{1}{n} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r. \end{aligned} \end{aligned}$$

If we take a supremum over $t\in [k\delta ,(k+1)\delta )$, prior to taking the expectation in (51), we therefore gain

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \right] \\&\quad \lesssim \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^{2} \\&\qquad + \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + (1+C^2)\left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\qquad + \frac{1}{n} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{2} \textrm{d}r. \end{aligned} \end{aligned}$$

Employing the coercive condition, we have

$$\begin{aligned} \begin{aligned} \sup _{m\ge 1} \mathbb {E}\left[ \sup _{t\in [k\delta ,(k+1)\delta )} \Vert q_{m}^{n}(t) \Vert _{H^0}^{2} \right] \lesssim \mathbb {E} \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^{2} + \sup _{m\ge 1} \mathbb {E} \int _{k\delta }^{t} \Vert q_{m}^{n}(r) \Vert _{H^0}^{0} \textrm{d}r. \end{aligned} \end{aligned}$$

Hence, we ultimately establish the a priori bound (50).

According to (50) and the linear growth of $\mathcal {A}^v$ (see (11)), one obtains that

(i)
$\{q_{m}^{n}(t)\}_{m\ge 1}$ is uniformly bounded in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)$;
(ii)
$\{\mathcal {A}^v q_{m}^{n}(t)\}_{m\ge 1}$ is uniformly bounded in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})$;
(iii)
$\{\mathcal {B} q_{m}^{n}(t)\}_{m\ge 1}$ is uniformly bounded in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^0)$.

Thus, the Banach–Alaoglu theorem implies that there are some weakly convergent subsequences. We do not relabel the indices with respect to m of the following convergent subsequences such that

(i)
$q_{m}^{n}(t)\rightharpoonup q^n(t)$ in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)$;
(ii)
$\mathcal {A}^v q_{m}^{n}(t)\rightharpoonup \zeta (t)$ in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})$;
(iii)
$\mathcal {B} q_{m}^{n}(t)\rightharpoonup \eta (t)$ in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^0)$.

It merely remains to verify that $\zeta (t)=\mathcal {A}^v q^n(t)$ and $\eta (t)=\mathcal {B} q^n(t)$. Additionally, we have the fact that $q_{m}^{n}(t)\overset{*}{\rightharpoonup }q^n(t)$ in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};L^{\infty })$. By the dominated convergence theorem, we may pass to the limit as $m\rightarrow \infty $ in (51) $(n\rightarrow \infty ~\text {as well})$

$$\begin{aligned} \begin{aligned} \mathbb {E}\left[ \Vert q_{}^{n}((k+1)\delta ) \Vert _{H^0}^{2} - \Vert q_{}^{n}(k\delta ) \Vert _{H^0}^{2} \right]&= \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \zeta (r),q_{}^{n}(r) \right\rangle _{H^1} + \left\| \eta (r) \right\| _{H^0}^{2} \right] \textrm{d}r. \end{aligned} \end{aligned}$$

Since the map $\varrho \rightarrow \mathbb {E}\Vert \varrho \Vert _{H^0}^{2}$ is convex, we get that

$$\begin{aligned} \begin{aligned} \mathbb {E}\left[ \Vert q_{}^{n}((k+1)\delta ) \Vert _{H^0}^{2} - \Vert q_{}^{n}(k\delta ) \Vert _{H^0}^{2} \right]&\le \liminf _{m\rightarrow \infty } \mathbb {E}\left[ \Vert q_{m}^{n}((k+1)\delta ) \Vert _{H^0}^{2} - \Vert q_{m}^{n}(k\delta ) \Vert _{H^0}^{2} \right] . \end{aligned} \end{aligned}$$

Then we have the inequality

$$\begin{aligned} \begin{aligned}&\mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \zeta (r),q_{}^{n}(r) \right\rangle _{H^1} + \left\| \eta (r) \right\| _{H^0}^{2} \right] \textrm{d}r \\&\quad \le \liminf _{m\rightarrow \infty } \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(r),q_{m}^{n}(r) \right\rangle _{H^1} + \left\| \mathcal {B} q_{m}^{n}(r) \right\| _{H^0}^{2} \right] \textrm{d}r. \end{aligned} \end{aligned}$$

(52)

From the monotonicity condition (see (10)) with $\lambda =0$, for all $p(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)$ and $m\ge 1$, we derive that

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \mathcal {A}^v q_{m}^{n}(t) - \mathcal {A}^v p(t),q_{m}^{n}(t) - p(t) \right\rangle _{H^1} + \left\| \mathcal {B}q_{m}^{n}(t) - \mathcal {B}p(t) \right\| _{H^0}^{2} \right] \textrm{d}r \le 0. \end{aligned}\nonumber \\ \end{aligned}$$

(53)

With the assistance of (52) and (53), and proceeding in a similar manner to the limit, we have

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{t} \left[ 2{_{H^{-1}}}\left\langle \zeta (t) - \mathcal {A}^v p(t),q_{}^{n}(t) - p(t) \right\rangle _{H^1} + \left\| \eta (t) - \mathcal {B}p(t) \right\| _{H^0}^{2} \right] \textrm{d}r \le 0. \end{aligned}\nonumber \\ \end{aligned}$$

(54)

If we set $q_{}^{n}(t)=p(t)$ in (54) we immediately obtain $\eta (t)=\mathcal {B}q_{}^{n}(t)$. To show that $\zeta (t)=\mathcal {A}^v q_{}^{n}(t)$, let $p(t)=q_{}^{n}(t) - \theta w(t)$ for $\theta >0$ and $w(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)$. Diving both side by $\theta $ we see that

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{(k+1)\delta } {_{H^{-1}}}\left\langle \zeta (r) - \mathcal {A}^v(q_{}^{n}(r)-\theta w(r)),w(r) \right\rangle _{H^1} \textrm{d}r \le 0. \end{aligned} \end{aligned}$$

Letting $\theta \rightarrow 0$ and utilizing the weak continuity of $\mathcal {A}^v$ (see (12)), we deduce that

$$\begin{aligned} \begin{aligned} \mathbb {E} \int _{k\delta }^{(k+1)\delta } {_{H^{-1}}}\left\langle \zeta (r) - \mathcal {A}^v q_{}^{n}(r),w(r) \right\rangle _{H^1} \textrm{d}r \le 0, \quad \forall ~w(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1). \end{aligned} \end{aligned}$$

It follows unambiguously that $\zeta (t)=\mathcal {A}^v q^n(t)$ in $L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})$.

The proof of the theorem has been done. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wan, H., Wang, G. & Xiong, J. A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle. Stoch PDE: Anal Comp 12, 675–735 (2024). https://doi.org/10.1007/s40072-023-00294-w

Download citation

Received: 03 May 2022
Revised: 15 January 2023
Accepted: 09 March 2023
Published: 24 March 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s40072-023-00294-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle

Abstract

Access this article

Similar content being viewed by others

A relaxation-based probabilistic approach for PDE-constrained optimization under uncertainty with pointwise state constraints

Stochastic Dynamic Programming and Control of Markov Processes

Control Strategies for the Dynamics of Large Particle Systems

Data availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Proof of Theorem 6

Appendix B: Proof of Lemma 5

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle

Abstract

Access this article

Similar content being viewed by others

A relaxation-based probabilistic approach for PDE-constrained optimization under uncertainty with pointwise state constraints

Stochastic Dynamic Programming and Control of Markov Processes

Control Strategies for the Dynamics of Large Particle Systems

Data availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Proof of Theorem 6

Appendix B: Proof of Lemma 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation