Abstract
This paper develops an efficient numerical algorithm for solving a class of partially observed stochastic optimal control problems with correlated noises. The main contribution of this paper is threefold: first, we introduce a relaxed system and assume the Roxin condition (convexity requirement) on coefficients. Then, an optimal relaxed system provides an optimal admissible control in a broader sense, and a relaxed control turns out to be a usual admissible control. Second, we transform the optimal control problem into an optimization problem for a convex functional by employing a projection operator. A stochastic gradient descent approach is then proposed and its convergence properties are demonstrated. Last but not least, we present a branching particle system (branching particle filter) to approximate the optimal filter. Due to the random nature of the coefficients in the Zakai equation, neither the dual approach nor the mild solution approach can be used. We devise a novel method for establishing the convergence of the branching particle system approximation, as well as its rate of convergence. This branching-type particle filter algorithm allows us to tackle non-Markovian environments. The major body of this paper concludes with a numerical case study.
Similar content being viewed by others
Data availibility
Enquiries about data availability should be directed to the authors.
References
Archibald, R., Bao, F., Yong, J.M., Zhou, T.: An efficient numerical algorithm for solving data driven feedback control problems. J. Sci. Comput. 85(51), 1–27 (2020)
Bain, A., Crisan, D.: Fundamentals of Stochastic Filtering. Stochastic Modelling and Applied Probability. Springer, New York (2009)
Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)
Bensoussan, A., Glowinski, R., Răşcanu, A.: Approximation of the Zakai equation by the splitting up method. SIAM J. Control Optim. 28(6), 1420–1431 (1990)
Bensoussan, A., Viot, M.: Optimal control of stochastic linear distributed parameter systems. SIAM J. Control 13, 904–926 (1975)
Buckdahn, R., Li, J., Ma, J.: A mean-field stochastic control problem with partial observations. Ann. Appl. Probab. 27(5), 3201–3245 (2017)
Chang, D.J., Liu, H.L., Xiong, J.: A branching particle system approximation for a class of FBSDEs. Probab. Uncertain. Quant. Risk 1(9), 1–34 (2016)
Charalambous, C.D., Elliott, R.J.: Classes of nonlinear partially observable stochastic optimal control problems with explicit optimal control laws. SIAM J. Control Optim. 36(2), 542–578 (1998)
Crisan, D.: Particle approximations for a class of stochastic partial differential equations. Appl. Math. Optim. 54(3), 293–314 (2006)
Crisan, D., Li, K.: Generalised particle filters with Gaussian mixtures. Stoch. Process. Appl. 125(7), 2643–2673 (2015)
Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions, 2nd edn. Cambridge University Press, Cambridge (2014)
Du, K.: \(W^{2, p}\)-solutions of parabolic SPDEs in general domains. Stoch. Process. Appl. 130(1), 1–19 (2020)
El Karoui, N., Du, N.H., Monique, J.-P.: Existence of an optimal Markovian filter for the control under partial observations. SIAM J. Control Optim. 26(5), 1025–1061 (1988)
Evans, L.C.: Partial Differential Equations, 2nd edn. American Mathematical Society, Providence (2010)
Fleming, W.H., Pardoux, É.: Optimal control for partially observed diffusions. SIAM J. Control Optim. 20(2), 261–285 (1982)
Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions, 2nd edn. Springer, New York (2006)
Florentin, J.J.: Partial observability and optimal control. Int. J. Electron. 13, 263–279 (1962)
Gong, B., Liu, W.B., Tang, T., Zhao, W.D., Zhou, T.: An efficient gradient projection method for stochastic optimal control problems. SIAM J. Numer. Anal. 55(6), 2982–3005 (2017)
Gozzi, F., Świȩch, A.: Hamilton-Jacobi-Bellman equations for the optimal control of the Duncan–Mortensen–Zakai equation. J. Funct. Anal. 172(2), 466–510 (2000)
Gyöngy, I., Millet, A.: On discretization schemes for stochastic evolution equations. Potential Anal. 23(2), 99–134 (2005)
Haussmann, U.G.: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Optim. 25(2), 341–361 (1987)
Huang, J.H., Wang, G.C., Xiong, J.: A maximum principle for partial information backward stochastic control problems with applications. SIAM J. Control Optim. 48(4), 2106–2117 (2009)
Kallianpur G., Xiong J.: Stochastic Differential Equations in Infinite Dimensional Spaces. IMS Lecture Notes—Monograph Series 26. Institute of Mathematical Statistics (1995)
Krylov, N.V.: On \(L_p\)-theory of stochastic partial differential equations in the whole space. SIAM J. Math. Anal. 27(2), 313–340 (1996)
Krylov, N.V., Rozovskii, B.L.: Stochastic evolution equations. J. Sov. Math. 16, 1233–1277 (1981)
Kurtz, T.G., Xiong, J.: Particle representations for a class of nonlinear SPDEs. Stoch. Process. Appl. 83(1), 103–126 (1999)
Kurtz T.G., Xiong J.: Numerical solutions for a class of SPDEs with application to filtering. In: Stochastics in Finite and Infinite Dimensions, pp. 233–258. Birkhäuser, Boston (2001)
Kushner, H.J.: Probability Methods for Approximations in Stochastic Control and for Elliptic Equations. Academic Press, New York (1977)
Kushner, H.J., Dupuis, P.: Numerical Methods for Stochastic Control Problems in Continuous Time, 2nd edn. Springer, New York (2001)
Li, X.J., Tang, S.J.: General necessary conditions for partially observed optimal stochastic controls. J. Appl. Probab. 32(4), 1118–1137 (1995)
Liu, H.L., Xiong, J.: A branching particle system approximation for nonlinear stochastic filtering. Sci. China Math. 56(8), 1521–1541 (2013)
Lototsky, S., Mikulevicius, R., Rozovskii, B.L.: Nonlinear filtering revisited: a spectral approach. SIAM J. Control Optim. 35(2), 435–461 (1997)
Liu, W., Röckner, M.: Stochastic Partial Differential Equations: An Introduction. Springer, Cham (2015)
Milstein, G.N., Tretyakov, M.V.: Numerical algorithms for forward-backward stochastic differential equations. SIAM J. Sci. Comput. 28(2), 561–582 (2006)
Nagase, N., Nisio, M.: Optimal controls for stochastic partial differential equations. SIAM J. Control Optim. 28(1), 186–213 (1990)
Nisio, M.: Stochastic Control Theory. Dynamic Programming Principle, 2nd edn. Springer, Tokyo (2015)
Pardoux, É.: Stochastic partial differential equations and filtering of diffusion processes. Stochastics 3(2), 127–167 (1979)
Pardoux, É.: Stochastic Partial Differential Equations. An introduction. Springer Briefs in Mathematics. Springer, Cham (2021)
Pardoux, É., Răşcanu, A.: Stochastic Differential Equations, Backward SDEs, Partial Differential Equations. Springer, Cham (2014)
Rogers L.C.G., Williams D.: Diffusions, Markov Processes, and Martingales, vol. 1. Foundations, vol. 2. Itô calculus. Cambridge University Press, Cambridge (2000)
Rozovsky, B.L., Lototsky, S.V.: Stochastic Evolution Systems. Linear Theory and Applications to Non-linear Filtering, 2nd edn. Springer, Cham (2018)
Stroock D.W., Varadhan S.R.S.: Multidimensional Diffusion Processes. Reprint of the 1997 edition. Springer, Berlin (2006)
Tang, S.J.: The maximum principle for partially observed optimal control of stochastic differential equations. SIAM J. Control Optim. 36(5), 1596–1617 (1998)
Wang, G.C., Wu, Z.: Kalman–Bucy filtering equations of forward and backward stochastic systems and applications to recursive optimal control problems. J. Math. Anal. Appl. 342(2), 1280–1296 (2008)
Wang, G.C., Wu, Z., Xiong, J.: Maximum principles for forward-backward stochastic control systems with correlated state and observation noises. SIAM J. Control Optim. 51(1), 491–524 (2013)
Wang, G.C., Wu, Z., Xiong, J.: A linear-quadratic optimal control problem of forward-backward stochastic differential equations with partial information. IEEE Trans. Autom. Control 60(11), 2904–2916 (2015)
Wang, G.C., Wu, Z., Xiong, J.: An Introduction to Optimal Control of FBSDE with Incomplete Information. Springer, Cham (2018)
Wonham, W.M.: On the separation theorem of stochastic control. SIAM J. Control 6, 312–326 (1968)
Xiong, J.: An Introduction to Stochastic Filtering Theory. Oxford University Press, Oxford (2008)
Xiong, J.: Particle Approximations to the Filtering Problem in Continuous Time. The Oxford Handbook of Nonlinear Filtering, pp. 635–655. Oxford University Press, Oxford (2011)
Xiong, J., Zhou, X.Y.: Mean-variance portfolio selection under partial information. SIAM J. Control Optim. 46(1), 156–175 (2007)
Yong, J.M., Zhou, X.Y.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer, New York (1999)
Zhang, J.F.: Backward Stochastic Differential Equations. From Linear to Fully Nonlinear Theory. Springer, New York (2017)
Zhang, Q.: Controlled partially observed diffusions with correlated noise. Appl. Math. Optim. 22(3), 265–285 (1990)
Zhao, W.D., Chen, L.F., Peng, S.G.: A new kind of accurate numerical method for backward stochastic differential equations. SIAM J. Sci. Comput. 28(4), 1563–1581 (2006)
Zhou, X.Y.: On the existence of optimal relaxed controls of stochastic partial differential equations. SIAM J. Control Optim. 30(2), 247–261 (1992)
Acknowledgements
H.X. Wan thanks Professor J. Xiong for hosting a visit to the Department of Mathematics at SUSTech in 2021, during which this work was initiated. The authors would like to think referees for suggesting improvements in our original manuscript.
The research of G.C. Wang is supported partially by the NSFC under Grant Nos. 61925306, 61821004 and 11831010, the National Key R &D Program of China under Grant No. 2022YFA1006103, and the NSF of Shandong Province under Grant Nos. ZR2019ZD42 and ZR2020ZD24. The research of J. Xiong is supported partially by the NSFC under Grant No. 11831010, and the National Key R &D Program of China under Grant No. 2022YFA1006102.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proof of Theorem 6
We will use two steps to finish the proof. In this appendix, \(\mathbb {E}\) represents \(\mathbb {E}^{\mathbb {Q}}\).
We begin by proving it for the special case that \(\tau \) takes values only in a countable set of times \(\{t_1,t_2,\ldots \}\). We need to show that \(\mathbb {E}[\pi _{\tau }(\phi )\textbf{1}_{A}]=\mathbb {E}[\phi (\mathcal {X}_{\tau })\textbf{1}_{A}]\) for every \(A\in \mathcal {F}_{\tau }^{Y}\). Note that \(A=\bigcup _{i\ge 1} A\cap \{\tau =t_i\} \triangleq \bigcup _{i\ge 1} A_i\) implies that
Then it follows from \(A_i\in \mathcal {F}_{t_i}^{Y}\) that
Moreover,
for every Borel set B; hence \(\{ \pi _{\tau }(\phi )\in B \} \cap \{ \tau =t_j \} \in \mathcal {F}_{t_j}^{Y}\subset \mathcal {F}_{t_i}^{Y}\) for every \(j\le i\). Consequently, \(\pi _{\tau }(\phi )=\bigcup _{j\le i} \{ \pi _{\tau }(\phi )\in B \} \cap \{ \tau =t_j \}\) is \(\mathcal {F}_{\tau }^{Y}\)-measurable. This completes the proof in discrete time case.
We proceed to the proof in continuous time case. Define the stopping times \(\tau _n=([2^n\tau ]+1)/2^n\). Then \(\tau _n\downarrow \tau \), and each \(\tau _n\) takes a countable number of values. It follows immediately that
for every n. By the continuity of \(\pi _{t}(\phi )\), we take the limit in the left-hand side of (48) and obtain
It remains to tackle the right-hand side of (48). We claim that \(\mathcal {F}_{\tau _{n+1}}^{Y}\subset \mathcal {F}_{\tau _{n}}^{Y}\). Since \(\tau _{n+1}\le \tau _n\) and \(\tau _n\) is an \(\mathcal {F}_{t}^{Y}\)-stopping time, we have
for every \(C\in \mathcal {F}_{\tau _{n+1}}^{Y}\). Therefore, the claim follows from the definition of \(\mathcal {F}_{\tau _n}^{Y}\). According to Hunt’s Lemma, we draw a nontrivial conclusion
where \(\mathcal {G}=\bigcap _{n\ge 1}\mathcal {F}_{\tau _n}^{Y}\). We now have to show that \(\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]=\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {F}_{\tau }^{Y}]\). It clearly that \(\mathcal {F}_{\tau }^{Y}\subset \mathcal {G}\). Hence it remains to prove \(\mathcal {F}_{\tau }^{Y}\supset \mathcal {G}\). Note that \(\pi _{\tau }(\phi )=\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]\). We desire to verify that \(\pi _{\tau }(\phi )\) is \(\mathcal {F}_{\tau }^{Y}\)-measurable. Define the stopping times \(\sigma _n=\tau _n-2^{-n}\). Then \(\sigma _n\le \tau \), and \(\sigma _n\uparrow \tau \) as \(n\rightarrow \infty \). But \(\pi _{\sigma _n}(\phi )\) is \(\mathcal {F}_{\sigma _n}^{Y}\)-measurable, so it is \(\mathcal {F}_{\tau }^{Y}\)-measurable for every n. Thus, \(\pi _{\tau }(\phi )=\lim _{n\rightarrow \infty }\pi _{\sigma _n}(\phi )\) must be \(\mathcal {F}_{\tau }^{Y}\)-measurable. We conclude that \(\mathbb {E}[\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]|\mathcal {F}_{\tau }^{Y}] =\mathbb {E}[\phi (\mathcal {X}_{\tau })|\mathcal {G}]\), i.e., \(\mathcal {F}_{\tau }^{Y}\supset \mathcal {G}\). Then the proof is complete. \(\square \)
Appendix B: Proof of Lemma 5
Let us define the process \(q_{m}^{n}(t)\) as a unique \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};V_m)\) valued solution to the system
The well-posedness of the system (49) does not follow directly from the standard theory for SODEs, since the coefficients of that equation need not be Lipschitz. However, various conclusions, for example, Theorem 3.21 in Pardoux and Răşcanu [39], allow us to deal with the current predicament. Assuming that the system (49) is well-posed for the time being, the next step is to build a uniform bound on the family \(\{q_{m}^{n}(t)\}_{m\ge 1}\).
In this appendix, \(\mathbb {E}\) stands for \(\mathbb {E}^{\mathbb {P}}\). We now establish a sufficiently strong a priori bound
With the use of (49) and Itô’s formula, we can get the identity for all \(1\le \ell \le m\)
By adding the values from \(\ell =1\) to \(\ell =m\), we obtain the following result
In order to take the expectation and eliminate the stochastic integrals, we first need to create a sequence of stopping times
We observe that \(\tau _R\rightarrow \infty \), \(\mathbb {P}\)-a.s., as \(R\rightarrow \infty \). Consider the stopped process \(q_{m}^{n}(t\wedge \tau _R)\), for which we have the bound
It should be highlighted that \(n\gg m\) (for instance, take \({\widetilde{n}}=n\cdot m\)), then it is straightforward to demonstrate that
Combine the resulting inequality with the coercive condition (see (9)) to yield
Thus, we infer that
Obviously, we merely need to exchange the supremum over \([k\delta ,(k+1)\delta )\) with the expectation. It follows from Buckholder–Davis–Gundy’s inequality as well as Hölder’s inequality that
Similarly, one deduces that
and
If we take a supremum over \(t\in [k\delta ,(k+1)\delta )\), prior to taking the expectation in (51), we therefore gain
Employing the coercive condition, we have
Hence, we ultimately establish the a priori bound (50).
According to (50) and the linear growth of \(\mathcal {A}^v\) (see (11)), one obtains that
-
(i)
\(\{q_{m}^{n}(t)\}_{m\ge 1}\) is uniformly bounded in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\);
-
(ii)
\(\{\mathcal {A}^v q_{m}^{n}(t)\}_{m\ge 1}\) is uniformly bounded in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})\);
-
(iii)
\(\{\mathcal {B} q_{m}^{n}(t)\}_{m\ge 1}\) is uniformly bounded in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^0)\).
Thus, the Banach–Alaoglu theorem implies that there are some weakly convergent subsequences. We do not relabel the indices with respect to m of the following convergent subsequences such that
-
(i)
\(q_{m}^{n}(t)\rightharpoonup q^n(t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\);
-
(ii)
\(\mathcal {A}^v q_{m}^{n}(t)\rightharpoonup \zeta (t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})\);
-
(iii)
\(\mathcal {B} q_{m}^{n}(t)\rightharpoonup \eta (t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^0)\).
It merely remains to verify that \(\zeta (t)=\mathcal {A}^v q^n(t)\) and \(\eta (t)=\mathcal {B} q^n(t)\). Additionally, we have the fact that \(q_{m}^{n}(t)\overset{*}{\rightharpoonup }q^n(t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};L^{\infty })\). By the dominated convergence theorem, we may pass to the limit as \(m\rightarrow \infty \) in (51) \((n\rightarrow \infty ~\text {as well})\)
Since the map \(\varrho \rightarrow \mathbb {E}\Vert \varrho \Vert _{H^0}^{2}\) is convex, we get that
Then we have the inequality
From the monotonicity condition (see (10)) with \(\lambda =0\), for all \(p(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\) and \(m\ge 1\), we derive that
With the assistance of (52) and (53), and proceeding in a similar manner to the limit, we have
If we set \(q_{}^{n}(t)=p(t)\) in (54) we immediately obtain \(\eta (t)=\mathcal {B}q_{}^{n}(t)\). To show that \(\zeta (t)=\mathcal {A}^v q_{}^{n}(t)\), let \(p(t)=q_{}^{n}(t) - \theta w(t)\) for \(\theta >0\) and \(w(t)\in L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^1)\). Diving both side by \(\theta \) we see that
Letting \(\theta \rightarrow 0\) and utilizing the weak continuity of \(\mathcal {A}^v\) (see (12)), we deduce that
It follows unambiguously that \(\zeta (t)=\mathcal {A}^v q^n(t)\) in \(L^2([0,T]\times \varOmega ,\mathcal {F}_{t}^{Y};H^{-1})\).
The proof of the theorem has been done. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wan, H., Wang, G. & Xiong, J. A branching particle system approximation for solving partially observed stochastic optimal control problems via stochastic maximum principle. Stoch PDE: Anal Comp 12, 675–735 (2024). https://doi.org/10.1007/s40072-023-00294-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40072-023-00294-w
Keywords
- Zakai equation
- Branching particle system
- Forward–backward stochastic differential equation
- Optimal control of partially observed system
- Maximum principle
- stochastic gradient decent