Abstract
The goal of this paper is to solve a class of stochastic optimal control problems numerically, in which the state process is governed by an Itô type stochastic differential equation with control process entering both in the drift and the diffusion, and is observed partially. The optimal control of feedback form is determined based on the available observational data. We call this type of control problems the data driven feedback control. The computational framework that we introduce to solve such type of problems aims to find the best estimate for the optimal control as a conditional expectation given the observational information. To make our method feasible in providing timely feedback to the controlled system from data, we develop an efficient stochastic optimization algorithm to implement our computational framework.
Similar content being viewed by others
References
Archibald, R., Bao, F., Yong, J.: Stochastic gradient descent approach for stochastic optimal control. East Asian J. Appl. Math. 10(4), s635–658 (2020)
Bally, V.: Approximation scheme for solutions of BSDE. In: Backward Stochastic Differential Equations (Paris, 1995–1996). Pitman Research Notes in Mathematics Series, vol. 64, pp. 177–191. Longman, Harlow (1997)
Bao, F., Archibald, R., Maksymovych, P.: Backward SDE filter for jump diffusion processes and its applications in material sciences. Commun. Comput. Phys. 27, 589–618 (2020)
Bao, F., Cao, Y., Meir, A., Zhao, W.: A first order scheme for backward doubly stochastic differential equations. SIAM/ASA J. Uncertain. Quantif. 4(1), 413–445 (2016)
Bao, F., Cao, Y., Zhao, W.: Numerical solutions for forward backward doubly stochastic differential equations and Zakai equations. Int. J. Uncertain. Quantif. 1(4), 351–367 (2011)
Bao, F., Cao, Y., Zhao, W.: A first order semi-discrete algorithm for backward doubly stochastic differential equations. Discrete Contin. Dyn. Syst. Ser. B 5, 1297–1313 (2015)
Bao, F., Cao, Y., Zhao, W.: A backward doubly stochastic differential equation approach for nonlinear filtering problems. Commun. Comput. Phys. 23(5), 1573–1601 (2018)
Bao, F., Cao, Y., Webster, C., Zhang, G.: A hybrid sparse-grid approach for nonlinear filtering problems based on adaptive-domain of the Zakai equation approximations. SIAM/ASA J. Uncertain. Quantif. 2(1), 784–804 (2014)
Bao, F., Maroulas, V.: Adaptive meshfree backward SDE filter. SIAM J. Sci. Comput. 39(6), A2664–A2683 (2017)
Baras, J.S., Elliott, R.J., Kohlmann, M.: The partially observed stochastic minimum principle. SIAM J. Control Optim. 27(6), 1279–1292 (1989)
Bensoussan, A.: Maximum principle and dynamic programming approaches of the optimal control of partially observed diffusions. Stochastics 9(3), 169–222 (1983)
Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)
Bouchard, B., Touzi, N.: Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations. Stoch. Process. Appl. 111(2), 175–206 (2004)
Charalambous, C.D., Elliott, R.J.: Classes of nonlinear partially observable stochastic optimal control problems with explicit optimal control laws. SIAM J. Control Optim. 36(2), 542–578 (1998)
Doucet, A., de Freitas, N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, New York (2001)
Exarchos, I., Theodorou, E.A.: Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87, 159–165 (2018)
Exarchos, I., Theodorou, E.A., Tsiotras, P.: Stochastic l1 optimal control via forward and backward sampling. Syst. Control Lett. 118, 101–108 (2018)
Feng, X., Glowinski, R., Neilan, M.: Recent developments in numerical methods for fully nonlinear second order partial differential equations. SIAM Rev. 55(2), 205–267 (2013)
Fleming, W.H.: Optimal control of partially observable diffusions. SIAM J. Control 6, 194–214 (1968)
Fleming, W.H., Pardoux, É.: Optimal control for partially observed diffusions. SIAM J. Control Optim. 20(2), 261–285 (1982)
Gong, B., Liu, W., Tang, T., Zhao, W., Zhou, T.: An efficient gradient projection method for stochastic optimal control problems. SIAM J. Numer. Anal. 55(6), 2982–3005 (2017)
Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F 140(2), 107–113 (1993)
Gorodetsky, A., Karaman, S., Marzouk, Y.: High-dimensional stochastic optimal control using continuous tensor decompositions. Int. J. Robot. Res. 37(2–3), 340–377 (2018)
Han, J., Jentzen, A., Weinan, E.: Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 115(34), 8505–8510 (2018)
Haussmann, U.G.: On the existence of optimal controls for partially observed diffusions. SIAM J. Control Optim. 20(3), 385–407 (1982)
Haussmann, U.G.: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Optim. 25(2), 341–361 (1987)
Kang, K., Maroulas, V., Schizas, I., Bao, F.: Improved distributed particle filters for tracking in a wireless sensor network. Comput. Stat. Data Anal. 117, 90–108 (2018)
Kwakernaak, H.: A minimum principle for stochastic control problems with output feedback. Syst. Control Lett. 1(1), 74–77 (1981/1982)
Li, Q., Tai, C., Weinan, E.: Stochastic modified equations and dynamics of stochastic gradient algorithms i: mathematical foundations. J. Mach. Learn. Res. 20, 40–47 (2019)
Li, X., Tang, S.: General necessary conditions for partially observed optimal stochastic controls. J. Appl. Probab. 32(4), 1118–1137 (1995)
Lindquist, A.: On feedback control of linear stochastic systems. SIAM J. Control 11, 323–343 (1973)
Ma, J., Protter, P., Yong, J.: Solving forward–backward stochastic differential equations explicitly–a four step scheme. Probab. Theory Relat. Fields 98(3), 339–359 (1994)
Medjghou, A., Ghanai, M., Chafaa, K.: Improved feedback linearization control based on PSO optimization of an extended Kalman filter. Optim. Control Appl. Methods 39(6), 1871–1886 (2018)
Milstein, G.N., Tretyakov, M.V.: Numerical algorithms for forward–backward stochastic differential equations. SIAM J. Sci. Comput. 28(2), 561–582 (2006)
Morzfeld, M., Tu, X., Atkins, E., Chorin, A.J.: A random map implementation of implicit filters. J. Comput. Phys. 231(4), 2049–2066 (2012)
Peng, S.: A general stochastic maximum principle for optimal control problems. SIAM J. Control Optim. 28(4), 966–979 (1990)
Sassano, M., Astolfi, A.: A local separation principle via dynamic approximate feedback and observer linearization for a class of nonlinear systems. IEEE Trans. Autom. Control 64(1), 111–126 (2019)
Shapiro, A., Wardi, Y.: Convergence analysis of gradient descent stochastic algorithms. J. Optim. Theory Appl. 91, 439–454 (1996)
Smears, I., Süli, E.: Discontinuous Galerkin finite element methods for time-dependent Hamilton-Jacobi-Bellman equations with Cordes coefficients. Numer. Math. 133(1), 141–176 (2016)
Tang, S.: The maximum principle for partially observed optimal control of stochastic differential equations. SIAM J. Control Optim. 36(5), 1596–1617 (1998)
Touzi, N.: Optimal stochastic control, stochastic target problems, and backward SDE. In: Fields Institute Monographs, vol. 29. Springer, New York; Fields Institute for Research in Mathematical Sciences, Toronto, ON. With Chapter 13 by Angès Tourin (2013)
Wang, G., Wu, Z., Xiong, J.: An Introduction to Optimal Control of FBSDE with Incomplete Information. Springer Briefs in Mathematics. Springer, Cham (2018)
Wonham, W.M.: On the separation theorem of stochastic control. SIAM J. Control 6, 312–326 (1968)
Yong, J., Zhou, X.Y.: Stochastic controls. Applications of Mathematics (New York), vol. 43. Springer, New York. Hamiltonian systems and HJB equations (1999)
Zakai, M.: On the optimal filtering of diffusion processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 11, 230–243 (1969)
Zhao, W., Chen, L., Peng, S.: A new kind of accurate numerical method for backward stochastic differential equations. SIAM J. Sci. Comput. 28(4), 1563–1581 (2006)
Zhao, W., Fu, Y., Zhou, T.: New kinds of high-order multistep schemes for coupled forward backward stochastic differential equations. SIAM J. Sci. Comput. 36(4), A1731–A1751 (2014)
Acknowledgements
This work is partially supported by the Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research through FASTMath Institute and CompFUSE project. The second author also acknowledges support by U.S. National Science Foundation under Contract DMS-1720222. The third author acknowledges the partial support by NSF Grant DMS-1812921.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Derivation for the Gradient Process
Appendix: Derivation for the Gradient Process
In this “Appendix”, we give a detailed discussion on the derivation for the gradient process (8). To proceed, let U be convex and \((X^*,u^*)\) be any state-control pair which could be an optimal pair. For any \(u^M \in \mathcal{U}_{ad}[0,T]\), we let \(u^{\varepsilon ,M}=u^*+\varepsilon (u^M-u^*)\). Then the Gâteaux derivative of \(u^M\mapsto J^*(u^M)\) at \(u^*\) is given by the following:
where
with \(\mathcal{D}X_0 = 0\). For convenience of presentation, we denote \(\psi ^{*}(t) := \psi (t,X^*_t,u^*_t)\) and use subscript to denote partial derivative of a function. Let \((Y,Z,\zeta )\) be the adapted solution to the following adjoint backward stochastic differential equation (BSDE)
with \(h^{*}_x:=h_x(X^*_T)\), where Z is the martingale representation of Y with respect to W and \(\zeta \) is the martingale representation of Y with respect to B. Then
Substituting the above equation into the right hand side of (46), we obtain
Here, we have used that \(u^M_t-u_t^*\) is \(\mathcal{F}_t^M\)-measurable. Hence, in the case that \(u^*\) is in the interior of U, one has
as required in (8), where Y and Z are solutions of the FBSDE system (9).
Rights and permissions
About this article
Cite this article
Archibald, R., Bao, F., Yong, J. et al. An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems. J Sci Comput 85, 51 (2020). https://doi.org/10.1007/s10915-020-01358-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-020-01358-y