Skip to main content
Log in

An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

The goal of this paper is to solve a class of stochastic optimal control problems numerically, in which the state process is governed by an Itô type stochastic differential equation with control process entering both in the drift and the diffusion, and is observed partially. The optimal control of feedback form is determined based on the available observational data. We call this type of control problems the data driven feedback control. The computational framework that we introduce to solve such type of problems aims to find the best estimate for the optimal control as a conditional expectation given the observational information. To make our method feasible in providing timely feedback to the controlled system from data, we develop an efficient stochastic optimization algorithm to implement our computational framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Archibald, R., Bao, F., Yong, J.: Stochastic gradient descent approach for stochastic optimal control. East Asian J. Appl. Math. 10(4), s635–658 (2020)

    MathSciNet  Google Scholar 

  2. Bally, V.: Approximation scheme for solutions of BSDE. In: Backward Stochastic Differential Equations (Paris, 1995–1996). Pitman Research Notes in Mathematics Series, vol. 64, pp. 177–191. Longman, Harlow (1997)

  3. Bao, F., Archibald, R., Maksymovych, P.: Backward SDE filter for jump diffusion processes and its applications in material sciences. Commun. Comput. Phys. 27, 589–618 (2020)

    MathSciNet  Google Scholar 

  4. Bao, F., Cao, Y., Meir, A., Zhao, W.: A first order scheme for backward doubly stochastic differential equations. SIAM/ASA J. Uncertain. Quantif. 4(1), 413–445 (2016)

    MathSciNet  MATH  Google Scholar 

  5. Bao, F., Cao, Y., Zhao, W.: Numerical solutions for forward backward doubly stochastic differential equations and Zakai equations. Int. J. Uncertain. Quantif. 1(4), 351–367 (2011)

    MathSciNet  MATH  Google Scholar 

  6. Bao, F., Cao, Y., Zhao, W.: A first order semi-discrete algorithm for backward doubly stochastic differential equations. Discrete Contin. Dyn. Syst. Ser. B 5, 1297–1313 (2015)

    MathSciNet  MATH  Google Scholar 

  7. Bao, F., Cao, Y., Zhao, W.: A backward doubly stochastic differential equation approach for nonlinear filtering problems. Commun. Comput. Phys. 23(5), 1573–1601 (2018)

    MathSciNet  Google Scholar 

  8. Bao, F., Cao, Y., Webster, C., Zhang, G.: A hybrid sparse-grid approach for nonlinear filtering problems based on adaptive-domain of the Zakai equation approximations. SIAM/ASA J. Uncertain. Quantif. 2(1), 784–804 (2014)

    MathSciNet  MATH  Google Scholar 

  9. Bao, F., Maroulas, V.: Adaptive meshfree backward SDE filter. SIAM J. Sci. Comput. 39(6), A2664–A2683 (2017)

    MathSciNet  MATH  Google Scholar 

  10. Baras, J.S., Elliott, R.J., Kohlmann, M.: The partially observed stochastic minimum principle. SIAM J. Control Optim. 27(6), 1279–1292 (1989)

    MathSciNet  MATH  Google Scholar 

  11. Bensoussan, A.: Maximum principle and dynamic programming approaches of the optimal control of partially observed diffusions. Stochastics 9(3), 169–222 (1983)

    MathSciNet  MATH  Google Scholar 

  12. Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)

    MATH  Google Scholar 

  13. Bouchard, B., Touzi, N.: Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations. Stoch. Process. Appl. 111(2), 175–206 (2004)

    MathSciNet  MATH  Google Scholar 

  14. Charalambous, C.D., Elliott, R.J.: Classes of nonlinear partially observable stochastic optimal control problems with explicit optimal control laws. SIAM J. Control Optim. 36(2), 542–578 (1998)

    MathSciNet  MATH  Google Scholar 

  15. Doucet, A., de Freitas, N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, New York (2001)

    MATH  Google Scholar 

  16. Exarchos, I., Theodorou, E.A.: Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87, 159–165 (2018)

    MathSciNet  MATH  Google Scholar 

  17. Exarchos, I., Theodorou, E.A., Tsiotras, P.: Stochastic l1 optimal control via forward and backward sampling. Syst. Control Lett. 118, 101–108 (2018)

    MATH  Google Scholar 

  18. Feng, X., Glowinski, R., Neilan, M.: Recent developments in numerical methods for fully nonlinear second order partial differential equations. SIAM Rev. 55(2), 205–267 (2013)

    MathSciNet  MATH  Google Scholar 

  19. Fleming, W.H.: Optimal control of partially observable diffusions. SIAM J. Control 6, 194–214 (1968)

    MathSciNet  MATH  Google Scholar 

  20. Fleming, W.H., Pardoux, É.: Optimal control for partially observed diffusions. SIAM J. Control Optim. 20(2), 261–285 (1982)

    MathSciNet  MATH  Google Scholar 

  21. Gong, B., Liu, W., Tang, T., Zhao, W., Zhou, T.: An efficient gradient projection method for stochastic optimal control problems. SIAM J. Numer. Anal. 55(6), 2982–3005 (2017)

    MathSciNet  MATH  Google Scholar 

  22. Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F 140(2), 107–113 (1993)

    Google Scholar 

  23. Gorodetsky, A., Karaman, S., Marzouk, Y.: High-dimensional stochastic optimal control using continuous tensor decompositions. Int. J. Robot. Res. 37(2–3), 340–377 (2018)

    Google Scholar 

  24. Han, J., Jentzen, A., Weinan, E.: Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 115(34), 8505–8510 (2018)

    MathSciNet  MATH  Google Scholar 

  25. Haussmann, U.G.: On the existence of optimal controls for partially observed diffusions. SIAM J. Control Optim. 20(3), 385–407 (1982)

    MathSciNet  MATH  Google Scholar 

  26. Haussmann, U.G.: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Optim. 25(2), 341–361 (1987)

    MathSciNet  MATH  Google Scholar 

  27. Kang, K., Maroulas, V., Schizas, I., Bao, F.: Improved distributed particle filters for tracking in a wireless sensor network. Comput. Stat. Data Anal. 117, 90–108 (2018)

    MathSciNet  MATH  Google Scholar 

  28. Kwakernaak, H.: A minimum principle for stochastic control problems with output feedback. Syst. Control Lett. 1(1), 74–77 (1981/1982)

  29. Li, Q., Tai, C., Weinan, E.: Stochastic modified equations and dynamics of stochastic gradient algorithms i: mathematical foundations. J. Mach. Learn. Res. 20, 40–47 (2019)

    MathSciNet  MATH  Google Scholar 

  30. Li, X., Tang, S.: General necessary conditions for partially observed optimal stochastic controls. J. Appl. Probab. 32(4), 1118–1137 (1995)

    MathSciNet  MATH  Google Scholar 

  31. Lindquist, A.: On feedback control of linear stochastic systems. SIAM J. Control 11, 323–343 (1973)

    MathSciNet  MATH  Google Scholar 

  32. Ma, J., Protter, P., Yong, J.: Solving forward–backward stochastic differential equations explicitly–a four step scheme. Probab. Theory Relat. Fields 98(3), 339–359 (1994)

    MathSciNet  MATH  Google Scholar 

  33. Medjghou, A., Ghanai, M., Chafaa, K.: Improved feedback linearization control based on PSO optimization of an extended Kalman filter. Optim. Control Appl. Methods 39(6), 1871–1886 (2018)

    MathSciNet  MATH  Google Scholar 

  34. Milstein, G.N., Tretyakov, M.V.: Numerical algorithms for forward–backward stochastic differential equations. SIAM J. Sci. Comput. 28(2), 561–582 (2006)

    MathSciNet  MATH  Google Scholar 

  35. Morzfeld, M., Tu, X., Atkins, E., Chorin, A.J.: A random map implementation of implicit filters. J. Comput. Phys. 231(4), 2049–2066 (2012)

    MathSciNet  MATH  Google Scholar 

  36. Peng, S.: A general stochastic maximum principle for optimal control problems. SIAM J. Control Optim. 28(4), 966–979 (1990)

    MathSciNet  MATH  Google Scholar 

  37. Sassano, M., Astolfi, A.: A local separation principle via dynamic approximate feedback and observer linearization for a class of nonlinear systems. IEEE Trans. Autom. Control 64(1), 111–126 (2019)

    MathSciNet  MATH  Google Scholar 

  38. Shapiro, A., Wardi, Y.: Convergence analysis of gradient descent stochastic algorithms. J. Optim. Theory Appl. 91, 439–454 (1996)

    MathSciNet  MATH  Google Scholar 

  39. Smears, I., Süli, E.: Discontinuous Galerkin finite element methods for time-dependent Hamilton-Jacobi-Bellman equations with Cordes coefficients. Numer. Math. 133(1), 141–176 (2016)

    MathSciNet  MATH  Google Scholar 

  40. Tang, S.: The maximum principle for partially observed optimal control of stochastic differential equations. SIAM J. Control Optim. 36(5), 1596–1617 (1998)

    MathSciNet  MATH  Google Scholar 

  41. Touzi, N.: Optimal stochastic control, stochastic target problems, and backward SDE. In: Fields Institute Monographs, vol. 29. Springer, New York; Fields Institute for Research in Mathematical Sciences, Toronto, ON. With Chapter 13 by Angès Tourin (2013)

  42. Wang, G., Wu, Z., Xiong, J.: An Introduction to Optimal Control of FBSDE with Incomplete Information. Springer Briefs in Mathematics. Springer, Cham (2018)

    Google Scholar 

  43. Wonham, W.M.: On the separation theorem of stochastic control. SIAM J. Control 6, 312–326 (1968)

    MathSciNet  MATH  Google Scholar 

  44. Yong, J., Zhou, X.Y.: Stochastic controls. Applications of Mathematics (New York), vol. 43. Springer, New York. Hamiltonian systems and HJB equations (1999)

  45. Zakai, M.: On the optimal filtering of diffusion processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 11, 230–243 (1969)

    MathSciNet  MATH  Google Scholar 

  46. Zhao, W., Chen, L., Peng, S.: A new kind of accurate numerical method for backward stochastic differential equations. SIAM J. Sci. Comput. 28(4), 1563–1581 (2006)

    MathSciNet  MATH  Google Scholar 

  47. Zhao, W., Fu, Y., Zhou, T.: New kinds of high-order multistep schemes for coupled forward backward stochastic differential equations. SIAM J. Sci. Comput. 36(4), A1731–A1751 (2014)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research through FASTMath Institute and CompFUSE project. The second author also acknowledges support by U.S. National Science Foundation under Contract DMS-1720222. The third author acknowledges the partial support by NSF Grant DMS-1812921.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Bao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Derivation for the Gradient Process

Appendix: Derivation for the Gradient Process

In this “Appendix”, we give a detailed discussion on the derivation for the gradient process (8). To proceed, let U be convex and \((X^*,u^*)\) be any state-control pair which could be an optimal pair. For any \(u^M \in \mathcal{U}_{ad}[0,T]\), we let \(u^{\varepsilon ,M}=u^*+\varepsilon (u^M-u^*)\). Then the Gâteaux derivative of \(u^M\mapsto J^*(u^M)\) at \(u^*\) is given by the following:

$$\begin{aligned} \begin{array}{ll} \displaystyle \lim _{\varepsilon \rightarrow 0}{J^*(u^{\varepsilon ,M} )-J^*(u^*)\over \varepsilon }= {\mathbb {E}} \Big [\int _0^T\big (f^{*}_x(t) \mathcal {D}X_t+f^{*}_u(t)[u^M_t-u^{*}_t]\big )d t+h_x\mathcal{D}X_T\Big ], \end{array} \end{aligned}$$
(46)

where

$$\begin{aligned} \begin{array}{ll} \displaystyle \mathcal{D}X_t=\int _0^t\Big (b^{*}_x(s)\mathcal{D}X_s+b^{*}_u(s)[u^M_s-u^*_s]\Big )ds+\int _0^t\Big (\sigma ^{*}_x(s)\mathcal{D}X_s+\sigma ^{*}_u(s)[u^M_s-u^*_s]\Big )dW_s, \end{array} \end{aligned}$$

with \(\mathcal{D}X_0 = 0\). For convenience of presentation, we denote \(\psi ^{*}(t) := \psi (t,X^*_t,u^*_t)\) and use subscript to denote partial derivative of a function. Let \((Y,Z,\zeta )\) be the adapted solution to the following adjoint backward stochastic differential equation (BSDE)

$$\begin{aligned} \displaystyle dY_s=\Big (-b^{*}_x(s)^\top Y_s-\sigma ^{*}_x(s)^\top Z_s-g^{*}_x(s)^\top \Big )ds+Z_s dW_s+\zeta _sdB_s,\qquad Y_T=(h^{*}_x)^\top \end{aligned}$$

with \(h^{*}_x:=h_x(X^*_T)\), where Z is the martingale representation of Y with respect to W and \(\zeta \) is the martingale representation of Y with respect to B. Then

$$\begin{aligned} \begin{aligned} \displaystyle h^{*}_x \mathcal {D}X_T=&\ \langle Y_T, \mathcal {D}X_T\rangle -\langle Y_0, \mathcal {D}X_0\rangle \\ \displaystyle =&\ \int _0^T\Big (\langle -b^{*}_x(t)^\top Y_t-\sigma ^{*}_x(t)^\top Z_t-f^{*}_x(t)^\top ,\mathcal {D}X_t\rangle \\&\ \quad \quad \quad \quad +\langle Y_t,b^{*}_x(t)\mathcal {D}X_t+b^{*}_u(t)[u^M_t-u^*_t]\rangle \\&\ \displaystyle \qquad \qquad +\langle Z_t,\sigma ^{*}_x(t)\mathcal {D}X_t+\sigma ^{*}_u(t)[u^M_t-u^*_t]\rangle \Big )ds\\&\ \displaystyle \qquad \qquad +\int _0^T\Big (\langle Z_t, \mathcal {D}X_t\rangle +\langle Y_t, \sigma ^{*}_x(t)\mathcal {D}X_t+\sigma ^{*}_u(t)[u^M_t-u^{*}_t]\Big )dW_t \\&\ \displaystyle \qquad \qquad +\int _0^T\langle \zeta _t,\mathcal{D}X_t+\sigma ^{*}_u(t)[u^M_t-u_t^*]\rangle dB_t\\ \displaystyle =&\ \int _0^T\Big (-f^{*}_x(t)\mathcal {D}X_t+\langle b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t,u^M_t-u^*_t\rangle \Big )dt\\&\ \displaystyle \qquad +\int _0^T\Big (\langle Z_t, \mathcal {D}X_t\rangle +\langle Y_t,\sigma ^{*}_x(t)\mathcal {D}X_t\rangle +\sigma ^{*}_u(t)[u^M_t-u^*_t]\Big )dW_t\\&\ \displaystyle \qquad \qquad +\int _0^T\langle \zeta _t,\mathcal{D}X_t+\sigma ^{*}_u(t)[u^M_t-u_t^*]\rangle dB_t. \end{aligned} \end{aligned}$$

Substituting the above equation into the right hand side of (46), we obtain

$$\begin{aligned} \begin{aligned}&\lim _{\varepsilon \rightarrow 0}{J^*(u^{\varepsilon , M})-J^*(u^*)\over \varepsilon }\\ =&\ {\mathbb {E}} \Big [\int _0^T\big (f_x(t)\mathcal {D}X_t+f_u(t)[u^M_t-u^*_t]\big )dt+h_x\mathcal{D}X_T \Big ]\\ =&\ {\mathbb {E}} \Big [\int _0^T\Big (\langle b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t+f^{*}_u(t)^\top ,u^M_t-u^*_t\rangle \Big )dt\Big ]\\ =&\ {\mathbb {E}} \Big [\int _0^T\langle {\mathbb {E}} \big [b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t+f^{*}_u(t)^\top \bigm |\mathcal{F}^M_t\big ],u^M_t-u^*_t\rangle dt\Big ]. \end{aligned} \end{aligned}$$

Here, we have used that \(u^M_t-u_t^*\) is \(\mathcal{F}_t^M\)-measurable. Hence, in the case that \(u^*\) is in the interior of U, one has

$$\begin{aligned} (J^*)'_u(u^*_t)= {\mathbb {E}} \big [b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t+f^{*}_u(t)^\top \bigm |\mathcal{F}^M_t\big ], \quad t \in [0, T], \end{aligned}$$

as required in (8), where Y and Z are solutions of the FBSDE system (9).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Archibald, R., Bao, F., Yong, J. et al. An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems. J Sci Comput 85, 51 (2020). https://doi.org/10.1007/s10915-020-01358-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-020-01358-y

Keywords

Mathematics Subject Classification

Navigation