An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems

Archibald, Richard; Bao, Feng; Yong, Jiongmin; Zhou, Tao

doi:10.1007/s10915-020-01358-y

An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems

Published: 10 November 2020

Volume 85, article number 51, (2020)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

Richard Archibald¹,
Feng Bao ORCID: orcid.org/0000-0002-1302-8120²,
Jiongmin Yong³ &
…
Tao Zhou⁴

515 Accesses
10 Citations
Explore all metrics

Abstract

The goal of this paper is to solve a class of stochastic optimal control problems numerically, in which the state process is governed by an Itô type stochastic differential equation with control process entering both in the drift and the diffusion, and is observed partially. The optimal control of feedback form is determined based on the available observational data. We call this type of control problems the data driven feedback control. The computational framework that we introduce to solve such type of problems aims to find the best estimate for the optimal control as a conditional expectation given the observational information. To make our method feasible in providing timely feedback to the controlled system from data, we develop an efficient stochastic optimization algorithm to implement our computational framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

CasADi: a software framework for nonlinear optimization and optimal control

Article 11 July 2018

References

Archibald, R., Bao, F., Yong, J.: Stochastic gradient descent approach for stochastic optimal control. East Asian J. Appl. Math. 10(4), s635–658 (2020)
MathSciNet Google Scholar
Bally, V.: Approximation scheme for solutions of BSDE. In: Backward Stochastic Differential Equations (Paris, 1995–1996). Pitman Research Notes in Mathematics Series, vol. 64, pp. 177–191. Longman, Harlow (1997)
Bao, F., Archibald, R., Maksymovych, P.: Backward SDE filter for jump diffusion processes and its applications in material sciences. Commun. Comput. Phys. 27, 589–618 (2020)
MathSciNet Google Scholar
Bao, F., Cao, Y., Meir, A., Zhao, W.: A first order scheme for backward doubly stochastic differential equations. SIAM/ASA J. Uncertain. Quantif. 4(1), 413–445 (2016)
MathSciNet MATH Google Scholar
Bao, F., Cao, Y., Zhao, W.: Numerical solutions for forward backward doubly stochastic differential equations and Zakai equations. Int. J. Uncertain. Quantif. 1(4), 351–367 (2011)
MathSciNet MATH Google Scholar
Bao, F., Cao, Y., Zhao, W.: A first order semi-discrete algorithm for backward doubly stochastic differential equations. Discrete Contin. Dyn. Syst. Ser. B 5, 1297–1313 (2015)
MathSciNet MATH Google Scholar
Bao, F., Cao, Y., Zhao, W.: A backward doubly stochastic differential equation approach for nonlinear filtering problems. Commun. Comput. Phys. 23(5), 1573–1601 (2018)
MathSciNet Google Scholar
Bao, F., Cao, Y., Webster, C., Zhang, G.: A hybrid sparse-grid approach for nonlinear filtering problems based on adaptive-domain of the Zakai equation approximations. SIAM/ASA J. Uncertain. Quantif. 2(1), 784–804 (2014)
MathSciNet MATH Google Scholar
Bao, F., Maroulas, V.: Adaptive meshfree backward SDE filter. SIAM J. Sci. Comput. 39(6), A2664–A2683 (2017)
MathSciNet MATH Google Scholar
Baras, J.S., Elliott, R.J., Kohlmann, M.: The partially observed stochastic minimum principle. SIAM J. Control Optim. 27(6), 1279–1292 (1989)
MathSciNet MATH Google Scholar
Bensoussan, A.: Maximum principle and dynamic programming approaches of the optimal control of partially observed diffusions. Stochastics 9(3), 169–222 (1983)
MathSciNet MATH Google Scholar
Bensoussan, A.: Stochastic Control of Partially Observable Systems. Cambridge University Press, Cambridge (1992)
MATH Google Scholar
Bouchard, B., Touzi, N.: Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations. Stoch. Process. Appl. 111(2), 175–206 (2004)
MathSciNet MATH Google Scholar
Charalambous, C.D., Elliott, R.J.: Classes of nonlinear partially observable stochastic optimal control problems with explicit optimal control laws. SIAM J. Control Optim. 36(2), 542–578 (1998)
MathSciNet MATH Google Scholar
Doucet, A., de Freitas, N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, New York (2001)
MATH Google Scholar
Exarchos, I., Theodorou, E.A.: Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87, 159–165 (2018)
MathSciNet MATH Google Scholar
Exarchos, I., Theodorou, E.A., Tsiotras, P.: Stochastic l1 optimal control via forward and backward sampling. Syst. Control Lett. 118, 101–108 (2018)
MATH Google Scholar
Feng, X., Glowinski, R., Neilan, M.: Recent developments in numerical methods for fully nonlinear second order partial differential equations. SIAM Rev. 55(2), 205–267 (2013)
MathSciNet MATH Google Scholar
Fleming, W.H.: Optimal control of partially observable diffusions. SIAM J. Control 6, 194–214 (1968)
MathSciNet MATH Google Scholar
Fleming, W.H., Pardoux, É.: Optimal control for partially observed diffusions. SIAM J. Control Optim. 20(2), 261–285 (1982)
MathSciNet MATH Google Scholar
Gong, B., Liu, W., Tang, T., Zhao, W., Zhou, T.: An efficient gradient projection method for stochastic optimal control problems. SIAM J. Numer. Anal. 55(6), 2982–3005 (2017)
MathSciNet MATH Google Scholar
Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F 140(2), 107–113 (1993)
Google Scholar
Gorodetsky, A., Karaman, S., Marzouk, Y.: High-dimensional stochastic optimal control using continuous tensor decompositions. Int. J. Robot. Res. 37(2–3), 340–377 (2018)
Google Scholar
Han, J., Jentzen, A., Weinan, E.: Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 115(34), 8505–8510 (2018)
MathSciNet MATH Google Scholar
Haussmann, U.G.: On the existence of optimal controls for partially observed diffusions. SIAM J. Control Optim. 20(3), 385–407 (1982)
MathSciNet MATH Google Scholar
Haussmann, U.G.: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Optim. 25(2), 341–361 (1987)
MathSciNet MATH Google Scholar
Kang, K., Maroulas, V., Schizas, I., Bao, F.: Improved distributed particle filters for tracking in a wireless sensor network. Comput. Stat. Data Anal. 117, 90–108 (2018)
MathSciNet MATH Google Scholar
Kwakernaak, H.: A minimum principle for stochastic control problems with output feedback. Syst. Control Lett. 1(1), 74–77 (1981/1982)
Li, Q., Tai, C., Weinan, E.: Stochastic modified equations and dynamics of stochastic gradient algorithms i: mathematical foundations. J. Mach. Learn. Res. 20, 40–47 (2019)
MathSciNet MATH Google Scholar
Li, X., Tang, S.: General necessary conditions for partially observed optimal stochastic controls. J. Appl. Probab. 32(4), 1118–1137 (1995)
MathSciNet MATH Google Scholar
Lindquist, A.: On feedback control of linear stochastic systems. SIAM J. Control 11, 323–343 (1973)
MathSciNet MATH Google Scholar
Ma, J., Protter, P., Yong, J.: Solving forward–backward stochastic differential equations explicitly–a four step scheme. Probab. Theory Relat. Fields 98(3), 339–359 (1994)
MathSciNet MATH Google Scholar
Medjghou, A., Ghanai, M., Chafaa, K.: Improved feedback linearization control based on PSO optimization of an extended Kalman filter. Optim. Control Appl. Methods 39(6), 1871–1886 (2018)
MathSciNet MATH Google Scholar
Milstein, G.N., Tretyakov, M.V.: Numerical algorithms for forward–backward stochastic differential equations. SIAM J. Sci. Comput. 28(2), 561–582 (2006)
MathSciNet MATH Google Scholar
Morzfeld, M., Tu, X., Atkins, E., Chorin, A.J.: A random map implementation of implicit filters. J. Comput. Phys. 231(4), 2049–2066 (2012)
MathSciNet MATH Google Scholar
Peng, S.: A general stochastic maximum principle for optimal control problems. SIAM J. Control Optim. 28(4), 966–979 (1990)
MathSciNet MATH Google Scholar
Sassano, M., Astolfi, A.: A local separation principle via dynamic approximate feedback and observer linearization for a class of nonlinear systems. IEEE Trans. Autom. Control 64(1), 111–126 (2019)
MathSciNet MATH Google Scholar
Shapiro, A., Wardi, Y.: Convergence analysis of gradient descent stochastic algorithms. J. Optim. Theory Appl. 91, 439–454 (1996)
MathSciNet MATH Google Scholar
Smears, I., Süli, E.: Discontinuous Galerkin finite element methods for time-dependent Hamilton-Jacobi-Bellman equations with Cordes coefficients. Numer. Math. 133(1), 141–176 (2016)
MathSciNet MATH Google Scholar
Tang, S.: The maximum principle for partially observed optimal control of stochastic differential equations. SIAM J. Control Optim. 36(5), 1596–1617 (1998)
MathSciNet MATH Google Scholar
Touzi, N.: Optimal stochastic control, stochastic target problems, and backward SDE. In: Fields Institute Monographs, vol. 29. Springer, New York; Fields Institute for Research in Mathematical Sciences, Toronto, ON. With Chapter 13 by Angès Tourin (2013)
Wang, G., Wu, Z., Xiong, J.: An Introduction to Optimal Control of FBSDE with Incomplete Information. Springer Briefs in Mathematics. Springer, Cham (2018)
Google Scholar
Wonham, W.M.: On the separation theorem of stochastic control. SIAM J. Control 6, 312–326 (1968)
MathSciNet MATH Google Scholar
Yong, J., Zhou, X.Y.: Stochastic controls. Applications of Mathematics (New York), vol. 43. Springer, New York. Hamiltonian systems and HJB equations (1999)
Zakai, M.: On the optimal filtering of diffusion processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 11, 230–243 (1969)
MathSciNet MATH Google Scholar
Zhao, W., Chen, L., Peng, S.: A new kind of accurate numerical method for backward stochastic differential equations. SIAM J. Sci. Comput. 28(4), 1563–1581 (2006)
MathSciNet MATH Google Scholar
Zhao, W., Fu, Y., Zhou, T.: New kinds of high-order multistep schemes for coupled forward backward stochastic differential equations. SIAM J. Sci. Comput. 36(4), A1731–A1751 (2014)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is partially supported by the Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research through FASTMath Institute and CompFUSE project. The second author also acknowledges support by U.S. National Science Foundation under Contract DMS-1720222. The third author acknowledges the partial support by NSF Grant DMS-1812921.

Author information

Authors and Affiliations

Computational Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
Richard Archibald
Department of Mathematics, Florida State University, Tallahassee, FL, USA
Feng Bao
Department of Mathematics, University of Central Florida, Orlando, FL, USA
Jiongmin Yong
Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing, China
Tao Zhou

Authors

Richard Archibald
View author publications
You can also search for this author in PubMed Google Scholar
Feng Bao
View author publications
You can also search for this author in PubMed Google Scholar
Jiongmin Yong
View author publications
You can also search for this author in PubMed Google Scholar
Tao Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Bao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Derivation for the Gradient Process

In this “Appendix”, we give a detailed discussion on the derivation for the gradient process (8). To proceed, let U be convex and $(X^*,u^*)$ be any state-control pair which could be an optimal pair. For any $u^M \in \mathcal{U}_{ad}[0,T]$, we let $u^{\varepsilon ,M}=u^*+\varepsilon (u^M-u^*)$. Then the Gâteaux derivative of $u^M\mapsto J^*(u^M)$ at $u^*$ is given by the following:

$$\begin{aligned} \begin{array}{ll} \displaystyle \lim _{\varepsilon \rightarrow 0}{J^*(u^{\varepsilon ,M} )-J^*(u^*)\over \varepsilon }= {\mathbb {E}} \Big [\int _0^T\big (f^{*}_x(t) \mathcal {D}X_t+f^{*}_u(t)[u^M_t-u^{*}_t]\big )d t+h_x\mathcal{D}X_T\Big ], \end{array} \end{aligned}$$

(46)

where

$$\begin{aligned} \begin{array}{ll} \displaystyle \mathcal{D}X_t=\int _0^t\Big (b^{*}_x(s)\mathcal{D}X_s+b^{*}_u(s)[u^M_s-u^*_s]\Big )ds+\int _0^t\Big (\sigma ^{*}_x(s)\mathcal{D}X_s+\sigma ^{*}_u(s)[u^M_s-u^*_s]\Big )dW_s, \end{array} \end{aligned}$$

with $\mathcal{D}X_0 = 0$. For convenience of presentation, we denote $\psi ^{*}(t) := \psi (t,X^*_t,u^*_t)$ and use subscript to denote partial derivative of a function. Let $(Y,Z,\zeta )$ be the adapted solution to the following adjoint backward stochastic differential equation (BSDE)

$$\begin{aligned} \displaystyle dY_s=\Big (-b^{*}_x(s)^\top Y_s-\sigma ^{*}_x(s)^\top Z_s-g^{*}_x(s)^\top \Big )ds+Z_s dW_s+\zeta _sdB_s,\qquad Y_T=(h^{*}_x)^\top \end{aligned}$$

with $h^{*}_x:=h_x(X^*_T)$, where Z is the martingale representation of Y with respect to W and $\zeta $ is the martingale representation of Y with respect to B. Then

$$\begin{aligned} \begin{aligned} \displaystyle h^{*}_x \mathcal {D}X_T=&\ \langle Y_T, \mathcal {D}X_T\rangle -\langle Y_0, \mathcal {D}X_0\rangle \\ \displaystyle =&\ \int _0^T\Big (\langle -b^{*}_x(t)^\top Y_t-\sigma ^{*}_x(t)^\top Z_t-f^{*}_x(t)^\top ,\mathcal {D}X_t\rangle \\&\ \quad \quad \quad \quad +\langle Y_t,b^{*}_x(t)\mathcal {D}X_t+b^{*}_u(t)[u^M_t-u^*_t]\rangle \\&\ \displaystyle \qquad \qquad +\langle Z_t,\sigma ^{*}_x(t)\mathcal {D}X_t+\sigma ^{*}_u(t)[u^M_t-u^*_t]\rangle \Big )ds\\&\ \displaystyle \qquad \qquad +\int _0^T\Big (\langle Z_t, \mathcal {D}X_t\rangle +\langle Y_t, \sigma ^{*}_x(t)\mathcal {D}X_t+\sigma ^{*}_u(t)[u^M_t-u^{*}_t]\Big )dW_t \\&\ \displaystyle \qquad \qquad +\int _0^T\langle \zeta _t,\mathcal{D}X_t+\sigma ^{*}_u(t)[u^M_t-u_t^*]\rangle dB_t\\ \displaystyle =&\ \int _0^T\Big (-f^{*}_x(t)\mathcal {D}X_t+\langle b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t,u^M_t-u^*_t\rangle \Big )dt\\&\ \displaystyle \qquad +\int _0^T\Big (\langle Z_t, \mathcal {D}X_t\rangle +\langle Y_t,\sigma ^{*}_x(t)\mathcal {D}X_t\rangle +\sigma ^{*}_u(t)[u^M_t-u^*_t]\Big )dW_t\\&\ \displaystyle \qquad \qquad +\int _0^T\langle \zeta _t,\mathcal{D}X_t+\sigma ^{*}_u(t)[u^M_t-u_t^*]\rangle dB_t. \end{aligned} \end{aligned}$$

Substituting the above equation into the right hand side of (46), we obtain

$$\begin{aligned} \begin{aligned}&\lim _{\varepsilon \rightarrow 0}{J^*(u^{\varepsilon , M})-J^*(u^*)\over \varepsilon }\\ =&\ {\mathbb {E}} \Big [\int _0^T\big (f_x(t)\mathcal {D}X_t+f_u(t)[u^M_t-u^*_t]\big )dt+h_x\mathcal{D}X_T \Big ]\\ =&\ {\mathbb {E}} \Big [\int _0^T\Big (\langle b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t+f^{*}_u(t)^\top ,u^M_t-u^*_t\rangle \Big )dt\Big ]\\ =&\ {\mathbb {E}} \Big [\int _0^T\langle {\mathbb {E}} \big [b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t+f^{*}_u(t)^\top \bigm |\mathcal{F}^M_t\big ],u^M_t-u^*_t\rangle dt\Big ]. \end{aligned} \end{aligned}$$

Here, we have used that $u^M_t-u_t^*$ is $\mathcal{F}_t^M$-measurable. Hence, in the case that $u^*$ is in the interior of U, one has

$$\begin{aligned} (J^*)'_u(u^*_t)= {\mathbb {E}} \big [b^{*}_u(t)^\top Y_t+\sigma ^{*}_u(t)^\top Z_t+f^{*}_u(t)^\top \bigm |\mathcal{F}^M_t\big ], \quad t \in [0, T], \end{aligned}$$

as required in (8), where Y and Z are solutions of the FBSDE system (9).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Archibald, R., Bao, F., Yong, J. et al. An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems. J Sci Comput 85, 51 (2020). https://doi.org/10.1007/s10915-020-01358-y

Download citation

Received: 01 August 2020
Revised: 25 October 2020
Accepted: 29 October 2020
Published: 10 November 2020
DOI: https://doi.org/10.1007/s10915-020-01358-y

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

Random Gradient-Free Minimization of Convex Functions

CasADi: a software framework for nonlinear optimization and optimal control

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Derivation for the Gradient Process

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

An Efficient Numerical Algorithm for Solving Data Driven Feedback Control Problems

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

Random Gradient-Free Minimization of Convex Functions

CasADi: a software framework for nonlinear optimization and optimal control

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Derivation for the Gradient Process

Appendix: Derivation for the Gradient Process

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation