Stochastic Differential Games: A Sampling Approach via FBSDEs

Exarchos, Ioannis; Theodorou, Evangelos; Tsiotras, Panagiotis

doi:10.1007/s13235-018-0268-4

Stochastic Differential Games: A Sampling Approach via FBSDEs

Published: 11 June 2018

Volume 9, pages 486–505, (2019)
Cite this article

Dynamic Games and Applications Aims and scope Submit manuscript

Ioannis Exarchos ORCID: orcid.org/0000-0002-5836-4750¹,
Evangelos Theodorou¹ &
Panagiotis Tsiotras¹

506 Accesses
7 Citations
Explore all metrics

Abstract

The aim of this work is to present a sampling-based algorithm designed to solve various classes of stochastic differential games. The foundation of the proposed approach lies in the formulation of the game solution in terms of a decoupled pair of forward and backward stochastic differential equations (FBSDEs). In light of the nonlinear version of the Feynman–Kac lemma, probabilistic representations of solutions to the nonlinear Hamilton–Jacobi–Isaacs equations that arise for each class are obtained. These representations are in form of decoupled systems of FBSDEs, which may be solved numerically.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Replication of the Pre-kernel and Related Solutions

Article 19 September 2023

A Crash Course in Differential Games and Applications

Article Open access 25 March 2024

Symmetric Markov Processes with Tightness Property

Notes

A process \(H_s\) is called square-integrable if \( \mathbb {E}\big [\int _{t}^{T}H_s^2\mathrm {d}s\big ] < \infty \) for any \(T>t\).
The Isaacs condition renders the viscosity solutions of the upper and lower value functions equal, thus making the order of maximization/minimization inconsequential.
While X is a function of s and \(\omega \), we shall use \(X_s\) for notational brevity.
Here, \(Y_i^m\) denotes the quantity \(Y^m_{i+1} +\varDelta t_ih(t_{i+1},X^m_{i+1},Y^m_{i+1},Z^m_{i+1})\), which is the \(Y^m_i\) sample value before the conditional expectation operator has been applied.
Whenever the m index is not present, the entirety with respect to this index is to be understood.

References

Athans M, Falb P (2007) Optimal control—an introduction to the theory and its applications. Dover Publications Inc, New York
Google Scholar
Barles G, Souganidis P (1991) Convergence of approximation schemes for fully nonlinear second order equations. Asymptot Anal 4(3):271–283
MathSciNet MATH Google Scholar
Beard R, Saridis G, Wen J (1997) Galerkin approximation of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2159–2177
Article MathSciNet MATH Google Scholar
Bender C, Denk R (2007) A forward scheme for backward SDEs. Stoch Process Appl 117:1793–1812
Article MathSciNet MATH Google Scholar
Berkovitz L (1961) A variational approach to differential games. RAND Corporation Report
Bouchard B, Touzi N (2004) Discrete time approximation and Monte Carlo simulation of BSDEs. Stoch Process Appl 111:175–206
Article MATH Google Scholar
Bouchard B, Elie R, Touzi N (2009) Discrete-time approximation of BSDEs and probabilistic schemes for fully nonlinear PDEs. Radon Ser Comput Appl Math 8:91–124
MathSciNet MATH Google Scholar
Buckdahn R, Li J (2008) Stochastic differential games and viscosity solutions of Hamilton–Jacobi–Bellman–Isaacs equations. SIAM J Control Optim 47(1):444–475
Article MathSciNet MATH Google Scholar
Chassagneux JF, Richou A (2016) Numerical simulation of quadratic BSDEs. Ann Appl Probab 26(1):262–304
Article MathSciNet MATH Google Scholar
Da Lio F, Ley O (2006) Uniqueness results for second-order Bellman-Isaacs equations under quadratic growth assumptions and applications. SIAM J Control Optim 45(1):74–106
Article MathSciNet MATH Google Scholar
Delbaen F, Hu Y, Richou A (2011) On the uniqueness of solutions to quadratic BSDEs with convex generators and unbounded terminal conditions. Annales de l’Institut Henri Poincarè, Probabilitès et Statistiques 47(2):559–574
Article MathSciNet MATH Google Scholar
Dixon M, Edelbaum T, Potter J, Vandervelde W (1970) Fuel optimal reorientation of axisymmetric spacecraft. J Spacecr Rockets 7(11):1345–1351
Article Google Scholar
Douglas J, Ma J, Protter P (1996) Numerical methods for forward-backward stochastic differential equations. Ann Appl Probab 6:940–968
Article MathSciNet MATH Google Scholar
Duncan T, Pasik-Duncan B (2015) Some stochastic differential games with state dependent noise. In: 54th IEEE conference on decision and control, Osaka, Japan, December 15–18
Dvijotham K, Todorov E (2013) Linearly solvable optimal control. In: Lewis FL, Liu D (eds) Reinforcement learning and approximate dynamic programming for feedback control, pp 119–141. https://doi.org/10.1002/9781118453988.ch6
El Karoui N, Peng S, Quenez MC (1997) Backward stochastic differential equations in finance. Math Finance 7:1–71
Article MathSciNet MATH Google Scholar
Exarchos I, Theodorou E (2018) Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87:159–165
Article MathSciNet MATH Google Scholar
Fahim A, Touzi N, Warin X (2011) A probabilistic numerical method for fully nonlinear parabolic PDEs. Ann Appl Probab 21(4):1322–1364
Article MathSciNet MATH Google Scholar
Fleming W, Soner H (2006) Controlled Markov processes and viscosity solutions, 2nd edn. Stochastic modelling and applied probability. Springer, Berlin
MATH Google Scholar
Fleming W, Souganidis P (1989) On the existence of value functions of two player zero-sum stochastic differential games. Indiana University Mathematics Journal, New York
MATH Google Scholar
Gobet E, Labart C (2007) Error expansion for the discretization of backward stochastic differential equations. Stoch Process Appl 117:803–829
Article MathSciNet MATH Google Scholar
Gorodetsky A, Karaman S, Marzouk Y (2015) Efficient high-dimensional stochastic optimal motion control using tensor-train decomposition. In: Robotics: science and systems (RSS)
Györfi L, Kohler M, Krzyzak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer series in statistics. Springer, New York
Book MATH Google Scholar
Hamadene S, Lepeltier JP (1995) Zero-sum stochastic differential games and backward equations. Syst Control Lett 24:259–263
Article MathSciNet MATH Google Scholar
Ho Y, Bryson A, Baron S (1965) Differential games and optimal pursuit-evasion strategies. IEEE Trans Autom Control 10:385–389
Article MathSciNet Google Scholar
Horowitz MB, Burdick JW (2014) Semidefinite relaxations for stochastic optimal control policies. In: American control conference, Portland, June 4–6 pp 3006–3012
Horowitz MB, Damle A, Burdick JW (2014) Linear Hamilton Jacobi Bellman equations in high dimensions. In: 53rd IEEE conference on decision and control, Los Angeles, California, USA, December 15–17
Isaacs R (1965) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Willey, New York
MATH Google Scholar
Kappen HJ (2005) Linear theory for control of nonlinear stochastic systems. Phys Rev Lett 95:200201
Article MathSciNet Google Scholar
Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, New York
MATH Google Scholar
Kloeden P, Platen E (1999) Numerical solution of stochastic differential equations, vol 23 of Applications in Mathematics, Stochastic modelling and applied probability, 3rd edn. Springer, Berlin
Google Scholar
Kobylanski M (2000) Backward stochastic differential equations and partial differential equations with quadratic growth. Ann Probab 28(2):558–602. https://doi.org/10.1214/aop/1019160253
Article MathSciNet MATH Google Scholar
Kushner H (2002) Numerical approximations for stochastic differential games. SIAM J Control Optim 41:457–486
Article MathSciNet MATH Google Scholar
Kushner H, Chamberlain S (1969) On stochastic differential games: sufficient conditions that a given strategy be a saddle point, and numerical procedures for the solution of the game. J Math Anal Appl 26:560–575
Article MathSciNet MATH Google Scholar
Lasserre JB, Henrion D, Prieur C, Trelat E (2008) Nonlinear optimal control via occupation measures and LMI-relaxations. SIAM J Control Optim 47(4):1643–1666
Article MathSciNet MATH Google Scholar
Lemor JP, Gobet E, Warin X (2006) Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations. Bernoulli 12(5):889–916
Article MathSciNet MATH Google Scholar
Lepeltier JP, Martìn JS (1998) Existence for BSDE with superlinear-quadratic coefficient. Stoch Int J Probab Stoch Process 63(3–4):227–240
MathSciNet MATH Google Scholar
Longstaff FA, Schwartz RS (2001) Valuing American options by simulation: a simple least-squares approach. Rev Financ Stud 14:113–147
Article MATH Google Scholar
Ma J, Yong J (1999) Forward-backward stochastic differential equations and their applications. Springer, Berlin
MATH Google Scholar
Ma J, Protter P, Yong J (1994) Solving forward-backward stochastic differential equations explicitly—a four step scheme. Probab Theory Relat Fields 98:339–359
Article MathSciNet MATH Google Scholar
Ma J, Shen J, Zhao Y (2008) On numerical approximations of forward-backward stochastic differential equations. SIAM J Numer Anal 46(5):2636–2661
Article MathSciNet MATH Google Scholar
McEneaney WM (2007) A curse-of-dimensionality-free numerical method for solution of certain HJB PDEs. SIAM J Control Optim 46(4):1239–1276
Article MathSciNet MATH Google Scholar
Milstein GN, Tretyakov MV (2006) Numerical algorithm for forward-backward stochastic differential equations. SIAM J Sci Comput 28(2):561–582
Article MathSciNet MATH Google Scholar
Morimoto J, Atkeson C (2002) Minimax differential dynamic programming: An application to robust biped walking. In: Advances in neural information processing systems (NIPS), Vancouver, British Columbia, Canada, December 9–14
Morimoto J, Zeglin G, Atkeson C (2003) Minimax differential dynamic programming: Application to a biped walking robot. In: IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, NV, 2: 1927–1932, October 27–31
Nagahara M, Quevedo DE, Nešić D (2016) Maximum hands-off control: a paradigm of control effort minimization. IEEE Trans Autom Control 61(3):735–747
Article MathSciNet MATH Google Scholar
Nagahara M, Quevedo DE, Nešić D (2013) Maximum hands-off control and \(L^1\) optimality. In: 52nd IEEE conference on decision and control, Florence, Italy, December 10–13, pp 3825–3830
Øksendal B (2007) Stochastic differential equations—an introduction with applications, 6th edn. Springer, Berlin
MATH Google Scholar
Ramachandran KM, Tsokos CP (2012) Stochastic differential games. Atlantis Press, Paris
Book MATH Google Scholar
Seywald H, Kumar RR, Deshpande SS, Heck ML (1994) Minimum fuel spacecraft reorientation. J Guid Control Dyn 17(1):21–29
Article MATH Google Scholar
Song Q, Yin G, Zhang Z (2008) Numerical solutions for stochastic differential games with regime switching. IEEE Trans Autom Control 53:509–521
Article MathSciNet MATH Google Scholar
Sun W, Theodorou EA, Tsiotras P (2015) Game-theoretic continuous time differential dynamic programming. In: American Control Conference, Chicago, July 1–3, pp 5593–5598
Theodorou EA, Buchli J, Schaal S (2010) A generalized path integral control approach to reinforcement learning. J Mach Learn Res 11:3137–3181
MathSciNet MATH Google Scholar
Xiu D (2010) Numerical methods for stochastic computations—a spectral method approach. Princeton University Press, Princeton
Book MATH Google Scholar
Yong J, Zhou XY (1999) Stochastic controls: hamiltonian systems and HJB equations. Springer, New York
Book MATH Google Scholar
Zhang J (2004) A numerical scheme for BSDEs. Ann Appl Probab 14(1):459–488
Article MathSciNet MATH Google Scholar
Zhang J (2017) Backward stochastic differential equations. Probability theory and stochastic modelling. Springer, Berlin
Book Google Scholar

Download references

Acknowledgements

Funding was provided by Army Research Office (W911NF-16-1-0390) and National Science Foundation (CMMI-1662523).

Author information

Authors and Affiliations

Department of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0150, USA
Ioannis Exarchos, Evangelos Theodorou & Panagiotis Tsiotras

Authors

Ioannis Exarchos
View author publications
You can also search for this author in PubMed Google Scholar
Evangelos Theodorou
View author publications
You can also search for this author in PubMed Google Scholar
Panagiotis Tsiotras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ioannis Exarchos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Exarchos, I., Theodorou, E. & Tsiotras, P. Stochastic Differential Games: A Sampling Approach via FBSDEs. Dyn Games Appl 9, 486–505 (2019). https://doi.org/10.1007/s13235-018-0268-4

Download citation

Published: 11 June 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s13235-018-0268-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic Differential Games: A Sampling Approach via FBSDEs

Abstract

Access this article

Similar content being viewed by others

On the Replication of the Pre-kernel and Related Solutions

A Crash Course in Differential Games and Applications

Symmetric Markov Processes with Tightness Property

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Stochastic Differential Games: A Sampling Approach via FBSDEs

Abstract

Access this article

Similar content being viewed by others

On the Replication of the Pre-kernel and Related Solutions

A Crash Course in Differential Games and Applications

Symmetric Markov Processes with Tightness Property

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation