Stochastic Differential Games: A Sampling Approach via FBSDEs

  • Ioannis Exarchos
  • Evangelos Theodorou
  • Panagiotis Tsiotras
Article
  • 14 Downloads

Abstract

The aim of this work is to present a sampling-based algorithm designed to solve various classes of stochastic differential games. The foundation of the proposed approach lies in the formulation of the game solution in terms of a decoupled pair of forward and backward stochastic differential equations (FBSDEs). In light of the nonlinear version of the Feynman–Kac lemma, probabilistic representations of solutions to the nonlinear Hamilton–Jacobi–Isaacs equations that arise for each class are obtained. These representations are in form of decoupled systems of FBSDEs, which may be solved numerically.

Keywords

Stochastic differential games Forward and backward stochastic differential equations Numerical methods Iterative algorithms 

Notes

Acknowledgements

Funding was provided by Army Research Office (W911NF-16-1-0390) and National Science Foundation (CMMI-1662523).

References

  1. 1.
    Athans M, Falb P (2007) Optimal control—an introduction to the theory and its applications. Dover Publications Inc, New YorkMATHGoogle Scholar
  2. 2.
    Barles G, Souganidis P (1991) Convergence of approximation schemes for fully nonlinear second order equations. Asymptot Anal 4(3):271–283MathSciNetMATHGoogle Scholar
  3. 3.
    Beard R, Saridis G, Wen J (1997) Galerkin approximation of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2159–2177MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Bender C, Denk R (2007) A forward scheme for backward SDEs. Stoch Process Appl 117:1793–1812MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Berkovitz L (1961) A variational approach to differential games. RAND Corporation ReportGoogle Scholar
  6. 6.
    Bouchard B, Touzi N (2004) Discrete time approximation and Monte Carlo simulation of BSDEs. Stoch Process Appl 111:175–206CrossRefMATHGoogle Scholar
  7. 7.
    Bouchard B, Elie R, Touzi N (2009) Discrete-time approximation of BSDEs and probabilistic schemes for fully nonlinear PDEs. Radon Ser Comput Appl Math 8:91–124MathSciNetMATHGoogle Scholar
  8. 8.
    Buckdahn R, Li J (2008) Stochastic differential games and viscosity solutions of Hamilton–Jacobi–Bellman–Isaacs equations. SIAM J Control Optim 47(1):444–475MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Chassagneux JF, Richou A (2016) Numerical simulation of quadratic BSDEs. Ann Appl Probab 26(1):262–304MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Da Lio F, Ley O (2006) Uniqueness results for second-order Bellman-Isaacs equations under quadratic growth assumptions and applications. SIAM J Control Optim 45(1):74–106MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Delbaen F, Hu Y, Richou A (2011) On the uniqueness of solutions to quadratic BSDEs with convex generators and unbounded terminal conditions. Annales de l’Institut Henri Poincarè, Probabilitès et Statistiques 47(2):559–574MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Dixon M, Edelbaum T, Potter J, Vandervelde W (1970) Fuel optimal reorientation of axisymmetric spacecraft. J Spacecr Rockets 7(11):1345–1351CrossRefGoogle Scholar
  13. 13.
    Douglas J, Ma J, Protter P (1996) Numerical methods for forward-backward stochastic differential equations. Ann Appl Probab 6:940–968MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Duncan T, Pasik-Duncan B (2015) Some stochastic differential games with state dependent noise. In: 54th IEEE conference on decision and control, Osaka, Japan, December 15–18Google Scholar
  15. 15.
    Dvijotham K, Todorov E (2013) Linearly solvable optimal control. In: Lewis FL, Liu D (eds) Reinforcement learning and approximate dynamic programming for feedback control, pp 119–141.  https://doi.org/10.1002/9781118453988.ch6
  16. 16.
    El Karoui N, Peng S, Quenez MC (1997) Backward stochastic differential equations in finance. Math Finance 7:1–71MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Exarchos I, Theodorou E (2018) Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87:159–165MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Fahim A, Touzi N, Warin X (2011) A probabilistic numerical method for fully nonlinear parabolic PDEs. Ann Appl Probab 21(4):1322–1364MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Fleming W, Soner H (2006) Controlled Markov processes and viscosity solutions, 2nd edn. Stochastic modelling and applied probability. Springer, BerlinMATHGoogle Scholar
  20. 20.
    Fleming W, Souganidis P (1989) On the existence of value functions of two player zero-sum stochastic differential games. Indiana University Mathematics Journal, New YorkMATHGoogle Scholar
  21. 21.
    Gobet E, Labart C (2007) Error expansion for the discretization of backward stochastic differential equations. Stoch Process Appl 117:803–829MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Gorodetsky A, Karaman S, Marzouk Y (2015) Efficient high-dimensional stochastic optimal motion control using tensor-train decomposition. In: Robotics: science and systems (RSS)Google Scholar
  23. 23.
    Györfi L, Kohler M, Krzyzak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer series in statistics. Springer, New YorkCrossRefMATHGoogle Scholar
  24. 24.
    Hamadene S, Lepeltier JP (1995) Zero-sum stochastic differential games and backward equations. Syst Control Lett 24:259–263MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Ho Y, Bryson A, Baron S (1965) Differential games and optimal pursuit-evasion strategies. IEEE Trans Autom Control 10:385–389MathSciNetCrossRefGoogle Scholar
  26. 26.
    Horowitz MB, Burdick JW (2014) Semidefinite relaxations for stochastic optimal control policies. In: American control conference, Portland, June 4–6 pp 3006–3012Google Scholar
  27. 27.
    Horowitz MB, Damle A, Burdick JW (2014) Linear Hamilton Jacobi Bellman equations in high dimensions. In: 53rd IEEE conference on decision and control, Los Angeles, California, USA, December 15–17Google Scholar
  28. 28.
    Isaacs R (1965) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Willey, New YorkMATHGoogle Scholar
  29. 29.
    Kappen HJ (2005) Linear theory for control of nonlinear stochastic systems. Phys Rev Lett 95:200201MathSciNetCrossRefGoogle Scholar
  30. 30.
    Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, New YorkMATHGoogle Scholar
  31. 31.
    Kloeden P, Platen E (1999) Numerical solution of stochastic differential equations, vol 23 of Applications in Mathematics, Stochastic modelling and applied probability, 3rd edn. Springer, BerlinGoogle Scholar
  32. 32.
    Kobylanski M (2000) Backward stochastic differential equations and partial differential equations with quadratic growth. Ann Probab 28(2):558–602.  https://doi.org/10.1214/aop/1019160253 MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Kushner H (2002) Numerical approximations for stochastic differential games. SIAM J Control Optim 41:457–486MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    Kushner H, Chamberlain S (1969) On stochastic differential games: sufficient conditions that a given strategy be a saddle point, and numerical procedures for the solution of the game. J Math Anal Appl 26:560–575MathSciNetCrossRefMATHGoogle Scholar
  35. 35.
    Lasserre JB, Henrion D, Prieur C, Trelat E (2008) Nonlinear optimal control via occupation measures and LMI-relaxations. SIAM J Control Optim 47(4):1643–1666MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    Lemor JP, Gobet E, Warin X (2006) Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations. Bernoulli 12(5):889–916MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Lepeltier JP, Martìn JS (1998) Existence for BSDE with superlinear-quadratic coefficient. Stoch Int J Probab Stoch Process 63(3–4):227–240MathSciNetMATHGoogle Scholar
  38. 38.
    Longstaff FA, Schwartz RS (2001) Valuing American options by simulation: a simple least-squares approach. Rev Financ Stud 14:113–147CrossRefMATHGoogle Scholar
  39. 39.
    Ma J, Yong J (1999) Forward-backward stochastic differential equations and their applications. Springer, BerlinMATHGoogle Scholar
  40. 40.
    Ma J, Protter P, Yong J (1994) Solving forward-backward stochastic differential equations explicitly—a four step scheme. Probab Theory Relat Fields 98:339–359MathSciNetCrossRefMATHGoogle Scholar
  41. 41.
    Ma J, Shen J, Zhao Y (2008) On numerical approximations of forward-backward stochastic differential equations. SIAM J Numer Anal 46(5):2636–2661MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    McEneaney WM (2007) A curse-of-dimensionality-free numerical method for solution of certain HJB PDEs. SIAM J Control Optim 46(4):1239–1276MathSciNetCrossRefMATHGoogle Scholar
  43. 43.
    Milstein GN, Tretyakov MV (2006) Numerical algorithm for forward-backward stochastic differential equations. SIAM J Sci Comput 28(2):561–582MathSciNetCrossRefMATHGoogle Scholar
  44. 44.
    Morimoto J, Atkeson C (2002) Minimax differential dynamic programming: An application to robust biped walking. In: Advances in neural information processing systems (NIPS), Vancouver, British Columbia, Canada, December 9–14Google Scholar
  45. 45.
    Morimoto J, Zeglin G, Atkeson C (2003) Minimax differential dynamic programming: Application to a biped walking robot. In: IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, NV, 2: 1927–1932, October 27–31Google Scholar
  46. 46.
    Nagahara M, Quevedo DE, Nešić D (2016) Maximum hands-off control: a paradigm of control effort minimization. IEEE Trans Autom Control 61(3):735–747MathSciNetCrossRefMATHGoogle Scholar
  47. 47.
    Nagahara M, Quevedo DE, Nešić D (2013) Maximum hands-off control and \(L^1\) optimality. In: 52nd IEEE conference on decision and control, Florence, Italy, December 10–13, pp 3825–3830Google Scholar
  48. 48.
    Øksendal B (2007) Stochastic differential equations—an introduction with applications, 6th edn. Springer, BerlinMATHGoogle Scholar
  49. 49.
    Ramachandran KM, Tsokos CP (2012) Stochastic differential games. Atlantis Press, ParisCrossRefMATHGoogle Scholar
  50. 50.
    Seywald H, Kumar RR, Deshpande SS, Heck ML (1994) Minimum fuel spacecraft reorientation. J Guid Control Dyn 17(1):21–29CrossRefMATHGoogle Scholar
  51. 51.
    Song Q, Yin G, Zhang Z (2008) Numerical solutions for stochastic differential games with regime switching. IEEE Trans Autom Control 53:509–521MathSciNetCrossRefMATHGoogle Scholar
  52. 52.
    Sun W, Theodorou EA, Tsiotras P (2015) Game-theoretic continuous time differential dynamic programming. In: American Control Conference, Chicago, July 1–3, pp 5593–5598Google Scholar
  53. 53.
    Theodorou EA, Buchli J, Schaal S (2010) A generalized path integral control approach to reinforcement learning. J Mach Learn Res 11:3137–3181MathSciNetMATHGoogle Scholar
  54. 54.
    Xiu D (2010) Numerical methods for stochastic computations—a spectral method approach. Princeton University Press, PrincetonMATHGoogle Scholar
  55. 55.
    Yong J, Zhou XY (1999) Stochastic controls: hamiltonian systems and HJB equations. Springer, New YorkCrossRefMATHGoogle Scholar
  56. 56.
    Zhang J (2004) A numerical scheme for BSDEs. Ann Appl Probab 14(1):459–488MathSciNetCrossRefMATHGoogle Scholar
  57. 57.
    Zhang J (2017) Backward stochastic differential equations. Probability theory and stochastic modelling. Springer, BerlinCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Aerospace EngineeringGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations