Advertisement

Stochastic Differential Games: A Sampling Approach via FBSDEs

  • Ioannis Exarchos
  • Evangelos Theodorou
  • Panagiotis Tsiotras
Article

Abstract

The aim of this work is to present a sampling-based algorithm designed to solve various classes of stochastic differential games. The foundation of the proposed approach lies in the formulation of the game solution in terms of a decoupled pair of forward and backward stochastic differential equations (FBSDEs). In light of the nonlinear version of the Feynman–Kac lemma, probabilistic representations of solutions to the nonlinear Hamilton–Jacobi–Isaacs equations that arise for each class are obtained. These representations are in form of decoupled systems of FBSDEs, which may be solved numerically.

Keywords

Stochastic differential games Forward and backward stochastic differential equations Numerical methods Iterative algorithms 

Notes

Acknowledgements

Funding was provided by Army Research Office (W911NF-16-1-0390) and National Science Foundation (CMMI-1662523).

References

  1. 1.
    Athans M, Falb P (2007) Optimal control—an introduction to the theory and its applications. Dover Publications Inc, New YorkzbMATHGoogle Scholar
  2. 2.
    Barles G, Souganidis P (1991) Convergence of approximation schemes for fully nonlinear second order equations. Asymptot Anal 4(3):271–283MathSciNetzbMATHGoogle Scholar
  3. 3.
    Beard R, Saridis G, Wen J (1997) Galerkin approximation of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2159–2177MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bender C, Denk R (2007) A forward scheme for backward SDEs. Stoch Process Appl 117:1793–1812MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Berkovitz L (1961) A variational approach to differential games. RAND Corporation ReportGoogle Scholar
  6. 6.
    Bouchard B, Touzi N (2004) Discrete time approximation and Monte Carlo simulation of BSDEs. Stoch Process Appl 111:175–206CrossRefzbMATHGoogle Scholar
  7. 7.
    Bouchard B, Elie R, Touzi N (2009) Discrete-time approximation of BSDEs and probabilistic schemes for fully nonlinear PDEs. Radon Ser Comput Appl Math 8:91–124MathSciNetzbMATHGoogle Scholar
  8. 8.
    Buckdahn R, Li J (2008) Stochastic differential games and viscosity solutions of Hamilton–Jacobi–Bellman–Isaacs equations. SIAM J Control Optim 47(1):444–475MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Chassagneux JF, Richou A (2016) Numerical simulation of quadratic BSDEs. Ann Appl Probab 26(1):262–304MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Da Lio F, Ley O (2006) Uniqueness results for second-order Bellman-Isaacs equations under quadratic growth assumptions and applications. SIAM J Control Optim 45(1):74–106MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Delbaen F, Hu Y, Richou A (2011) On the uniqueness of solutions to quadratic BSDEs with convex generators and unbounded terminal conditions. Annales de l’Institut Henri Poincarè, Probabilitès et Statistiques 47(2):559–574MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Dixon M, Edelbaum T, Potter J, Vandervelde W (1970) Fuel optimal reorientation of axisymmetric spacecraft. J Spacecr Rockets 7(11):1345–1351CrossRefGoogle Scholar
  13. 13.
    Douglas J, Ma J, Protter P (1996) Numerical methods for forward-backward stochastic differential equations. Ann Appl Probab 6:940–968MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Duncan T, Pasik-Duncan B (2015) Some stochastic differential games with state dependent noise. In: 54th IEEE conference on decision and control, Osaka, Japan, December 15–18Google Scholar
  15. 15.
    Dvijotham K, Todorov E (2013) Linearly solvable optimal control. In: Lewis FL, Liu D (eds) Reinforcement learning and approximate dynamic programming for feedback control, pp 119–141.  https://doi.org/10.1002/9781118453988.ch6
  16. 16.
    El Karoui N, Peng S, Quenez MC (1997) Backward stochastic differential equations in finance. Math Finance 7:1–71MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Exarchos I, Theodorou E (2018) Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87:159–165MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Fahim A, Touzi N, Warin X (2011) A probabilistic numerical method for fully nonlinear parabolic PDEs. Ann Appl Probab 21(4):1322–1364MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Fleming W, Soner H (2006) Controlled Markov processes and viscosity solutions, 2nd edn. Stochastic modelling and applied probability. Springer, BerlinzbMATHGoogle Scholar
  20. 20.
    Fleming W, Souganidis P (1989) On the existence of value functions of two player zero-sum stochastic differential games. Indiana University Mathematics Journal, New YorkzbMATHGoogle Scholar
  21. 21.
    Gobet E, Labart C (2007) Error expansion for the discretization of backward stochastic differential equations. Stoch Process Appl 117:803–829MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Gorodetsky A, Karaman S, Marzouk Y (2015) Efficient high-dimensional stochastic optimal motion control using tensor-train decomposition. In: Robotics: science and systems (RSS)Google Scholar
  23. 23.
    Györfi L, Kohler M, Krzyzak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer series in statistics. Springer, New YorkCrossRefzbMATHGoogle Scholar
  24. 24.
    Hamadene S, Lepeltier JP (1995) Zero-sum stochastic differential games and backward equations. Syst Control Lett 24:259–263MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Ho Y, Bryson A, Baron S (1965) Differential games and optimal pursuit-evasion strategies. IEEE Trans Autom Control 10:385–389MathSciNetCrossRefGoogle Scholar
  26. 26.
    Horowitz MB, Burdick JW (2014) Semidefinite relaxations for stochastic optimal control policies. In: American control conference, Portland, June 4–6 pp 3006–3012Google Scholar
  27. 27.
    Horowitz MB, Damle A, Burdick JW (2014) Linear Hamilton Jacobi Bellman equations in high dimensions. In: 53rd IEEE conference on decision and control, Los Angeles, California, USA, December 15–17Google Scholar
  28. 28.
    Isaacs R (1965) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Willey, New YorkzbMATHGoogle Scholar
  29. 29.
    Kappen HJ (2005) Linear theory for control of nonlinear stochastic systems. Phys Rev Lett 95:200201MathSciNetCrossRefGoogle Scholar
  30. 30.
    Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, New YorkzbMATHGoogle Scholar
  31. 31.
    Kloeden P, Platen E (1999) Numerical solution of stochastic differential equations, vol 23 of Applications in Mathematics, Stochastic modelling and applied probability, 3rd edn. Springer, BerlinGoogle Scholar
  32. 32.
    Kobylanski M (2000) Backward stochastic differential equations and partial differential equations with quadratic growth. Ann Probab 28(2):558–602.  https://doi.org/10.1214/aop/1019160253 MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Kushner H (2002) Numerical approximations for stochastic differential games. SIAM J Control Optim 41:457–486MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Kushner H, Chamberlain S (1969) On stochastic differential games: sufficient conditions that a given strategy be a saddle point, and numerical procedures for the solution of the game. J Math Anal Appl 26:560–575MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Lasserre JB, Henrion D, Prieur C, Trelat E (2008) Nonlinear optimal control via occupation measures and LMI-relaxations. SIAM J Control Optim 47(4):1643–1666MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Lemor JP, Gobet E, Warin X (2006) Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations. Bernoulli 12(5):889–916MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Lepeltier JP, Martìn JS (1998) Existence for BSDE with superlinear-quadratic coefficient. Stoch Int J Probab Stoch Process 63(3–4):227–240MathSciNetzbMATHGoogle Scholar
  38. 38.
    Longstaff FA, Schwartz RS (2001) Valuing American options by simulation: a simple least-squares approach. Rev Financ Stud 14:113–147CrossRefzbMATHGoogle Scholar
  39. 39.
    Ma J, Yong J (1999) Forward-backward stochastic differential equations and their applications. Springer, BerlinzbMATHGoogle Scholar
  40. 40.
    Ma J, Protter P, Yong J (1994) Solving forward-backward stochastic differential equations explicitly—a four step scheme. Probab Theory Relat Fields 98:339–359MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Ma J, Shen J, Zhao Y (2008) On numerical approximations of forward-backward stochastic differential equations. SIAM J Numer Anal 46(5):2636–2661MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    McEneaney WM (2007) A curse-of-dimensionality-free numerical method for solution of certain HJB PDEs. SIAM J Control Optim 46(4):1239–1276MathSciNetCrossRefzbMATHGoogle Scholar
  43. 43.
    Milstein GN, Tretyakov MV (2006) Numerical algorithm for forward-backward stochastic differential equations. SIAM J Sci Comput 28(2):561–582MathSciNetCrossRefzbMATHGoogle Scholar
  44. 44.
    Morimoto J, Atkeson C (2002) Minimax differential dynamic programming: An application to robust biped walking. In: Advances in neural information processing systems (NIPS), Vancouver, British Columbia, Canada, December 9–14Google Scholar
  45. 45.
    Morimoto J, Zeglin G, Atkeson C (2003) Minimax differential dynamic programming: Application to a biped walking robot. In: IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, NV, 2: 1927–1932, October 27–31Google Scholar
  46. 46.
    Nagahara M, Quevedo DE, Nešić D (2016) Maximum hands-off control: a paradigm of control effort minimization. IEEE Trans Autom Control 61(3):735–747MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Nagahara M, Quevedo DE, Nešić D (2013) Maximum hands-off control and \(L^1\) optimality. In: 52nd IEEE conference on decision and control, Florence, Italy, December 10–13, pp 3825–3830Google Scholar
  48. 48.
    Øksendal B (2007) Stochastic differential equations—an introduction with applications, 6th edn. Springer, BerlinzbMATHGoogle Scholar
  49. 49.
    Ramachandran KM, Tsokos CP (2012) Stochastic differential games. Atlantis Press, ParisCrossRefzbMATHGoogle Scholar
  50. 50.
    Seywald H, Kumar RR, Deshpande SS, Heck ML (1994) Minimum fuel spacecraft reorientation. J Guid Control Dyn 17(1):21–29CrossRefzbMATHGoogle Scholar
  51. 51.
    Song Q, Yin G, Zhang Z (2008) Numerical solutions for stochastic differential games with regime switching. IEEE Trans Autom Control 53:509–521MathSciNetCrossRefzbMATHGoogle Scholar
  52. 52.
    Sun W, Theodorou EA, Tsiotras P (2015) Game-theoretic continuous time differential dynamic programming. In: American Control Conference, Chicago, July 1–3, pp 5593–5598Google Scholar
  53. 53.
    Theodorou EA, Buchli J, Schaal S (2010) A generalized path integral control approach to reinforcement learning. J Mach Learn Res 11:3137–3181MathSciNetzbMATHGoogle Scholar
  54. 54.
    Xiu D (2010) Numerical methods for stochastic computations—a spectral method approach. Princeton University Press, PrincetonzbMATHGoogle Scholar
  55. 55.
    Yong J, Zhou XY (1999) Stochastic controls: hamiltonian systems and HJB equations. Springer, New YorkCrossRefzbMATHGoogle Scholar
  56. 56.
    Zhang J (2004) A numerical scheme for BSDEs. Ann Appl Probab 14(1):459–488MathSciNetCrossRefzbMATHGoogle Scholar
  57. 57.
    Zhang J (2017) Backward stochastic differential equations. Probability theory and stochastic modelling. Springer, BerlinCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Aerospace EngineeringGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations