Skip to main content

Integral Policy Iteration for Zero-Sum Games with Completely Unknown Nonlinear Dynamics

  • Conference paper
Neural Information Processing (ICONIP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8226))

Included in the following conference series:

Abstract

In this paper, we develop a model-free integral policy iteration algorithm to learn online the Nash equilibrium solution of two-player zero-sum differential games with completely unknown nonlinear continuous-time dynamics. The developed algorithm updates value function, control and disturbance policies simultaneously. To implement this algorithm, three neural networks are used to approximate the game value function, the control policy and the disturbance policy. The least squares method is used to estimate the unknown parameters of the neural networks. The effectiveness of the developed scheme is demonstrated by a simulation example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Lewis, F.L., Liu, D.: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, Hoboken (2012)

    Book  Google Scholar 

  2. Vamvoudakis, K.G., Lewis, F.L.: Online Actor-critic Algorithm to Solve the Continuous-time Infinite Horizon Optimal Control Problem. Automatica 46, 878–888 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  3. Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method. IEEE Trans. Neural Netw. 22, 2226–2236 (2011)

    Article  Google Scholar 

  4. Bhasin, S., Kamalapurkar, R., Johnson, M., Vamvoudakis, K.G., Lewis, F.L., Dixon, W.E.: A Novel Actor-critic-identifier Architecture for Approximate Optimal Control of Uncertain Nonlinear Systems. Automatica 49, 82–92 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  5. Vrabie, D., Pastravanu, O., Abu-Khalaf, M., Lewis, F.L.: Adaptive Optimal Control for Continuous-time Linear Systems Based on Policy Iteration. Automatica 45, 477–484 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  6. Vrabie, D., Lewis, F.L.: Neural Network Approach to Continuous-time Direct Adaptive Optimal Control for Partially Unknown Nonlinear Systems. Neural Netw. 22, 237–246 (2009)

    Article  Google Scholar 

  7. Mehta, P., Meyn, S.: Q-learning and Pontryagins Minimum Principle. In: Proceedings of the 48th IEEE Conference on Decision and Control, pp. 3598–3605 (2009)

    Google Scholar 

  8. Lee, J.Y., Park, J.B., Choi, Y.H.: Integral Q-learning and Explorized Policy Iteration for Adaptive Optimal Control of Continuous-time Linear Systems. Automatica 48, 2850–2859 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  9. Lee, J.Y., Park, J.B., Choi, Y.H.: Integral Reinforcement Learning with Explorations for Continuous-time Nonlinear Systems. In: Proceedings of the 2012 IEEE World Congress on Computational Intelligence, pp. 1042–1047 (2012)

    Google Scholar 

  10. Jiang, Y., Jiang, Z.P.: Computational Adaptive Optimal Control for Continuous-time Linear Systems with Completely Unknown Dynamics. Automatica 48, 2699–2704 (2012)

    Article  MATH  Google Scholar 

  11. Basar, T., Olsder, G.J.: Dynamic Noncooperative Game, 2nd edn. SIAM, Philadelphia (1997)

    Google Scholar 

  12. Abu-Khalaf, M., Lewis, F.L., Huang, J.: Neurodynamic Progarmming and Zero-sum Games for Constrained Control Systems. IEEE Trans. Neural Netw. 19, 1243–1252 (2008)

    Article  Google Scholar 

  13. Zhang, H., Wei, Q., Liu, D.: An Iterative Adaptive Dynamic Programming Method for Solving a Class of Nonlinear Zero-sum Differential Games. Automatica 47, 207–214 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  14. Vamvoudakis, K.G., Lewis, F.L.: Online Solution of Nonlinear Two-player Zero-sum Games using Synchronous Policy Iteration. Int. J. Robust. Nonlinear Control 22, 1460–1483 (2012)

    Article  MathSciNet  Google Scholar 

  15. Johnson, M., Bhasin, S., Dixon, W.E.: Nonlinear Two-player Zero-sum Game Approximate Solution Using a Policy Iteration Algorithm. In: Proceedings of Conference on Decision and Control and European Control Conference, pp. 142–147 (2011)

    Google Scholar 

  16. Varbie, D., Lewis, F.L.: Adaptive Dynamic Programming for Online Solution of a Zero-sum Differential Game. J. Control Theory Appl. 9, 353–360 (2011)

    Article  MathSciNet  Google Scholar 

  17. Wu, H.N., Luo, B.: Neural Network Based Online Simultaneous Policy Update Algorithm for Solving the HJI Equation in Nonlinear H  ∞  Control. IEEE Trans. Neural Netw. and Learn. Syst. 23, 1884–1895 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, H., Liu, D., Wang, D. (2013). Integral Policy Iteration for Zero-Sum Games with Completely Unknown Nonlinear Dynamics. In: Lee, M., Hirose, A., Hou, ZG., Kil, R.M. (eds) Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-42054-2_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-42053-5

  • Online ISBN: 978-3-642-42054-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics