Skip to main content
Log in

Discounted Markov games: Generalized policy iteration method

  • Contributed Papers
  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

In this paper, we consider two-person zero-sum discounted Markov games with finite state and action spaces. We show that the Newton-Raphson or policy iteration method as presented by Pollats-chek and Avi-Itzhak does not necessarily converge, contradicting a proof of Rao, Chandrasekaran, and Nair. Moreover, a set of successive approximation algorithms is presented of which Shapley's method and a total-expected-rewards version of Hoffman and Karp's method are the extreme elements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Shapley, L.S.,Stochastic Games, Proceedings of the National Academy of Sciences USA, Vol. 39, pp. 1095–1100, 1953.

    Google Scholar 

  2. Zachrisson, L. E.,Markov Games, Advances in Game Theory, Edited by M. Dresher, L. S. Shapley, and A. W. Tucker, Princeton University Press, Princeton, New Jersey, pp. 211–253, 1964.

    Google Scholar 

  3. Charnes, A., andSchroeder, R. G.,On Some Stochastic Tactical Anti-Submarine Games, Naval Research Logistics Quarterly, Vol. 14, pp. 291–311, 1967.

    Google Scholar 

  4. MacQueen, J.,A Modified Dynamic Programming Method for Markovian Decision Problems, Journal of Mathematical Analysis and Applications, Vol. 14, pp. 38–43, 1966.

    Google Scholar 

  5. Porteus, E. L.,Some Bounds for Discounted Sequential Decision Processes, Management Science, Vol. 18, pp. 7–11, 1971.

    Google Scholar 

  6. Van der Wal, J.,Discounted Markov Games; Successive Approximations and Stopping Times, International Journal of Game Theory, Vol. 6, pp. 11–22, 1977.

    Google Scholar 

  7. Howard, R. A.,Dynamic Programming and Markov Processes, MIT Press, Cambridge, Massachusetts, 1960.

    Google Scholar 

  8. Hoffman, A. K., andKarp, R. M.,On Nonterminating Stochastic Games, Management Science, Vol. 12, pp. 359–370, 1966.

    Google Scholar 

  9. Pollatschek, M. A., andAvi-Itzhak, B.,Algorithms for Stochastic Games with Geometrical Interpretation, Management Science, Vol. 15, pp. 399–415, 1969.

    Google Scholar 

  10. Van der Wal, J., andWessels, J.,On Markov Games, Statistica Neerlandica, Vol. 30, pp. 51–71, 1976.

    Google Scholar 

  11. Rao, S. S., Chandrasekaran, R., andNair, K. P. K.,Algorithms for Discounted Stochastic Games, Journal of Optimization Theory and Applications, Vol. 11, pp. 627–637, 1973.

    Google Scholar 

  12. Van Nunen, J. A. E. E.,A Set of Successive Approximation Methods for Discounted Markov Decision Processes, Zeitschrift für Operations Research, Vol. 30, pp. 203–208, 1976.

    Google Scholar 

  13. Van Nunen, J. A. E. E.,Contracting Markov Decision Processes, Mathematical Centre Tract 71, Mathematisch Centrum, Amsterdam, Holland, 1976.

    Google Scholar 

  14. Wessels, J.,Stopping Times and Markov Programming, Transactions of the Seventh Prague Conference and 1974 EMS, Academia, Prague, Czechoslovakia, pp. 575–585, 1977.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Communicated by R. A. Howard

Rights and permissions

Reprints and permissions

About this article

Cite this article

Van der Wal, J. Discounted Markov games: Generalized policy iteration method. J Optim Theory Appl 25, 125–138 (1978). https://doi.org/10.1007/BF00933260

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00933260

Key Words

Navigation