Skip to main content
Log in

Approximation of zero-sum continuous-time Markov games under the discounted payoff criterion

  • Original Paper
  • Published:
TOP Aims and scope Submit manuscript

Abstract

We deal with a two-person zero-sum continuous-time Markov game \(\mathcal {G}\) with denumerable state space, general action spaces, and unbounded payoff and transition rates. We consider noncooperative equilibria for the discounted payoff criterion. We are interested in approximating numerically the value and the optimal strategies of \(\mathcal {G}\). To this end, we propose a definition of a sequence of game models \(\mathcal {G}_n\) converging to \(\mathcal {G}\), which ensures that the value and the optimal strategies of \(\mathcal {G}_n\) converge to those of \(\mathcal {G}\). For numerical purposes, we construct finite state and actions game models \(\mathcal {G}_n\) that can be explicitly solved, and we study the convergence rate of the value of the games. A game model based on a population system illustrates our results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. This is not the usual statement of the Portmanteau theorem. Observe, however, that the function constructed in (Billingsley 1968, Theorem 1.2) is bounded and Lipschitz continuous, and then proceed as in the proof of (Billingsley 1968, Theorem 2.1). Another reference for this result is (Bogachev 2007, Remark 8.3.1).

References

  • Billingsley P (1968) Convergence of probability measures. Wiley, New York

    Google Scholar 

  • Bogachev VI (2007) Measure theory, vol II. Springer, New York

    Book  Google Scholar 

  • Bolley F (2008) Separability and completeness for the Wasserstein distance. In: Séminaire de probabilités XLI. Lecture Notes in Math. 1934, Springer, Berlin, pp 371–377

  • Chang HS, Hu JQ, Fu MC, Marcus SI (2010) Adaptive adversarial multi-armed bandit approach to two-person zero-sum Markov games. IEEE Trans Automat Control 55:463–468

    Article  Google Scholar 

  • Frenk JBG, Kassay G, Kolumbán J (2004) On equivalent results in minimax theory Euro. J Oper Res 157:46–58

    Article  Google Scholar 

  • Guo XP, Hernández-Lerma O (2003) Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates. J Appl Probab 40:327–345

    Article  Google Scholar 

  • Guo XP, Hernández-Lerma O (2003) Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans Automat Control 48:236–244

    Article  Google Scholar 

  • Guo XP, Hernández-Lerma O (2005) Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates. Bernoulli 11:1009–1029

    Article  Google Scholar 

  • Guo XP, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, New York

    Book  Google Scholar 

  • Guo XP, Zhang WZ (2014) Convergence of controlled models and finite-state approximation for discounted continuous-time Markov decision processes with constraints. Eur J Oper Res 238:486–496

    Article  Google Scholar 

  • Jaśkiewicz A, Nowak AS (2006) Approximation of noncooperative semi-Markov games. J Optim Theory Appl 131:115–134

    Article  Google Scholar 

  • Nowak AS, Altman E (2002) \(\epsilon \)-equilibria for stochastic games with uncountable state space and unbounded costs. SIAM J Control Optim 40:1821–1839

    Article  Google Scholar 

  • Prieto-Rumeau T, Hernández-Lerma O (2012) Discounted continuous-time controlled Markov chains: convergence of control models. J Appl Probab 49:1072–1090

    Article  Google Scholar 

  • Prieto-Rumeau T, Hernández-Lerma O (2012) Selected topics on continuous-time controlled Markov chains and Markov games. Imperial College Press, London

    Google Scholar 

  • Prieto-Rumeau T, Lorenzo JM (2010) Approximating ergodic average reward continuous-time controlled Markov chains. IEEE Trans Automat Control 55:201–207

    Article  Google Scholar 

Download references

Acknowledgments

Research supported by Grant MTM2012-31393 from the Spanish Ministerio de Economía y Competitividad.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomás Prieto-Rumeau.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Prieto-Rumeau, T., Lorenzo, J.M. Approximation of zero-sum continuous-time Markov games under the discounted payoff criterion. TOP 23, 799–836 (2015). https://doi.org/10.1007/s11750-014-0354-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11750-014-0354-8

Keywords

Mathematics Subject Classification

Navigation