Approximation of zero-sum continuous-time Markov games under the discounted payoff criterion

Prieto-Rumeau, Tomás; Lorenzo, José María

doi:10.1007/s11750-014-0354-8

Approximation of zero-sum continuous-time Markov games under the discounted payoff criterion

Original Paper
Published: 10 December 2014

Volume 23, pages 799–836, (2015)
Cite this article

TOP Aims and scope Submit manuscript

Tomás Prieto-Rumeau¹ &
José María Lorenzo²

243 Accesses
8 Citations
Explore all metrics

Abstract

We deal with a two-person zero-sum continuous-time Markov game \(\mathcal {G}\) with denumerable state space, general action spaces, and unbounded payoff and transition rates. We consider noncooperative equilibria for the discounted payoff criterion. We are interested in approximating numerically the value and the optimal strategies of \(\mathcal {G}\). To this end, we propose a definition of a sequence of game models \(\mathcal {G}_n\) converging to \(\mathcal {G}\), which ensures that the value and the optimal strategies of \(\mathcal {G}_n\) converge to those of \(\mathcal {G}\). For numerical purposes, we construct finite state and actions game models \(\mathcal {G}_n\) that can be explicitly solved, and we study the convergence rate of the value of the games. A game model based on a population system illustrates our results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

This is not the usual statement of the Portmanteau theorem. Observe, however, that the function constructed in (Billingsley 1968, Theorem 1.2) is bounded and Lipschitz continuous, and then proceed as in the proof of (Billingsley 1968, Theorem 2.1). Another reference for this result is (Bogachev 2007, Remark 8.3.1).

References

Billingsley P (1968) Convergence of probability measures. Wiley, New York
Google Scholar
Bogachev VI (2007) Measure theory, vol II. Springer, New York
Book Google Scholar
Bolley F (2008) Separability and completeness for the Wasserstein distance. In: Séminaire de probabilités XLI. Lecture Notes in Math. 1934, Springer, Berlin, pp 371–377
Chang HS, Hu JQ, Fu MC, Marcus SI (2010) Adaptive adversarial multi-armed bandit approach to two-person zero-sum Markov games. IEEE Trans Automat Control 55:463–468
Article Google Scholar
Frenk JBG, Kassay G, Kolumbán J (2004) On equivalent results in minimax theory Euro. J Oper Res 157:46–58
Article Google Scholar
Guo XP, Hernández-Lerma O (2003) Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates. J Appl Probab 40:327–345
Article Google Scholar
Guo XP, Hernández-Lerma O (2003) Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans Automat Control 48:236–244
Article Google Scholar
Guo XP, Hernández-Lerma O (2005) Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates. Bernoulli 11:1009–1029
Article Google Scholar
Guo XP, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, New York
Book Google Scholar
Guo XP, Zhang WZ (2014) Convergence of controlled models and finite-state approximation for discounted continuous-time Markov decision processes with constraints. Eur J Oper Res 238:486–496
Article Google Scholar
Jaśkiewicz A, Nowak AS (2006) Approximation of noncooperative semi-Markov games. J Optim Theory Appl 131:115–134
Article Google Scholar
Nowak AS, Altman E (2002) \(\epsilon \)-equilibria for stochastic games with uncountable state space and unbounded costs. SIAM J Control Optim 40:1821–1839
Article Google Scholar
Prieto-Rumeau T, Hernández-Lerma O (2012) Discounted continuous-time controlled Markov chains: convergence of control models. J Appl Probab 49:1072–1090
Article Google Scholar
Prieto-Rumeau T, Hernández-Lerma O (2012) Selected topics on continuous-time controlled Markov chains and Markov games. Imperial College Press, London
Google Scholar
Prieto-Rumeau T, Lorenzo JM (2010) Approximating ergodic average reward continuous-time controlled Markov chains. IEEE Trans Automat Control 55:201–207
Article Google Scholar

Download references

Acknowledgments

Research supported by Grant MTM2012-31393 from the Spanish Ministerio de Economía y Competitividad.

Author information

Authors and Affiliations

Department of Statistics and Operations Research, UNED, Madrid, Spain
Tomás Prieto-Rumeau
Department of Statistics and Operations Research II, Universidad Complutense de Madrid, Madrid, Spain
José María Lorenzo

Authors

Tomás Prieto-Rumeau
View author publications
You can also search for this author in PubMed Google Scholar
José María Lorenzo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomás Prieto-Rumeau.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prieto-Rumeau, T., Lorenzo, J.M. Approximation of zero-sum continuous-time Markov games under the discounted payoff criterion. TOP 23, 799–836 (2015). https://doi.org/10.1007/s11750-014-0354-8

Download citation

Received: 01 April 2014
Accepted: 20 November 2014
Published: 10 December 2014
Issue Date: October 2015
DOI: https://doi.org/10.1007/s11750-014-0354-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximation of zero-sum continuous-time Markov games under the discounted payoff criterion

Abstract

Access this article

Similar content being viewed by others

Zero-Sum Markov Games with Random State-Actions-Dependent Discount Factors: Existence of Optimal Strategies

Zero-sum infinite-horizon discounted piecewise deterministic Markov games

Zero-sum semi-Markov games with state-action-dependent discount factors

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Approximation of zero-sum continuous-time Markov games under the discounted payoff criterion

Abstract

Access this article

Similar content being viewed by others

Zero-Sum Markov Games with Random State-Actions-Dependent Discount Factors: Existence of Optimal Strategies

Zero-sum infinite-horizon discounted piecewise deterministic Markov games

Zero-sum semi-Markov games with state-action-dependent discount factors

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation