Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity

Chávez-Rodríguez, Selene; Cavazos-Cadena, Rolando; Cruz-Suárez, Hugo

doi:10.1007/s00186-015-0496-y

Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity

Original Article
Published: 22 February 2015

Volume 81, pages 269–298, (2015)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Selene Chávez-Rodríguez¹,
Rolando Cavazos-Cadena² &
Hugo Cruz-Suárez¹

291 Accesses
2 Citations
Explore all metrics

Abstract

This note concerns discrete-time controlled Markov chains driven by a decision maker with constant risk-sensitivity \(\lambda \). Assuming that the system evolves on a denumerable state space and is endowed with a bounded cost function, the paper analyzes the continuity of the optimal average cost with respect to the risk-sensitivity parameter, a property that is promptly seen to be valid at each no-null value of \(\lambda \). Under standard continuity-compactness conditions, it is shown that a general form of the simultaneous Doeblin condition allows to establish the continuity of the optimal average cost at \(\lambda = 0\), and explicit examples are given to show that, even if every state is positive recurrent under the action of any stationary policy, the above continuity conclusion can not be ensured under weaker recurrence requirements, as the Lyapunov function condition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous-Time Controlled Jump Markov Processes on the Finite Horizon

Average cost criterion induced by the regular utility function for continuous-time Markov decision processes

Article 20 February 2017

Qingda Wei & Xian Chen

Controlled Semi-Markov Chains with Risk-Sensitive Average Cost Criterion

Article 11 March 2016

Selene Chávez-Rodríguez, Rolando Cavazos-Cadena & Hugo Cruz-Suárez

References

Arapostathis A, Borkar VK, Fernández-Gaucherand E, Gosh MK, Marcus SI (1993) Discrete-time controlled Markov processes with average cost criteria: a survey. SIAM J Control Optim 31:282–334
Article MATH MathSciNet Google Scholar
Bäuerle N, Rieder U (2011) Markov decision processes with applications to finance. Springer, New York
Book MATH Google Scholar
Bäuerle N, Rieder U (2013) More risk-sensitive Markov decision processes. Math Oper Res 39:105–120
Article Google Scholar
Cavazos-Cadena R (2003) Solution to the risk-sensitive average cost optimality equation in a class of Markov decision processes with finite state space. Math Method Oper Res 57:263–285
Article MATH MathSciNet Google Scholar
Cavazos-Cadena R, Fernández-Gaucherand E (1999) Controlled Markov chains with risk-sensitive criteria: average cost, optimality equations and optimal solutions. Math Method Oper Res 43:121–139
Google Scholar
Cavazos-Cadena R, Fernández-Gaucherand E (2002) Risk-sensitive control in communicating average Markov decision chains. In: Dror M, L’Ecuyer P, Szidarovsky F (eds) Modelling uncertainty: an examination of stochastic theory, methods and applications. Kluwer, Boston, pp 525–544
Google Scholar
Cavazos-Cadena R, Hernández-Lerma O (1992) Equivalence of Lyapunov stability criteria in a class of Markov decision processes. Appl Math Optim 26:113–137
Article MATH MathSciNet Google Scholar
Cavazos-Cadena R, Hernández-Hernández D (2015) A Characterization of the Optimal Certainty Equivalent of the Average Cost via the Arrow-Pratt Sensitivity Function. Math Oper Res (to appear)
Di Masi GB, Stettner L (1999) Risk-sensitive control of discrete time Markov processes with infinite horizon. SIAM J Control Optim 38:61–78
Article MATH MathSciNet Google Scholar
Di Masi GB, Stettner L (2000) Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Syst Control Lett 40:15–20
Article MATH Google Scholar
Di Masi GB, Stettner L (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J Control Optim 46:231–252
Article MATH MathSciNet Google Scholar
Hernández-Hernández D, Marcus SI (1996) Risk-sensitive control of Markov processes in countable state space. Syst Control Lett 29(1996):147–155
Article MATH Google Scholar
Hernández-Lerma O (1989) Adaptive Markov control processes. Springer, New York
Book MATH Google Scholar
Hordijk A (1974) Dynamic programming and Markov potential theory, Mathematical Centre Tracts 51, Mathematisch Centrum, Amsterdam
Howard AR, Matheson JE (1972) Risk-sensitive Markov decision processes. Manag Sci 18:356–369
Article MATH MathSciNet Google Scholar
Jaśkiewicz A (2007) Average optimality for risk sensitive control with general state space. Ann Appl Probab 17:654–675
Article MATH MathSciNet Google Scholar
Puterman ML (1994) Markov decision processes. Wiley, New York
Book MATH Google Scholar
Stokey NL, Lucas RE (1989) Recursive methods in economic dynamics. Harvard University Press, Cambridge
MATH Google Scholar
Sladký K (2008) Growth rates and average optimality in risk-sensitive Markov decision chains. Kybernetika 44:205–226
MATH MathSciNet Google Scholar

Download references

Acknowledgments

The authors are grateful to the reviewers for their careful reading of the original manuscript and their helpful suggestions to improve the paper.

Author information

Authors and Affiliations

Facultad de Ciencias Físico-Matemáticas, Ave. San Claudio y Río Verde, Col. San Manuel CU, Benemérita Universidad Autónoma de Puebla, 72570, Puebla, PUE, Mexico
Selene Chávez-Rodríguez & Hugo Cruz-Suárez
Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Boulevard Antonio Narro 1923, Buenavista, 25315, Saltillo, COAH, Mexico
Rolando Cavazos-Cadena

Authors

Selene Chávez-Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Rolando Cavazos-Cadena
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Cruz-Suárez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rolando Cavazos-Cadena.

Additional information

This work was supported by PSFO under Grant No. 14-300-01, and by PRODEP under Grant No. 17332-UAAAN-CA-23.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chávez-Rodríguez, S., Cavazos-Cadena, R. & Cruz-Suárez, H. Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity. Math Meth Oper Res 81, 269–298 (2015). https://doi.org/10.1007/s00186-015-0496-y

Download citation

Received: 05 October 2014
Accepted: 10 February 2015
Published: 22 February 2015
Issue Date: June 2015
DOI: https://doi.org/10.1007/s00186-015-0496-y

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity

Abstract

Access this article

Similar content being viewed by others

Continuous-Time Controlled Jump Markov Processes on the Finite Horizon

Average cost criterion induced by the regular utility function for continuous-time Markov decision processes

Controlled Semi-Markov Chains with Risk-Sensitive Average Cost Criterion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity

Abstract

Access this article

Similar content being viewed by others

Continuous-Time Controlled Jump Markov Processes on the Finite Horizon

Average cost criterion induced by the regular utility function for continuous-time Markov decision processes

Controlled Semi-Markov Chains with Risk-Sensitive Average Cost Criterion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation