Postponing Collapse: Ergodic Control with a Probabilistic Constraint

Borkar, Vivek S.; Filar, Jerzy A.

doi:10.1007/978-3-030-25498-8_3

Vivek S. Borkar⁴ &
Jerzy A. Filar⁵

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 164))

1554 Accesses

Abstract

We consider the long run average or ‘ergodic’ control of a discrete time Markov process with a probabilistic constraint in terms of a bound on the exit rate from a bounded subset of the state space. This is a natural counterpart of the more common probabilistic constraints in the finite horizon control problems. Using a recent characterization by Anantharam and the first author of risk-sensitive reward as the value of an average cost ergodic control problem, this problem is mapped to a constrained ergodic control problem that seeks to maximize an ergodic reward subject to a constraint on another ergodic reward. However, unlike the classical constrained ergodic reward/cost problems, this problem has some non-classical features due to a non-standard coupling between between the primary ergodic reward and the one that gets constrained. This renders the problem inaccessible to standard solution methodologies. A brief discussion of possible ways out is included.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anantharam, V., Borkar, V. S.: A variational formula for risk-sensitive reward. SIAM Journal of Control and Optimization. 55(2), 961-988 (2017).
Article MathSciNet MATH Google Scholar
Andrieu, L., Cohen, J., Va´zquez-Abad, F. J.: Gradient-based simulation optimization under probability constraints. European Journal of Operational Research. 212(2), 345-351 (2011).
Article MathSciNet MATH Google Scholar
Borkar, V. S.: Topics in Controlled Markov Chains. Pitman Research Notes in Math. No. 240, Longmans Scientific and Technical, Harlow, UK (1991).
Google Scholar
Borkar, V. S.: Probability Theory: An Advanced Course. Springer Verlag, New York (1995).
Google Scholar
Borkar, V. S.: Convex analytic methods in Markov decision processes. In: Feinberg E. A., Shwartz A. (eds.) Handbook of Markov Decision Processes, pp. 347-375. Kluwer Academic Publishers, Norwell, Mass. (2002).
Google Scholar
Borkar, V. S.: Stochastic Approximation: A Dynamical Systems Viewpoint. Hindustan Pub- lishing Agency, New Delhi, and Cambridge University Press, Cambridge, UK (2008).
Google Scholar
Danskin, J. M.: Theory of max-min, with applications. SIAM Journal of Applied Mathemat- ics.14(4) (1966), 641-664.
Google Scholar
Herna´ndez-Lerma, O., Lasserre, J. B.: Policy iteration for average cost Markov control pro- cesses on Borel spaces. Acta Applicandae Mathematicae. 47, 125-154 (1997).
Google Scholar
Kang, B., Filar, J. A.: Time consistent dynamic risk measures. Mathematical Methods of Op- erations Research 63(1) (2006), 169-186.
Google Scholar
Krein, M. G., Rutman, M. A¿: Linear operators leaving invariant a cone in Banach spaces. Uspekhi Mat. Nauk. 3(1), 3-95 (1948).
Google Scholar
Meyn, S. P.: The policy iteration algorithm for average reward Markov decision processes with general state space. IEEE Transactions on Automatic Control. 42(12), 1663-1680 (1997).
Article MathSciNet MATH Google Scholar
Milgrom, P., Segal, I.: Envelope theorems for arbitrary choice sets. Econometrica. 70(2), 2002, 583-601.
Article MathSciNet MATH Google Scholar
Puterman, M.: Markov Decision Processes: Discrete Dynamic Programming. John Wiley and Sons, Hoboken, NJ, 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai, 400076, India
Vivek S. Borkar
Centre for Applications in Natural Resource Mathematics, School of Mathematics and Physics, University of Queensland, St Lucia, QLD, 4072, Australia
Jerzy A. Filar

Authors

Vivek S. Borkar
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy A. Filar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vivek S. Borkar .

Editor information

Editors and Affiliations

Department of Mathematics, Wayne State University, Detroit, MI, USA
George Yin
Department of Mathematics, University of Georgia, Athens, GA, USA
Qing Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Borkar, V.S., Filar, J.A. (2019). Postponing Collapse: Ergodic Control with a Probabilistic Constraint. In: Yin, G., Zhang, Q. (eds) Modeling, Stochastic Control, Optimization, and Applications. The IMA Volumes in Mathematics and its Applications, vol 164. Springer, Cham. https://doi.org/10.1007/978-3-030-25498-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-25498-8_3
Published: 17 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25497-1
Online ISBN: 978-3-030-25498-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics