Skip to main content
Log in

On maximizing probabilities for over-performing a target for Markov decision processes

  • Research Article
  • Published:
Optimization and Engineering Aims and scope Submit manuscript

Abstract

This paper studies the dual relation between risk-sensitive control and large deviation control of maximizing the probability for out-performing a target for Markov Decision Processes. To derive the desired duality, we apply a non-linear extension of the Krein-Rutman Theorem to characterize the optimal risk-sensitive value and prove that an optimal policy exists which is stationary and deterministic. The right-hand side derivative of this value function is used to characterize the specific targets which make the duality to hold. It is proved that the optimal policy for the “out-performing” probability can be approximated by the optimal one for the risk-sensitive control. The range of the (right-hand, left-hand side) derivative of the optimal risk-sensitive value function plays an important role. Some essential differences between these two types of optimal control problems are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

Download references

Acknowledgements

The authors are grateful to the Referee for the very careful review of the first version of this paper and for the many helpful comments and suggestions for improvement. This paper is part of the first author’s dissertation. This work is supported by the NSFC 11671226.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinwen Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, T., Dai, Y. & Chen, J. On maximizing probabilities for over-performing a target for Markov decision processes. Optim Eng (2023). https://doi.org/10.1007/s11081-023-09870-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11081-023-09870-4

Keywords

Navigation