Skip to main content
Log in

Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games

  • Original Article
  • Published:
Acta Informatica Aims and scope Submit manuscript

Abstract

We establish the existence of optimal scheduling strategies for time-bounded reachability in continuous-time Markov decision processes, and of co-optimal strategies for continuous-time Markov games. Furthermore, we show that optimal control does not only exist, but has a surprisingly simple structure: the optimal schedulers from our proofs are deterministic and timed positional, and the bounded time can be divided into a finite number of intervals, in which the optimal strategies are positional. That is, we demonstrate the existence of finite optimal control. Finally, we show that these pleasant properties of Markov decision processes extend to the more general class of continuous-time Markov games, and that both early and late schedulers show this behaviour.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ash R.B., Doléans-Dade C.A.: Probability and Measure Theory. Elsevier Science, Amsterdam (2000)

    MATH  Google Scholar 

  2. Aziz A., Sanwal K., Singhal V., Brayton R.: Model-checking continuous-time markov chains. Trans. Comput. Logic 1(1), 162–170 (2000)

    Article  MathSciNet  Google Scholar 

  3. Baier C., Hermanns H., Katoen J-.P., Haverkort B.R.: Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes. Theoret. Comput. Sci. 345(1), 2–26 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  4. Baier, C., Katoen, J.-P., Hermanns, H.: Approximate symbolic model checking of continuous-time Markov chains. In: Proceedings of CONCUR’99, Lecture Notes in Computer Science, vol. 1664, pp. 146–161 (1999)

  5. Bellman R.: Dynamic Programming. Princeton University Press, Princeton (1957)

    MATH  Google Scholar 

  6. Brazdil, T., Forejt, V., Krcal, J., Kretinsky, J., Kucera, A.: Continuous-Time Stochastic Games with Time-Bounded Reachability. In: Proceedings of FSTTCS’09, Leibniz International Proceedings in Informatics (LIPIcs), pp. 61–72 (2009)

  7. Buchholz P., Schulz I.: Numerical analysis of continuous time Markov decision processes over finite horizons. Comput. Oper. Res. 38(3), 651–659 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  8. Feinberg E.A.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29(3), 492–524 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  9. Guo X., Hernández-Lerma O.: Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates. J. Appl. Probab. 40(2), 327–345 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  10. Guo X., Hernández-Lerma O.: Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates. Bernoulli 11(6), 1009–1029 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  11. Guo X., Hernández-Lerma O.: Zero-sum games for continuous-time jump Markov processes in Polish spaces: discounted payoffs. Adv. Appl. Probab. 39(3), 645–668 (2007)

    Article  MATH  Google Scholar 

  12. Guo X., Hernández-Lerma O.: Continuous-Time Markov Decision Processes, volume 62 of Stochastic Modelling and Applied Probability. Springer, Berlin (2009)

    Book  Google Scholar 

  13. Hermanns, H.: Interactive Markov Chains and the Quest for Quantified Quality. In: LNCS, vol. 2428 (2002)

  14. Marsan M.A., Balbo G., Conte G., Donatelli S., Franceschinis G.: Modelling with generalized stochastic petri nets. SIGMETRICS Perform. Eval. Rev. 26(2), 2 (1998)

    Article  Google Scholar 

  15. Miller B.L.: Finite state continuous time markov decision processes with a finite planning horizon. SIAM J. Control 6(2), 266–280 (1968)

    Article  MathSciNet  MATH  Google Scholar 

  16. Neuhäußer, M.R., Stoelinga, M., Katoen, J.-P.: Delayed nondeterminism in ontinuous-time Markov decision processes. In: Proceedings of FOSSACS ’09, pp. 364–379 (2009)

  17. Neuhäußer, M.R., Zhang, L.: Time-bounded reachability probabilities in continuous-time Markov decision processes. In: QEST, pp. 209–218 (2010)

  18. Puterman M.L.: Markov decision processes: discrete stochastic dynamic programming. Wiley-Interscience, NY (1994)

    MATH  Google Scholar 

  19. Rabe, M., Schewe, S.: Optimal schedulers for time-bounded reachability in CTMDPs. Reports of SFB/TR 14 AVACS 55, October (2009)

  20. Rabe, M., Schewe, S.: Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games. CoRR, abs/1004.4005 (2010)

  21. Rabe, M., Schewe, S.: Optimal time-abstract schedulers for CTMDPs and Markov games. In: Proceedings of QAPL, pp. 144–158 (2010)

  22. Rabe, M., Schewe, S., Zhang, L.: Efficient approximation of optimal control for Markov games. CoRR, abs/1011.0397 (2010)

  23. Sanders, W.H., Meyer, J.F.: Reduced base model construction methods for stochastic activity networks. In: Proceedings of PNPM’89, pp. 74–84 (1989)

  24. Stewart W.J.: Introduction to the Numerical Solution of Markov Chains. Princeton University Press, Princeton (1994)

    MATH  Google Scholar 

  25. Winston W.: Optimality of the shortest line discipline. J. Appl. Probab. 14(1), 181–189 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  26. Wolovick, N., Johr, S.: A characterization of meaningful schedulers for continuous-time Markov decision processes. In: Proceedings of FORMATS’06, pp. 352–367 (2006)

  27. Zhang, L., Hermanns, H., Hahn, E.M., Wachter, B.: Time-bounded model checking of infinite-state continuous-time Markov chains. In: Proceedings of ACSD’08, pp. 98–107 (2008)

  28. Zhang, L., Neuhäußer, M.R.: Model checking interactive Markov chains. In: Proceedings of TACAS, pp. 53–68 (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sven Schewe.

Additional information

This work was partly supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Center “Automatic Verification and Analysis of Complex Systems” (SFB/TR 14 AVACS), the project SpAGAT in the DFG priority programme RS3, and by the Engineering and Physical Science Research Council (EPSRC) through grant EP/H046623/1 “Synthesis and Verification in Markov Game Structures”.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rabe, M.N., Schewe, S. Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games. Acta Informatica 48, 291 (2011). https://doi.org/10.1007/s00236-011-0140-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00236-011-0140-0

Keywords

Navigation