Advertisement

Acta Informatica

, 48:291 | Cite as

Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games

  • Markus N. Rabe
  • Sven ScheweEmail author
Original Article

Abstract

We establish the existence of optimal scheduling strategies for time-bounded reachability in continuous-time Markov decision processes, and of co-optimal strategies for continuous-time Markov games. Furthermore, we show that optimal control does not only exist, but has a surprisingly simple structure: the optimal schedulers from our proofs are deterministic and timed positional, and the bounded time can be divided into a finite number of intervals, in which the optimal strategies are positional. That is, we demonstrate the existence of finite optimal control. Finally, we show that these pleasant properties of Markov decision processes extend to the more general class of continuous-time Markov games, and that both early and late schedulers show this behaviour.

Keywords

Markov Decision Process Switching Point Discrete Location Goal Region Continuous Location 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Ash R.B., Doléans-Dade C.A.: Probability and Measure Theory. Elsevier Science, Amsterdam (2000)zbMATHGoogle Scholar
  2. 2.
    Aziz A., Sanwal K., Singhal V., Brayton R.: Model-checking continuous-time markov chains. Trans. Comput. Logic 1(1), 162–170 (2000)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Baier C., Hermanns H., Katoen J-.P., Haverkort B.R.: Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes. Theoret. Comput. Sci. 345(1), 2–26 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    Baier, C., Katoen, J.-P., Hermanns, H.: Approximate symbolic model checking of continuous-time Markov chains. In: Proceedings of CONCUR’99, Lecture Notes in Computer Science, vol. 1664, pp. 146–161 (1999)Google Scholar
  5. 5.
    Bellman R.: Dynamic Programming. Princeton University Press, Princeton (1957)zbMATHGoogle Scholar
  6. 6.
    Brazdil, T., Forejt, V., Krcal, J., Kretinsky, J., Kucera, A.: Continuous-Time Stochastic Games with Time-Bounded Reachability. In: Proceedings of FSTTCS’09, Leibniz International Proceedings in Informatics (LIPIcs), pp. 61–72 (2009)Google Scholar
  7. 7.
    Buchholz P., Schulz I.: Numerical analysis of continuous time Markov decision processes over finite horizons. Comput. Oper. Res. 38(3), 651–659 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Feinberg E.A.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29(3), 492–524 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Guo X., Hernández-Lerma O.: Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates. J. Appl. Probab. 40(2), 327–345 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    Guo X., Hernández-Lerma O.: Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates. Bernoulli 11(6), 1009–1029 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Guo X., Hernández-Lerma O.: Zero-sum games for continuous-time jump Markov processes in Polish spaces: discounted payoffs. Adv. Appl. Probab. 39(3), 645–668 (2007)zbMATHCrossRefGoogle Scholar
  12. 12.
    Guo X., Hernández-Lerma O.: Continuous-Time Markov Decision Processes, volume 62 of Stochastic Modelling and Applied Probability. Springer, Berlin (2009)CrossRefGoogle Scholar
  13. 13.
    Hermanns, H.: Interactive Markov Chains and the Quest for Quantified Quality. In: LNCS, vol. 2428 (2002)Google Scholar
  14. 14.
    Marsan M.A., Balbo G., Conte G., Donatelli S., Franceschinis G.: Modelling with generalized stochastic petri nets. SIGMETRICS Perform. Eval. Rev. 26(2), 2 (1998)CrossRefGoogle Scholar
  15. 15.
    Miller B.L.: Finite state continuous time markov decision processes with a finite planning horizon. SIAM J. Control 6(2), 266–280 (1968)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Neuhäußer, M.R., Stoelinga, M., Katoen, J.-P.: Delayed nondeterminism in ontinuous-time Markov decision processes. In: Proceedings of FOSSACS ’09, pp. 364–379 (2009)Google Scholar
  17. 17.
    Neuhäußer, M.R., Zhang, L.: Time-bounded reachability probabilities in continuous-time Markov decision processes. In: QEST, pp. 209–218 (2010)Google Scholar
  18. 18.
    Puterman M.L.: Markov decision processes: discrete stochastic dynamic programming. Wiley-Interscience, NY (1994)zbMATHGoogle Scholar
  19. 19.
    Rabe, M., Schewe, S.: Optimal schedulers for time-bounded reachability in CTMDPs. Reports of SFB/TR 14 AVACS 55, October (2009)Google Scholar
  20. 20.
    Rabe, M., Schewe, S.: Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games. CoRR, abs/1004.4005 (2010)Google Scholar
  21. 21.
    Rabe, M., Schewe, S.: Optimal time-abstract schedulers for CTMDPs and Markov games. In: Proceedings of QAPL, pp. 144–158 (2010)Google Scholar
  22. 22.
    Rabe, M., Schewe, S., Zhang, L.: Efficient approximation of optimal control for Markov games. CoRR, abs/1011.0397 (2010)Google Scholar
  23. 23.
    Sanders, W.H., Meyer, J.F.: Reduced base model construction methods for stochastic activity networks. In: Proceedings of PNPM’89, pp. 74–84 (1989)Google Scholar
  24. 24.
    Stewart W.J.: Introduction to the Numerical Solution of Markov Chains. Princeton University Press, Princeton (1994)zbMATHGoogle Scholar
  25. 25.
    Winston W.: Optimality of the shortest line discipline. J. Appl. Probab. 14(1), 181–189 (1977)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Wolovick, N., Johr, S.: A characterization of meaningful schedulers for continuous-time Markov decision processes. In: Proceedings of FORMATS’06, pp. 352–367 (2006)Google Scholar
  27. 27.
    Zhang, L., Hermanns, H., Hahn, E.M., Wachter, B.: Time-bounded model checking of infinite-state continuous-time Markov chains. In: Proceedings of ACSD’08, pp. 98–107 (2008)Google Scholar
  28. 28.
    Zhang, L., Neuhäußer, M.R.: Model checking interactive Markov chains. In: Proceedings of TACAS, pp. 53–68 (2010)Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Universität des SaarlandesSaarbrückenGermany
  2. 2.University of LiverpoolLiverpoolUK

Personalised recommendations