Abstract
We establish the existence of optimal scheduling strategies for time-bounded reachability in continuous-time Markov decision processes, and of co-optimal strategies for continuous-time Markov games. Furthermore, we show that optimal control does not only exist, but has a surprisingly simple structure: the optimal schedulers from our proofs are deterministic and timed positional, and the bounded time can be divided into a finite number of intervals, in which the optimal strategies are positional. That is, we demonstrate the existence of finite optimal control. Finally, we show that these pleasant properties of Markov decision processes extend to the more general class of continuous-time Markov games, and that both early and late schedulers show this behaviour.
Similar content being viewed by others
References
Ash R.B., Doléans-Dade C.A.: Probability and Measure Theory. Elsevier Science, Amsterdam (2000)
Aziz A., Sanwal K., Singhal V., Brayton R.: Model-checking continuous-time markov chains. Trans. Comput. Logic 1(1), 162–170 (2000)
Baier C., Hermanns H., Katoen J-.P., Haverkort B.R.: Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes. Theoret. Comput. Sci. 345(1), 2–26 (2005)
Baier, C., Katoen, J.-P., Hermanns, H.: Approximate symbolic model checking of continuous-time Markov chains. In: Proceedings of CONCUR’99, Lecture Notes in Computer Science, vol. 1664, pp. 146–161 (1999)
Bellman R.: Dynamic Programming. Princeton University Press, Princeton (1957)
Brazdil, T., Forejt, V., Krcal, J., Kretinsky, J., Kucera, A.: Continuous-Time Stochastic Games with Time-Bounded Reachability. In: Proceedings of FSTTCS’09, Leibniz International Proceedings in Informatics (LIPIcs), pp. 61–72 (2009)
Buchholz P., Schulz I.: Numerical analysis of continuous time Markov decision processes over finite horizons. Comput. Oper. Res. 38(3), 651–659 (2011)
Feinberg E.A.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29(3), 492–524 (2004)
Guo X., Hernández-Lerma O.: Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates. J. Appl. Probab. 40(2), 327–345 (2003)
Guo X., Hernández-Lerma O.: Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates. Bernoulli 11(6), 1009–1029 (2005)
Guo X., Hernández-Lerma O.: Zero-sum games for continuous-time jump Markov processes in Polish spaces: discounted payoffs. Adv. Appl. Probab. 39(3), 645–668 (2007)
Guo X., Hernández-Lerma O.: Continuous-Time Markov Decision Processes, volume 62 of Stochastic Modelling and Applied Probability. Springer, Berlin (2009)
Hermanns, H.: Interactive Markov Chains and the Quest for Quantified Quality. In: LNCS, vol. 2428 (2002)
Marsan M.A., Balbo G., Conte G., Donatelli S., Franceschinis G.: Modelling with generalized stochastic petri nets. SIGMETRICS Perform. Eval. Rev. 26(2), 2 (1998)
Miller B.L.: Finite state continuous time markov decision processes with a finite planning horizon. SIAM J. Control 6(2), 266–280 (1968)
Neuhäußer, M.R., Stoelinga, M., Katoen, J.-P.: Delayed nondeterminism in ontinuous-time Markov decision processes. In: Proceedings of FOSSACS ’09, pp. 364–379 (2009)
Neuhäußer, M.R., Zhang, L.: Time-bounded reachability probabilities in continuous-time Markov decision processes. In: QEST, pp. 209–218 (2010)
Puterman M.L.: Markov decision processes: discrete stochastic dynamic programming. Wiley-Interscience, NY (1994)
Rabe, M., Schewe, S.: Optimal schedulers for time-bounded reachability in CTMDPs. Reports of SFB/TR 14 AVACS 55, October (2009)
Rabe, M., Schewe, S.: Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games. CoRR, abs/1004.4005 (2010)
Rabe, M., Schewe, S.: Optimal time-abstract schedulers for CTMDPs and Markov games. In: Proceedings of QAPL, pp. 144–158 (2010)
Rabe, M., Schewe, S., Zhang, L.: Efficient approximation of optimal control for Markov games. CoRR, abs/1011.0397 (2010)
Sanders, W.H., Meyer, J.F.: Reduced base model construction methods for stochastic activity networks. In: Proceedings of PNPM’89, pp. 74–84 (1989)
Stewart W.J.: Introduction to the Numerical Solution of Markov Chains. Princeton University Press, Princeton (1994)
Winston W.: Optimality of the shortest line discipline. J. Appl. Probab. 14(1), 181–189 (1977)
Wolovick, N., Johr, S.: A characterization of meaningful schedulers for continuous-time Markov decision processes. In: Proceedings of FORMATS’06, pp. 352–367 (2006)
Zhang, L., Hermanns, H., Hahn, E.M., Wachter, B.: Time-bounded model checking of infinite-state continuous-time Markov chains. In: Proceedings of ACSD’08, pp. 98–107 (2008)
Zhang, L., Neuhäußer, M.R.: Model checking interactive Markov chains. In: Proceedings of TACAS, pp. 53–68 (2010)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was partly supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Center “Automatic Verification and Analysis of Complex Systems” (SFB/TR 14 AVACS), the project SpAGAT in the DFG priority programme RS3, and by the Engineering and Physical Science Research Council (EPSRC) through grant EP/H046623/1 “Synthesis and Verification in Markov Game Structures”.
Rights and permissions
About this article
Cite this article
Rabe, M.N., Schewe, S. Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games. Acta Informatica 48, 291 (2011). https://doi.org/10.1007/s00236-011-0140-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00236-011-0140-0