Advertisement

Continuous-Time Markov Decisions Based on Partial Exploration

  • Pranav Ashok
  • Yuliya Butkova
  • Holger Hermanns
  • Jan KřetínskýEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11138)

Abstract

We provide a framework for speeding up algorithms for time-bounded reachability analysis of continuous-time Markov decision processes. The principle is to find a small, but almost equivalent subsystem of the original system and only analyse the subsystem. Candidates for the subsystem are identified through simulations and iteratively enlarged until runs are represented in the subsystem with high enough probability. The framework is thus dual to that of abstraction refinement. We instantiate the framework in several ways with several traditional algorithms and experimentally confirm orders-of-magnitude speed ups in many cases.

References

  1. [ABHK18]
    Ashok, P., Butkova, Y., Hermanns, H., Křetínský, J.: Continuous-Time Markov Decisions Based on Partial Exploration. ArXiv e-prints (2018). https://arxiv.org/abs/1807.09641
  2. [ACD+17]
    Ashok, P., Chatterjee, K., Daca, P., Kretínský, J., Meggendorfer, T.: Value iteration for long-run average reward in Markov decision processes. In: CAV (2017)Google Scholar
  3. [ASSB96]
    Aziz, A., Sanwal, K., Singhal, V., Brayton, R.K.: Verifying continuous time Markov chains. In: CAV (1996)Google Scholar
  4. [BBB+17]
    Bartocci, E., Bortolussi, L., Brázdil, T., Milios, D., Sanguinetti, G.: Policy learning in continuous-time Markov decision processes using gaussian processes. Perform. Eval. 116, 84–100 (2017)CrossRefGoogle Scholar
  5. [BCC+14]
    Brázdil, T., et al.: Verification of Markov decision processes using learning algorithms. In: ATVA (2014)Google Scholar
  6. [BDF81]
    Bruno, J.L., Downey, P.J., Frederickson, G.N.: Sequencing tasks with exponential service times to minimize the expected flow time or makespan. J. ACM 28(1), 100–113 (1981)MathSciNetCrossRefGoogle Scholar
  7. [Ber95]
    Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. II. Athena Scientific (1995)Google Scholar
  8. [BFK+09]
    Brázdil, T., Forejt, V., Krčál, J., Křetínský, J., Kučera, A.: Continuous-time stochastic games with time-bounded reachability. In: FSTTCS (2009)Google Scholar
  9. [BHHK04]
    Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.: Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes. In: TACAS (2004)Google Scholar
  10. [BHHK15]
    Butkova, Y., Hatefi, H., Hermanns, H., Krcál, J.: Optimal continuous time Markov decisions. In: ATVA (2015)Google Scholar
  11. [BS11]
    Buchholz, P., Schulz, I.: Numerical analysis of continuous time Markov decision processes over finite horizons. Comput. OR 38(3), 651–659 (2011)MathSciNetCrossRefGoogle Scholar
  12. [EHKZ13]
    Eisentraut, C., Hermanns, H., Katoen, J., Zhang, L.: A semantics for every GSPN. In: Petri Nets (2013)CrossRefGoogle Scholar
  13. [Fei04]
    Feinberg, E.A.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29(3), 492–524 (2004)MathSciNetCrossRefGoogle Scholar
  14. [FRSZ11]
    Fearnley, J., Rabe, M., Schewe, S., Zhang, L.: Efficient approximation of optimal control for continuous-time Markov games. In: FSTTCS (2011)Google Scholar
  15. [GGL03]
    Ghemawat, S., Gobioff, H., Leung, S.: The google file system. In: SOSP (2003)Google Scholar
  16. [GHH+13]
    Guck, D., Hatefi, H., Hermanns, H., Katoen, J., Timmer, M.: Modelling, reduction and analysis of Markov automata. In: QEST (2013)Google Scholar
  17. [GHKN12]
    Guck, D., Han, T., Katoen, J., Neuhäußer, M.R.: Quantitative timed analysis of interactive Markov chains. In: NFM (2012)Google Scholar
  18. [HCH+02]
    Haverkort, B.R., Cloth, L., Hermanns, H., Katoen, J., Baier, C.: Model checking performability properties. In: DSN (2002)Google Scholar
  19. [HH13]
    Hatefi, H., Hermanns, H.: Improving time bounded reachability computations in interactive Markov chains. In: FSEN (2013)CrossRefGoogle Scholar
  20. [HHK00]
    Haverkort, B.R., Hermanns, H., Katoen, J.: On the use of model checking techniques for dependability evaluation. In: SRDS’00 (2000)Google Scholar
  21. [KNP11]
    Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-22110-1_47CrossRefGoogle Scholar
  22. [Lef81]
    Lefèvre, C.: Optimal control of a birth and death epidemic process. Oper. Res. 29(5), 971–982 (1981)MathSciNetCrossRefGoogle Scholar
  23. [MLG05]
    McMahan, H.B., Likhachev, M., Gordon, G.J.: Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees. In: ICML (2005)Google Scholar
  24. [Neu10]
    Neuhäußer, M.R.: Model checking nondeterministic and randomly timed systems. Ph.D. thesis, RWTH Aachen University (2010)Google Scholar
  25. [NZ10]
    Neuhäußer, M.R., Zhang, L.: Time-bounded reachability probabilities in continuous-time Markov decision processes. In: QEST (2010)Google Scholar
  26. [PBU13]
    Pavese, E., Braberman, V.A., Uchitel, S.: Automated reliability estimation over partial systematic explorations. In: ICSE, pp. 602–611 (2013)Google Scholar
  27. [QQP01]
    Qiu, Q., Qu, Q., Pedram, M.: Stochastic modeling of a power-managed system-construction and optimization. IEEE Trans. CAD Integr. Circuits Syst. 20(10), 1200–1217 (2001)CrossRefGoogle Scholar
  28. [Sen99]
    Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley-Interscience, New York (1999)zbMATHGoogle Scholar
  29. [Tim11]
    Timmer, M.: Scoop: a tool for symbolic optimisations of probabilistic processes. In: QEST (2011)Google Scholar
  30. [TKvdPS12]
    Timmer, M., Katoen, J.-P., van de Pol, J., Stoelinga, M.I.A.: Efficient modelling and generation of Markov automata. In: Koutny, M., Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 364–379. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-32940-1_26CrossRefGoogle Scholar
  31. [TvdPS13]
    Timmer, M., van de Pol, J., Stoelinga, M.I.A.: Confluence reduction for Markov automata. In: Braberman, V., Fribourg, L. (eds.) FORMATS 2013. LNCS, vol. 8053, pp. 243–257. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40229-6_17CrossRefGoogle Scholar
  32. [ZN10]
    Zhang, L., Neuhäußer, M.R.: Model checking interactive Markov chains. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 53–68. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-12002-2_5CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Technical University of MunichMunichGermany
  2. 2.Saarland UniversitySaarbrückenGermany

Personalised recommendations