Advertisement

Percentile Queries in Multi-dimensional Markov Decision Processes

  • Mickael RandourEmail author
  • Jean-François Raskin
  • Ocan Sankur
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9206)

Abstract

Markov decision processes (MDPs) with multi-dimensional weights are useful to analyze systems with multiple objectives that may be conflicting and require the analysis of trade-offs. In this paper, we study the complexity of percentile queries in such MDPs and give algorithms to synthesize strategies that enforce such constraints. Given a multi-dimensional weighted MDP and a quantitative payoff function f, thresholds \(v_i\) (one per dimension), and probability thresholds \(\alpha _i\), we show how to compute a single strategy to enforce that for all dimensions i, the probability of outcomes \(\rho \) satisfying \(f_i(\rho ) \ge v_i\) is at least \(\alpha _i\). We consider classical quantitative payoffs from the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum, discounted sum). Our work extends to the quantitative case the multi-objective model checking problem studied by Etessami et al. [16] in unweighted MDPs.

Keywords

Short Path Payoff Function Markov Decision Process Short Path Problem Maximal Subset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Baier, C., Daum, M., Dubslaff, C., Klein, J., Klüppelholz, S.: Energy-utility quantiles. In: Badger, J.M., Rozier, K.Y. (eds.) NFM 2014. LNCS, vol. 8430, pp. 285–299. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  2. 2.
    Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16, 580–595 (1991)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Boker, U., Henzinger, T.A.: Exact and approximate determinization of discounted-sum automata. LMCS 10(1), 1–33 (2014)MathSciNetGoogle Scholar
  4. 4.
    Boker, U., Henzinger, T.A., Otop, J.: The target discounted-sum problem. In: Proceedings of LICS. IEEE Computer Society (2015)Google Scholar
  5. 5.
    Brázdil, T., Chen, T., Forejt, V., Novotný, P., Simaitis, A.: Solvency Markov decision processes with interest. In: Proceedings of FSTTCS, LIPIcs, vol. 24, pp. 487–499. Schloss Dagstuhl - LZI (2013)Google Scholar
  6. 6.
    Brázdil, T., Brozek, V., Chatterjee, K., Forejt, V., Kucera, A.: Markov decision processes with multiple long-run average objectives. LMCS 10(13), 1–29 (2014)Google Scholar
  7. 7.
    Bruyère, V., Filiot, E., Randour, M., Raskin, J.-F.: Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In: Proceedings of STACS, LIPIcs, vol. 25, pp. 199–213. Schloss Dagstuhl - LZI (2014)Google Scholar
  8. 8.
    Chatterjee, K., Doyen, L., Randour, M., Raskin, J.-F.: Looking at mean-payoff and total-payoff through windows. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp. 118–132. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  9. 9.
    Chatterjee, K., Forejt, V., Wojtczak, D.: Multi-objective discounted reward verification in graphs and MDPs. In: McMillan, K., Middeldorp, A., Voronkov, A. (eds.) LPAR-19 2013. LNCS, vol. 8312, pp. 228–242. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  10. 10.
    Chatterjee, K., Henzinger, T.A.: Probabilistic systems with limsup and liminf objectives. In: Archibald, M., Brattka, V., Goranko, V., Löwe, B. (eds.) ILC 2007. LNCS, vol. 5489, pp. 32–45. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  11. 11.
    Chatterjee, K., Komárková, Z., Kretínský, J.: Unifying two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS. IEEE Computer Society (2015)Google Scholar
  12. 12.
    Chatterjee, K., Majumdar, R., Henzinger, T.A.: Markov decision processes with multiple objectives. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, pp. 325–336. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  13. 13.
    Chatterjee, K., Randour, M., Raskin, J.-F.: Strategy synthesis for multi-dimensional quantitative objectives. Acta Inform. 51(3–4), 129–163 (2014)MathSciNetCrossRefGoogle Scholar
  14. 14.
    de Alfaro, L.: Formal verification of probabilistic systems. Ph.D. thesis, Stanford University (1997)Google Scholar
  15. 15.
    de Alfaro, L.: Computing minimum and maximum reachability times in probabilistic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR 1999. LNCS, vol. 1664, pp. 66–81. Springer, Heidelberg (1999) CrossRefGoogle Scholar
  16. 16.
    Etessami, K., Kwiatkowska, M.Z., Vardi, M.Y., Yannakakis, M.: Multi-objective model checking of Markov decision processes. LMCS 4(4), 1–21 (2008)MathSciNetGoogle Scholar
  17. 17.
    Filar, J.A., Krass, D., Ross, K.W.: Percentile performance criteria for limiting average Markov decision processes. IEEE Trans. Aut. Control 40(1), 2–10 (1995)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Garey, Michael R., Johnson, David S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)Google Scholar
  19. 19.
    Goldreich, O.: On promise problems: a survey. In: Goldreich, O., Rosenberg, A.L., Selman, A.L. (eds.) Theoretical Computer Science. LNCS, vol. 3895, pp. 254–290. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  20. 20.
    Haase, C., Kiefer, S.: The complexity of the Kth largest subset problem and related problems. CoRR, abs/1501.06729 (2015)Google Scholar
  21. 21.
    Haase, C., Kiefer, S.: The odds of staying on budget. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 234–246. Springer, Heidelberg (2015) CrossRefGoogle Scholar
  22. 22.
    Ohtsubo, Y.: Optimal threshold probability in undiscounted Markov decision processes with a target set. Appl. Math. Comput. 149(2), 519–532 (2004)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. Wiley, New York (1994) CrossRefGoogle Scholar
  24. 24.
    Randour, M., Raskin, J.-F., Sankur, O.: Percentile queries in multi-dimensional Markov decision processes. CoRR, abs/1410.4801 (2014)Google Scholar
  25. 25.
    Randour, M., Raskin, J.-F., Sankur, O.: Variations on the stochastic shortest path problem. In: D’Souza, D., Lal, A., Larsen, K.G. (eds.) VMCAI 2015. LNCS, vol. 8931, pp. 1–18. Springer, Heidelberg (2015) Google Scholar
  26. 26.
    Sakaguchi, M., Ohtsubo, Y.: Markov decision processes associated with two threshold probability criteria. J. Control Theor. Appl. 11(4), 548–557 (2013)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Toda, S.: PP is as hard as the polynomial-time hierarchy. SIAM J. Comput. 20(5), 865–877 (1991)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Travers, S.D.: The complexity of membership problems for circuits over sets of integers. Theor. Comput. Sci. 369(1–3), 211–229 (2006)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Ummels, M., Baier, C.: Computing quantiles in Markov reward models. In: Pfenning, F. (ed.) FOSSACS 2013 (ETAPS 2013). LNCS, vol. 7794, pp. 353–368. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  30. 30.
    White, D.J.: Minimizing a threshold probability in discounted Markov decision processes. J. Math. Anal. Appl. 173(2), 634–646 (1993)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Wu, C., Lin, Y.: Minimizing risk models in Markov decision processes with policies depending on target values. J. Math. Anal. Appl. 231(1), 47–67 (1999)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Xu, H., Mannor, S.: Probabilistic goal Markov decision processes. In: IJCAI, pp. 2046–2052 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Mickael Randour
    • 1
    Email author
  • Jean-François Raskin
    • 2
  • Ocan Sankur
    • 2
  1. 1.LSVCNRS and ENS CachanCachanFrance
  2. 2.Département d’InformatiqueUniversité Libre de Bruxelles (U.L.B.)BrusselsBelgium

Personalised recommendations