Abstract
Markov decision processes (MDPs) with multi-dimensional weights are useful to analyze systems with multiple objectives that may be conflicting and require the analysis of trade-offs. We study the complexity of percentile queries in such MDPs and give algorithms to synthesize strategies that enforce such constraints. Given a multi-dimensional weighted MDP and a quantitative payoff function f, thresholds \(v_i\) (one per dimension), and probability thresholds \(\alpha _i\), we show how to compute a single strategy to enforce that for all dimensions i, the probability of outcomes \(\rho \) satisfying \(f_i(\rho ) \ge v_i\) is at least \(\alpha _i\). We consider classical quantitative payoffs from the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum, discounted sum). Our work extends to the quantitative case the multi-objective model checking problem studied by Etessami et al. (Log Methods Comput Sci 4(4), 2008) in unweighted MDPs.
Similar content being viewed by others
Notes
The projection of a run \((s_1,m_1),a_1,(s_2,m_2),a_2,\ldots \) in \(M_s^\sigma \) to M is simply the run \(s_{1}a_{1}s_{2}a_{2}\ldots {}\) in M.
References
Baier C, Daum M, Dubslaff C, Klein J, Klüppelholz S (2014) Energy-utility quantiles. In: NASA formal methods, LNCS 8430, Springer, pp 285–299
Bertsekas DP, Tsitsiklis JN (1991) An analysis of stochastic shortest path problems. Math Oper Res 16:580–595
Boker U, Henzinger TA (2014) Exact and approximate determinization of discounted-sum automata. Log Methods Comput Sci 10(1)
Boker U, Henzinger TA, Otop J (2015) The target discounted-sum problem. In: IEEE Proceedings of LICS, pp 750–761
Brázdil T, Chen T, Forejt V, Novotný P, Simaitis A (2013) Solvency Markov decision processes with interest. In: Proceedings of FSTTCS, volume 24 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, pp 487–499
Brázdil T, Brozek V, Chatterjee K, Forejt V, Kucera A (2014) Markov decision processes with multiple long-run average objectives. Log Methods Comput Sci 10(13):1–29
Bruyère V, Filiot E, Randour M, Raskin J-F (2014) Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In: Proceedings of STACS, volume 25 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, pp 199–213
Chatterjee K (2007) Concurrent games with tail objectives. Theore Comput Sci 388(1 3):181–198
Chatterjee K, Doyen L, Henzinger TA, Raskin J-F (2010) Generalized mean-payoff and energy games. In: Proceedings of FSTTCS, volume 8 of LIPIcs, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, pp 505–516
Chatterjee K, Doyen L, Randour M, Raskin J-F (2015) Looking at mean-payoff and total-payoff through windows. Inf Comput 242:25–52
Chatterjee K, Forejt V, Wojtczak D (2013) Multi-objective discounted reward verification in graphs and MDPs. In: Proceedings of LPAR, LNCS 8312, Springer, pp 228–242
Chatterjee K, Henzinger TA (2009) Probabilistic systems with limsup and liminf objectives. In: Brattka MAV, Löwe VGB (eds), Infinity in logic and computation, LNCS 5489, Springer, pp 32–45
Chatterjee K, Komárková Z, Kretínský J (2015) Unifying two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS, pp 244–256
Chatterjee K, Majumdar R, Henzinger TA (2006) Markov decision processes with multiple objectives. In: Proceedings of STACS, LNCS 3884, Springer, pp 325–336
Chatterjee K, Randour M, Raskin J-F (2014) Strategy synthesis for multi-dimensional quantitative objectives. Acta Inform 51(3–4):129–163
de Alfaro L (1997) Formal verification of probabilistic systems. Ph.D. thesis, Stanford University
de Alfaro L (1999) Computing minimum and maximum reachability times in probabilistic systems. In: Proceedings of CONCUR, LNCS 1664, Springer, pp 66–81
Etessami K, Kwiatkowska M, Vardi MY, Yannakakis M (2008) Multi-objective model checking of Markov decision processes. Log Methods in Comput Sci 4(4)
Filar JA, Krass D, Ross KW (1995) Percentile performance criteria for limiting average Markov decision processes. IEEE Trans Autom Control 40(1):2–10
Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York
Goldreich O (2006) On promise problems: a survey. In: Goldreich O, Rosenberg A, Selman AL (eds), Theoretical computer science, essays in memory of Shimon Even, LNCS 3895, Springer, pp 254–290
Haase C, Kiefer S (2015) The odds of staying on budget. In: Proceedings of ICALP, LNCS 9135, Springer, pp 234–246
Haase C, Kiefer S (2016) The complexity of the Kth largest subset problem and related problems. Inf Process Lett 116(2):111–115
Johnson DB, Kashdan SD (1978) Lower bounds for selection in X + Y and other multisets. J ACM 25(4):556–570
Minsky ML (1961) Recursive unsolvability of Post’s problem of “tag” and other topics in theory of Turing machines. Ann Math 74(3):437–455
Ohtsubo Y (2004) Optimal threshold probability in undiscounted Markov decision processes with a target set. Appl Math Comput 149(2):519–532
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming, 1st edn. Wiley, New York
Randour M, Raskin J-F, Sankur O (2015) Percentile queries in multi-dimensional Markov decision processes. In: Proceedings of CAV, LNCS 9206, Springer, pp 123–139
Randour M, Raskin J-F, Sankur O (2015) Variations on the stochastic shortest path problem. In: Proceedings of VMCAI, LNCS 8931, Springer, pp 1–18
Sakaguchi M, Ohtsubo Y (2013) Markov decision processes associated with two threshold probability criteria. J Contro Theory Appl 11(4):548–557
Toda S (1991) PP is as hard as the polynomial-time hierarchy. SIAM J Comput 20(5):865–877
Tracol M (2009) Fast convergence to state-action frequency polytopes for MDPs. Oper Res Lett 37(2):123–126
Travers SD (2006) The complexity of membership problems for circuits over sets of integers. Theor Comput Sci 369(1–3):211–229
Ummels M, Baier C (2013) Computing quantiles in Markov reward models. In: Proceedings of FOSSACS, LNCS 7794, Springer, pp 353–368
Vardi MY (1985) Automatic verification of probabilistic concurrent finite-state programs. In: IEEE Proceedings of FOCS, pp 327–338
White DJ (1993) Minimizing a threshold probability in discounted Markov decision processes. J Math Anal Appl 173(2):634–646
Congbin W, Lin Y (1999) Minimizing risk models in Markov decision processes with policies depending on target values. J Math Anal Appl 231(1):47–67
Xu H, Mannor S (2011) Probabilistic goal Markov decision processes. In: Proceedings of IJCAI, pp 2046–2052
Author information
Authors and Affiliations
Corresponding author
Additional information
M. Randour is an F.R.S.-FNRS Postdoctoral Researcher, J.-F. Raskin is supported by ERC Starting Grant (279499: inVEST). Work partly supported by European project CASSTING (FP7-ICT-601148).
Rights and permissions
About this article
Cite this article
Randour, M., Raskin, JF. & Sankur, O. Percentile queries in multi-dimensional Markov decision processes. Form Methods Syst Des 50, 207–248 (2017). https://doi.org/10.1007/s10703-016-0262-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10703-016-0262-7