Percentile queries in multi-dimensional Markov decision processes

Published: 05 January 2017

Volume 50, pages 207–248, (2017)
Cite this article

Formal Methods in System Design Aims and scope Submit manuscript

Mickael Randour¹,
Jean-François Raskin¹ &
Ocan Sankur²

404 Accesses
11 Citations
Explore all metrics

Abstract

Markov decision processes (MDPs) with multi-dimensional weights are useful to analyze systems with multiple objectives that may be conflicting and require the analysis of trade-offs. We study the complexity of percentile queries in such MDPs and give algorithms to synthesize strategies that enforce such constraints. Given a multi-dimensional weighted MDP and a quantitative payoff function f, thresholds \(v_i\) (one per dimension), and probability thresholds \(\alpha _i\), we show how to compute a single strategy to enforce that for all dimensions i, the probability of outcomes \(\rho \) satisfying \(f_i(\rho ) \ge v_i\) is at least \(\alpha _i\). We consider classical quantitative payoffs from the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum, discounted sum). Our work extends to the quantitative case the multi-objective model checking problem studied by Etessami et al. (Log Methods Comput Sci 4(4), 2008) in unweighted MDPs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Similar content being viewed by others

Percentile Queries in Multi-dimensional Markov Decision Processes

Chapter © 2015

Simple Strategies in Multi-Objective MDPs

Chapter © 2020

Multi-objective dynamic programming with limited precision

Article Open access 02 November 2021

Notes

The projection of a run \((s_1,m_1),a_1,(s_2,m_2),a_2,\ldots \) in \(M_s^\sigma \) to M is simply the run \(s_{1}a_{1}s_{2}a_{2}\ldots {}\) in M.

References

Baier C, Daum M, Dubslaff C, Klein J, Klüppelholz S (2014) Energy-utility quantiles. In: NASA formal methods, LNCS 8430, Springer, pp 285–299
Bertsekas DP, Tsitsiklis JN (1991) An analysis of stochastic shortest path problems. Math Oper Res 16:580–595
Article MathSciNet MATH Google Scholar
Boker U, Henzinger TA (2014) Exact and approximate determinization of discounted-sum automata. Log Methods Comput Sci 10(1)
Boker U, Henzinger TA, Otop J (2015) The target discounted-sum problem. In: IEEE Proceedings of LICS, pp 750–761
Brázdil T, Chen T, Forejt V, Novotný P, Simaitis A (2013) Solvency Markov decision processes with interest. In: Proceedings of FSTTCS, volume 24 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, pp 487–499
Brázdil T, Brozek V, Chatterjee K, Forejt V, Kucera A (2014) Markov decision processes with multiple long-run average objectives. Log Methods Comput Sci 10(13):1–29
MathSciNet MATH Google Scholar
Bruyère V, Filiot E, Randour M, Raskin J-F (2014) Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In: Proceedings of STACS, volume 25 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, pp 199–213
Chatterjee K (2007) Concurrent games with tail objectives. Theore Comput Sci 388(1 3):181–198
Article MathSciNet MATH Google Scholar
Chatterjee K, Doyen L, Henzinger TA, Raskin J-F (2010) Generalized mean-payoff and energy games. In: Proceedings of FSTTCS, volume 8 of LIPIcs, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, pp 505–516
Chatterjee K, Doyen L, Randour M, Raskin J-F (2015) Looking at mean-payoff and total-payoff through windows. Inf Comput 242:25–52
Article MathSciNet MATH Google Scholar
Chatterjee K, Forejt V, Wojtczak D (2013) Multi-objective discounted reward verification in graphs and MDPs. In: Proceedings of LPAR, LNCS 8312, Springer, pp 228–242
Chatterjee K, Henzinger TA (2009) Probabilistic systems with limsup and liminf objectives. In: Brattka MAV, Löwe VGB (eds), Infinity in logic and computation, LNCS 5489, Springer, pp 32–45
Chatterjee K, Komárková Z, Kretínský J (2015) Unifying two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS, pp 244–256
Chatterjee K, Majumdar R, Henzinger TA (2006) Markov decision processes with multiple objectives. In: Proceedings of STACS, LNCS 3884, Springer, pp 325–336
Chatterjee K, Randour M, Raskin J-F (2014) Strategy synthesis for multi-dimensional quantitative objectives. Acta Inform 51(3–4):129–163
Article MathSciNet MATH Google Scholar
de Alfaro L (1997) Formal verification of probabilistic systems. Ph.D. thesis, Stanford University
de Alfaro L (1999) Computing minimum and maximum reachability times in probabilistic systems. In: Proceedings of CONCUR, LNCS 1664, Springer, pp 66–81
Etessami K, Kwiatkowska M, Vardi MY, Yannakakis M (2008) Multi-objective model checking of Markov decision processes. Log Methods in Comput Sci 4(4)
Filar JA, Krass D, Ross KW (1995) Percentile performance criteria for limiting average Markov decision processes. IEEE Trans Autom Control 40(1):2–10
Article MathSciNet MATH Google Scholar
Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York
MATH Google Scholar
Goldreich O (2006) On promise problems: a survey. In: Goldreich O, Rosenberg A, Selman AL (eds), Theoretical computer science, essays in memory of Shimon Even, LNCS 3895, Springer, pp 254–290
Haase C, Kiefer S (2015) The odds of staying on budget. In: Proceedings of ICALP, LNCS 9135, Springer, pp 234–246
Haase C, Kiefer S (2016) The complexity of the Kth largest subset problem and related problems. Inf Process Lett 116(2):111–115
Article MATH Google Scholar
Johnson DB, Kashdan SD (1978) Lower bounds for selection in X + Y and other multisets. J ACM 25(4):556–570
Article MathSciNet MATH Google Scholar
Minsky ML (1961) Recursive unsolvability of Post’s problem of “tag” and other topics in theory of Turing machines. Ann Math 74(3):437–455
Article MathSciNet MATH Google Scholar
Ohtsubo Y (2004) Optimal threshold probability in undiscounted Markov decision processes with a target set. Appl Math Comput 149(2):519–532
MathSciNet MATH Google Scholar
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming, 1st edn. Wiley, New York
Book MATH Google Scholar
Randour M, Raskin J-F, Sankur O (2015) Percentile queries in multi-dimensional Markov decision processes. In: Proceedings of CAV, LNCS 9206, Springer, pp 123–139
Randour M, Raskin J-F, Sankur O (2015) Variations on the stochastic shortest path problem. In: Proceedings of VMCAI, LNCS 8931, Springer, pp 1–18
Sakaguchi M, Ohtsubo Y (2013) Markov decision processes associated with two threshold probability criteria. J Contro Theory Appl 11(4):548–557
Article MathSciNet MATH Google Scholar
Toda S (1991) PP is as hard as the polynomial-time hierarchy. SIAM J Comput 20(5):865–877
Article MathSciNet MATH Google Scholar
Tracol M (2009) Fast convergence to state-action frequency polytopes for MDPs. Oper Res Lett 37(2):123–126
Article MathSciNet MATH Google Scholar
Travers SD (2006) The complexity of membership problems for circuits over sets of integers. Theor Comput Sci 369(1–3):211–229
Article MathSciNet MATH Google Scholar
Ummels M, Baier C (2013) Computing quantiles in Markov reward models. In: Proceedings of FOSSACS, LNCS 7794, Springer, pp 353–368
Vardi MY (1985) Automatic verification of probabilistic concurrent finite-state programs. In: IEEE Proceedings of FOCS, pp 327–338
White DJ (1993) Minimizing a threshold probability in discounted Markov decision processes. J Math Anal Appl 173(2):634–646
Article MathSciNet MATH Google Scholar
Congbin W, Lin Y (1999) Minimizing risk models in Markov decision processes with policies depending on target values. J Math Anal Appl 231(1):47–67
Article MathSciNet MATH Google Scholar
Xu H, Mannor S (2011) Probabilistic goal Markov decision processes. In: Proceedings of IJCAI, pp 2046–2052

Download references

Author information

Authors and Affiliations

Département d’Informatique, Université libre de Bruxelles (ULB), CP 212, Boulevard du Triomphe, 1050, Brussels, Belgium
Mickael Randour & Jean-François Raskin
CNRS, Irisa, Campus Universitaire de Beaulieu, 35042, Rennes Cedex, France
Ocan Sankur

Authors

Mickael Randour
View author publications
You can also search for this author in PubMed Google Scholar
Jean-François Raskin
View author publications
You can also search for this author in PubMed Google Scholar
Ocan Sankur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mickael Randour.

Additional information

M. Randour is an F.R.S.-FNRS Postdoctoral Researcher, J.-F. Raskin is supported by ERC Starting Grant (279499: inVEST). Work partly supported by European project CASSTING (FP7-ICT-601148).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Randour, M., Raskin, JF. & Sankur, O. Percentile queries in multi-dimensional Markov decision processes. Form Methods Syst Des 50, 207–248 (2017). https://doi.org/10.1007/s10703-016-0262-7

Download citation

Published: 05 January 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10703-016-0262-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions