Maximizing the Conditional Expected Reward for Reaching the Goal

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10206)

Abstract

The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state (briefly called maximal conditional expectations) in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are (i) a polynomial-time algorithm to check the finiteness of maximal conditional expectations, (ii) PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, (iii) a pseudo-polynomial-time algorithm for the threshold problem in the general (cyclic) case, and (iv) an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.

References

  1. 1.
    Abdulla, P.A., Henda, N.B., Mayr, R.: Decisive Markov chains. Logical Methods Comput. Sci. 3(4) (2007)Google Scholar
  2. 2.
    Acerbi, C., Tasche, D.: Expected shortfall: a natural coherent alternative to value at risk. Econ. notes 31(2), 379–388 (2002)CrossRefGoogle Scholar
  3. 3.
    Alvim, M.S., Andrés, M.E., Chatzikokolakis, K., Degano, P., Palamidessi, C.: On the information leakage of differentially-private mechanisms. J. Comput. Secur. 23(4), 427–469 (2015)CrossRefGoogle Scholar
  4. 4.
    Alvim, M.S., Chatzikokolakis, K., McIver, A., Morgan, C., Palamidessi, C., Smith, G.: Axioms for information leakage. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 77–92. IEEE Computer Society (2016)Google Scholar
  5. 5.
    Alvim, M.S., Chatzikokolakis, K., Palamidessi, C., Smith, G.: Measuring information leakage using generalized gain functions. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 265–279. IEEE Computer Society (2012)Google Scholar
  6. 6.
    Andrés, M.E.: Quantitative Analysis of Information Leakage in Probabilistic and Nondeterministic Systems. Ph.D. thesis, UB Nijmegen (2011)Google Scholar
  7. 7.
    Andrés, M.E., Palamidessi, C., van Rossum, P., Sokolova, A.: Information hiding in probabilistic concurrent systems. Theoret. Comput. Sci. 412(28), 3072–3089 (2011)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Andrés, M.E., van Rossum, P.: Conditional probabilities over probabilistic and nondeterministic systems. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 157–172. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78800-3_12 CrossRefGoogle Scholar
  9. 9.
    Baier, C., Dubslaff, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Probabilistic model checking for energy-utility analysis. In: Breugel, F., Kashefi, E., Palamidessi, C., Rutten, J. (eds.) Horizons of the Mind. A Tribute to Prakash Panangaden. LNCS, vol. 8464, pp. 96–123. Springer, Heidelberg (2014). doi:10.1007/978-3-319-06880-0_5 CrossRefGoogle Scholar
  10. 10.
    Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press, Cambridge (2008)MATHGoogle Scholar
  11. 11.
    Baier, C., Klein, J., Klüppelholz, S., Märcker, S.: Computing conditional probabilities in Markovian models efficiently. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 515–530. Springer, Heidelberg (2014). doi:10.1007/978-3-642-54862-8_43 CrossRefGoogle Scholar
  12. 12.
    Baier, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Weight monitoring with linear temporal logic: complexity and decidability. In: Proceedings of Computer Science Logic/Logic in Computer Science (CSL-LICS), pp. 11:1–11:10. ACM (2014)Google Scholar
  13. 13.
    Baier, C., Klein, J., Klüppelholz, S. Wunderlich, S.: Maximizing the conditional expected reward for reaching the goal (extended version). arXiv:1701.05389 (2017)
  14. 14.
    Barthe, G., Espitau, T., Ferrer Fioriti, L.M., Hsu, J.: Synthesizing probabilistic invariants via Doob’s decomposition. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 43–61. Springer, Heidelberg (2016). doi:10.1007/978-3-319-41528-4_3 Google Scholar
  15. 15.
    Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Bertsekas, D.P., Yu, H.: Stochastic path problems under weak conditions. Technical report, M.I.T. Cambridge, Report LIDS 2909 (2016)Google Scholar
  17. 17.
    Boker, U., Chatterjee, K., Henzinger, T.A., Kupferman, O.: Temporal specifications with accumulative values. In: Proceedings of Logic in Computer Science (LICS), pp. 43–52. IEEE Computer Society (2011)Google Scholar
  18. 18.
    Brázdil, T., Brozek, V., Chatterjee, K., Forejt, V., Kucera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. Logical Methods Comput. Sci. 10(1) (2014)Google Scholar
  19. 19.
    Brázdil, T., Kučera, A.: Computing the expected accumulated reward and gain for a subclass of infinite Markov Chains. In: Sarukkai, S., Sen, S. (eds.) FSTTCS 2005. LNCS, vol. 3821, pp. 372–383. Springer, Heidelberg (2005). doi:10.1007/11590156_30 CrossRefGoogle Scholar
  20. 20.
    Chatterjee, K., Fu, H., Goharshady, A.K.: Termination analysis of probabilistic programs through Positivstellensatz’s. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 3–22. Springer, Heidelberg (2016). doi:10.1007/978-3-319-41528-4_1 Google Scholar
  21. 21.
    Chatzikokolakis, K., Palamidessi, C., Braun, C.: Compositional methods for information-hiding. Math. Struct. Comput. Sci. 26(6), 908–932 (2016)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Alfaro, L.: Computing minimum and maximum reachability times in probabilistic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR 1999. LNCS, vol. 1664, pp. 66–81. Springer, Heidelberg (1999). doi:10.1007/3-540-48320-9_7 CrossRefGoogle Scholar
  23. 23.
    Gretz, F., Katoen, J., McIver, A.: Operational versus weakest pre-expectation semantics for the probabilistic guarded command language. Perform. Eval. 73, 110–132 (2014)CrossRefGoogle Scholar
  24. 24.
    Jansen, N., Kaminski, B.L., Katoen, J., Olmedo, F., Gretz, F., McIver, A.: Conditioning in probabilistic programming. In: Proceedings of Mathematical Foundations of Programming Semantics (MFPS), Electronic Notes Theoretical Computer Science, vol. 319, pp. 199–216 (2015)Google Scholar
  25. 25.
    Kallenberg, L.: Markov Decision Processes. Lecture Notes. University of Leiden, Leiden (2011)Google Scholar
  26. 26.
    Katoen, J.-P., Gretz, F., Jansen, N., Kaminski, B.L., Olmedo, F.: Understanding probabilistic programs. In: Meyer, R., Platzer, A., Wehrheim, H. (eds.) Correct System Design. LNCS, vol. 9360, pp. 15–32. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23506-6_4 CrossRefGoogle Scholar
  27. 27.
    Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22110-1_47 CrossRefGoogle Scholar
  28. 28.
    PRISM model checker. http://www.prismmodelchecker.org/
  29. 29.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRefMATHGoogle Scholar
  30. 30.
    Randour, M., Raskin, J.-F., Sankur, O.: Variations on the stochastic shortest path problem. In: D’Souza, D., Lal, A., Larsen, K.G. (eds.) VMCAI 2015. LNCS, vol. 8931, pp. 1–18. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46081-8_1 Google Scholar
  31. 31.
    Seber, G., Lee, A.: Linear Regression Analysis. Wiley Series in Probability and Statistics. Wiley, New York (2003)CrossRefMATHGoogle Scholar
  32. 32.
    Uryasev, S.: Conditional value-at-risk: optimization algorithms and applications. In Proceedings of Computational Intelligence and Financial Engineering (CIFEr), pp. 49–57. IEEE (2000)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Institute for Theoretical Computer ScienceTechnische Universität DresdenDresdenGermany

Personalised recommendations