Advertisement

Multi-objective Robust Strategy Synthesis for Interval Markov Decision Processes

  • Ernst Moritz Hahn
  • Vahid Hashemi
  • Holger Hermanns
  • Morteza Lahijanian
  • Andrea Turrini
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10503)

Abstract

Interval Markov decision processes (IMDPs) generalise classical MDPs by having interval-valued transition probabilities. They provide a powerful modelling tool for probabilistic systems with an additional variation or uncertainty that prevents the knowledge of the exact transition probabilities. In this paper, we consider the problem of multi-objective robust strategy synthesis for interval MDPs, where the aim is to find a robust strategy that guarantees the satisfaction of multiple properties at the same time in face of the transition probability uncertainty. We first show that this problem is PSPACE-hard. Then, we provide a value iteration-based decision algorithm to approximate the Pareto set of achievable points. We finally demonstrate the practical effectiveness of our proposals by applying them on several real-world case studies.

References

  1. 1.
    Basset, N., Kwiatkowska, M., Wiltsche, C.: Compositional controller synthesis for stochastic games. In: Baldan, P., Gorla, D. (eds.) CONCUR 2014. LNCS, vol. 8704, pp. 173–187. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44584-6_13 Google Scholar
  2. 2.
    Benedikt, M., Lenhardt, R., Worrell, J.: LTL model checking of interval Markov chains. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 32–46. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-36742-7_3 CrossRefGoogle Scholar
  3. 3.
    Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
  4. 4.
    Cantino, A.S., Roberts, D.L., Isbell, C.L.: Autonomous nondeterministic tour guides: improving quality of experience with TTD-MDPs. In: AAMAS, p. 22 (2007)Google Scholar
  5. 5.
    Chatterjee, K., Majumdar, R., Henzinger, T.A.: Markov decision processes with multiple objectives. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, pp. 325–336. Springer, Heidelberg (2006). doi: 10.1007/11672142_26 CrossRefGoogle Scholar
  6. 6.
    Chatterjee, K., Sen, K., Henzinger, T.A.: Model-checking \(\omega \)-regular properties of interval Markov chains. In: Amadio, R. (ed.) FoSSaCS 2008. LNCS, vol. 4962, pp. 302–317. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-78499-9_22 CrossRefGoogle Scholar
  7. 7.
    Chen, T., Forejt, V., Kwiatkowska, M., Simaitis, A., Wiltsche, C.: On stochastic games with multiple objectives. In: Chatterjee, K., Sgall, J. (eds.) MFCS 2013. LNCS, vol. 8087, pp. 266–277. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40313-2_25 CrossRefGoogle Scholar
  8. 8.
    Chen, T., Han, T., Kwiatkowska, M.: On the complexity of model checking interval-valued discrete time Markov chains. Inf. Proc. Lett. 113(7), 210–216 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Ehrgott, M.: Multicriteria Optimization. Springer Science & Business Media, Heidelberg (2006)zbMATHGoogle Scholar
  10. 10.
    Esteve, M.-A., Katoen, J.-P., Nguyen, V.Y., Postma, B., Yushtein, Y.: Formal correctness, safety, dependability and performance analysis of a satellite. In: ICSE, pp. 1022–1031 (2012)Google Scholar
  11. 11.
    Etessami, K., Kwiatkowska, M., Vardi, M.Y., Yannakakis, M.: Multi-objective model checking of Markov decision processes. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 50–65. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-71209-1_6 CrossRefGoogle Scholar
  12. 12.
    Fecher, H., Leucker, M., Wolf, V.: Don’t Know in probabilistic systems. In: Valmari, A. (ed.) SPIN 2006. LNCS, vol. 3925, pp. 71–88. Springer, Heidelberg (2006). doi: 10.1007/11691617_5 CrossRefGoogle Scholar
  13. 13.
    Forejt, V., Kwiatkowska, M., Norman, G., Parker, D., Qu, H.: Quantitative multi-objective verification for probabilistic systems. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 112–127. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-19835-9_11 CrossRefGoogle Scholar
  14. 14.
    Forejt, V., Kwiatkowska, M., Parker, D.: Pareto curves for probabilistic model checking. In: Chakraborty, S., Mukund, M. (eds.) ATVA 2012. LNCS, pp. 317–332. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33386-6_25 CrossRefGoogle Scholar
  15. 15.
    Givan, R., Leach, S.M., Dean, T.L.: Bounded-parameter Markov decision processes. AI 122(1–2), 71–109 (2000)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Hahn, E.M., Han, T., Zhang, L.: Synthesis for PCTL in parametric Markov decision processes. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 146–161. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-20398-5_12 CrossRefGoogle Scholar
  17. 17.
    Hahn, E.M., Hashemi, V., Hermanns, H., Lahijanian, M., Turrini, A.: Multi-objective robust strategy synthesis for interval Markov decision processes (2017). http://arxiv.org/abs/1706.06875
  18. 18.
    Hashemi, V., Hermanns, H., Song, L.: Reward-bounded reachability probability for uncertain weighted MDPs. In: Jobstmann, B., Leino, K.R.M. (eds.) VMCAI 2016. LNCS, vol. 9583, pp. 351–371. Springer, Heidelberg (2016). doi: 10.1007/978-3-662-49122-5_17 CrossRefGoogle Scholar
  19. 19.
    Jonsson, B., Larsen, K.G.: Specification and refinement of probabilistic processes. In: LICS, pp. 266–277. IEEE Computer Society (1991)Google Scholar
  20. 20.
    Kozine, I., Utkin, L.V.: Interval-valued finite Markov chains. Reliable Comput. 8(2), 97–113 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Kwiatkowska, M., Norman, G., Parker, D., Qu, H.: Compositional probabilistic verification through multi-objective model checking. I&C 232, 38–65 (2013)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Lahijanian, M., Andersson, S.B., Belta, C.: Formal verification and synthesis for discrete-time stochastic systems. IEEE Tr. Autom. Contr. 60(8), 2031–2045 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Lahijanian, M., Kwiatkowska, M.: Specification revision for Markov decision processes with optimal trade-off. In: CDC, pp. 7411–7418 (2016)Google Scholar
  24. 24.
    Luna, R., Lahijanian, M., Moll, M., Kavraki, L.E.: Asymptotically optimal stochastic motion planning with temporal goals. In: Akin, H.L., Amato, N.M., Isler, V., Stappen, A.F. (eds.) WAFR 2014. STAR, vol. 107, pp. 335–352. Springer, Cham (2015). doi: 10.1007/978-3-319-16595-0_20 Google Scholar
  25. 25.
    Luna, R., Lahijanian, M., Moll, M., Kavraki, L.E.: Fast stochastic motion planning with optimality guarantees using local policy reconfiguration. In: ICRA, pp. 3013–3019 (2014)Google Scholar
  26. 26.
    Luna, R., Lahijanian, M., Moll, M., Kavraki, L.E.: Optimal and efficient stochastic motion planning in partially-known environments. In: AAAI, pp. 2549–2555 (2014)Google Scholar
  27. 27.
    Mouaddib, A.: Multi-objective decision-theoretic plan problem. In: ICRA, pp. 2814–2819 (2004)Google Scholar
  28. 28.
    Nilim, A., El Ghaoui, L.: Robust control of Markov decision processes with uncertain transition matrices. Oper. Res. 53(5), 780–798 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Ogryczak, W., Perny, P., Weng, P.: A compromise programming approach to multiobjective Markov decision processes. IJITDM 12(5), 1021–1054 (2013)Google Scholar
  30. 30.
    Perny, P., Weng, P., Goldsmith, J., Hanna, J.P.: Approximation of Lorenz-optimal solutions in multiobjective Markov decision processes. In: AAAI, pp. 92–94 (2013)Google Scholar
  31. 31.
    Puggelli, A.: Formal techniques for the verification and optimal control of probabilistic systems in the presence of modeling uncertainties. Ph.D. thesis, UC Berkeley (2014)Google Scholar
  32. 32.
    Puggelli, A., Li, W., Sangiovanni-Vincentelli, A.L., Seshia, S.A.: Polynomial-time verification of PCTL properties of MDPs with convex uncertainties. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 527–542. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39799-8_35 CrossRefGoogle Scholar
  33. 33.
    Randour, M., Raskin, J.-F., Sankur, O.: Percentile queries in multi-dimensional Markov decision processes. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 123–139. Springer, Cham (2015). doi: 10.1007/978-3-319-21690-4_8 CrossRefGoogle Scholar
  34. 34.
    Wolff, E.M., Topcu, U., Murray, R.M.: Robust control of uncertain Markov decision processes with temporal logic specifications. In: CDC, pp. 3372–3379 (2012)Google Scholar
  35. 35.
    Wu, D., Koutsoukos, X.D.: Reachability analysis of uncertain systems using bounded parameter Markov decision processes. AI 172(9), 945–954 (2008)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Ernst Moritz Hahn
    • 1
    • 2
  • Vahid Hashemi
    • 1
  • Holger Hermanns
    • 1
  • Morteza Lahijanian
    • 3
  • Andrea Turrini
    • 2
  1. 1.Saarland UniversitySaarbrückenGermany
  2. 2.State Key Laboratory of Computer ScienceInstitute of Software Chinese Academy of SciencesBeijingChina
  3. 3.Department of Computer ScienceUniversity of OxfordOxfordUK

Personalised recommendations