Optimizing the Expected Mean Payoff in Energy Markov Decision Processes

  • Tomáš Brázdil
  • Antonín Kučera
  • Petr Novotný
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9938)

Abstract

Energy Markov Decision Processes (EMDPs) are finite-state Markov decision processes where each transition is assigned an integer counter update and a rational payoff. An EMDP configuration is a pair s(n), where s is a control state and n is the current counter value. The configurations are changed by performing transitions in the standard way. We consider the problem of computing a safe strategy (i.e., a strategy that keeps the counter non-negative) which maximizes the expected mean payoff.

Keywords

Optimal Strategy Markov Decision Process Safe Strategy Finite Path Integer Counter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abdulla, P.A., Mayr, R., Sangnier, A., Sproston, J.: Solving parity games on integer vectors. In: DArgenio, P.R., Melgratti, H. (eds.) CONCUR 2013. LNCS, vol. 8052, pp. 106–120. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40184-8_9 CrossRefGoogle Scholar
  2. 2.
    Abdulla, P.A., Ciobanu, R., Mayr, R., Sangnier, A., Sproston, J.: Qualitative analysis of VASS-induced MDPs. In: Jacobs, B., et al. (eds.) FOSSACS 2016. LNCS, vol. 9634, pp. 319–334. Springer, Heidelberg (2016). doi: 10.1007/978-3-662-49630-5_19 CrossRefGoogle Scholar
  3. 3.
    de Alfaro, L.: Formal verification of probabilistic systems. Ph.D. thesis, Stanford University, Stanford, CA, USA (1998)Google Scholar
  4. 4.
    Bouyer, P., Fahrenberg, U., Larsen, K.G., Markey, N., Srba, J.: Infinite runs in weighted timed automata with energy constraints. In: Cassez, F., Jard, C. (eds.) FORMATS 2008. LNCS, vol. 5215, pp. 33–47. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-85778-5_4 CrossRefGoogle Scholar
  5. 5.
    Bouyer, P., Markey, N., Randour, M., Larsen, K.G., Laursen, S.: Average-energy games. In: Proceedings of GandALF 2015, pp. 1–15 (2015)Google Scholar
  6. 6.
    Brázdil, T., Brožek, V., Chatterjee, K., Forejt, V., Kučera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2011, pp. 33–42 (2011)Google Scholar
  7. 7.
    Brázdil, T., Brozek, V., Etessami, K., Kučera, A., Wojtczak, D.: One-counter Markov decision processes. In: Proceedings of SODA 2010, pp. 863–874. SIAM (2010)Google Scholar
  8. 8.
    Brázdil, T., Kiefer, S., Kučera, A.: Efficient analysis of probabilistic programs with an unbounded counter. J. ACM 61(6), 41:1–41:35 (2014)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Brázdil, T., Kučera, A., Novotný, P.: Optimizing the Expected Mean Payoff in Energy Markov Decision Processes. CoRR abs/1607.00678 (2016)Google Scholar
  10. 10.
    Brenguier, R., Cassez, F., Raskin, J.F.: Energy and mean-payoff timed games. In: Proceedings of the 17th International Conference on Hybrid Systems: Computation and Control, HSCC 2014, pp. 283–292. ACM, New York (2014)Google Scholar
  11. 11.
    Brim, L., Chaloupka, J., Doyen, L., Gentilini, R., Raskin, J.: Faster algorithms for mean-payoff games. Formal Methods Syst. Des. 38(2), 97–118 (2011)CrossRefMATHGoogle Scholar
  12. 12.
    Bruyère, V., Filiot, E., Randour, M., Raskin, J.F.: Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In: Mayr, E.W., Portier, N. (eds.) STACS 2014. Leibniz International Proceedings in Informatics (LIPIcs), vol. 25, pp. 199–213. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2014)Google Scholar
  13. 13.
    Cachera, D., Fahrenberg, U., Legay, A.: An omega-algebra for real-time energy problems. In: Proceedings of FSTTCS 2015. LIPIcs, vol. 45, pp. 394–407. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015)Google Scholar
  14. 14.
    Chakrabarti, A., Alfaro, L., Henzinger, T.A., Stoelinga, M.: Resource interfaces. In: Alur, R., Lee, I. (eds.) EMSOFT 2003. LNCS, vol. 2855, pp. 117–133. Springer, Heidelberg (2003). doi: 10.1007/978-3-540-45212-6_9 CrossRefGoogle Scholar
  15. 15.
    Chatterjee, K., Doyen, L.: Energy parity games. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 599–610. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-14162-1_50 CrossRefGoogle Scholar
  16. 16.
    Chatterjee, K., Doyen, L., Henzinger, T., Raskin, J.: Generalized mean-payoff and energy games. In: Proceedings of FST & TCS 2010. LIPIcs, vol. 8, pp. 505–516. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2010)Google Scholar
  17. 17.
    Chatterjee, K., Komárková, Z., Křetínský, J.: Unifying two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2015, pp. 244–256 (2015)Google Scholar
  18. 18.
    Chatterjee, K., Henzinger, M.: Efficient and dynamic algorithms for alternating Büchi games and maximal end-component decomposition. J. ACM 61(3), 15:1–15:40 (2014)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Chatterjee, K., Henzinger, M., Krinninger, S., Nanongkai, D.: Polynomial-time algorithms for energy games with special weight structures. Algorithmica 70(3), 457–492 (2014)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Chatterjee, K., Randour, M., Raskin, J.F.: Strategy synthesis for multi-dimensional quantitative objectives. Acta Informatica 51(3–4), 129–163 (2014)MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Clemente, L., Raskin, J.F.: Multidimensional beyond worst-case and almost-sure problems for mean-payoff objectives. In: Proceedings of LICS 2015, pp. 257–268. IEEE Computer Society, Washington (2015)Google Scholar
  22. 22.
    Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer-Verlag New York Inc., New York (1996)CrossRefMATHGoogle Scholar
  23. 23.
    Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated verification techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 53–113. springer, Heidelberg (2011). doi: 10.1007/978-3-642-21455-4_3 CrossRefGoogle Scholar
  24. 24.
    Gurvich, V., Karzanov, A., Khachiyan, L.: Cyclic games and an algorithm to find minimax cycle means in directed graphs. USSR Comput. Math. Math. Phys. 28(5), 85–91 (1990)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Haase, C., Kiefer, S.: The odds of staying on budget. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 234–246. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-47666-6_19 Google Scholar
  26. 26.
    Howard, R.: Dynamic Programming and Markov Processes. MIT Press, New York (1960)MATHGoogle Scholar
  27. 27.
    Juhl, L., Guldstrand Larsen, K., Raskin, J.-F.: Optimal bounds for multiweighted and parametrised energy games. In: Liu, Z., Woodcock, J., Zhu, H. (eds.) Theories of Programming and Formal Methods. LNCS, vol. 8051, pp. 244–255. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39698-4_15 CrossRefGoogle Scholar
  28. 28.
    Kitaev, M., Rykov, V.: Controlled Queueing Systems. CRC Press, Boca Raton (1995)MATHGoogle Scholar
  29. 29.
    Kučera, A.: Playing games with counter automata. In: Finkel, A., Leroux, J., Potapov, I. (eds.) RP 2012. LNCS, vol. 7550, pp. 29–41. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33512-9_4 CrossRefGoogle Scholar
  30. 30.
    Puterman, M.L.: Markov Decision Processes. Wiley-Interscience, Hoboken (2005)MATHGoogle Scholar
  31. 31.
    Velner, Y., Chatterjee, K., Doyen, L., Henzinger, T., Rabinovich, A., Raskin, J.: The complexity of multi-mean-payoff and multi-energy games. Inf. Comput. 241, 177–196 (2015)MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    Zwick, U., Paterson, M.: The complexity of mean payoff games on graphs. Theor. Comput. Sci. 158(1&2), 343–359 (1996)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Tomáš Brázdil
    • 1
  • Antonín Kučera
    • 1
  • Petr Novotný
    • 2
  1. 1.Faculty of Informatics MUBrnoCzech Republic
  2. 2.IST AustriaKlosterneuburgAustria

Personalised recommendations