Abstract
We characterize the class of nondeterministic \(\omega \)-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata ‘good-for-MDPs’ (GFM). We show that GFM automata are closed under classic simulation as well as under more powerful simulation relations that leverage properties of optimal control strategies for MDPs. This closure enables us to exploit state-space reduction techniques, such as those based on direct and delayed simulation, that guarantee simulation equivalence. We demonstrate the promise of GFM automata by defining a new class of automata with favorable properties—they are Büchi automata with low branching degree obtained through a simple construction—and show that going beyond limit-deterministic automata may significantly benefit reinforcement learning.
This work has been supported by the National Natural Science Foundation of China (Grant Nr. 61532019), EPSRC grants EP/M027287/1 and EP/P020909/1, and a CU Boulder RIO grant.
Chapter PDF
Similar content being viewed by others
References
T. Babiak, M. Křetínský, V. Rehák, and J. Strejcek. LTL to Büchi automata translation: Fast and more deterministic. In Tools and Algorithms for the Construction and Analysis of Systems, pages 95–109, 2012.
Ch. Baier and J.-P. Katoen. Principles of Model Checking. MIT Press, 2008.
C. Courcoubetis and M. Yannakakis. Verifying temporal properties of finite-state probabilistic programs. In Foundations of Computer Science, pages 338–345. IEEE, 1988.
C. Courcoubetis and M. Yannakakis. The complexity of probabilistic verification. J. ACM, 42(4):857–907, July 1995.
L. de Alfaro. Formal Verification of Probabilistic Systems. PhD thesis, Stanford University, 1998.
P. Dhariwal, Ch. Hesse, O. Klimov, A. Nichol, M. Plappert, A. Radford, J. Schulman, S. Sidor, Y. Wu, and P. Zhokhov. Openai baselines. https://github.com/openai/baselines, 2017.
D. L. Dill, A. J. Hu, and H. Wong-Toi. Checking for language inclusion using simulation relations. In Computer Aided Verification, pages 255–265, July 1991. LNCS 575.
A. Duret-Lutz, A. Lewkowicz, A. Fauchille, T. Michaud, E. Renault, and L. Xu. Spot 2.0 - A framework for LTL and \(\omega \)-automata manipulation. In Automated Technology for Verification and Analysis, pages 122–129, 2016.
K. Etessami, T. Wilke, and R. A. Schuller. Fair simulation relations, parity games, and state space reduction for Büchi automata. SIAM J. Comput., 34(5):1159–1175, 2005.
S. Gurumurthy, R. Bloem, and F. Somenzi. Fair simulation minimization. In Computer Aided Verification (CAV’02), pages 610–623, July 2002. LNCS 2404.
E. M. Hahn, G. Li, S. Schewe, A. Turrini, and L. Zhang. Lazy probabilistic model checking without determinisation. In Concurrency Theory, pages 354–367, 2015.
E. M. Hahn, M. Perez, S. Schewe, F. Somenzi, A. Trivedi, and D. Wojtczak. Omega-regular objectives in model-free reinforcement learning. In Tools and Algorithms for the Construction and Analysis of Systems, pages 395–412, 2019. LNCS 11427.
E. M. Hahn, M. Perez, F. Somenzi, A. Trivedi, S. Schewe, and D. Wojtczak. Good-for-MDPs automata. arXiv e-prints, abs/1909.05081, September 2019.
T. Henzinger, O. Kupferman, and S. Rajamani. Fair simulation. In Concurrency Theory, pages 273–287, 1997. LNCS 1243.
T. A. Henzinger and N. Piterman. Solving games without determinization. In Computer Science Logic, pages 394–409, September 2006. LNCS 4207.
D. Kini and M. Viswanathan. Optimal translation of LTL to limit deterministic automata. In Tools and Algorithms for the Construction and Analysis of Systems, pages 113–129, 2017.
J. Klein, D. Müller, Ch. Baier, and S. Klüppelholz. Are good-for-games automata good for probabilistic model checking? In Language and Automata Theory and Applications, pages 453–465. Springer, 2014.
J. Klein, D. Müller, Ch. Baier, and S. Klüppelholz. Are good-for-games automata good for probabilistic model checking? In Language and Automata Theory and Applications, pages 453–465, 2014.
J. Křetínský, T. Meggendorfer, S. Sickert, and Ch. Ziegler. Rabinizer 4: from LTL to your favourite deterministic automaton. In Computer Aided Verification, pages 567–577. Springer, 2018.
J. Křetínský, T. Meggendorfer, and S. Sickert. Owl: A library for \(\omega \)-words, automata, and LTL. In Automated Technology for Verification and Analysis, pages 543–550, 2018.
R. Milner. An algebraic definition of simulation between programs. Int. Joint Conf. on Artificial Intelligence, pages 481–489, 1971.
N. Piterman. From deterministic Büchi and Streett automata to deterministic parity automata. Logical Methods in Computer Science, 3(3):1–21, 2007.
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, New York, NY, USA, 1994.
S. Safra. Complexity of Automata on Infinite Objects. PhD thesis, The Weizmann Institute of Science, March 1989.
S. Schewe. Beyond hyper-minimisation—minimising DBAs and DPAs isNP-complete. In Foundations of Software Technology and Theoretical Computer Science, FSTTCS, pages 400–411, 2010.
S. Schewe and T. Varghese. Tight bounds for the determinisation and complementation of generalised Büchi automata. In Automated Technology for Verification and Analysis, pages 42–56, 2012.
S. Schewe and T. Varghese. Determinising parity automata. In Mathematical Foundations of Computer Science, pages 486–498, 2014.
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. CoRR, abs/1707.06347, 2017.
S. Sickert, J. Esparza, S. Jaax, and J. Křetínský. Limit-deterministic Büchi automata for linear temporal logic. In Computer Aided Verification, pages 312–332, 2016. LNCS 9780.
S. Sickert and J. Křetínský. MoChiBA: Probabilistic LTL model checking using limit-deterministic Büchi automata. In Automated Technology for Verification and Analysis, pages 130–137, 2016.
F. Somenzi and R. Bloem. Efficient Büchi automata from LTL formulae. In Computer Aided Verification, pages 248–263, July 2000. LNCS 1855.
M.-H. Tsai, S. Fogarty, M. Y. Vardi, and Y.-K. Tsay. State of Büchi complementation. Logical Mehods in Computer Science, 10(4), 2014.
M.-H. Tsai, Y.-K. Tsay, and Y.-S. Hwang. GOAL for games, omega-automata, and logics. In Computer Aided Verification, pages 883–889, 2013.
M. Y. Vardi. Automatic verification of probabilistic concurrent finite state programs. In Foundations of Computer Science, pages 327–338, 1985.
E. M. Hahn, M. Perez, S. Schewe, F. Somenzi, A. Trivedi, and D. Wojtczak. Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning Figshare (2020), https://doi.org/10.6084/m9.figshare.11882739.
A. Hartmanns and M. Seidl. tacas20ae.ova. Figshare (2019) https://doi.org/10.6084/m9.figshare.9699839.v2
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2020 The Author(s)
About this paper
Cite this paper
Hahn, E.M., Perez, M., Schewe, S., Somenzi, F., Trivedi, A., Wojtczak, D. (2020). Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning. In: Biere, A., Parker, D. (eds) Tools and Algorithms for the Construction and Analysis of Systems. TACAS 2020. Lecture Notes in Computer Science(), vol 12078. Springer, Cham. https://doi.org/10.1007/978-3-030-45190-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-45190-5_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45189-9
Online ISBN: 978-3-030-45190-5
eBook Packages: Computer ScienceComputer Science (R0)