Abstract
We study the synthesis of control for a system that interacts with a black-box environment, based on deep learning. The goal is to minimize the number of interaction failures. The current state of the environment is unavailable to the controller, hence its operation depends on a limited view of the history. We suggest a reinforcement learning framework of training a Recurrent Neural Network (RNN) to control such a system. We experiment with various parameters: loss function, exploration/exploitation ratio, and size of lookahead. We designed examples that capture various potential control difficulties. We present experiments performed with the toolkit DyNet.
S. Iosti and S. Bensalem—The research performed by these authors was partially funded by H2020-ECSEL grants CPS4EU 2018-IA call - Grant Agreement number 826276.
D. Peled and K. Aharon—The research performed by these authors was partially funded by ISF grants “Runtime Measuring and Checking of Cyber Physical Systems” (ISF award 2239/15) and “Efficient Runtime Verification for Systems with Lots of Data and its Applications” (ISF award 1464/18).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
DyNet is a python package for automatic differentiation and stochastic gradient training, similar to PyTorch, and TensorFlow but which is also optimized for strong CPU performance.
References
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)
Basu, A., Bensalem, S., Peled, D.A., Sifakis, J.: Priority scheduling of distributed systems based on model checking. Formal Methods Syst. Des. 39(3), 229–245 (2011)
Bensalem, S., Bozga, M., Graf, S., Peled, D., Quinton, S.: Methods for knowledge based controlling of distributed systems. In: Bouajjani, A., Chin, W.-N. (eds.) ATVA 2010. LNCS, vol. 6252, pp. 52–66. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15643-4_6
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. CoRR, abs/1712.01815 (2017)
Neubig, G., et al.: DyNet: the dynamic neural network toolkit. CoRR, abs/1701.03980 (2017)
Gerth, R., Peled, D.A., Vardi, M.Y., Wolper, P.: Simple on-the-fly automatic verification of linear temporal logic. In: Dembinski, P., Sredniawa, M., (eds.) Protocol Specification, Testing and Verification XV, Proceedings of the Fifteenth IFIP WG6.1 International Symposium on Protocol Specification, Testing and Verification, Warsaw, Poland, June 1995. IFIP Conference Proceedings, vol. 38, pp. 3–18. Chapman & Hall (1995)
Goodfellow, I.J., Bengio, Y., Courville, A.C.: Deep learning. In: Adaptive Computation and Machine Learning. MIT Press (2016)
Hausknecht, M.J., Stone, P.: Deep recurrent Q-learning for partially observable mdps. CoRR, abs/1507.06527 (2015)
Heess, N., Hunt, J.J., Lillicrap, T.P., Silver, D.: Memory-based control with recurrent neural networks. CoRR, abs/1512.04455 (2015)
Peled, D., Iosti, S., Bensalem, S.: Control synthesis through deep learning. In: Bartocci, E., Cleaveland, R., Grosu, R., Sokolsky, O. (eds.) From Reactive Systems to Cyber-Physical Systems - Essays Dedicated to Scott A. Smolka on the Occasion of His 65th Birthday, pp. 242–255. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31514-6_14
Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Conference Record of the Sixteenth Annual ACM Symposium on Principles of Programming Languages, Austin, Texas, USA, 11–13 January 1989, pp. 179–190 (1989)
Pnueli, A., Rosner, R.: Distributed reactive systems are hard to synthesize. In: 31st Annual Symposium on Foundations of Computer Science, St. Louis, Missouri, USA, 22–24 October 1990, vol. II, pp. 746–757 (1990)
Safra, S.: On the complexity of omega-automata. In: 29th Annual Symposium on Foundations of Computer Science, White Plains, New York, USA, 24–26 October 1988, pp. 319–327. IEEE Computer Society (1988)
Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning, 2nd edn. MIT Press (2018)
Wonham, W.M., Ramadge, P.J.: Modular supervisory control of discrete-event systems. MCSS 1(1), 13–30 (1988)
Zhu, P., Li, X., Poupart, P.: On improving deep reinforcement learning for pomdps. CoRR, abs/1704.07978 (2017)
Zielonka, W.: Infinite games on finitely coloured graphs with applications to automata on infinite trees. Theor. Comput. Sci. 200(1–2), 135–183 (1998)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Iosti, S., Peled, D., Aharon, K., Bensalem, S., Goldberg, Y. (2020). Synthesizing Control for a System with Black Box Environment, Based on Deep Learning. In: Margaria, T., Steffen, B. (eds) Leveraging Applications of Formal Methods, Verification and Validation: Engineering Principles. ISoLA 2020. Lecture Notes in Computer Science(), vol 12477. Springer, Cham. https://doi.org/10.1007/978-3-030-61470-6_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-61470-6_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61469-0
Online ISBN: 978-3-030-61470-6
eBook Packages: Computer ScienceComputer Science (R0)