Abstract
Today, model checking is one of the essential techniques in the verification of software systems. This technique can verify some properties such as reachability in which the entire state space is searched to find the desired state. However, model checking may lead to the state space explosion problem in which all states cannot be generated due to the exponential resource usage. Although the results of recent model checking approaches are promising, there is still room for improvement in terms of accuracy and the number of explored states. In this paper, using deep reinforcement learning and two neural networks, we propose an approach to increase the accuracy of the generated witnesses and reduce the use of hardware resources. In this approach, at first, an agent starts to explore the state space without any knowledge and gradually identifies the proper and improper actions by receiving different rewards/penalties from the environment to achieve the goal. Once the dataset is fulfilled with the agent's experiences, two neural networks evaluate the quality of each operation in each state, and afterwards, the best action is selected. The significant difficulties and challenges in the implementation are encoding the states, feature engineering, feature selection, reward engineering, handling invalid actions, and configuring the neural network. Finally, the proposed approach has been implemented in the Groove toolset, and as a result, in most of the case studies, it overcame the problem of state space explosion. Also, this approach outperforms the existing solutions in terms of generating shorter witnesses and exploring fewer states. On average, the proposed approach is nearly 400% better than other approaches in exploring fewer states and 300% better than the others in generating shorter witnesses. Also, on average, the proposed approach is 37% more accurate than the others in terms of finding the goals state.
This is a preview of subscription content,
to check access.

































Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
Amiri M, Amnieh HB, Hasanipanah M, Khanli LM (2016) A new combination of artificial neural network and K-nearest neighbors models to predict blast-induced ground vibration and air-overpressure. Eng Comput 32(4):631–644. https://doi.org/10.1007/s00366-016-0442-5
Baier C, Katoen J-P (2008) Principles of model checking. MIT Press, London
Bergmann G, et al., (2015) Viatra 3: A reactive model transformation platform. In: international conference on theory and practice of model transformations, 2015: Springer, pp. 101–110. [Online]. Available: https://link.springer.com/chapter/https://doi.org/10.1007/978-3-319-21155-8_8
Bertsekas DP (2018) Feature-based aggregation and deep reinforcement learning: a survey and some new implementations. IEEE/CAA J Automatica Sinica 6(1):1–31
Corradini A, F. L. Dotti, L. Foss, and L. Ribeiro, (2004) Translating Java code to graph transformation systems. In: international conference on graph transformation, Springer, pp. 383–398. [Online]. Available: https://link.springer.com/chapter/https://doi.org/10.1007/978-3-540-30203-2_27
Demri S, Laroussinie F, Schnoebelen P (2006) A parametric analysis of the state-explosion problem in model checking. Comput Syst Sci 72(4):547–575
Duboue P (2020) The art of feature engineering: essentials for machine learning. Cambridge University Press, Cambridge
Fulton N, and A. Platzer, (2018) Safe reinforcement learning via formal methods: Toward safe control through proof and learning. In: thirty-second AAAI conference on artificial intelligence, 2018. [Online]. Available: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/download/17376/16225
Hasselt van H, A. Guez, and D. Silver, (2016) Deep reinforcement learning with double q-learning. In: Thirtieth AAAI conference on artificial intelligence, 2016. [Online]. Available: https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/download/12389/11847
Hosu I.A., and T. Rebedea, "Playing atari games with deep reinforcement learning and human checkpoint replay. arXiv.org e-Print archive, 2016. [Online]. Available: https://arxiv.org/abs/1607.05077
Jansen N, B. Könighofer, S. Junges, and R. Bloem, (2018) Shielded decision-making in MDPs. arXiv.org e-Print archive, vol. abs/1807.06096, 2018. [Online]. Available: https://www.researchgate.net/profile/Nils_Jansen/publication/326459531_Shielded_Decision-Making_in_MDPs/links/5be0224e4585150b2b9faeed/Shielded-Decision-Making-in-MDPs.pdf?origin=publication_detail.
Lara de J and H. Vangheluwe, (2002) AToM 3: a Tool for Multi-formalism and Meta-modelling. In: international conference on fundamental approaches to software engineering, Springer, pp. 174–188. [Online]. Available: https://link.springer.com/chapter/https://doi.org/10.1007/3-540-45923-5_12
Liu T, Tian B, Ai Y, Li L, Cao D, Wang F-Y (2018) Parallel reinforcement learning: a framework and case study. IEEE/CAA J Automatica Sinica 5(4):827–835
Maeoka J, Y. Tanabe, and F. Ishikawa, (2015) Depth-first heuristic search for software model checking. In: Computer and Information Science, Springer, 2016, pp. 75-96
Partabian J, Rafe V, Parvin H, Nejatian S (2020) An approach based on knowledge exploration for state space management in checking reachability of complex software systems. Soft Comput 24(10):7181–7196
Pira E (2020) A novel approach to solve AI planning problems in graph transformations. Eng Appl Artif Intell 92:103684
Pira E, Rafe V, Nikanjam A (2016) EMCDM: efficient model checking by data mining for verification of complex software systems specified through architectural styles. Appl Soft Comput 49:1185–1201
Pira E, Rafe V, Nikanjam A (2017) Deadlock detection in complex software systems specified through graph transformation using Bayesian optimization algorithm. Syst Softw 131:181–200
Pira E, Rafe V, Nikanjam A (2018) Searching for violation of safety and liveness properties using knowledge discovery in complex systems specified through graph transformations. Inf Softw Technol 97:110–134
Pira E, Rafe V, Nikanjam A (2019) Using evolutionary algorithms for reachability analysis of complex software systems specified through graph transformation. Reliabil Eng Syst Saf 191:106577
Pira E, "Using knowledge discovery to propose a two-phase model checking for safety analysis of graph transformations," Software Quality Journal, pp. 1–28, 2021.
Rafe V, Moradi M, Yousefian R, Nikanjam A (2015) A meta-heuristic solution for automated refutation of complex software systems specified through graph transformations. Appl Soft Comput 33:136–149
Rafe V, Darghayedi M, Pira E (2019) MS-ACO: a multi-stage ant colony optimization to refute complex software systems specified through graph transformation. Soft Comput 23(12):4531–4556
Rensink A, (2003) The GROOVE simulator: a tool for state space generation. In: international workshop on applications of graph transformations with industrial relevance, Springer, pp. 479–485. [Online]. Available: https://link.springer.com/chapter/https://doi.org/10.1007/978-3-540-25959-6_40
Rozenberg G (1997) Handbook of graph grammars and computing. World Scientific, Singapore
Snippe E, (2011) Using heuristic search to solve planning problems in GROOVE. In: 14th Twente Student Conference on IT, University of Twente, [Online]. Available: https://www.researchgate.net/publication/228977418_Using_Heuristic_Search_to_Solve_Planning_Problems_in_GROOVE
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press, London
Taentzer G, (2003) AGG: A graph transformation environment for modeling and validation of software. In: international workshop on applications of graph transformations with industrial relevance, Springer, pp. 446–453. [Online]. Available: https://link.springer.com/chapter/https://doi.org/10.1007/978-3-540-25959-6_35
Wilcoxon F (1992) Individual comparisons by ranking methods. In: Jhonson NL, Kotz S (eds) Breakthroughs in statistics. Springer, New York, pp 196–202
Yasrebi M, Rafe V, Nejatian S (2020) An efficient approach to state space management in model checking of complex software systems using machine learning techniques. J Intell Fuzzy Syst 38(2):1761–1773
Yousefian R, Rafe V, Rahmani M (2014) A heuristic solution for model checking graph transformation systems. Appl Soft Comput 24:169–180
Yousefian R, Aboutorabi S, Rafe V (2016) A greedy algorithm versus metaheuristic solutions to deadlock detection in Graph Transformation Systems. Intell Fuzzy Syst 31(1):137–149
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mehrabi, M.J., Rafe, V. Using deep reinforcement learning to search reachability properties in systems specified through graph transformation. Soft Comput 26, 9635–9663 (2022). https://doi.org/10.1007/s00500-022-06815-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-06815-4