Abstract
We present a black-box active learning algorithm for inferring extended finite state machines (EFSM)s by dynamic black-box analysis. EFSMs can be used to model both data flow and control behavior of software and hardware components. Different dialects of EFSMs are widely used in tools for model-based software development, verification, and testing. Our algorithm infers a class of EFSMs called register automata. Register automata have a finite control structure, extended with variables (registers), assignments, and guards. Our algorithm is parameterized on a particular theory, i.e., a set of operations and tests on the data domain that can be used in guards.
Key to our learning technique is a novel learning model based on so-called tree queries. The learning algorithm uses tree queries to infer symbolic data constraints on parameters, e.g., sequence numbers, time stamps, identifiers, or even simple arithmetic. We describe sufficient conditions for the properties that the symbolic constraints provided by a tree query in general must have to be usable in our learning model. We also show that, under these conditions, our framework induces a generalization of the classical Nerode equivalence and canonical automata construction to the symbolic setting. We have evaluated our algorithm in a black-box scenario, where tree queries are realized through (black-box) testing. Our case studies include connection establishment in TCP and a priority queue from the Java Class Library.
Similar content being viewed by others
References
Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: Proc. POPL 2002, pp 4–16. ACM
Alur R, Cerný P, Madhusudan P, Nam W (2005) Synthesis of interface specifications for Java classes. In: Proc. POPL 2005, pp 98–109. ACM
Aarts F, Ruiter JD, Poll E (2013) Formal models of bank cards for free. In: Proc. ICSTW 2013, pp 461–468. IEEE
Aarts F, Heidarian F, Kuppens H, Olsen P, Vaandrager FW (2012) Automata learning through counterexample guided abstraction refinement. In: Proc. FM 2012, volume 7436 of LNCS, pp 10–27. Springer
Aarts F, Howar F, Kuppens H, Vaandrager FW (2014) Algorithms for inferring register automata—a comparison of existing approaches. In: Proc. ISoLA 2014, Part I, volume 8802 of LNCS, pp 202–219. Springer
Aarts F, Jonsson B, Uijen J, Vaandrager F (2014) Generating models of infinite-state communication protocols using regular inference with abstraction. Formal Methods Syst Design 46(1):1–41
Aarts F, Kuppens H, Tretmans J, Vaandrager FW, Verwer S (2012) Learning and testing the bounded retransmission protocol. In: Proc. ICGI 2012, volume 21 of JMLR Proceedings, pp 4–18. JMLR.org
Angluin D (1987) Learning regular sets from queries and counterexamples. Inf Comput 75(2): 87–106
Reo FA (2004) A channel-based coordination model for component composition. Math Struct Comput Sci 14(3): 329–366
Aarts F, Schmaltz J, Vaandrager FW (2010) Inference and abstraction of the biometric passport. In: Proc. ISoLA 2010, Part I, volume 6415 of LNCS, pp 673–686. Springer
Botinčan M, Babić D (2013) Sigma*: symbolic learning of input–output specifications. In: Proc. POPL 2013, pp 443–456. ACM
Ball T, Bounimova E, Cook B, Levin V, Lichtenberg J, McGarvey C, Ondrusek B, Rajamani SK, Ustuner A (2006) Thorough static analysis of device drivers. In: Proc. 2006 EuroSys Conf., pp 73–85. ACM
Bollig B, Habermehl P, Leucker M, Monmege B (2013) A fresh approach to learning register automata. In: Proc. DLT 2013, volume 7907 of LNCS, pp 118–130. Springer
Broy M, Jonsson B, Katoen J-P, Leucker M, Pretschner A (eds) (2004) Model-based testing of reactive systems, volume 3472 of LNCS. Springer, Berlin
Berg T, Jonsson B, Raffelt H (2008) Regular inference for state machines using domains with equality tests. In: Proc. FASE, volume 4961 of LNCS, pp 317–331
Bertoli P, Pistore M, Traverso P (2010) Automated composition of web services via planning in asynchronous domains. Artif Intell 174(3-4): 316–361
Clarke E. M, Grumberg O, Peled D (2001) Model checking. MIT Press, Cambridge
Cassel S, Howar F, Jonsson B (2015) RALib: a LearnLib extension for inferring efsms. In: DIFTS 2015, Available online: http://www.faculty.ece.vt.edu/chaowang/difts2015/papers/paper_5.pdf.
Cassel S, Howar F, Jonsson B, Steffen B (2014) Learning extended finite state machines. In: Proc. SEFM 2014, volume 8702 of LNCS, pp 250–264. Springer
Moura L. MD, Bjørner N (2008) Z3: an efficient SMT solver. In: Proc. TACAS 2008, volume 4963 of LNCS, pp 337–340. Springer
Ernst MD, Perkins JH, Guo PJ, McCamant S, Pacheco C, Tschantz MS, Xiao C (2007) The Daikon system for dynamic detection of likely invariants. Sci Comput Program 69(1-3): 35–45
Gery E, Harel D, Rhapsody EP (2002) A complete life-cycle model-based development system. In: Proc. IFM 2002, volume 2335 of LNCS, pp 1–10. Springer
Groz R, Irfan M-N, Oriat C (2012) Algorithmic improvements on regular inference of software models and perspectives for security testing. In: Proc. ISoLA 2012, Part I, volume 7609 of LNCS, pp 444–457. Springer
Giannakopoulou D, Rakamarić Z, Raman V (2012) Symbolic learning of component interfaces. In: Proc. SAS 2012, volume 7460 of LNCS, pp 248–264. Springer, Berlin, Heidelberg
Hagerer A, Hungar H, Niese O, Steffen B (2002) Model generation by moderated regular extrapolation. In: Proc. FASE 2002, volume 2306 of LNCS, pp 80–95. Springer
Howar F, Isberner M, Steffen B, Bauer O, Jonsson B (2012) Inferring semantic interfaces of data structures. In: Proc. ISoLA 2012, Part I, volume 7609 of LNCS, pp 554–571. Springer
Henzinger TA, Jhala R, Majumdar R (2005) Permissive interfaces. In: Proc. ESEC/FSE 2005, pp 31–40. ACM
Hungar H, Niese O, Steffen B (2003) Domain-specific optimization in automata learning. In: Proc. CAV 2003, volume 2725 of LNCS, pp 315–327. Springer
Howar F (2012) Active learning of interface programs. PhD thesis, Technical University of Dortmund, Germany, 2012
Howar F, Steffen B, Jonsson B, Cassel S (2012) Inferring canonical register automata. In: Proc. VMCAI 2012, volume 7148 of LNCS, pp 251–266. Springer
Howar F, Steffen B, Merten M (2011) Automata learning with automated alphabet abstraction refinement. In: Proc. VMCAI 2011, volume 6538 of LNCS, pp 263–277. Springer
Huima A (2007) Implementing Conformiq Qtronic. In: Proc. TestCom/FATES 2007, volume 4581 of LNCS, pp 1–12. Springer
Isberner M, Howar F, Steffen B (2014) Learning register automata: from languages to program structures. Mach Learn 96(1-2): 65–98
Isberner M, Howar F, Steffen B (2015) The open-source learnlib—a framework for active automata learning. In: Kroening D, Pasareanu CS (eds) Proc. CAV 2015, volume 9206 of LNCS, pp 487–495. Springer
Jhala R, Majumdar R (2009) Software model checking. ACM Comput Surv 41(4): 21, 1–21, 54
Lorenzoli D, Mariani L, Pezzè M (2008) Automatic generation of software behavioral models. In: Proc. ICSE 2008, pp 501–510. ACM
Maler O, Mens I-E (2014) Learning regular languages over large alphabets. In: Proc. TACAS 2014, volume 8413 of LNCS, pp 485–499. Springer
Rivest RL, Schapire RE (1993) Inference of finite automata using homing sequences. Inf Comput 103(2): 299–347
Shu G, Lee D (2007) Testing security properties of protocol implementations—a machine learning based approach. In: Proc. ICDCS 2007, pp 25. IEEE
Utting M, Legeard B (2007) Practical model-based testing—a tools approach. Morgan Kaufmann, Burlington
Walkinshaw N, Bogdanov K, Derrick J, Paris J (2010) Increasing functional coverage by inductive testing: a case study. In: Proc. ICTSS 2010, volume 6435 of LNCS, pp 126–141. Springer
Xiao H, Sun J, Liu Y, Lin S-W, Sun C (2013) Tzuyu: learning stateful typestates. In: Proc. ASE 2013, pp 432–442. IEEE
Author information
Authors and Affiliations
Corresponding author
Additional information
Dimitra Giannakopoulou, Gwen Salaün, and Michael Butler
This is an extended version of the conference paper [CHJS14] with a new intuitive introduction to our novel ideas, revised formal definitions of the paper’s main concepts, a complete proof of our generalization of the Nerode congruence, and an expanded section on benchmark examples and results.
Supported in part by the European FP7 project CONNECT (IST 231167), and by the UPMARC centre of excellence.
Rights and permissions
About this article
Cite this article
Cassel, S., Howar, F., Jonsson, B. et al. Active learning for extended finite state machines. Form Asp Comp 28, 233–263 (2016). https://doi.org/10.1007/s00165-016-0355-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00165-016-0355-5