Skip to main content
Log in

Active learning for extended finite state machines

  • Original Article
  • Published:
Formal Aspects of Computing

Abstract

We present a black-box active learning algorithm for inferring extended finite state machines (EFSM)s by dynamic black-box analysis. EFSMs can be used to model both data flow and control behavior of software and hardware components. Different dialects of EFSMs are widely used in tools for model-based software development, verification, and testing. Our algorithm infers a class of EFSMs called register automata. Register automata have a finite control structure, extended with variables (registers), assignments, and guards. Our algorithm is parameterized on a particular theory, i.e., a set of operations and tests on the data domain that can be used in guards.

Key to our learning technique is a novel learning model based on so-called tree queries. The learning algorithm uses tree queries to infer symbolic data constraints on parameters, e.g., sequence numbers, time stamps, identifiers, or even simple arithmetic. We describe sufficient conditions for the properties that the symbolic constraints provided by a tree query in general must have to be usable in our learning model. We also show that, under these conditions, our framework induces a generalization of the classical Nerode equivalence and canonical automata construction to the symbolic setting. We have evaluated our algorithm in a black-box scenario, where tree queries are realized through (black-box) testing. Our case studies include connection establishment in TCP and a priority queue from the Java Class Library.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: Proc. POPL 2002, pp 4–16. ACM

  2. Alur R, Cerný P, Madhusudan P, Nam W (2005) Synthesis of interface specifications for Java classes. In: Proc. POPL 2005, pp 98–109. ACM

  3. Aarts F, Ruiter JD, Poll E (2013) Formal models of bank cards for free. In: Proc. ICSTW 2013, pp 461–468. IEEE

  4. Aarts F, Heidarian F, Kuppens H, Olsen P, Vaandrager FW (2012) Automata learning through counterexample guided abstraction refinement. In: Proc. FM 2012, volume 7436 of LNCS, pp 10–27. Springer

  5. Aarts F, Howar F, Kuppens H, Vaandrager FW (2014) Algorithms for inferring register automata—a comparison of existing approaches. In: Proc. ISoLA 2014, Part I, volume 8802 of LNCS, pp 202–219. Springer

  6. Aarts F, Jonsson B, Uijen J, Vaandrager F (2014) Generating models of infinite-state communication protocols using regular inference with abstraction. Formal Methods Syst Design 46(1):1–41

  7. Aarts F, Kuppens H, Tretmans J, Vaandrager FW, Verwer S (2012) Learning and testing the bounded retransmission protocol. In: Proc. ICGI 2012, volume 21 of JMLR Proceedings, pp 4–18. JMLR.org

  8. Angluin D (1987) Learning regular sets from queries and counterexamples. Inf Comput 75(2): 87–106

    Article  MathSciNet  MATH  Google Scholar 

  9. Reo FA (2004) A channel-based coordination model for component composition. Math Struct Comput Sci 14(3): 329–366

    Article  MathSciNet  Google Scholar 

  10. Aarts F, Schmaltz J, Vaandrager FW (2010) Inference and abstraction of the biometric passport. In: Proc. ISoLA 2010, Part I, volume 6415 of LNCS, pp 673–686. Springer

  11. Botinčan M, Babić D (2013) Sigma*: symbolic learning of input–output specifications. In: Proc. POPL 2013, pp 443–456. ACM

  12. Ball T, Bounimova E, Cook B, Levin V, Lichtenberg J, McGarvey C, Ondrusek B, Rajamani SK, Ustuner A (2006) Thorough static analysis of device drivers. In: Proc. 2006 EuroSys Conf., pp 73–85. ACM

  13. Bollig B, Habermehl P, Leucker M, Monmege B (2013) A fresh approach to learning register automata. In: Proc. DLT 2013, volume 7907 of LNCS, pp 118–130. Springer

  14. Broy M, Jonsson B, Katoen J-P, Leucker M, Pretschner A (eds) (2004) Model-based testing of reactive systems, volume 3472 of LNCS. Springer, Berlin

  15. Berg T, Jonsson B, Raffelt H (2008) Regular inference for state machines using domains with equality tests. In: Proc. FASE, volume 4961 of LNCS, pp 317–331

  16. Bertoli P, Pistore M, Traverso P (2010) Automated composition of web services via planning in asynchronous domains. Artif Intell 174(3-4): 316–361

    Article  MathSciNet  Google Scholar 

  17. Clarke E. M, Grumberg O, Peled D (2001) Model checking. MIT Press, Cambridge

    Book  Google Scholar 

  18. Cassel S, Howar F, Jonsson B (2015) RALib: a LearnLib extension for inferring efsms. In: DIFTS 2015, Available online: http://www.faculty.ece.vt.edu/chaowang/difts2015/papers/paper_5.pdf.

  19. Cassel S, Howar F, Jonsson B, Steffen B (2014) Learning extended finite state machines. In: Proc. SEFM 2014, volume 8702 of LNCS, pp 250–264. Springer

  20. Moura L. MD, Bjørner N (2008) Z3: an efficient SMT solver. In: Proc. TACAS 2008, volume 4963 of LNCS, pp 337–340. Springer

  21. Ernst MD, Perkins JH, Guo PJ, McCamant S, Pacheco C, Tschantz MS, Xiao C (2007) The Daikon system for dynamic detection of likely invariants. Sci Comput Program 69(1-3): 35–45

    Article  MathSciNet  MATH  Google Scholar 

  22. Gery E, Harel D, Rhapsody EP (2002) A complete life-cycle model-based development system. In: Proc. IFM 2002, volume 2335 of LNCS, pp 1–10. Springer

  23. Groz R, Irfan M-N, Oriat C (2012) Algorithmic improvements on regular inference of software models and perspectives for security testing. In: Proc. ISoLA 2012, Part I, volume 7609 of LNCS, pp 444–457. Springer

  24. Giannakopoulou D, Rakamarić Z, Raman V (2012) Symbolic learning of component interfaces. In: Proc. SAS 2012, volume 7460 of LNCS, pp 248–264. Springer, Berlin, Heidelberg

  25. Hagerer A, Hungar H, Niese O, Steffen B (2002) Model generation by moderated regular extrapolation. In: Proc. FASE 2002, volume 2306 of LNCS, pp 80–95. Springer

  26. Howar F, Isberner M, Steffen B, Bauer O, Jonsson B (2012) Inferring semantic interfaces of data structures. In: Proc. ISoLA 2012, Part I, volume 7609 of LNCS, pp 554–571. Springer

  27. Henzinger TA, Jhala R, Majumdar R (2005) Permissive interfaces. In: Proc. ESEC/FSE 2005, pp 31–40. ACM

  28. Hungar H, Niese O, Steffen B (2003) Domain-specific optimization in automata learning. In: Proc. CAV 2003, volume 2725 of LNCS, pp 315–327. Springer

  29. Howar F (2012) Active learning of interface programs. PhD thesis, Technical University of Dortmund, Germany, 2012

  30. Howar F, Steffen B, Jonsson B, Cassel S (2012) Inferring canonical register automata. In: Proc. VMCAI 2012, volume 7148 of LNCS, pp 251–266. Springer

  31. Howar F, Steffen B, Merten M (2011) Automata learning with automated alphabet abstraction refinement. In: Proc. VMCAI 2011, volume 6538 of LNCS, pp 263–277. Springer

  32. Huima A (2007) Implementing Conformiq Qtronic. In: Proc. TestCom/FATES 2007, volume 4581 of LNCS, pp 1–12. Springer

  33. Isberner M, Howar F, Steffen B (2014) Learning register automata: from languages to program structures. Mach Learn 96(1-2): 65–98

    Article  MathSciNet  MATH  Google Scholar 

  34. Isberner M, Howar F, Steffen B (2015) The open-source learnlib—a framework for active automata learning. In: Kroening D, Pasareanu CS (eds) Proc. CAV 2015, volume 9206 of LNCS, pp 487–495. Springer

  35. Jhala R, Majumdar R (2009) Software model checking. ACM Comput Surv 41(4): 21, 1–21, 54

    Article  Google Scholar 

  36. Lorenzoli D, Mariani L, Pezzè M (2008) Automatic generation of software behavioral models. In: Proc. ICSE 2008, pp 501–510. ACM

  37. Maler O, Mens I-E (2014) Learning regular languages over large alphabets. In: Proc. TACAS 2014, volume 8413 of LNCS, pp 485–499. Springer

  38. Rivest RL, Schapire RE (1993) Inference of finite automata using homing sequences. Inf Comput 103(2): 299–347

    Article  MathSciNet  MATH  Google Scholar 

  39. Shu G, Lee D (2007) Testing security properties of protocol implementations—a machine learning based approach. In: Proc. ICDCS 2007, pp 25. IEEE

  40. Utting M, Legeard B (2007) Practical model-based testing—a tools approach. Morgan Kaufmann, Burlington

    Google Scholar 

  41. Walkinshaw N, Bogdanov K, Derrick J, Paris J (2010) Increasing functional coverage by inductive testing: a case study. In: Proc. ICTSS 2010, volume 6435 of LNCS, pp 126–141. Springer

  42. Xiao H, Sun J, Liu Y, Lin S-W, Sun C (2013) Tzuyu: learning stateful typestates. In: Proc. ASE 2013, pp 432–442. IEEE

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sofia Cassel.

Additional information

Dimitra Giannakopoulou, Gwen Salaün, and Michael Butler

This is an extended version of the conference paper [CHJS14] with a new intuitive introduction to our novel ideas, revised formal definitions of the paper’s main concepts, a complete proof of our generalization of the Nerode congruence, and an expanded section on benchmark examples and results.

Supported in part by the European FP7 project CONNECT (IST 231167), and by the UPMARC centre of excellence.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cassel, S., Howar, F., Jonsson, B. et al. Active learning for extended finite state machines. Form Asp Comp 28, 233–263 (2016). https://doi.org/10.1007/s00165-016-0355-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00165-016-0355-5

Keywords

Navigation