Skip to main content
Log in

Software model synthesis using satisfiability solvers

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript


We introduce a novel approach for synthesis of software models based on identifying deterministic finite state automata. Our approach consists of three important contributions. First, we argue that in order to model software, one should focus mainly on observed executions (positive data), and use the randomly generated failures (negative data) only for testing consistency. We present a new greedy heuristic for this purpose, and show how to integrate it in the state-of-the-art evidence-driven state-merging (EDSM) algorithm. Second, we apply the enhanced EDSM algorithm to iteratively reduce the size of the problem. Yet during each iteration, the evidence is divided over states and hence the effectiveness of this algorithm is decreased. We propose—when EDSM becomes too weak—to tackle the reduced identification problem using satisfiability solvers. Third, in case the amount of positive data is small, we solve the identification problem several times by randomizing the greedy heuristic and combine the solutions using a voting scheme. The interaction between these contributions appeared crucial to solve hard software models synthesis benchmarks. Our implementation, called DFASAT, won the StaMinA competition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others


  1. The machine we used for the competition contained 20Gb of RAM.

  2. Available from

  3. See


  • Aarts F, Vaandrager F (2010) Learning I/O automata. In: CONCUR. Lecture notes in computer science, vol 6269. Springer, New York, pp 71–85

    Google Scholar 

  • Abela J, Coste F, Spina S (2004) Mutually compatible and incompatible merges for the search of the smallest consistent DFA. In: ICGI. LNCS, vol 3264. Springer, New York, pp 28–39

    Google Scholar 

  • Ammons G, Bodik R, Larus JR (2002) Mining specifications. In: Proceedings of the symposium on principles of programming languages. ACM, pp 4–16

  • Antunes J, Neves N, Verissimo P (2011) Reverse engineering of protocols from network traces. In: Working conference on reverse engineering, pp 169–178

  • Bertolino A, Inverardi P, Pelliccione P, Tivoli M (2009) Automatic synthesis of behavior protocols for composable web-services. In: Proceedings of the joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. ACM, pp 141–150

  • Biere A (2008) PicoSAT essentials. JSAT 4(2–4):75–97.

  • Biere A, Cimatti A, Clarke EM, Zhu Y (1999) Symbolic model checking without BDDs. In: TACAS ’99. Springer, London, UK, pp 193–207

    Google Scholar 

  • Biere A, Heule MJH, van Maaren H, Walsh T (eds) (2009) Handbook of satisfiability, frontiers in artificial intelligence and applications, handbook of satisfiability, vol 185. IOS Press, Amsterdam, The Netherlands

    Google Scholar 

  • Biermann AW, Feldman JA (1972) On the synthesis of finite-state machines from samples of their behavior. IEEE Trans Comput 21(6):592–597. doi:10.1109/TC.1972.5009015

    Article  MathSciNet  MATH  Google Scholar 

  • Broy M, Jonsson B, Katoen JP, Leucker M, Pretschner A (eds) (2005) Model-based testing of reactive systems. Lecture notes in computer science, vol 3472. Springer, New York

    Google Scholar 

  • Bugalho M, Oliveira AL (2005) Inference of regular languages using state merging algorithms with search. Pattern Recognit 38:1457–1467

    Article  MATH  Google Scholar 

  • Castro J, Gavaldà R (2008) Towards feasible PAC-learning of probabilistic deterministic finite automata. In: ICGI, pp 163–174

  • Clark A, Thollard F (2004) PAC-learnability of probabilistic deterministic finite state automata. J Mach Learn Res 5:473–497

    Google Scholar 

  • Clarke E (1997) Model checking. In: Foundations of software technology and theoretical computer science. LNCS, vol 1346. Springer, New York, pp 54–56

    Chapter  Google Scholar 

  • Cook JE, Wolf AL (1998) Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol 7:215–249

    Article  Google Scholar 

  • Coste F, Nicolas J (1997) Regular inference as a graph coloring problem. In: Workshop on grammatical inference, automata induction, and language acquisition (ICML’97)

  • Cui W, Kannan J, Wang HJ (2007) Discoverer: automatic protocol reverse engineering from network traces. In: Proceedings of 16th USENIX security symposium, p 14

  • Dallmeier V, Lindig C, Wasylkowski A, Zeller A (2006) Mining object behavior with ADABU. In: Proceedings of the 2006 international workshop on dynamic systems analysis, ACM, WODA ’06, pp 17–24

  • de la Higuera C (2010) Grammatical inference: learning automata and grammars. Cambridge University Press, New York, NY, USA

    MATH  Google Scholar 

  • Denis F, Lemay A, Terlutte A (2000) Learning regular languages using non deterministic finite automata. In: ICGI, pp 39–50

  • Dietterich T (2000) Ensemble methods in machine learning. In: Multiple classifier systems. LNCS, vol 1857. Springer, pp 1–15

  • Dupont P, Lambeau B, Damas C, van Lamsweerde A (2008) The QSM algorithm and its application to software behavior model induction. Appl Artif Intell 22(1–2):77–115

    Article  Google Scholar 

  • Endrullis J, Waldmann J, Zantema H (2008) Matrix interpretations for proving termination of term rewriting. J Autom Reason 40(2–3):195–220. 10.1007/s10817-007-9087-9

    Article  MathSciNet  MATH  Google Scholar 

  • Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New York, NY, USA

    MATH  Google Scholar 

  • Gold EM (1978) Complexity of automaton identification from given data. Inf Control 37(3):302–320

    Article  MathSciNet  MATH  Google Scholar 

  • Grinchtein O, Jonsson B, Petterson P (2006a) Inference of event-recording automata using timed decision trees. In: CONCUR. LNCS, vol 4137. Springer, pp 435–449

  • Grinchtein O, Leucker M, Piterman N (2006b) Inferring network invariants automatically. In: Automated reasoning. LNCS, vol 4130. Springer, pp 483–497

  • Heule MJH, Verwer S (2010) Exact DFA identification using SAT solvers. In: ICGI. LNCS, vol 6339. Springer, pp 66–79

  • Higuera Cdl, Janodet JC (2004) Inference of omega-languages from prefixes. Theor Comp Sci 313(2):295–312

    Article  MATH  Google Scholar 

  • Jain S, Osherson D, Royer JS, Sharma A (1999) Systems that learn. MIT Press, Cambridge, MA

    Google Scholar 

  • Jarvisalo M, Biere A, Heule MJH (2010) Blocked clause elimination. In: TACAS 2010. LNCS, vol 6015. Springer, pp 129–144

  • Juillé H, Pollack JB (1998) A sampling-based heuristic for tree search applied to grammar induction. In: Innovative applications of artificial intelligence. American Association for Artificial Intelligence, Menlo Park, CA, USA, AAAI ’98, pp 776–783.

  • Kearns MJ, Vazirani UV (1994) An introduction to computational learning theory. MIT Press, Cambridge, MA

    Google Scholar 

  • Kullmann O (1999) On a generalization of extended resolution. Discrete Appl Math 96–97(1):149–176. doi:10.1016/S0166-218X(99)00037-2

    Article  MathSciNet  Google Scholar 

  • Lang KJ (1999) Faster algorithms for finding minimal consistent DFAs. Tech. rep., NEC Research Institute

  • Lang KJ, Pearlmutter BA, Price RA (1998) Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In: ICGI. LNCS. Springer, vol 1433

  • Mariani L, Pastore F, Pezze M (2011) Dynamic analysis for diagnosing integration faults. IEEE Trans Softw Eng 37:486–508

    Article  Google Scholar 

  • Marques-Silva JP, Glass T (1999) Combinational equivalence checking using satisfiability and recursive learning. In: Proceedings of DATE ’99. ACM, New York, NY, USA, p 33. doi:10.1145/307418.307477

    Chapter  Google Scholar 

  • Oliveira AL, Marques-Silva JP (1998) Efficient search techniques for the inference of minimum sized finite state machines. String processing and information retrieval, pp 81–89

  • Oncina J, Garcia P (1992) Inferring regular languages in polynomial update time. In: Pattern recognition and image analysis, series in machine perception and artificial intelligence, vol 1. World Scientific, Singapore, pp 49–61

    Chapter  Google Scholar 

  • Pitt L, Warmuth M (1989) The minimum consistent DFA problem cannot be approximated within and polynomial. In: Annual ACM symposium on theory of computing. ACM, pp 421–432

  • Raffelt H, Steffen B, Berg T, Margaria T (2009) Learnlib: a framework for extrapolating behavioral models. Int J Softw Tools Technol Transf 11:393–407

    Article  Google Scholar 

  • Sakallah KA (2009) Symmetry and satisfiability, chap 10, pp 289–338. Vol 185 of Biere et al. (2009)

  • Sudkamp TA (2006) Languages and machines: an introduction to the theory of computer science, 3rd edn. Addison-Wesley, Reading, MA

    Google Scholar 

  • van der Aalst WMP (2011) Process mining—discovery, conformance and enhancement of business processes. Springer, New York

    MATH  Google Scholar 

  • Verwer S (2010) Efficient identification of timed automata: theory and practice. PhD thesis, Delft University of Technology

  • Walkinshaw N, Bogdanov K, Holcombe M, Salahuddin S (2007) Reverse engineering state machines by interactive grammar inference. In: Proceedings of the 14th working conference on reverse engineering. IEEE, pp 209–218

  • Walkinshaw N, Bogdanov K, Damas C, Lambeau B, Dupont P (2010) A framework for the competitive evaluation of model inference techniques. In: Proceedings of the first international workshop on model inference in testing, MIIT ’10. ACM, New York, NY, USA, pp 1–9

    Chapter  Google Scholar 

  • Walsh T (2000) SAT v CSP. In: CP ’02: proceedings of the 6th international conference on principles and practice of constraint programming. Springer, London, UK, pp 441–456

    Google Scholar 

  • Yokomori T (1993) Learning non-deterministic finite automata from queries and counterexamples. In: Machine intelligence. University Press, pp 196–189

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sicco Verwer.

Additional information

Editor: Andreas Zeller

The first author is supported by the Austrian Science Foundation (FWF) NFN Grant S11408-N23 (RiSE). The second author is supported by STW project 11763 (ITALIA) and the Research Foundation - Flanders (FWO-Vlaanderen) project G.0682.11 (Declarative experimentation).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heule, M.J.H., Verwer, S. Software model synthesis using satisfiability solvers. Empir Software Eng 18, 825–856 (2013).

Download citation

  • Published:

  • Issue Date:

  • DOI: