Empirical Software Engineering

, Volume 18, Issue 4, pp 825–856 | Cite as

Software model synthesis using satisfiability solvers

  • Marijn J. H. Heule
  • Sicco VerwerEmail author


We introduce a novel approach for synthesis of software models based on identifying deterministic finite state automata. Our approach consists of three important contributions. First, we argue that in order to model software, one should focus mainly on observed executions (positive data), and use the randomly generated failures (negative data) only for testing consistency. We present a new greedy heuristic for this purpose, and show how to integrate it in the state-of-the-art evidence-driven state-merging (EDSM) algorithm. Second, we apply the enhanced EDSM algorithm to iteratively reduce the size of the problem. Yet during each iteration, the evidence is divided over states and hence the effectiveness of this algorithm is decreased. We propose—when EDSM becomes too weak—to tackle the reduced identification problem using satisfiability solvers. Third, in case the amount of positive data is small, we solve the identification problem several times by randomizing the greedy heuristic and combine the solutions using a voting scheme. The interaction between these contributions appeared crucial to solve hard software models synthesis benchmarks. Our implementation, called DFASAT, won the StaMinA competition.


Software model synthesis Model inference Automaton identification Learning Satisfiability State machines 


  1. Aarts F, Vaandrager F (2010) Learning I/O automata. In: CONCUR. Lecture notes in computer science, vol 6269. Springer, New York, pp 71–85Google Scholar
  2. Abela J, Coste F, Spina S (2004) Mutually compatible and incompatible merges for the search of the smallest consistent DFA. In: ICGI. LNCS, vol 3264. Springer, New York, pp 28–39Google Scholar
  3. Ammons G, Bodik R, Larus JR (2002) Mining specifications. In: Proceedings of the symposium on principles of programming languages. ACM, pp 4–16Google Scholar
  4. Antunes J, Neves N, Verissimo P (2011) Reverse engineering of protocols from network traces. In: Working conference on reverse engineering, pp 169–178Google Scholar
  5. Bertolino A, Inverardi P, Pelliccione P, Tivoli M (2009) Automatic synthesis of behavior protocols for composable web-services. In: Proceedings of the joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. ACM, pp 141–150Google Scholar
  6. Biere A (2008) PicoSAT essentials. JSAT 4(2–4):75–97.
  7. Biere A, Cimatti A, Clarke EM, Zhu Y (1999) Symbolic model checking without BDDs. In: TACAS ’99. Springer, London, UK, pp 193–207Google Scholar
  8. Biere A, Heule MJH, van Maaren H, Walsh T (eds) (2009) Handbook of satisfiability, frontiers in artificial intelligence and applications, handbook of satisfiability, vol 185. IOS Press, Amsterdam, The NetherlandsGoogle Scholar
  9. Biermann AW, Feldman JA (1972) On the synthesis of finite-state machines from samples of their behavior. IEEE Trans Comput 21(6):592–597. doi: 10.1109/TC.1972.5009015 MathSciNetzbMATHCrossRefGoogle Scholar
  10. Broy M, Jonsson B, Katoen JP, Leucker M, Pretschner A (eds) (2005) Model-based testing of reactive systems. Lecture notes in computer science, vol 3472. Springer, New YorkGoogle Scholar
  11. Bugalho M, Oliveira AL (2005) Inference of regular languages using state merging algorithms with search. Pattern Recognit 38:1457–1467zbMATHCrossRefGoogle Scholar
  12. Castro J, Gavaldà R (2008) Towards feasible PAC-learning of probabilistic deterministic finite automata. In: ICGI, pp 163–174Google Scholar
  13. Clark A, Thollard F (2004) PAC-learnability of probabilistic deterministic finite state automata. J Mach Learn Res 5:473–497Google Scholar
  14. Clarke E (1997) Model checking. In: Foundations of software technology and theoretical computer science. LNCS, vol 1346. Springer, New York, pp 54–56CrossRefGoogle Scholar
  15. Cook JE, Wolf AL (1998) Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol 7:215–249CrossRefGoogle Scholar
  16. Coste F, Nicolas J (1997) Regular inference as a graph coloring problem. In: Workshop on grammatical inference, automata induction, and language acquisition (ICML’97)Google Scholar
  17. Cui W, Kannan J, Wang HJ (2007) Discoverer: automatic protocol reverse engineering from network traces. In: Proceedings of 16th USENIX security symposium, p 14Google Scholar
  18. Dallmeier V, Lindig C, Wasylkowski A, Zeller A (2006) Mining object behavior with ADABU. In: Proceedings of the 2006 international workshop on dynamic systems analysis, ACM, WODA ’06, pp 17–24Google Scholar
  19. de la Higuera C (2010) Grammatical inference: learning automata and grammars. Cambridge University Press, New York, NY, USAzbMATHGoogle Scholar
  20. Denis F, Lemay A, Terlutte A (2000) Learning regular languages using non deterministic finite automata. In: ICGI, pp 39–50Google Scholar
  21. Dietterich T (2000) Ensemble methods in machine learning. In: Multiple classifier systems. LNCS, vol 1857. Springer, pp 1–15Google Scholar
  22. Dupont P, Lambeau B, Damas C, van Lamsweerde A (2008) The QSM algorithm and its application to software behavior model induction. Appl Artif Intell 22(1–2):77–115CrossRefGoogle Scholar
  23. Endrullis J, Waldmann J, Zantema H (2008) Matrix interpretations for proving termination of term rewriting. J Autom Reason 40(2–3):195–220. 10.1007/s10817-007-9087-9 MathSciNetzbMATHCrossRefGoogle Scholar
  24. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New York, NY, USAzbMATHGoogle Scholar
  25. Gold EM (1978) Complexity of automaton identification from given data. Inf Control 37(3):302–320MathSciNetzbMATHCrossRefGoogle Scholar
  26. Grinchtein O, Jonsson B, Petterson P (2006a) Inference of event-recording automata using timed decision trees. In: CONCUR. LNCS, vol 4137. Springer, pp 435–449Google Scholar
  27. Grinchtein O, Leucker M, Piterman N (2006b) Inferring network invariants automatically. In: Automated reasoning. LNCS, vol 4130. Springer, pp 483–497Google Scholar
  28. Heule MJH, Verwer S (2010) Exact DFA identification using SAT solvers. In: ICGI. LNCS, vol 6339. Springer, pp 66–79Google Scholar
  29. Higuera Cdl, Janodet JC (2004) Inference of omega-languages from prefixes. Theor Comp Sci 313(2):295–312zbMATHCrossRefGoogle Scholar
  30. Jain S, Osherson D, Royer JS, Sharma A (1999) Systems that learn. MIT Press, Cambridge, MAGoogle Scholar
  31. Jarvisalo M, Biere A, Heule MJH (2010) Blocked clause elimination. In: TACAS 2010. LNCS, vol 6015. Springer, pp 129–144Google Scholar
  32. Juillé H, Pollack JB (1998) A sampling-based heuristic for tree search applied to grammar induction. In: Innovative applications of artificial intelligence. American Association for Artificial Intelligence, Menlo Park, CA, USA, AAAI ’98, pp 776–783.
  33. Kearns MJ, Vazirani UV (1994) An introduction to computational learning theory. MIT Press, Cambridge, MAGoogle Scholar
  34. Kullmann O (1999) On a generalization of extended resolution. Discrete Appl Math 96–97(1):149–176. doi: 10.1016/S0166-218X(99)00037-2 MathSciNetCrossRefGoogle Scholar
  35. Lang KJ (1999) Faster algorithms for finding minimal consistent DFAs. Tech. rep., NEC Research InstituteGoogle Scholar
  36. Lang KJ, Pearlmutter BA, Price RA (1998) Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In: ICGI. LNCS. Springer, vol 1433Google Scholar
  37. Mariani L, Pastore F, Pezze M (2011) Dynamic analysis for diagnosing integration faults. IEEE Trans Softw Eng 37:486–508CrossRefGoogle Scholar
  38. Marques-Silva JP, Glass T (1999) Combinational equivalence checking using satisfiability and recursive learning. In: Proceedings of DATE ’99. ACM, New York, NY, USA, p 33. doi:10.1145/307418.307477 CrossRefGoogle Scholar
  39. Oliveira AL, Marques-Silva JP (1998) Efficient search techniques for the inference of minimum sized finite state machines. String processing and information retrieval, pp 81–89Google Scholar
  40. Oncina J, Garcia P (1992) Inferring regular languages in polynomial update time. In: Pattern recognition and image analysis, series in machine perception and artificial intelligence, vol 1. World Scientific, Singapore, pp 49–61CrossRefGoogle Scholar
  41. Pitt L, Warmuth M (1989) The minimum consistent DFA problem cannot be approximated within and polynomial. In: Annual ACM symposium on theory of computing. ACM, pp 421–432Google Scholar
  42. Raffelt H, Steffen B, Berg T, Margaria T (2009) Learnlib: a framework for extrapolating behavioral models. Int J Softw Tools Technol Transf 11:393–407CrossRefGoogle Scholar
  43. Sakallah KA (2009) Symmetry and satisfiability, chap 10, pp 289–338. Vol 185 of Biere et al. (2009)Google Scholar
  44. Sudkamp TA (2006) Languages and machines: an introduction to the theory of computer science, 3rd edn. Addison-Wesley, Reading, MAGoogle Scholar
  45. van der Aalst WMP (2011) Process mining—discovery, conformance and enhancement of business processes. Springer, New YorkzbMATHGoogle Scholar
  46. Verwer S (2010) Efficient identification of timed automata: theory and practice. PhD thesis, Delft University of TechnologyGoogle Scholar
  47. Walkinshaw N, Bogdanov K, Holcombe M, Salahuddin S (2007) Reverse engineering state machines by interactive grammar inference. In: Proceedings of the 14th working conference on reverse engineering. IEEE, pp 209–218Google Scholar
  48. Walkinshaw N, Bogdanov K, Damas C, Lambeau B, Dupont P (2010) A framework for the competitive evaluation of model inference techniques. In: Proceedings of the first international workshop on model inference in testing, MIIT ’10. ACM, New York, NY, USA, pp 1–9CrossRefGoogle Scholar
  49. Walsh T (2000) SAT v CSP. In: CP ’02: proceedings of the 6th international conference on principles and practice of constraint programming. Springer, London, UK, pp 441–456Google Scholar
  50. Yokomori T (1993) Learning non-deterministic finite automata from queries and counterexamples. In: Machine intelligence. University Press, pp 196–189Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Johannes Kepler UniversityLinzAustria
  2. 2.Delft University of TechnologyDelftThe Netherlands
  3. 3.Radboud Universiteit NijmegenNijmegenThe Netherlands
  4. 4.Katholieke Universiteit LeuvenLeuvenBelgium

Personalised recommendations