Empirical Software Engineering

, Volume 22, Issue 2, pp 928–961 | Cite as

Generating valid grammar-based test inputs by means of genetic programming and annotated grammars

  • Fitsum Meshesha Kifetew
  • Roberto Tiella
  • Paolo Tonella


Automated generation of system level tests for grammar based systems requires the generation of complex and highly structured inputs, which must typically satisfy some formal grammar. In our previous work, we showed that genetic programming combined with probabilities learned from corpora gives significantly better results over the baseline (random) strategy. In this work, we extend our previous work by introducing grammar annotations as an alternative to learned probabilities, to be used when finding and preparing the corpus required for learning is not affordable. Experimental results carried out on six grammar based systems of varying levels of complexity show that grammar annotations produce a higher number of valid sentences and achieve similar levels of coverage and fault detection as learned probabilities.


Grammar based testing Genetic programming Grammar annotations 


  1. Arcuri A, Iqbal MZ, Briand L (2010) Formal analysis of the effectiveness and predictability of random testing. In: Proceedings of the 19th international symposium on software testing and analysis, ISSTA ’10. doi: 10.1145/1831708.1831736. ACM, New York, pp 219–230CrossRefGoogle Scholar
  2. Beyene M, Andrews JH (2012) Generating string test data for code coverage. In: Proceedings of the international conference on software testing, verification, and validation (ICST), pp 270–279Google Scholar
  3. Booth TL, Thompson RA (1973) Applying probability measures to abstract languages. IEEE Trans Comput 100(5):442–450MathSciNetCrossRefzbMATHGoogle Scholar
  4. Claessen K, Hughes J (2011) Quickcheck: a lightweight tool for random testing of haskell programs. Acm sigplan notices 46(4):53–64CrossRefGoogle Scholar
  5. Duchon P, Flajolet P, Louchard G, Schaeffer G (2004) Boltzmann samplers for the random generation of combinatorial structures. Comb Probab Comput 13(4–5):577–625MathSciNetCrossRefzbMATHGoogle Scholar
  6. Feldt R, Poulding S (2013) Finding test data with specific properties via metaheuristic search. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE). IEEE, pp 350–359Google Scholar
  7. Fraser G, Arcuri A (2011) Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11. Szeged, Hungary, pp 416–419CrossRefGoogle Scholar
  8. Fraser G, Arcuri A (2013) Whole test suite generation. IEEE Trans Softw Eng 39(2):276–291. doi: 10.1109/TSE.2012.14 CrossRefGoogle Scholar
  9. Godefroid P, Kiezun A, Levin MY (2008) Grammar-based whitebox fuzzing. In: Proceedings of the ACM SIGPLAN conference on programming language design and implementation (PLDI), pp 206–215Google Scholar
  10. Grune D, Jacobs CJH (1990) Parsing techniques: a practical guide. Ellis Horwood Limited, ChichesterzbMATHGoogle Scholar
  11. Guo HF, Qiu Z (2014) A dynamic stochastic model for automatic grammar-based test generation. Software: Practice and ExperienceGoogle Scholar
  12. Hennessy M, Power JF (2005) An analysis of rule coverage as a criterion in generating minimal test suites for grammar-based software. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering, ASE ’05. doi: 10.1145/1101908.1101926. ACM, New York, pp 104–113CrossRefGoogle Scholar
  13. Kifetew FM, Tiella R, Tonella P (2014) Combining stochastic grammars and genetic programming for coverage testing at the system level. In: Proceedings of the 6th international symposium on search-based software engineering (SSBSE), pp 138–152Google Scholar
  14. Lari K, Young SJ (1990) The estimation of stochastic context-free grammars using the inside-outside algorithm. Comput Speech Lang 4(1):35–56CrossRefGoogle Scholar
  15. Majumdar R, Xu RG (2007) Directed test generation using symbolic grammars. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering (ASE), pp 134–143Google Scholar
  16. Maurer PM (1990) Generating test data with enhanced context-free grammars. IEEE Softw 7(4):50–55CrossRefGoogle Scholar
  17. McKay RI, Hoai NX, Whigham PA, Shan Y, O’Neill M (2010) Grammar-based genetic programming: a survey. Genet Program Evolvable Mach 11(3–4):365–396CrossRefGoogle Scholar
  18. McMinn P (2004) Search-based software test data generation: a survey. J Softw Test Verification and Reliability (STVR) 14:105–156CrossRefGoogle Scholar
  19. Pargas R, Harrold MJ, Peck R (1999) Test-data generation using genetic algorithms. J Softw Test Verification and Reliability (STVR) 9:263–282CrossRefGoogle Scholar
  20. Poulding S, Alexander R, Clark JA, Hadley MJ (2013) The optimisation of stochastic grammars to enable cost-effective probabilistic structural testing. In: Proceedings of the 15th annual conference on genetic and evolutionary computation, GECCO ’13. doi: 10.1145/2463372.2463550. ACM, New York, pp 1477–1484CrossRefGoogle Scholar
  21. Purdom P (1972) A sentence generator for testing parsers. BIT Numer Math 12:366–375. doi: 10.1007/BF01932308 MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Fitsum Meshesha Kifetew
    • 1
  • Roberto Tiella
    • 1
  • Paolo Tonella
    • 1
  1. 1.Fondazione Bruno Kessler–IRSTTrentoItaly

Personalised recommendations