On the Danger of Coverage Directed Test Case Generation

  • Matt Staats
  • Gregory Gay
  • Michael Whalen
  • Mats Heimdahl
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7212)


In the avionics domain, the use of structural coverage criteria is legally required in determining test suite adequacy. With the success of automated test generation tools, it is tempting to use these criteria as the basis for test generation. To more firmly establish the effectiveness of such approaches, we have generated and evaluated test suites to satisfy two coverage criteria using counterexample-based test generation and a random generation approach, contrasted against purely random test suites of equal size.

Our results yield two key conclusions. First, coverage criteria satisfaction alone is a poor indication of test suite effectiveness. Second, the use of structural coverage as a supplement—not a target—for test generation can have a positive impact. These observations points to the dangers inherent in the increase in test automation in critical systems and the need for more research in how coverage criteria, generation approach, and system structure jointly influence test effectiveness.


Test Suite Test Generation Random Test Branch Coverage Avionic System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Zhu, H., Hall, P.: Test data adequacy measurement. Software Engineering Journal 8(1), 21–29 (1993)CrossRefGoogle Scholar
  2. 2.
    RTCA, DO-178B: Software Consideration. In: Airborne Systems and Equipment Certification. RTCA (1992)Google Scholar
  3. 3.
    Rayadurgam, S., Heimdahl, M.P.: Coverage based test-case generation using model checkers. In: Proc. of the 8th IEEE Int’l. Conf. and Workshop on the Engineering of Computer Based Systems, pp. 83–91. IEEE Computer Society (April 2001)Google Scholar
  4. 4.
    Sen, K., Marinov, D., Agha, G.: CUTE: A concolic unit testing engine for C. In: Proc. of the 10th European Software Engineering Conf. / 13th ACM SIGSOFT Int’l. Symp. on Foundations of Software Engineering. ACM, New York (2005)Google Scholar
  5. 5.
    Godefroid, P., Klarlund, N., Sen, K.: DART: Directed automated random testing. In: PLDI 2005: Proc. of the 2005 ACM SIGPLAN Conf. on Programming Language Design and Implementation (2005)Google Scholar
  6. 6.
    Heimdahl, M.P., Devaraj, G., Weber, R.J.: Specification test coverage adequacy criteria = specification test generation inadequacy criteria? In: Proc. of the Eighth IEEE Int’l Symp. on High Assurance Systems Engineering (HASE), Tampa, Florida (March 2004)Google Scholar
  7. 7.
    Heimdahl, M.P., Devaraj, G.: Test-suite reduction for model based tests: Effects on test quality and implications for testing. In: Proc. of the 19th IEEE Int’l Conf. on Automated Software Engineering (ASE), Linz, Austria (September 2004)Google Scholar
  8. 8.
    Jia, Y., Harman, M.: An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering (99), 1 (2010)Google Scholar
  9. 9.
    Chilenski, J.J., Miller, S.P.: Applicability of Modified Condition/Decision Coverage to Software Testing. Software Engineering Journal, 193–200 (September 1994)Google Scholar
  10. 10.
    Juristo, N., Moreno, A., Vegas, S.: Reviewing 25 years of testing technique experiments. Empirical Software Engineering 9(1), 7–44 (2004)CrossRefGoogle Scholar
  11. 11.
    Hutchins, M., Foster, H., Goradia, T., Ostrand, T.: Experiments of the effectiveness of dataflow-and controlflow-based test adequacy criteria. In: Proc. of the 16th Int’l Conf. on Software Engineering. IEEE Computer Society Press, Los Alamitos (1994)Google Scholar
  12. 12.
    Frankl, P., Weiss, S.N.: An experimental comparison of the effectiveness of the all-uses and all-edges adequacy criteria. In: Proc. of the Symposium on Testing, Analysis, and Verification (1991)Google Scholar
  13. 13.
    Namin, A., Andrews, J.: The influence of size and coverage on test suite effectiveness. In: Proc. of the 18th Int’l Symp. on Software Testing and Analysis. ACM (2009)Google Scholar
  14. 14.
    Weyuker, E., Jeng, B.: Analyzing partition testing strategies. IEEE Trans. on Software Engineering 17(7), 703–711 (1991)CrossRefGoogle Scholar
  15. 15.
    Chen, T.Y., Yu, Y.T.: On the expected number of failures detected by subdomain testing and random testing. IEEE Transactions on Software Engineering 22(2) (1996)Google Scholar
  16. 16.
    Gutjahr, W.J.: Partition testing vs. random testing: The influence of uncertainty. IEEE Transactions on Software Engineering 25(5), 661–674 (1999)CrossRefGoogle Scholar
  17. 17.
    Arcuri, A., Iqbal, M.Z.Z., Briand, L.C.: Formal analysis of the effectiveness and predictability of random testing. In: ISSTA 2010, pp. 219–230 (2010)Google Scholar
  18. 18.
    Arcuri, A., Briand, L.C.: Adaptive random testing: An illusion of effectiveness? In: ISSTA (2011)Google Scholar
  19. 19.
    Gargantini, A., Heitmeyer, C.: Using model checking to generate tests from requirements specifications. Software Engineering Notes 24(6), 146–162 (1999)CrossRefGoogle Scholar
  20. 20.
    Majumdar, R., Sen, K.: Hybrid concolic testing. In: ICSE, pp. 416–426 (2007)Google Scholar
  21. 21.
    Yu, Y., Lau, M.: A comparison of MC/DC, MUMCUT and several other coverage criteria for logical decisions. Journal of Systems and Software 79(5), 577–590 (2006)CrossRefGoogle Scholar
  22. 22.
    Kandl, S., Kirner, R.: Error detection rate of MC/DC for a case study from the automotive domain. In: Software Technologies for Embedded and Ubiquitous Systems, pp. 131–142 (2011)Google Scholar
  23. 23.
    Dupuy, A., Leveson, N.: An empirical evaluation of the MC/DC coverage criterion on the hete-2 satellite software. In: Proc. of the Digital Aviation Systems Conference (DASC), Philadelphia, USA (October 2000)Google Scholar
  24. 24.
    Reactive systems inc. Reactis Product Description,
  25. 25.
    Mathworks Inc. Simulink product web site,
  26. 26.
    Mathworks Inc. Stateflow product web site,
  27. 27.
    Halbwachs, N.: Synchronous Programming of Reactive Systems. Kluwer Academic Press (1993)Google Scholar
  28. 28.
    Rajan, A., Whalen, M., Staats, M., Heimdahl, M.P.E.: Requirements Coverage as an Adequacy Measure for Conformance Testing. In: Liu, S., Araki, K. (eds.) ICFEM 2008. LNCS, vol. 5256, pp. 86–104. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  29. 29.
    Andrews, J., Briand, L., Labiche, Y.: Is mutation an appropriate tool for testing experiments? In: Proc of the 27th Int’l Conf on Software Engineering (ICSE), pp. 402–411 (2005)Google Scholar
  30. 30.
    Van Eijk, C.: Sequential equivalence checking based on structural similarities. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 19(7), 814–819 (2002)CrossRefGoogle Scholar
  31. 31.
    Chilenski, J.: An investigation of three forms of the modified condition decision coverage (MCDC) criterion. Office of Aviation Research, Washington, D.C., Tech. Rep. DOT/FAA/AR-01/18 (April 2001)Google Scholar
  32. 32.
    The NuSMV Toolset (2005),
  33. 33.
    Fisher, R.: The Design of Experiment. Hafner, New York (1935)Google Scholar
  34. 34.
    Devaraj, G., Heimdahl, M., Liang, D.: Coverage-directed test generation with model checkers: Challenges and opportunities. In: Annual International Computer Software and Applications Conference, vol. 1, pp. 455–462 (2005)Google Scholar
  35. 35.
    Rajan, A., Whalen, M., Heimdahl, M.: The effect of program and model structure on MC/DC test adequacy coverage. In: Proc. of the 30th Int’l Conference on Software Engineering, pp. 161–170. ACM, New York (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Matt Staats
    • 1
  • Gregory Gay
    • 2
  • Michael Whalen
    • 2
  • Mats Heimdahl
    • 2
  1. 1.Korea Advanced Institute of Science & TechnologyDaejeonRepublic of Korea
  2. 2.University of MinnesotaMinneapolisUSA

Personalised recommendations