Formal Verification of Developer Tests: A Research Agenda Inspired by Mutation Testing

  • Serge DemeyerEmail author
  • Ali ParsaiEmail author
  • Sten VercammenEmail author
  • Brent van Bladel
  • Mehrdad Abdi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12477)


With the current emphasis on DevOps, automated software tests become a necessary ingredient for continuously evolving, high-quality software systems. This implies that the test code takes a significant portion of the complete code base—test to code ratios ranging from 3:1 to 2:1 are quite common.

We argue that “testware” provides interesting opportunities for formal verification, especially because the system under test may serve as an oracle to focus the analysis. As an example we describe five common problems (mainly from the subfield of mutation testing) and how formal verification may contribute. We deduce a research agenda as an open invitation for fellow researchers to investigate the peculiarities of formally verifying testware.


Testware Formal verification Mutation testing 


  1. 1.
    Finding duplicated code with CPD (2020). Accessed July 2020
  2. 2.
    Abdi, M., Rocha, H., Demeyer, S.: Test amplification in the pharo smalltalk ecosystem. In: Proceedings IWST 2019 International Workshop on Smalltalk Technologies. ESUG (2019)Google Scholar
  3. 3.
    Agibalov, A.: What is a normal “functional lines of code” to “test lines of code” ratio? (2015). Accessed Aug 2020
  4. 4.
    Aichernig, B.K., Lorber, F., Tiran, S.: Formal test-driven development with verified test cases. In: Proceedings MODELSWARD 2014 2nd International Conference on Model-Driven Engineering and Software Development, pp. 626–635 (2014)Google Scholar
  5. 5.
    Athanasiou, D., Nugroho, A., Visser, J., Zaidman, A.: Test code quality and its relation to issue handling performance. IEEE Trans. Softw. Eng. 40(11), 1100–1125 (2014). Scholar
  6. 6.
    Bass, L., Weber, I., Zhu, L.: DevOps: A Software Architect’s Perspective. Addison-Wesley Longman Publishing Co. Inc., Boston (2015)Google Scholar
  7. 7.
    van Bladel, B., Demeyer, S.: A novel approach for detecting Type-IV clones in test code. In: Proceedings IWSC 2019 IEEE 13th International Workshop on Software Clones, pp. 102–118. IEEE (2019).
  8. 8.
    van Bladel, B., Demeyer, S.: Clone detection in test code: an empirical evaluation. In: Proceedings SANER 2020 International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 492–500. IEEE (2020).
  9. 9.
    Brillout, A., et al.: Mutation-based test case generation for simulink models. In: de Boer, F.S., Bonsangue, M.M., Hallerstede, S., Leuschel, M. (eds.) FMCO 2009. LNCS, vol. 6286, pp. 208–227. Springer, Heidelberg (2010). Scholar
  10. 10.
    Cordy, J.R., Roy, C.K.: The NiCad clone detector. In: 2011 IEEE 19th International Conference on Program Comprehension, pp. 219–220. IEEE (2011)Google Scholar
  11. 11.
    Danglot, B., Vera-Perez, O., Yu, Z., Zaidman, A., Monperrus, M., Baudry, B.: A snowballing literature study on test amplification. J. Syst. Softw. 157, 110398 (2019)CrossRefGoogle Scholar
  12. 12.
    Danglot, B., Vera-Pérez, O.L., Baudry, B., Monperrus, M.: Automatic test improvement with dspot: a study with ten mature open-source projects. Empirical Softw. Eng. 24, 2603–2635 (2019)CrossRefGoogle Scholar
  13. 13.
    Daniel, B., Jagannath, V., Dig, D., Marinov, D.: Reassert: Suggesting repairs for broken unit tests. In: Proceedings ASE 2009 International Conference on Automated Software Engineering, pp. 433–444. IEEE CS (2009).
  14. 14.
    Devroey, X., Perrouin, G., Papadakis, M., Legay, A., Schobbens, P.Y., Heymans, P.: Model-based mutant equivalence detection using automata language equivalence and simulations. J. Syst. Softw. 141, 1–15 (2018). Scholar
  15. 15.
    Fewster, M., Graham, D.: Software Test Automation: Effective Use of Test Execution Tools. ACM Press Series. Addison-Wesley (1999)Google Scholar
  16. 16.
    Göde, N., Koschke, R.: Incremental clone detection. In: 2009 13th European Conference on Software Maintenance and Reengineering, pp. 219–228. IEEE (2009)Google Scholar
  17. 17.
    Hähnle, R.: Quo vadis formal verification? In: Ahrendt, W., Beckert, B., Bubel, R., Hähnle, R., Schmitt, P.H., Ulbrich, M. (eds.) Deductive Software Verification - The KeY Book: From Theory to Practice, pp. 1–19. Springer, Cham (2016).
  18. 18.
    Hall, A.: Seven myths of formal methods. IEEE Softw. 7(5), 11–19 (1990). Scholar
  19. 19.
    Hasanain, W., Labiche, Y., Eldh, S.: An analysis of complex industrial test code using clone analysis. In: Proceedings QRS 2018 IEEE International Conference on Software Quality, Reliability and Security, pp. 482–489. IEEE (2018).
  20. 20.
    Hiep, H.D.A., Maathuis, O., Bian, J., de Boer, F.S., van Eekelen, M., de Gouw, S.: Verifying openjdk’s linkedlist using key. In: Biere, A., Parker, D. (eds.) Tools and Algorithms for the Construction and Analysis of Systems, pp. 217–234. Springer, Cham (2020).
  21. 21.
    Jacobs, B., Smans, J., Philippaerts, P., Vogels, F., Penninckx, W., Piessens, F.: VeriFast: a powerful, sound, predictable, fast verifier for C and Java. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 41–55. Springer, Heidelberg (2011). Scholar
  22. 22.
    Jenkins, J.: Velocity culture. In: Keynote Address at the Velocity 2011 Conference (2011)Google Scholar
  23. 23.
    Jia, Y., Harman, M.: An analysis and survey of the development of mutation testing. IEEE Trans. Softw. Eng. 37(5), 649–678 (2011). Scholar
  24. 24.
    King, K.N., Offutt, A.J.: A fortran language system for mutation-based software testing. Softw. Pract. Exp. 21(7), 685–718 (1991). Scholar
  25. 25.
    Kintis, M., Malevris, N.: MEDIC: a static analysis framework for equivalent mutant identification. Inf. Softw. Technol. 68, 1–17 (2015). Scholar
  26. 26.
    Koschke, R.: Survey of research on software clones. In: Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2007)Google Scholar
  27. 27.
    Li, N., Offutt, J.: Test oracle strategies for model-based testing. IEEE Trans. Softw. Eng. 43(4), 372–395 (2016). Scholar
  28. 28.
    Lu, Z.X., Vercammen, S., Demeyer, S.: Semi-automatic test case expansion for mutation testing. In: Proceedings VST 2020 IEEE Workshop on Validation, Analysis and Evolution of Software Tests, pp. 1–7 (2020).
  29. 29.
    Luo, Q., Hariri, F., Eloussi, L., Marinov, D.: An empirical analysis of flaky tests. In: Proceedings FSE 2014 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 643–453. Association for Computing Machinery, New York (2014).
  30. 30.
    Rob, M.: Everything you need to know about tesla software updates (2014). Accessed May 2020
  31. 31.
    Madeyski, L., Orzeszyna, W., Torkar, R., Jozala, M.: Overcoming the equivalent mutant problem: a systematic literature review and a comparative experiment of second order mutation. IEEE Trans. Softw. Eng. 40(1), 23–42 (2014). Scholar
  32. 32.
    Marcozzi, M., Bardin, S., Kosmatov, N., Papadakis, M., Prevosto, V., Correnson, L.: Time to clean your test objectives. In: Proceedings ICSE 2018 40th International Conference on Software Engineering, pp. 456–467. Association for Computing Machinery, New York (2018).
  33. 33.
    Offutt, A.J., Pan, J.: Automatically detecting equivalent mutants and infeasible paths. Softw. Test. Verification Reliab. 7(3), 165–192 (1997).<165::AID-STVR143>3.0.CO;2-U
  34. 34.
    Papadakis, M., Jia, Y., Harman, M., Le Traon, Y.: Trivial compiler equivalence: a large scale empirical study of a simple, fast and effective equivalent mutant detection technique. In: Proceedings of the 37th International Conference on Software Engineering, Piscataway, NJ, USA, vol. 1, pp. 936–946. IEEE Press (2015).
  35. 35.
    Papadakis, M., Kintis, M., Zhang, J., Jia, Y., Traon, Y.L., Harman, M.: Mutation testing advances: an analysis and survey. Adv. Comput. 112, 275–378 (2019). Scholar
  36. 36.
    Parsai, A., Demeyer, S.: Do null-type mutation operators help prevent null-type faults? In: Catania, B., Královič, R., Nawrocki, J., Pighizzini, G. (eds.) SOFSEM 2019. LNCS, vol. 11376, pp. 419–434. Springer, Cham (2019). Scholar
  37. 37.
    Parsai, A., Demeyer, S.: Comparing mutation coverage against branch coverage in an industrial setting. Int. J. Softw. Tools Technol. Transfer (2020).
  38. 38.
    Parsai, A., Demeyer, S., De Busser, S.: C++11/14 mutation operators based on common fault patterns. In: Medina-Bulo, I., Merayo, M.G., Hierons, R. (eds.) ICTSS 2018. LNCS, vol. 11146, pp. 102–118. Springer, Cham (2018). Scholar
  39. 39.
    Roy, C.K., Cordy, J.R.: Benchmarks for software clone detection: a ten-year retrospective. In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (JSS), pp. 26–37. IEEE (2018)Google Scholar
  40. 40.
    Roy, C.K., Cordy, J.R., Koschke, R.: Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci. Comput. Program. 74(7), 470–495 (2009)MathSciNetCrossRefGoogle Scholar
  41. 41.
    Shi, A., Bell, J., Marinov, D.: Mitigating the effects of flaky tests on mutation testing. In: Proceedings ISSTA 2019 the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 112–122. Association for Computing Machinery, New York (2019).
  42. 42.
    Svajlenko, J., Islam, J.F., Keivanloo, I., Roy, C.K., Mia, M.M.: Towards a big data curated benchmark of inter-project code clones. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 476–480 (2014)Google Scholar
  43. 43.
    Tillmann, N., Schulte, W.: Unit tests reloaded: parameterized unit testing with symbolic execution. IEEE Softw. 23(4) (2006).
  44. 44.
    Tonella, P.: Evolutionary testing of classes. In: Proceedings ISSTA 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 119–128. Association for Computing Machinery, New York (2004).
  45. 45.
    Van Rompaey, B., Du Bois, B., Demeyer, S., Rieger, M.: On the detection of test smells: a metrics-based approach for general fixture and eager test. IEEE Trans. Softw. Eng. 33(12), 800–817 (2007). Scholar
  46. 46.
    Vercammen, S., Demeyer, S., Borg, M., Eldh, S.: Speeding up mutation testing via the cloud: lessons learned for further optimisations. In: Proceedings ESEM 2018 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 26:1–26:9. ACM, New York (2018).
  47. 47.
    Xie, T.: Augmenting automatically generated unit-test suites with regression oracle checking. In: Thomas, D. (ed.) ECOOP 2006. LNCS, vol. 4067, pp. 380–403. Springer, Heidelberg (2006). Scholar
  48. 48.
    Zaidman, A., Rompaey, B.V., van Deursen, A., Demeyer, S.: Studying the co-evolution of production and test code in open source and industrial developer test processes through repository mining. Int. J. Empirical Softw. Eng. 16(3), 325–364 (2011). Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Universiteit AntwerpenAntwerpBelgium
  2. 2.Flanders Make vzwKortrijkBelgium

Personalised recommendations