Unit Testing Tool Competitions – Lessons Learned

  • Sebastian Bauersfeld
  • Tanja E. J. Vos
  • Kiran Lakhotia
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8432)


This paper reports about the two rounds of the Java Unit Testing Tool Competition that ran in the context of the Search Based Software Testing (SBST) workshop at ICST 2013 and the first Future Internet Testing (FITTEST) workshop at ICTSS 2013. It describes the main objectives of the benchmark, the Java classes that were selected in both competitions, the data that was collected, the tools that were used for data collection, the protocol that was carried out to execute the benchmark and how the final benchmark scores for each participating tool were calculated. Eventually, we discuss the challenges encountered during the events, what we learned and how we plan to improve our framework for future competitions.


Benchmark Mutation testing Automated unit testing 



This work is funded by the European Project with the acronym FITTEST (Future Internet Testing) and contract number (ICT-257574). We would also like to thank Arthur Baars for his initial efforts in setting up the benchmark architecture.


  1. 1.
    Apache Commons Lang v3.1. Accessed 22 Feb 2013
  2. 2.
    Apache Lucene v4.1.0. Accessed 22 Feb 2013
  3. 3.
    Async Http Client v1.7.20. Accessed 03 Sept 2013
  4. 4.
    Barbecue v1.5 beta. Accessed 22 Feb 2013
  5. 5.
    Cobertura v1.9.4.1. Accessed 22 Feb 2013
  6. 6.
    Eclipse checkstyle plugin v5.6.1. Accessed 2 Sept 2013
  7. 7.
    Gdata Java Client v1.4.7.1. Accessed 8 July 2013
  8. 8.
    Guava v15. Accessed 8 Sept 2013
  9. 9.
    Hibernate v4.2.7. Accessed 2 Sept 2013
  10. 10.
    JaCoCo v0.6.3. Accessed 24 Oct 2013
  11. 11.
    Java Machine Learning Libraryr v0.1.7. Accessed 9 Sept 2013
  12. 12.
    Java Wikipedia Library v0.9.2. Accessed 17 July 2013
  13. 13.
    Joda Time v2.0. Accessed 22 Feb 2013
  14. 14.
    Pitest v0.3.1. Accessed 24 Oct 2013
  15. 15.
    Randoop v1.3.3. Accessed 22 Feb 2013
  16. 16.
    Scribe v1.3.5. Accessed 3 Sept 2013
  17. 17.
    sqlsheet v6.4. Accessed 22 Feb 2013
  18. 18.
    Twitter4j v3.0.4. Accessed 3 Sept 2013
  19. 19.
    Andrews, J., Menzies, T., Li, F.: Genetic algorithms for randomized unit testing. IEEE Trans. Softw. Eng. 37(1), 80–94 (2011)CrossRefGoogle Scholar
  20. 20.
    Basili, V.R., Shull, F., Lanubile, F.: Building knowledge through families of experiments. IEEE Trans. Softw. Eng. 25(4), 456–473 (1999)CrossRefGoogle Scholar
  21. 21.
    Daniel, B., Boshernitsan, M.: Predicting effectiveness of automatic testing tools. In: 23rd IEEE/ACM International Conference on Automated Software Engineering 2008, ASE 2008, pp. 363–366 (2008)Google Scholar
  22. 22.
    Fraser, G., Zeller, A.: Mutation-driven generation of unit tests and oracles. IEEE Trans. Softw. Eng 38(2), 278–292 (2012)CrossRefGoogle Scholar
  23. 23.
    Fraser, G., Arcuri, A.: Sound empirical evidence in software testing. In: Proceedings of the 2012 International Conference on Software Engineering, ICSE 2012, pp. 178–188. IEEE Press, Piscataway (2012).
  24. 24.
    Fraser, G., Arcuri, A.: Evosuite at the SBST 2013 tool competition. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pp. 406–409, March 2013Google Scholar
  25. 25.
    Fraser, G., Arcuri, A.: Evosuite at the second unit testing tool competition. In: Future Internet Testing (FITTEST) Workshop (2014)Google Scholar
  26. 26.
    Harman, M., Jia, Y., Langdon, W.B.: Strong higher order mutation-based test data generation. In: Gyimóthy, T., Zeller, A. (eds.) SIGSOFT FSE. pp. 212–222. ACM (2011).
  27. 27.
    Kitchenham, B., Dyba, T., Jorgensen, M.: Evidence-based software engineering. In: Proceedings of ICSE, pp. 273–281. IEEE (2004)Google Scholar
  28. 28.
    Kitchenham, B.A., Pfleeger, S.L., Pickard, L.M., Jones, P.W., Hoaglin, D.C., Emam, K.E., Rosenberg, J.: Preliminary guidelines for empirical research in software engineering. IEEE Trans. Softw. Eng. 28(8), 721–734 (2002)CrossRefGoogle Scholar
  29. 29.
    Nistor, A., Luo, Q., Pradel, M., Gross, T., Marinov, D.: Ballerina: Automatic generation and clustering of efficient random unit tests for multithreaded code. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 727–737, June 2012Google Scholar
  30. 30.
    Pacheco, C., Ernst, M.D.: Randoop: feedback-directed random testing for java. In: Companion to the 22nd ACM SIGPLAN Conference on Object-Oriented Programming Systems and Applications Companion, OOPSLA ’07, pp. 815–816. ACM, New York (2007).
  31. 31.
    Pacheco, C., Lahiri, S.K., Ball, T.: Finding errors with feedback-directed random testing. In: Proceedings of the 2008 International Symposium on Software Testing and Analysis, ISSTA ’08, pp. 87–96. ACM, New York (2008).
  32. 32.
    Prasetya, I.: Measuring T2 against SBST 2013 benchmark suite. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pp. 410–413, March 2013Google Scholar
  33. 33.
    Prasetya, I.: T3, a combinator-based random testing tool for java: benchmarking. In: Future Internet Testing (FITTEST) Workshop (2014)Google Scholar
  34. 34.
    Schuler, D., Zeller, A.: Javalanche: efficient mutation testing for java. In: ESEC/FSE ’09: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 297–298, Aug 2009Google Scholar
  35. 35.
    Schuler, D., Zeller, A.: (un-)covering equivalent mutants. In: ICST ’10: Proceedings of the 3rd International Conference on Software Testing, Verification and Validation, pp. 45–54. IEEE Computer Society, Apr 2010Google Scholar
  36. 36.
    Sim, S., Easterbrook, S., Holt, R.: Using benchmarking to advance research: a challenge to software engineering. In: Proceedings of the 25th International Conference on Software Engineering, 2003, pp. 74–83, May 2003Google Scholar
  37. 37.
    Tonella, P., Torchiano, M., Du Bois, B., Systä, T.: Empirical studies in reverse engineering: state of the art and future trends. Empirical Softw. Engg. 12(5), 551–571 (2007)CrossRefGoogle Scholar
  38. 38.
    Vegas, S., Basili, V.: A characterisation schema for software testing techniques. Empirical Softw. Engg. 10(4), 437–466 (2005)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sebastian Bauersfeld
    • 1
  • Tanja E. J. Vos
    • 1
  • Kiran Lakhotia
    • 2
  1. 1.Centro de Métodos de Producción de Software (ProS)Universidad Politécnica de ValenciaValenciaSpain
  2. 2.University College LondonLondonUK

Personalised recommendations