Empirical Software Engineering

, Volume 22, Issue 6, pp 2763–2805 | Cite as

An industry experiment on the effects of test-driven development on external quality and productivity

  • Ayse TosunEmail author
  • Oscar Dieste
  • Davide Fucci
  • Sira Vegas
  • Burak Turhan
  • Hakan Erdogmus
  • Adrian Santos
  • Markku Oivo
  • Kimmo Toro
  • Janne Jarvinen
  • Natalia Juristo


Existing empirical studies on test-driven development (TDD) report different conclusions about its effects on quality and productivity. Very few of those studies are experiments conducted with software professionals in industry. We aim to analyse the effects of TDD on the external quality of the work done and the productivity of developers in an industrial setting. We conducted an experiment with 24 professionals from three different sites of a software organization. We chose a repeated-measures design, and asked subjects to implement TDD and incremental test last development (ITLD) in two simple tasks and a realistic application close to real-life complexity. To analyse our findings, we applied a repeated-measures general linear model procedure and a linear mixed effects procedure. We did not observe a statistical difference between the quality of the work done by subjects in both treatments. We observed that the subjects are more productive when they implement TDD on a simple task compared to ITLD, but the productivity drops significantly when applying TDD to a complex brownfield task. So, the task complexity significantly obscured the effect of TDD. Further evidence is necessary to conclude whether TDD is better or worse than ITLD in terms of external quality and productivity in an industrial setting. We found that experimental factors such as selection of tasks could dominate the findings in TDD studies.


Industry experiment Test-driven development External quality Productivity 



This research has been partly funded by Spanish Ministry of Science and Innovation projects TIN2011-23216, the Distinguished Professor Program of Tekes, and the Academy of Finland (Grant Decision No. 260871). We would like to thank Dr. Lucas Layman for his contributions in designing one of the tasks used in this experiment. We would also like to sincerely thank FSecure Corporation and the software professionals who attended our training/experiment.


  1. Eclipse helios (2014).
  2. Oracle virtual box 4.3 (2014)Google Scholar
  3. The bowling game kata (2015)Google Scholar
  4. Aniche M F, Gerosa M A (2010) Most common mistakes in test-driven development practice: results from an online survey with developers. In: Third international conference on software testing, verification and validation workshopGoogle Scholar
  5. Basili V (1992) Software modeling and measurement: the goal/question/metric paradigm. Technical Report CS-TR-2956, UMIACS-TR-92-96, University of MarylandGoogle Scholar
  6. Beck K (2003) Test driven development: by example. Addison WesleyGoogle Scholar
  7. Becker K, Pimenta M S, Jacobi R P (2014) Besouro: a framework for exploring compliance rules in automatic tdd behavior assessment. Information and Software TechnologyGoogle Scholar
  8. Bergersen G R, Sjøberg D I K, Dybå T (2014) Construction and validation of an instrument for measuring programming skill. IEEE Trans Softw Eng 40(12):1163–1184CrossRefGoogle Scholar
  9. Canfora G, Cimitile A, Garcia F, Piattini M, Visaggio C A (2006) Evaluating advantages of test driven development: a controlled experiment with professionals. In: ISESE, pp 364–371Google Scholar
  10. Causevic A, Sundmark D, Punnekkat S (2010) An industrial survey on contemporary aspects of software testing. In: Third IEEE international conference on software testing, verification and validationGoogle Scholar
  11. Causevic A, Sundmark D, Punnekkat S (2011) Factors limiting industrial adoption of test driven development: a systematic review. In: Fourth IEEE international conference on software testing, verification and validation, pp 337–346Google Scholar
  12. Coe R (2002) It’s the effect size, stupid: what effect size is and why it is important. In: Annual conference of the British educational research associationGoogle Scholar
  13. Cohen J (1992) A power primer. Psychol Bull 112(1):155–159CrossRefGoogle Scholar
  14. Draper D (2006) Dojo, kata or randori?Google Scholar
  15. Ellis P D (2010) The essential guide to effect sizes: power, meta-analysis and the interpretation of research results. CambrigdeGoogle Scholar
  16. Emam K (2003) Finding success in small software projects, agile project management executive report. Technical report, Cutter Consortium, Arlington, MassachusettsGoogle Scholar
  17. Erdogmus H, Morisio M, Torchiano M (2005) On the effectiveness of the test-first approach to programming. IEEE Trans Softw Eng 31:226–237CrossRefGoogle Scholar
  18. Field A (2007) Discovering statistics using SPSS. Sage Publications IncGoogle Scholar
  19. Fucci D, Turhan B (2013) A replicated experiment on the effectiveness of test-first development. In: 2013 ACM / IEEE International symposium on empirical software engineering and measurement, pp 103– 112Google Scholar
  20. Fucci D, Turhan B, Juristo N, Dieste O, Tosun-Misirli A, Oivo M (2015) Towards an operationalization of test-driven development skills: an industrial empirical study. Inf Softw Technol 68:82–97CrossRefGoogle Scholar
  21. Fucci D, Turhan B, Oivo M (2014) Impact of process conformance on the effects of test-driven development. In: Proceedings of the 8th ACM/IEEE International symposium on empirical software engineering and measurement. ACM, p 10Google Scholar
  22. Gamma E, Beck K (2014) Junit testing framework.
  23. George B (2002) Analysis and quantification of test driven development approach. Master’s thesis, NC State UniversityGoogle Scholar
  24. George B, Williams L (2003) An initial investigation of test driven development in industry. In: ACM Symposium on applied computingGoogle Scholar
  25. George B, Williams L (2004) A structured experiment of test-driven development. Inf Softw Technol 46(5):337–342. Special issue on software engineering, applications, practices and tools from the {ACM} symposium on applied computing 2003CrossRefGoogle Scholar
  26. Geras A, Smith M, Miller J (2004) A prototype empirical evaluation of test driven development. In: 10th International symposium on software metrics (METRICS)Google Scholar
  27. Ivarsson M, Gorschek T (2011) A method for evaluating rigor and industrial relevance of technology evaluations. Emp Softw Eng 16:365–395CrossRefGoogle Scholar
  28. Jedlitschka A, Pfahl D (2005) Reporting guidelines for controlled experiments in software engineering. In: International symposium on empirical software engineeringGoogle Scholar
  29. Juristo N (2016) Experiences conducting experiments in industry: the eseil fidipro project. In: 4th International workshop on conducting empirical studies in industry. ACMGoogle Scholar
  30. Kampenes V B, Dyba T, Hannay J E, Sjoberg D I (2007) A systematic review of effect size in software engineering experiments. Inf Softw Technol 49(11–12):1073–1086CrossRefGoogle Scholar
  31. Kim H-Y Y (2013) Statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis. Restor Dent Endod 38(1):52–54CrossRefGoogle Scholar
  32. Kollanus S (2010) Test driven development - still a promising approach? In: 7th International conference on the quality of information and communications technology, pp 403–408Google Scholar
  33. Latorre R (2014a) Effects of developer experience on learning and applying unit test-driven development. IEEE Trans Softw Eng 40(4):381–395Google Scholar
  34. Latorre R (2014b) A successful application of a test-driven development strategy in the industrial environment. Emp Softw Eng 19:753–773Google Scholar
  35. Madeyski L, Szala L (2007) Lecture notes in computer science, chapter the impact of test-driven development on software development productivity: an empirical study. Springer, pp 200–211Google Scholar
  36. Maximilien E M, Williams L (2003) Assessing test-driven development at ibm. In: International conference on software engineering (ICSE)Google Scholar
  37. McCulloch CEC, Searle S (2000) Generalized, linear, and mixed models. WileyGoogle Scholar
  38. Munir H, Moayyed M, Petersen K (2014) Considering rigor and relevance when evaluating test driven development: a systematic review. Inf Softw Technol 56:375–394CrossRefGoogle Scholar
  39. Nagappan N, Maximilien E M, Bhat T, Williams L (2008) Realizing quality improvement through test driven development: results and experiences of four industrial teams. Emp Softw Eng 13:289–302CrossRefGoogle Scholar
  40. Pancur M, Ciglaric (2011) Impact of test-driven development on productivity, code and tests: a controlled experiment. Inf Softw TechnolGoogle Scholar
  41. Rafique Y, Misic V B (2013) The effects of test-driven development on external quality and productivity: a meta-analysis. IEEE Trans Softw Eng 39(6):835–856CrossRefGoogle Scholar
  42. Rodriguez P, Markkula J, Oivo M, Turula K (2012) Survey on agile and lean usage in finnish software industry. In: Six international symposium on empirical software engineering and measurementGoogle Scholar
  43. Salman I, Tosun Misirli A, Juristo N (2015) Are students representatives of professionals in software engineering experiments? In: Proceedings of the 37th international conference on software engineering, vol 1. IEEE Press, pp 666–676Google Scholar
  44. Sanchez J C, Williams L, Maximilien E M (2007) On the sustained use of a test-driven development practice at ibm. In: AGILE conference, pp 5–14Google Scholar
  45. Shadish W R, Cook T D, Campbell D T (2001) Experimental and quasi-experimental designs for generalized causal inference. Houghton MifflinGoogle Scholar
  46. Siniaalto M (2006) Test driven development: empirical body of evidence. Technical report, Information Technology for European Advancement, EindhovenGoogle Scholar
  47. Sjoeberg D I K, Hannay J E, Hansen O, Kampenes V B, Karahasanovic A, Liborg N-K, Rekdal A C (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733– 753CrossRefGoogle Scholar
  48. Still J (2007) Experiences in applying agile software development in f-secure. In: Munch J, Abrahamsson P (eds) Product-focused software process improvement, volume 4589 of lecture notes in computer science. Springer Berlin Heidelberg, pp 3–3Google Scholar
  49. Tosun-Misirli A, Erdogmus H, Juristo N, Dieste O (2014) Topic selection in industry experiments. In: 3rd International workshop on conducting experiments in software industry (CESI)Google Scholar
  50. Turhan B, Layman L, Diep M, Shull F, Erdogmus H (2010) Making software: what really works, and why we believe it, chapter how effective is test driven development? O’Reilly PressGoogle Scholar
  51. VersionOne (2013) 8th annual state of agile survey. Technical reportGoogle Scholar
  52. Williams L, Maximilien EM, Vouk M (2003) Test-driven development as a defect-reduction practice. In: 14th International symposium on software reliability engineering (ISSRE)Google Scholar
  53. Winer B (1971) Statistical principles in experimental design, 2nd edn. McGraw-Hill Series in PsychologyGoogle Scholar
  54. Wohlin C, Runeson P, Höst M, Ohlsson M C, Regnell B (2012) Experimentation in software engineering. SpringerGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Ayse Tosun
    • 1
    Email author
  • Oscar Dieste
    • 2
  • Davide Fucci
    • 3
  • Sira Vegas
    • 2
  • Burak Turhan
    • 3
  • Hakan Erdogmus
    • 4
  • Adrian Santos
    • 3
  • Markku Oivo
    • 3
  • Kimmo Toro
    • 5
  • Janne Jarvinen
    • 5
  • Natalia Juristo
    • 2
    • 3
  1. 1.Faculty of Computer Engineering and InformaticsIstanbul Technical UniversityIstanbulTurkey
  2. 2.Escuela Tecnica Superior de Ingenieros InformaticsUPMMadridSpain
  3. 3.Department of Information Processing ScienceUniversity of OuluOuluFinland
  4. 4.Carnegie Mellon UniversityMoffett FieldUSA
  5. 5.FSecure CorporationHelsinkiFinland

Personalised recommendations