Empirical Software Engineering

, Volume 22, Issue 5, pp 2457–2542 | Cite as

Empirical evaluation of the effects of experience on code quality and programmer productivity: an exploratory study

  • Oscar DiesteEmail author
  • Alejandrina M. Aranda
  • Fernando Uyaguari
  • Burak Turhan
  • Ayse Tosun
  • Davide Fucci
  • Markku Oivo
  • Natalia Juristo



There is a widespread belief in both SE and other branches of science that experience helps professionals to improve their performance. However, cases have been reported where experience not only does not have a positive influence but sometimes even degrades the performance of professionals.


Determine whether years of experience influence programmer performance.


We have analysed 10 quasi-experiments executed both in academia with graduate and postgraduate students and in industry with professionals. The experimental task was to apply ITLD on two experimental problems and then measure external code quality and programmer productivity.


Programming experience gained in industry does not appear to have any effect whatsoever on quality and productivity. Overall programming experience gained in academia does tend to have a positive influence on programmer performance. These two findings may be related to the fact that, as opposed to deliberate practice, routine practice does not appear to lead to improved performance. Experience in the use of productivity tools, such as testing frameworks and IDE also has positive effects.


Years of experience are a poor predictor of programmer performance. Academic background and specialized knowledge of task-related aspects appear to be rather good predictors.


Experience Industry Academy Programming Iterative test-last development External quality Productivity Performance 



We would like to acknowledge Dr.Hakan Erdogmus who contributed to the design of one of the tasks used in the study (BSK) and the corresponding test cases. We also wish to acknowledge Mr. Timo Raty for his participation in the creation of the code templates for C++, and the training given in one of the quasi-experiments. We wish also acknowledge Mr. Adrian Santos for his support in the collection of the subjects’ data.


  1. Adelson B (1981) Problem solving and the development of abstract categories in programming languages. Mem Cogn 9(4):422–433CrossRefGoogle Scholar
  2. Adelson B (1984) When novices surpass experts: the difficulty of a task may increase with expertise. J Exp Psychol: Learn Mem Cogn 10(3):483Google Scholar
  3. Agarwal R, Tanniru MR (1991) Knowledge extraction using content analysis. Knowl Acquis 3:421–441CrossRefGoogle Scholar
  4. Aranda A, Dieste O, Juristo N (2014) Evidence of the presence of bias in subjective metrics: analysis within a family of experiments. Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (EASE 2014). London, UK, pp 24–27Google Scholar
  5. Arisholm E, Gallis H, Dyba T, Sjoberg DIK (2007) Evaluating pair programming with respect to system complexity and programmer expertise. IEEE Trans Softw Eng 33(2):65–86CrossRefGoogle Scholar
  6. Armour PG (2004) Beware of counting LOC. Commun ACM 47(3):21–24MathSciNetCrossRefGoogle Scholar
  7. Askar P, Davenport D (2009) An investigation of factors related to self-efficacy for java programming among engineering students. Turk Online J Educ Technol 8(1):26–32Google Scholar
  8. Belsley DA (1991) Conditioning diagnostics: collinearity and weak data in regression. WileyGoogle Scholar
  9. Bob U (2005) The bowling game kata. Retrieved from
  10. Brandmaier AM, von Oertzen T, McArdle JJ, Lindenberger U (2013) Structural equation model trees. Psychol Methods 18:71–86CrossRefGoogle Scholar
  11. Burkhardt J, Détienne F, Wiedenbeck S (1997) Mental representations constructed by experts and novices in object-oriented program comprehension. In: Howard S, Hammond J, Lindgaard G (eds) Springer US, pp 339–346Google Scholar
  12. Burkhardt J, Détienne F, Wiedenbeck S (2002) Object-oriented program comprehension: effect of expertise, task and phase. Empir Softw Eng 7(2):115–156zbMATHCrossRefGoogle Scholar
  13. Camerer CF, Johnson EJ (1997) 10 the process-performance paradox in expert judgment: How can experts know so much and predict so badly? Research on Judgment and Decision Making: Currents, Connections, and Controversies. 342Google Scholar
  14. Campbell RL, Bello LD (1996) Studying human expertise: beyond the binary paradigm. J Exp Theor Artif Intell 8(3-4):277–291CrossRefGoogle Scholar
  15. Chase WG, Simon HA (1973) The mind’s eye in chessGoogle Scholar
  16. Chmiel R, Loui MC (2004) Debugging: from novice to expert. ACM SIGCSE Bull 36(1):17–21CrossRefGoogle Scholar
  17. Chulis K (2012) Optimal segmentation approach and application. clustering vs. classification trees. Retrieved from
  18. Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Erlbaum Associates, HillsdalezbMATHGoogle Scholar
  19. Colvin G (2008) Talent is overrated: What really separates world-class performers from Everybody Else. Penguin Publishing GroupGoogle Scholar
  20. Crosby M, Scholtz J, Widenbeck S (2002) The roles beacons play in comprehension for novice and expert programmers. 14th Workshop of the Psychology of Programming Interest Group, Brunel University. pp 58–73Google Scholar
  21. Curtis B (1984) Fifteen years of psychology in software engineering: individual differences and cognitive science. IEEE Press, OrlandoGoogle Scholar
  22. Curtis B, Krasner H, Iscoe N (1988) A field study of the software design process for large systems. Commun ACM 31(11):1268–1287CrossRefGoogle Scholar
  23. Darcy DP, Ma M (2005) Exploring individual characteristics and programming performance: Implications for programmer selection. Proceedings of the 38th Annual Hawaii International Conference on System Sciences, 314a.Google Scholar
  24. Daun M, Salmon A, Weyer T, Pohl K (2015) The impact of students’ skills and experiences on empirical results: A controlled experiment with undergraduate and graduate students. Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, Art. No. 29.Google Scholar
  25. De Groot AD (1978) Thought and choice in chess. Walter de GruyterGoogle Scholar
  26. Erdogmus H, Morisio M, Torchiano M (2005) On the effectiveness of the test-first approach to programming. Softw Eng IEEE Trans 31(3):226–237CrossRefGoogle Scholar
  27. Ericsson KA (2006a) The influence of experience and deliberate practice on the development of superior expert performance. The Cambridge Handbook of Expertise and Expert Performance, pp 683–703Google Scholar
  28. Ericsson KA (2006b) An introduction to cambridge handbook of expertise and expert performance: Its development, organization, and content. In: Ericsson KA, Charness N, Hoffman RR, Feltovich PJ (eds) The cambridge handbook of expertise and expert performance. Cambridge University Press, pp 3–19Google Scholar
  29. Ericsson KA, Charness N (1994) Expert performance: its structure and acquisition. Am Psychol 49(8):725CrossRefGoogle Scholar
  30. Ericsson KA, Lehmann AC (1996) Expert and exceptional performance: evidence of maximal adaptation to task constraints. Annu Rev Psychol 47(1):273–305CrossRefGoogle Scholar
  31. Ericsson KA, Krampe RT, Tesch-Römer C (1993) The role of deliberate practice in the acquisition of expert performance. Psychol Rev 100(3):363–406CrossRefGoogle Scholar
  32. Experience (2015) from Retrieved 7 Oct 2015
  33. Faul F, Erdfelder E, Lang A, Buchner A (2007) G* power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 39(2):175–191CrossRefGoogle Scholar
  34. Fenton N, Bieman J (2014) Software metrics: a rigorous and practical approach, third edition. CRC Press.Google Scholar
  35. Field A, Miles J, Field Z (2012) Discovering statistics using R. SAGE PublicationsGoogle Scholar
  36. Glenwick DS (2016) Handbook of methodological approaches to community-based research: Qualitative, quantitative, and mixed methods. Oxford University PressGoogle Scholar
  37. Green SB (1991) How many subjects does it take to do A regression analysis. Multivar Behav Res 26(3):499–510CrossRefGoogle Scholar
  38. Hedges LV, Olkin I (1985) Statistical methods for meta-analysis. Academic PressGoogle Scholar
  39. Heiberger RM, Holland B (2013) Statistical analysis and data display: an intermediate course with examples in S-plus, R, and SAS. Springer, New YorkzbMATHGoogle Scholar
  40. ISO I (2011) IEC25010: 2011 systems and software engineering–systems and software quality requirements and evaluation (SQuaRE)–System and software quality models. Int Organ StandGoogle Scholar
  41. Jeffries R, Turner AA, Polson PG, Atwood ME (1981) The processes involved in designing software. Cogn Skills Acquis 255:283Google Scholar
  42. Jørgensen M, Faugli B, Gruschke T (2007) Characteristics of software engineers with optimistic predictions. J Syst Softw 80(9):1472–1482CrossRefGoogle Scholar
  43. Kitchenham B, Mendes E (2004) Software productivity measurement using multiple size measures. IEEE Trans Softw Eng 30(12):1023–1035CrossRefGoogle Scholar
  44. Larkin J, McDermott J, Simon DP, Simon HA (1980) Expert and novice performance in solving physics problems. Science (New York, NY) 208(4450):1335–1342CrossRefGoogle Scholar
  45. Lee WK, Chung IS, Yoon GS, Kwon YR (2001) Specification-based program slicing and its applications. J Syst Archit 47(5):427–443CrossRefGoogle Scholar
  46. Lui KM, Chan KCC (2006) Pair programming productivity: novice–novice vs. expert–expert. Int J Hum-Comput Stud 64(9):915–925CrossRefGoogle Scholar
  47. MacCallum R, Zhang S, Preacher K, Rucker D (2002) On the practice of dichotomization of quantitative variables. 7:10–40Google Scholar
  48. MacDorman KF, Whalen TJ, Ho C, Patel H (2011) An improved usability measure based on novice and expert performance. Int J Hum-Comput Interact 27(3):280–302CrossRefGoogle Scholar
  49. Madeyski L (2005) Preliminary analysis of the effects of pair programming and test-driven development on the external code quality. Proceedings of the 2005 Conference on Software Engineering: Evolution and Emerging Technologies. pp. 113–123Google Scholar
  50. Marakas GM, Elam JJ (1998) Semantic structuring in analyst and representation of facts in requirements analysis. Inf Syst Res 9(1):37–63CrossRefGoogle Scholar
  51. Mayer RE (1997) From novice to expert. In: Helander M, Landauer TK, Prabhu P (eds) Handbook of human-computer interaction, 2nd edn. Elsevier Science B.V, pp. 781–795Google Scholar
  52. McDaniel MA, Schmidt FL, Hunter JE (1988) Job experience correlates of job performance. J Appl Psychol 73(2):327CrossRefGoogle Scholar
  53. McKeithen KB, Reitman JS, Rueter HH, Hirtle SC (1981) Knowledge organization and skill differences in computer programmers. Cogn Psychol 13(3):307–325CrossRefGoogle Scholar
  54. Miles J, Shevlin M (2001) Applying regression and correlation: A guide for students and researchers. SAGE PublicationsGoogle Scholar
  55. Müller MM, Höfer A (2007) The effect of experience on the test-driven development process. Empir Softw Eng 12(6):593–615CrossRefGoogle Scholar
  56. Muller MM, Padberg F (2004) An empirical study about the feelgood factor in pair programming. Proceedings 10th International Symposium on Software Metrics. pp 151–158Google Scholar
  57. Munir H, Moayyed M, Petersen K (2014) Considering rigor and relevance when evaluating test driven development: a systematic review. Inf Softw Technol 56(4):375–394CrossRefGoogle Scholar
  58. Nisbet R, Elder J, Miner G (2009) Handbook of statistical analysis and data mining applications. Academic PressGoogle Scholar
  59. O’brien R (2007) A caution regarding rules of thumb for variance inflation factors. Qual Quant 41(5):673–690CrossRefGoogle Scholar
  60. Ricca F, Di Penta M, Torchiano M, Tonella P, Ceccato M (2007) The role of experience and ability in comprehension tasks supported by UML stereotypes. 29th International Conference on Software Engineering. pp 375–384Google Scholar
  61. Riley RD, Lambert PC, Abo-Zaid G (2010) Meta-analysis of individual participant data: Rationale, conduct, and reporting. BMJ 340. doi: 10.1136/bmj.c221
  62. Runeson P (2003) Using students as experiment subjects – an analysis on graduate and freshmen student data. Proceedings 7Th International conference on empirical assessment & evaluation in software engineering. pp 95–102Google Scholar
  63. Sheppard SB, Curtis B, Milliman P, Love T (1979) Modern coding practices and programmer performance. Computer 12:41–49CrossRefGoogle Scholar
  64. Siegmund J, Kästner C, Liebig J, Apel S, Hanenberg S (2014) Measuring and modeling programming experience. Empir Softw Eng 19(5):1299–1334CrossRefGoogle Scholar
  65. Sim SE, Ratanotayanon S, Aiyelokun O, Morris E (2006) An initial study to develop an empirical test for software engineering expertise. Institute for Software Research, University of California, Irvine, CA, USA, Technical Report# UCI-ISR-06-6Google Scholar
  66. Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng SE-10(5):595–609CrossRefGoogle Scholar
  67. Soloway E, Bonar J, Ehrlich K (1983) Cognitive strategies and looping constructs: an empirical study. Commun ACM 26(11):853–860Google Scholar
  68. Sonnentag S (1995) Excellent software professionals: experience, work activities, and perception by peers. Behav Inform Technol 14(5):289–299CrossRefGoogle Scholar
  69. Sonnentag S (1998) Expertise in professional software design: a process study. J Appl Psychol 83(5):703–715CrossRefGoogle Scholar
  70. Votta LG (1994) By the way, has anyone studied any real programmers, yet? Software Process Workshop, 1994. Proceedings., Ninth International. pp 93–95Google Scholar
  71. Weisberg S (2005). Applied Linear Regression, third edition. John Wiley & Sons, Inc., Hoboken, NJGoogle Scholar
  72. Weiser M (1981) Program slicing. IEEE Press, San DiegozbMATHGoogle Scholar
  73. Weiser J, Shertz J (1984) Programming problem representation in novice and expert programmers. Int J Man-Mach Stud 19:391–398CrossRefGoogle Scholar
  74. Wiedenbeck S (1985) Novice/expert differences in programming skills. Int J Man-Mach Stud 23(4):383–390CrossRefGoogle Scholar
  75. Williams L, Kudrjavets G, Nagappan N (2009) On the effectiveness of unit test automation at microsoft. software reliability engineering, 2009. ISSRE ‘09. 20th International symposium on. pp 81–89Google Scholar
  76. Winship C, Mare RD (1984) Regression models with ordinal variables. Am Sociol Rev 49(4):512–525CrossRefGoogle Scholar
  77. Ye N, Salvendy G (1994) Quantitative and qualitative differences between experts and novices in chunking computer software knowledge. Int J Hum-Comput Interact 6(1):105–118CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Oscar Dieste
    • 1
    Email author
  • Alejandrina M. Aranda
    • 1
  • Fernando Uyaguari
    • 1
  • Burak Turhan
    • 2
  • Ayse Tosun
    • 3
  • Davide Fucci
    • 2
  • Markku Oivo
    • 2
  • Natalia Juristo
    • 1
    • 2
  1. 1.Escuela Técnica Superior de Ingenieros en InformáticaUniversidad Politécnica de MadridBoadilla del MonteSpain
  2. 2.Department of Information Processing ScienceUniversity of OuluOuluFinland
  3. 3.Faculty of Computer & InformaticsIstanbul Technical UniversityIstanbulTurkey

Personalised recommendations