Empirical Software Engineering

, Volume 19, Issue 5, pp 1335–1382 | Cite as

An empirical study on the impact of static typing on software maintainability

  • Stefan Hanenberg
  • Sebastian Kleinschmager
  • Romain Robbes
  • Éric Tanter
  • Andreas Stefik


Static type systems play an essential role in contemporary programming languages. Despite their importance, whether static type systems impact human software development capabilities remains open. One frequently mentioned argument in favor of static type systems is that they improve the maintainability of software systems—an often-used claim for which there is little empirical evidence. This paper describes an experiment that tests whether static type systems improve the maintainability of software systems, in terms of understanding undocumented code, fixing type errors, and fixing semantic errors. The results show rigorous empirical evidence that static types are indeed beneficial to these activities, except when fixing semantic errors. We further conduct an exploratory analysis of the data in order to understand possible reasons for the effect of type systems on the three kinds of tasks used in this experiment. From the exploratory analysis, we conclude that developers using a dynamic type system tend to look at different files more frequently when doing programming tasks—which is a potential reason for the observed differences in time.


Type systems Programming languages Empirical studies Software engineering 


  1. Bruce KB (2002) Foundations of object-oriented languages: types and semantics. MIT Press, CambridgeGoogle Scholar
  2. Bird R, Wadler P (1988) An introduction to functional programming. Prentice Hall International (UK) Ltd., HertfordshireGoogle Scholar
  3. Cardelli L (1997) Type systems. In: Tucker AB (ed) The computer science and engineering handbook, chap 103. CRC Press, Boca Raton, pp 2208–2236Google Scholar
  4. Callaú O, Robbes R, Tanter É, Röthlisberger D (2013) How (and Why) developers use the dynamic features of programming languages: the case of small talk. Empir Softw Eng 18(6):1156–1194CrossRefGoogle Scholar
  5. Curtis B (1988) Five paradigms in the psychology of programming. In: Helander M (ed) Handbook of human-computer interaction. Elsevier, North-Holland, pp 87–106CrossRefGoogle Scholar
  6. Daly MT, Sazawal V, Foster JS (2009) Work in progress: an empirical study of static typing in ruby. Workshop on evaluation and usability of programming languages and tools (PLATEAU). Orlando, Florida, October 2009Google Scholar
  7. Denny P, Luxton-Reilly A, Tempero E (2012) All syntax errors are not equal. In: Proceedings of the 17th ACM annual conference on innovation and technology in computer science education, ITiCSE ’12. ACM, New York, pp 75–80Google Scholar
  8. Endrikat S, Hanenberg S (2011) Is aspect-oriented programming a rewarding investment into future code changes? A socio-technical study on development and maintenance time. In: The 19th IEEE international conference on program comprehension, ICPC 2011, Kingston, ON, Canada, June 22–24, 2011, pp 51–60Google Scholar
  9. Feigenspan J, Kästner C, Liebig J, Apel S, Hanenberg S (2012) Measuring programming experience. In: IEEE 20th international conference on program comprehension, ICPC 2012, Passau, Germany, June 11–13, 2012. ICPC’12, pp 73–82Google Scholar
  10. Gannon JD (1977) An experimental evaluation of data type conventions. Commun ACM 20(8):584–595zbMATHCrossRefGoogle Scholar
  11. Gat E (2000) Point of view: LISP as an alternative to Java. Intelligence 11(4):21–24CrossRefGoogle Scholar
  12. Gravetter FJ, Wallnau LB (2009) Statistics for the behavioral sciences. Wadsworth Cengage LearningGoogle Scholar
  13. Hanenberg S (2010) An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time. In: Proceedings of the ACM international conference on object oriented programming systems languages and applications, OOPSLA ’10. ACM, New York, pp 22–35Google Scholar
  14. Hanenberg S (2011) A chronological experience report from an initial experiment series on static type systems. In: 2nd workshop on empirical evaluation of software composition techniques (ESCOT). LancasterGoogle Scholar
  15. Hudak P, Jones MP (1994) Haskell vs. ada vs. c++ vs. awk vs.... an experiment in software prototyping productivity. Technical reportGoogle Scholar
  16. Hanenberg S, Kleinschmager S, Josupeit-Walter M (2009) Does aspect-oriented programming increase the development speed for crosscutting code? An empirical study. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement, ESEM ’09, Lake Buena Vista. IEEE Computer Society, Florida, pp 156–167Google Scholar
  17. Höst M, Regnell B, Wohlin C (2000) Using students as subjects—a comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng 5(3):201–214zbMATHCrossRefGoogle Scholar
  18. Juristo N, Moreno AM (2001) Basics of software engineering experimentation. SpringerGoogle Scholar
  19. Juzgado NJ, Vegas S (2011) The role of non-exact replications in software engineering experiments. Empir Softw Eng 16(3):295–324CrossRefGoogle Scholar
  20. Kitchenham B, Al-Khilidar H, Ali Babar M, Berry M, Cox K, Keung J, Kurniawati F, Staples M, Zhang H, Zhu L (2006) Evaluating guidelines for empirical software engineering studies. In: ISESE ’06: proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. ACM, New York, pp 38–47Google Scholar
  21. Kleinschmager S, Hanenberg S, Robbes R, Tanter É, Stefik A (2012) Do static type systems improve the maintainability of software systems? An empirical study. In: IEEE 20th international conference on program comprehension, ICPC 2012, Passau, Germany, June 11–13, 2012, pp 153–162Google Scholar
  22. Kleinschmager S (2011) An empirical study using Java and Groovy about the impact of static type systems on developer performance when using and adapting software systems. Master thesis at the institute for computer science and business information systems, University of Duisburg-EssenGoogle Scholar
  23. Ko AJ, Myers BA, Coblenz MJ, Aung HH (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng 32(12):971–987CrossRefGoogle Scholar
  24. McConnell S (2010) What does 10x mean? Measuring variations in programmer productivity. In: Oram A, Wilson G (eds) Making software: what really works, and why we believe it, O’Reilly series. O’Reilly Media, pp 567–575Google Scholar
  25. Mayer C, Hanenberg S, Robbes R, Tanter É, Stefik A (2012) An empirical study of the influence of static type systems on the usability of undocumented software. In: ACM SIGPLAN conference on object-oriented programming systems and applications, OOPSLA ’12Google Scholar
  26. Nierstrasz O, Bergel A, Denker M, Ducasse S, Gälli M, Wuyts R (2005) On the revival of dynamic languages. In: Proceedings of the 4th international conference on software composition, SC’05. Springer-Verlag, Berlin, Heidelberg, pp 1–13Google Scholar
  27. Pfleeger SL (1995) Experimental design and analysis in software engineering. Ann Softw Eng 1:219–253CrossRefGoogle Scholar
  28. Pierce BC (2002) Types and programming languages. MIT Press, CambridgeGoogle Scholar
  29. Prechelt L (2000) An empirical comparison of seven programming languages, IEEE computer (33). Computer 33:23–29CrossRefGoogle Scholar
  30. Prechelt L (2001) Kontrollierte experimente in der softwaretechnik. Springer, BerlinCrossRefGoogle Scholar
  31. Prechelt L, Tichy WF (1998) A controlled experiment to assess the benefits of procedure argument type checking. IEEE Trans Softw Eng 24(4):302–312CrossRefGoogle Scholar
  32. Richards G, Hammer C, Burg B, Vitek J (2011) The eval that men do - a large-scale study of the use of eval in javascript applications. In: ECOOP 2011 - object-oriented programming - 25th European conference, Lancaster, UK, July 25–29, 2011 Proceedings, pp 52–78Google Scholar
  33. Rosenthal R, Rosnow R (2008) Essentials of behavioral research: methods and data analysis. McGraw-Hill higher education. McGraw-Hill Companies, IncorporatedGoogle Scholar
  34. Steinberg M, Hanenberg S (2012) What is the impact of static type systems on debugging type errors and semantic errors? An empirical study of differences in debugging time using statically and dynamically typed languages - unpublished work in progressGoogle Scholar
  35. Stuchlik A, Hanenberg S (2011) Static vs. dynamic type systems: an empirical study about the relationship between type casts and development time. In: Proceedings of the 7th symposium on dynamic languages, DLS 2011, October 24, 2011, Portland, OR, USA. ACM, pp 97–106Google Scholar
  36. Sjøberg DIK, Hannay JE, Hansen O, Kampenes VB, Karahasanović A, Liborg N-L, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753CrossRefGoogle Scholar
  37. Tichy WF (2000) Hints for reviewing empirical work in software engineering. Empir Softw Eng 5(4):309–312MathSciNetCrossRefGoogle Scholar
  38. Tratt L (2009) Dynamically typed languages. Adv Comput 77:149–184CrossRefGoogle Scholar
  39. van Deursen A, Moonen L (2006) Documenting software systems using types. Sci Comput Program 60(2):205–220MathSciNetzbMATHCrossRefGoogle Scholar
  40. Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer, NorwellCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Stefan Hanenberg
    • 1
  • Sebastian Kleinschmager
    • 1
  • Romain Robbes
    • 2
  • Éric Tanter
    • 2
  • Andreas Stefik
    • 3
  1. 1.Department for Computer Science and BISUniversity of Duisburg-EssenEssenGermany
  2. 2.PLEIAD Laboratory, Computer Science Department (DCC)University of ChileSantiago de ChileChile
  3. 3.Department of Computer ScienceUniversity of NevadaLas VegasUSA

Personalised recommendations