Advertisement

Shorter identifier names take longer to comprehend

  • Johannes C. Hofmeister
  • Janet Siegmund
  • Daniel V. Holt
Article

Abstract

Developers spend the majority of their time reading code, a process in which identifier names play a key role. Although many identifier naming styles exist, they often lack an empirical basis and it is not clear whether short or long identifier names facilitate comprehension. In this paper, we investigate the effect of different identifier naming styles (single letters, abbreviations, and words) on program comprehension. We conducted an experimental study with 72 professional C# developers who had to locate defects in source code snippets. We used a within-subjects design, such that each developer worked with all three versions of identifier naming styles, and we measured the time it took them to find a defect. We found that word identifiers led to a 19% increase in speed to find defects compared to meaningless single letters and abbreviations, but we did not find a difference between letters and abbreviations. The results of our study suggest that code is more difficult to comprehend when it contains only letters and abbreviations as identifier names. Words as identifier names facilitate program comprehension and may help to save costs and improve software quality.

Keywords

Identifier names Program comprehension Professional C# developers Psychology Defect detection Software quality 

Notes

Acknowledgements

This work has been supported by the DFG grant SI 2045/2-1. Janet Siegmund’s work is further funded by the Bavarian State Ministry of Education, Science and the Arts in the framework of the Centre Digitisation.Bavaria (ZD.B).

Compliance with Ethical Standards

This study was performed in accordance with the ethical standards of the Department of Psychology, Heidelberg University, Germany.

Conflict of interests

The authors declare that they have no conflict of interest.

References

  1. Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Conf. centre for advanced studies on collaborative research, CASCON ’98. IBM Press, Toronto, pp 1–10Google Scholar
  2. Baddeley AD, Thomson N, Buchanan M (1975) Word length and the structure of short-term memory. J Verbal Learn Verbal Behav 14(6):575–589.  https://doi.org/10.1016/S0022-5371(75)80045-4 CrossRefGoogle Scholar
  3. Bakeman R (2005) Recommended effect size statistics for repeated measures designs. Behav Res Methods 37(3):379–384.  https://doi.org/10.3758/BF03192707 CrossRefGoogle Scholar
  4. Balota DA, Chumbley JI (1985) The locus of word-frequency effects in the pronunciation task: lexical access and/or production? J Mem Lang 24(1):89–106.  https://doi.org/10.1016/0749-596X(85)90017-8 CrossRefGoogle Scholar
  5. Binkley D, Davis M, Lawrie D, Morrell C (2009) To CamelCase or under_score. In: Proc. Int’l conf. program comprehension (ICPC), pp 158–167.  https://doi.org/10.1109/ICPC.2009.5090039
  6. Brooks R (1983) Towards a theory of the comprehension of computer programs. Intĺ J Man-Mach Stud 18(6):543–554.  https://doi.org/10.1016/S0020-7373(83)80031-5 CrossRefGoogle Scholar
  7. Buse RPL, Weimer WR (2010) Learning a metric for code readability. IEEE Trans Softw Eng (TSE) 36(4):546–558.  https://doi.org/10.1109/TSE.2009.70 CrossRefGoogle Scholar
  8. Ceccato M, Di Penta M, Falcarin P, Ricca F, Torchiano M, Tonella P (2014) A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques. Empir Softw Eng 19:1040–1074Google Scholar
  9. Cohen J (1988) Statistical power analysis for the behavioral sciences. Erlbaum, HillsdaleMATHGoogle Scholar
  10. Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407–428.  https://doi.org/10.1037/0033-295X.82.6.407 CrossRefGoogle Scholar
  11. Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J (2001) DRC: a dual route cascaded model of visual word recognition and reading aloud. Psychol Rev 108(1):204–256CrossRefGoogle Scholar
  12. Cowan N (2001) The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behav Brain Sci 24(1):87–185CrossRefGoogle Scholar
  13. Deissenboeck F, Pizka M (2006) Concise and consistent naming. Softw Qual Control 14(3):261–282.  https://doi.org/10.1007/s11219-006-9219-1 CrossRefGoogle Scholar
  14. Hofmeister J, Siegmund J, Holt DV (2017) Shorter identifier names take longer to comprehend. In: 2017 IEEE 24th International conference on software analysis, evolution and reengineering (SANER), pp 217–227.  https://doi.org/10.1109/SANER.2017.7884623
  15. Jansen AR, Blackwell AF, Marriott K (2003) A tool for tracking visual attention: the restricted focus viewer. Behav Res Methods Instrum Comput 35(1):57–69CrossRefMATHGoogle Scholar
  16. Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? A study of identifiers. In: Proc. Int’l conf. program comprehension (ICPC), pp 3–12.  https://doi.org/10.1109/ICPC.2006.51
  17. Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innov Syst Softw Eng 3(4):303–318.  https://doi.org/10.1007/s11334-007-0031-2 CrossRefGoogle Scholar
  18. Leonhart R (2009) Lehrbuch Statistik Einstieg und Vertiefung, 2nd edn. Hans Huber, Hogrefe AG, BernGoogle Scholar
  19. Miller GA (1994) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 101(2):343–352CrossRefGoogle Scholar
  20. MSDN (2016) Class naming guidelines [online]. available: https://msdn.microsoft.com/en-us/library/4xhs4564(v=vs.71).aspx
  21. Posnett D, Hindle A, Devanbu P (2011) A simpler model of software readability, ACM, New YorkGoogle Scholar
  22. Ratcliff R (1993) Methods for dealing with reaction time outliers. Psychol Bull 114(3):510–532CrossRefGoogle Scholar
  23. Scalabrino S, Linares-Vásquez M, Poshyvanyk D, Oliveto R (2016) Improving code readability models with textual features. In: Proc. Int’l conf. program comprehension (ICPC), pp 1–10.  https://doi.org/10.1109/ICPC.2016.7503707
  24. Sharif B, Maletic JI (2010) An eye tracking study on camelcase and under_score identifier styles. In: Proc. Int’l Conf. program comprehension (ICPC). Proc. Int’l Conf. Program Comprehension (ICPC). IEEE Computer Society, Washington, DC, pp 196–205Google Scholar
  25. Sneed H (1996) Object-oriented COBOL Recycling. In: Proceedings of the Third working conference on reverse engineering, 1996, pp 169–178.  https://doi.org/10.1109/WCRE.1996.558901
  26. Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng SE 10(5):595–609.  https://doi.org/10.1109/TSE.1984.5010283 CrossRefGoogle Scholar
  27. Tichy WF (1998) Should computer scientists experiment more? In: IEEE ComputerGoogle Scholar
  28. Weekes BS (1997) Differential effects of number of letters on word and nonword naming latency. Q J Exper Psychol Sec A 50(2):439–456.  https://doi.org/10.1080/713755710 CrossRefGoogle Scholar
  29. Whelan R (2008) Effective analysis of reaction time data. Psychol Record 58 (3):475CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Johannes C. Hofmeister
    • 1
  • Janet Siegmund
    • 1
  • Daniel V. Holt
    • 2
  1. 1.University of PassauPassauGermany
  2. 2.Heidelberg UniversityHeidelbergGermany

Personalised recommendations