Abstract
Developers spend the majority of their time reading code, a process in which identifier names play a key role. Although many identifier naming styles exist, they often lack an empirical basis and it is not clear whether short or long identifier names facilitate comprehension. In this paper, we investigate the effect of different identifier naming styles (single letters, abbreviations, and words) on program comprehension. We conducted an experimental study with 72 professional C# developers who had to locate defects in source code snippets. We used a within-subjects design, such that each developer worked with all three versions of identifier naming styles, and we measured the time it took them to find a defect. We found that word identifiers led to a 19% increase in speed to find defects compared to meaningless single letters and abbreviations, but we did not find a difference between letters and abbreviations. The results of our study suggest that code is more difficult to comprehend when it contains only letters and abbreviations as identifier names. Words as identifier names facilitate program comprehension and may help to save costs and improve software quality.
Similar content being viewed by others
Notes
The IQR is defined as Q3 − Q1, where the slowest 25% of response times lie below Q1 (first quartile) and the fastest 25% above Q3 (third quartile)
The t-values of these two tests are by chance identical when rounded to two decimal places. The standardized effect sizes differ due to the correction for correlated observations.
References
Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Conf. centre for advanced studies on collaborative research, CASCON ’98. IBM Press, Toronto, pp 1–10
Baddeley AD, Thomson N, Buchanan M (1975) Word length and the structure of short-term memory. J Verbal Learn Verbal Behav 14(6):575–589. https://doi.org/10.1016/S0022-5371(75)80045-4
Bakeman R (2005) Recommended effect size statistics for repeated measures designs. Behav Res Methods 37(3):379–384. https://doi.org/10.3758/BF03192707
Balota DA, Chumbley JI (1985) The locus of word-frequency effects in the pronunciation task: lexical access and/or production? J Mem Lang 24(1):89–106. https://doi.org/10.1016/0749-596X(85)90017-8
Binkley D, Davis M, Lawrie D, Morrell C (2009) To CamelCase or under_score. In: Proc. Int’l conf. program comprehension (ICPC), pp 158–167. https://doi.org/10.1109/ICPC.2009.5090039
Brooks R (1983) Towards a theory of the comprehension of computer programs. Intĺ J Man-Mach Stud 18(6):543–554. https://doi.org/10.1016/S0020-7373(83)80031-5
Buse RPL, Weimer WR (2010) Learning a metric for code readability. IEEE Trans Softw Eng (TSE) 36(4):546–558. https://doi.org/10.1109/TSE.2009.70
Ceccato M, Di Penta M, Falcarin P, Ricca F, Torchiano M, Tonella P (2014) A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques. Empir Softw Eng 19:1040–1074
Cohen J (1988) Statistical power analysis for the behavioral sciences. Erlbaum, Hillsdale
Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407–428. https://doi.org/10.1037/0033-295X.82.6.407
Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J (2001) DRC: a dual route cascaded model of visual word recognition and reading aloud. Psychol Rev 108(1):204–256
Cowan N (2001) The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behav Brain Sci 24(1):87–185
Deissenboeck F, Pizka M (2006) Concise and consistent naming. Softw Qual Control 14(3):261–282. https://doi.org/10.1007/s11219-006-9219-1
Hofmeister J, Siegmund J, Holt DV (2017) Shorter identifier names take longer to comprehend. In: 2017 IEEE 24th International conference on software analysis, evolution and reengineering (SANER), pp 217–227. https://doi.org/10.1109/SANER.2017.7884623
Jansen AR, Blackwell AF, Marriott K (2003) A tool for tracking visual attention: the restricted focus viewer. Behav Res Methods Instrum Comput 35(1):57–69
Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? A study of identifiers. In: Proc. Int’l conf. program comprehension (ICPC), pp 3–12. https://doi.org/10.1109/ICPC.2006.51
Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innov Syst Softw Eng 3(4):303–318. https://doi.org/10.1007/s11334-007-0031-2
Leonhart R (2009) Lehrbuch Statistik Einstieg und Vertiefung, 2nd edn. Hans Huber, Hogrefe AG, Bern
Miller GA (1994) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 101(2):343–352
MSDN (2016) Class naming guidelines [online]. available: https://msdn.microsoft.com/en-us/library/4xhs4564(v=vs.71).aspx
Posnett D, Hindle A, Devanbu P (2011) A simpler model of software readability, ACM, New York
Ratcliff R (1993) Methods for dealing with reaction time outliers. Psychol Bull 114(3):510–532
Scalabrino S, Linares-Vásquez M, Poshyvanyk D, Oliveto R (2016) Improving code readability models with textual features. In: Proc. Int’l conf. program comprehension (ICPC), pp 1–10. https://doi.org/10.1109/ICPC.2016.7503707
Sharif B, Maletic JI (2010) An eye tracking study on camelcase and under_score identifier styles. In: Proc. Int’l Conf. program comprehension (ICPC). Proc. Int’l Conf. Program Comprehension (ICPC). IEEE Computer Society, Washington, DC, pp 196–205
Sneed H (1996) Object-oriented COBOL Recycling. In: Proceedings of the Third working conference on reverse engineering, 1996, pp 169–178. https://doi.org/10.1109/WCRE.1996.558901
Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng SE 10(5):595–609. https://doi.org/10.1109/TSE.1984.5010283
Tichy WF (1998) Should computer scientists experiment more? In: IEEE Computer
Weekes BS (1997) Differential effects of number of letters on word and nonword naming latency. Q J Exper Psychol Sec A 50(2):439–456. https://doi.org/10.1080/713755710
Whelan R (2008) Effective analysis of reaction time data. Psychol Record 58 (3):475
Acknowledgements
This work has been supported by the DFG grant SI 2045/2-1. Janet Siegmund’s work is further funded by the Bavarian State Ministry of Education, Science and the Arts in the framework of the Centre Digitisation.Bavaria (ZD.B).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
This study was performed in accordance with the ethical standards of the Department of Psychology, Heidelberg University, Germany.
Conflict of interests
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Hofmeister, J.C., Siegmund, J. & Holt, D.V. Shorter identifier names take longer to comprehend. Empir Software Eng 24, 417–443 (2019). https://doi.org/10.1007/s10664-018-9621-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-018-9621-x