Shorter identifier names take longer to comprehend

Hofmeister, Johannes C.; Siegmund, Janet; Holt, Daniel V.

doi:10.1007/s10664-018-9621-x

Shorter identifier names take longer to comprehend

Published: 26 April 2018

Volume 24, pages 417–443, (2019)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Johannes C. Hofmeister¹,
Janet Siegmund¹ &
Daniel V. Holt²

1592 Accesses
32 Citations
7 Altmetric
Explore all metrics

Abstract

Developers spend the majority of their time reading code, a process in which identifier names play a key role. Although many identifier naming styles exist, they often lack an empirical basis and it is not clear whether short or long identifier names facilitate comprehension. In this paper, we investigate the effect of different identifier naming styles (single letters, abbreviations, and words) on program comprehension. We conducted an experimental study with 72 professional C# developers who had to locate defects in source code snippets. We used a within-subjects design, such that each developer worked with all three versions of identifier naming styles, and we measured the time it took them to find a defect. We found that word identifiers led to a 19% increase in speed to find defects compared to meaningless single letters and abbreviations, but we did not find a difference between letters and abbreviations. The results of our study suggest that code is more difficult to comprehend when it contains only letters and abbreviations as identifier names. Words as identifier names facilitate program comprehension and may help to save costs and improve software quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Investigation of Empirical Contradictions - Aggregated Results of Local Studies on Readability and Comprehensibility of Source Code

Article 09 November 2023

Seeing confusion through a new lens: on the impact of atoms of confusion on novices’ code comprehension

Article 18 May 2023

Developers talking about code quality

Article Open access 21 September 2023

Notes

http://brains-on-code.org/
Miller (1994) originally argued for a capacity limit of about 7 ± 2 items, while newer research shows that core working memory capacity is more likely limited to 3 to 5 items (Cowan 2001).
The IQR is defined as Q3 − Q1, where the slowest 25% of response times lie below Q1 (first quartile) and the fastest 25% above Q3 (third quartile)
The t-values of these two tests are by chance identical when rounded to two decimal places. The standardized effect sizes differ due to the correction for correlated observations.

References

Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Conf. centre for advanced studies on collaborative research, CASCON ’98. IBM Press, Toronto, pp 1–10
Baddeley AD, Thomson N, Buchanan M (1975) Word length and the structure of short-term memory. J Verbal Learn Verbal Behav 14(6):575–589. https://doi.org/10.1016/S0022-5371(75)80045-4
Article Google Scholar
Bakeman R (2005) Recommended effect size statistics for repeated measures designs. Behav Res Methods 37(3):379–384. https://doi.org/10.3758/BF03192707
Article Google Scholar
Balota DA, Chumbley JI (1985) The locus of word-frequency effects in the pronunciation task: lexical access and/or production? J Mem Lang 24(1):89–106. https://doi.org/10.1016/0749-596X(85)90017-8
Article Google Scholar
Binkley D, Davis M, Lawrie D, Morrell C (2009) To CamelCase or under_score. In: Proc. Int’l conf. program comprehension (ICPC), pp 158–167. https://doi.org/10.1109/ICPC.2009.5090039
Brooks R (1983) Towards a theory of the comprehension of computer programs. Intĺ J Man-Mach Stud 18(6):543–554. https://doi.org/10.1016/S0020-7373(83)80031-5
Article Google Scholar
Buse RPL, Weimer WR (2010) Learning a metric for code readability. IEEE Trans Softw Eng (TSE) 36(4):546–558. https://doi.org/10.1109/TSE.2009.70
Article Google Scholar
Ceccato M, Di Penta M, Falcarin P, Ricca F, Torchiano M, Tonella P (2014) A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques. Empir Softw Eng 19:1040–1074
Google Scholar
Cohen J (1988) Statistical power analysis for the behavioral sciences. Erlbaum, Hillsdale
MATH Google Scholar
Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407–428. https://doi.org/10.1037/0033-295X.82.6.407
Article Google Scholar
Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J (2001) DRC: a dual route cascaded model of visual word recognition and reading aloud. Psychol Rev 108(1):204–256
Article Google Scholar
Cowan N (2001) The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behav Brain Sci 24(1):87–185
Article Google Scholar
Deissenboeck F, Pizka M (2006) Concise and consistent naming. Softw Qual Control 14(3):261–282. https://doi.org/10.1007/s11219-006-9219-1
Article Google Scholar
Hofmeister J, Siegmund J, Holt DV (2017) Shorter identifier names take longer to comprehend. In: 2017 IEEE 24th International conference on software analysis, evolution and reengineering (SANER), pp 217–227. https://doi.org/10.1109/SANER.2017.7884623
Jansen AR, Blackwell AF, Marriott K (2003) A tool for tracking visual attention: the restricted focus viewer. Behav Res Methods Instrum Comput 35(1):57–69
Article MATH Google Scholar
Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? A study of identifiers. In: Proc. Int’l conf. program comprehension (ICPC), pp 3–12. https://doi.org/10.1109/ICPC.2006.51
Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innov Syst Softw Eng 3(4):303–318. https://doi.org/10.1007/s11334-007-0031-2
Article Google Scholar
Leonhart R (2009) Lehrbuch Statistik Einstieg und Vertiefung, 2nd edn. Hans Huber, Hogrefe AG, Bern
Google Scholar
Miller GA (1994) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 101(2):343–352
Article Google Scholar
MSDN (2016) Class naming guidelines [online]. available: https://msdn.microsoft.com/en-us/library/4xhs4564(v=vs.71).aspx
Posnett D, Hindle A, Devanbu P (2011) A simpler model of software readability, ACM, New York
Ratcliff R (1993) Methods for dealing with reaction time outliers. Psychol Bull 114(3):510–532
Article Google Scholar
Scalabrino S, Linares-Vásquez M, Poshyvanyk D, Oliveto R (2016) Improving code readability models with textual features. In: Proc. Int’l conf. program comprehension (ICPC), pp 1–10. https://doi.org/10.1109/ICPC.2016.7503707
Sharif B, Maletic JI (2010) An eye tracking study on camelcase and under_score identifier styles. In: Proc. Int’l Conf. program comprehension (ICPC). Proc. Int’l Conf. Program Comprehension (ICPC). IEEE Computer Society, Washington, DC, pp 196–205
Google Scholar
Sneed H (1996) Object-oriented COBOL Recycling. In: Proceedings of the Third working conference on reverse engineering, 1996, pp 169–178. https://doi.org/10.1109/WCRE.1996.558901
Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng SE 10(5):595–609. https://doi.org/10.1109/TSE.1984.5010283
Article Google Scholar
Tichy WF (1998) Should computer scientists experiment more? In: IEEE Computer
Weekes BS (1997) Differential effects of number of letters on word and nonword naming latency. Q J Exper Psychol Sec A 50(2):439–456. https://doi.org/10.1080/713755710
Article Google Scholar
Whelan R (2008) Effective analysis of reaction time data. Psychol Record 58 (3):475
Article Google Scholar

Download references

Acknowledgements

This work has been supported by the DFG grant SI 2045/2-1. Janet Siegmund’s work is further funded by the Bavarian State Ministry of Education, Science and the Arts in the framework of the Centre Digitisation.Bavaria (ZD.B).

Author information

Authors and Affiliations

University of Passau, Innstrasse 33, 94032, Passau, Germany
Johannes C. Hofmeister & Janet Siegmund
Heidelberg University, Hauptstrasse 47-51, 69117, Heidelberg, Germany
Daniel V. Holt

Authors

Johannes C. Hofmeister
View author publications
You can also search for this author in PubMed Google Scholar
Janet Siegmund
View author publications
You can also search for this author in PubMed Google Scholar
Daniel V. Holt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johannes C. Hofmeister.

Ethics declarations

This study was performed in accordance with the ethical standards of the Department of Psychology, Heidelberg University, Germany.

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Andrian Marcus and Gabriele Bavota

This article extends a previous conference paper presented at the 24th International Conference on Software Analysis, Evolution and Reengineering (Hofmeister et al. 2017). See the end of Section 1 for details.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hofmeister, J.C., Siegmund, J. & Holt, D.V. Shorter identifier names take longer to comprehend. Empir Software Eng 24, 417–443 (2019). https://doi.org/10.1007/s10664-018-9621-x

Download citation

Published: 26 April 2018
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s10664-018-9621-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Shorter identifier names take longer to comprehend

Abstract

Access this article

Similar content being viewed by others

On the Investigation of Empirical Contradictions - Aggregated Results of Local Studies on Readability and Comprehensibility of Source Code

Seeing confusion through a new lens: on the impact of atoms of confusion on novices’ code comprehension

Developers talking about code quality

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Shorter identifier names take longer to comprehend

Abstract

Access this article

Similar content being viewed by others

On the Investigation of Empirical Contradictions - Aggregated Results of Local Studies on Readability and Comprehensibility of Source Code

Seeing confusion through a new lens: on the impact of atoms of confusion on novices’ code comprehension

Developers talking about code quality

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation