Empirical Software Engineering

, Volume 18, Issue 2, pp 219–276 | Cite as

The impact of identifier style on effort and comprehension

  • Dave Binkley
  • Marcia Davis
  • Dawn Lawrie
  • Jonathan I. Maletic
  • Christopher Morrell
  • Bonita Sharif
Article

Abstract

A family of studies investigating the impact of program identifier style on human comprehension is presented. Two popular identifier styles are examined, namely camel case and underscore. The underlying hypothesis is that identifier style affects the speed and accuracy of comprehending source code. To investigate this hypothesis, five studies were designed and conducted. The first study, which investigates how well humans read identifiers in the two different styles, focuses on low-level readability issues. The remaining four studies build on the first to focus on the semantic implications of identifier style. The studies involve 150 participants with varied demographics from two different universities. A range of experimental methods is used in the studies including timed testing, read aloud, and eye tracking. These methods produce a broad set of measurements and appropriate statistical methods, such as regression models and Generalized Linear Mixed Models (GLMMs), are applied to analyze the results. While unexpected, the results demonstrate that the tasks of reading and comprehending source code is fundamentally different from those of reading and comprehending natural language. Furthermore, as the task becomes similar to reading prose, the results become similar to work on reading natural language text. For more “source focused” tasks, experienced software developers appear to be less affected by identifier style; however, beginners benefit from the use of camel casing with respect to accuracy and effort.

Keywords

Program comprehension Text recognition Coding standards Identifier names Memory Identifier styles Eye-tracking study Code readability 

Notes

Acknowledgements

Special thanks to all the participants as this work would not be possible without your time. Our thanks to David Robbins for assisting in the use of the Tobii eye tracker and Matt Hearn for helping in the preparation and administration of the studies. Finally, thanks to our three reviewers for their thorough and well considered reviews.

References

  1. Anquetil N, Lethbridge T (1998) Extracting concepts from file names; a new file clustering criterion. In: Proceedings of the 20th international conference on software engineeringGoogle Scholar
  2. Bednarik R, Tukiainen M (2006) An eye-tracking methodology for characterizing program comprehension processes. In: Proceedings of symposium on eye tracking research & applications (ETRA), California, USAGoogle Scholar
  3. Bednarik R, Tukiainen M (2008) Temporal eye-tracking data: evolution of debugging strategies with multiple representations. In: Proceedings of symposium on eye tracking research & applications (ETRA), Savannah, GeorgiaGoogle Scholar
  4. Beymer D, Russell D (2005) Webgazeanalyzer: a system for capturing and analyzing web reading behavior using eye gaze. In: Proceedings of CHI ’05 extended abstracts on human factors in computing systems, Portland, ORGoogle Scholar
  5. Binkley D, Davis M, Lawrie D, Morrell C (2009a) To camelcase or under_score. In: 17th IEEE international conference on program comprehension, British Columbia, CanadaGoogle Scholar
  6. Binkley D, Lawrie D, Maex S, Morrell C (2009b) Identifier length and limited programmer memory. Sci Comput Program 74:149–158MathSciNetCrossRefGoogle Scholar
  7. Binkley D, Davis M, Lawrie D, Maletic JI, Morrell C, Sharif B (2011) Extended models on the impact of identifier style on effort and comprehension. Technical Report LOY110720, Loyola University in MarylandGoogle Scholar
  8. Bouma H (1970) Interaction effects in parafoveal letter recognition. Nature 226:177–178CrossRefGoogle Scholar
  9. Brooks R (1983) Towards a theory of the comprehension of computer programs. Int J Man-Mach Stud 18:543–554MathSciNetCrossRefGoogle Scholar
  10. Butler S, Wermelinger M, Yijun Y, Sharp H (2010) Exploring the influence of identifier names on code quality: an empirical study. In: Proceedings of 14th European conference on software maintenance and reengineering, Madrid, SpainGoogle Scholar
  11. Caprile B, Tonella P (2000) Restructuring program identifier names. In: IEEE international conference on software maintenanceGoogle Scholar
  12. Crosby M, Stelovsky J (1990) How do we read algorithms? A case study. IEEE Comput 23(1):24–35CrossRefGoogle Scholar
  13. Cutrell E, Guan Z (2007) What are you looking for? An eye-tracking study of information usage in web search. In: Proceedings of CHI, San Jose, CaliforniaGoogle Scholar
  14. de Kock E, van Biljon J, Pretorius M (2009) Usability evaluation methods: mind the gaps. In: Proceedings of annual research conference of the South African institute of computer scientists and information technologists Vanderbijlpark, Emfuleni, South AfricaGoogle Scholar
  15. Deißenböck F, Pizka M (2005) Concise and consistent naming. In: Proceedings of the 13th international workshop on program comprehension (IWPC 2005), St. Louis, MO, USAGoogle Scholar
  16. Duchowski A (2007) Eye tracking methodology: theory and practice, 2nd edn. Springer, LondonGoogle Scholar
  17. Epelboim J, Booth J, Ashkenazy R, Steinmans ATR (1997) Fillers and spaces in text: the importance of word recognition during reading. Vis Res 37(20):465–472CrossRefGoogle Scholar
  18. Goldberg JH, Stimson MJ, Lewenstein M, Scott N, Wichansky AM (2002) Eye tracking in web search tasks: design implications. In: Proceedings of 2002 symposium on eye tracking research & applications (ETRA), New Orleans, LouisianaGoogle Scholar
  19. Grant S, Cordy JR (2010) Estimating the optimal number of latent concepts in source code analysis. In: 10th IEEE working conference on source code analysis and manipulation (SCAM), Timisoara, RomaniaGoogle Scholar
  20. Guéhéneuc Y-G (2006) Taupe: towards understanding program comprehension. In: Proceedings of 16th IBM centers for advanced studies on collaborative research, CanadaGoogle Scholar
  21. Høst E, Østvold B (2008) The programmer’s lexicon, volume i: the verbs. In: International working conference on source code analysis and manipulation, Beijing, ChinaGoogle Scholar
  22. Jeanmart S, Guéhèneuc Y-G, Sahraoui H, Habra N (2009) Impact of the visitor pattern on program comprehension and maintenance. In: Proceedings of 3rd international symposium on empirical software engineering and measurement, Lake Buena Vista, FloridaGoogle Scholar
  23. Just M, Carpenter P (1980) A theory of reading: from eye fixations to comprehension. Psychol Rev 87:329–354CrossRefGoogle Scholar
  24. Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? A study of identifiers. In: 14th international conference on program comprehensionGoogle Scholar
  25. Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innovations in Systems and Software Engineering 3(4):303–318CrossRefGoogle Scholar
  26. Liblit B, Begel A, Sweetser E (2006) Cognitive perspectives on the role of naming in computer programs. In: 8th annual psychology of programming workshop, Brighton, UKGoogle Scholar
  27. MacGinitie W, MacGinitie R, Maria K, Dreyer LG, Hughes KE (2000) Gates–MacGinitie reading tests, 4th edn (GRMT-4). Riverside, Itasca, ILGoogle Scholar
  28. Matsuda Y, Uwano H, Ohira M, Matsumoto K-i (2009) An Analysis of eye movements during browsing multiple search results pages. Springer, BerlinGoogle Scholar
  29. Molenberghs G, Verbeke G (2006) Models for discrete longitudinal data. Springer, BerlinGoogle Scholar
  30. Morrell C, Pearson J, Brant L (1997) Linear transformations of linear mixed effects models. Am Stat 51:338–343Google Scholar
  31. Nakamichi N, Shima K, Sakai M, Matsumoto K-i (2006) Detecting low usability web pages using quantitative data of users’ behavior. In: Proceedings of 28th international conference on software engineering, Shanghai, ChinaGoogle Scholar
  32. New B, Ferrand L, Pallier C, Brysbaert M (2006) Reexamining the word length effect in visual word recognition: new evidence from the English Lexicon Project. Psychon Bull Rev 13(1):45–52CrossRefGoogle Scholar
  33. Ohba M, Gondow K (2005) Toward mining “concept keywords” from identifiers in large software projects. In: Proceedings of the proceedings of the second international workshop on mining software repositories, St Louis, MOGoogle Scholar
  34. Porras GC, Guéhéneuc Y-G (2010) An empirical study on the efficiency of different design pattern representations in uml class diagrams. Empirical Software Engineering 15:493–522CrossRefGoogle Scholar
  35. Rayner K, Fischer M, Pollatsek A (1998) Unspaced text interferes with both word identification and eye movement control. Vis Res 38(8):1129–1144CrossRefGoogle Scholar
  36. Sami P, Roman B, Tatiana G, Vesa T, Markku T (2008) A method to study visual attention aspects of collaboration: eye-tracking pair programmers simultaneously. In: Proceedings of symposium on eye tracking research & applications, Georgia, USAGoogle Scholar
  37. Sharif B, Maletic J (2010a) An eye tracking study on camelcase and under_score identifier styles. In: 18th IEEE international conference on program comprehension, Braga, PortugalGoogle Scholar
  38. Sharif B, Maletic J (2010b) An eye tracking study on the effects of layout in understanding the role of design patterns. In: 26th IEEE international conference on software maintenance, Timisoara, RomaniaGoogle Scholar
  39. Simonyi C (1999) Hungarian notation. msdn.microsoft.com/en-us/library/aa260976(VS.60).aspx
  40. Sjøberg, D, Hannay, J, Hansen, O, Kampenes, V, Karahasanovic, A, Liborg, N, and Rekdal, A (1993). A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 19(4):733–753Google Scholar
  41. Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng 10:595–609CrossRefGoogle Scholar
  42. Takang A, Grubb P, Macredie R (1996) The effects of comments and identifier names on program comprehensibility: an experiential study. J Program Lang 4(3):143–167Google Scholar
  43. Uwano H, Nakamura M, Monden A, Matsumoto K (2006) Analyzing individual performance of source code review using reviewers’ eye movement. In: Proceedings of 2006 symposium on eye tracking research & applications (ETRA), San Diego, CaliforniaGoogle Scholar
  44. Uwano H, Monden A, Matsumoto K (2008) Dresrem 2: an analysis system for multi-document software review using reviewers’ eye movements. In: Proceedings of 3rd international conference on software engineering advances (ICSEA), Sliema, MaltaGoogle Scholar
  45. Verbeke G, Molenberghs G (2001) Linear mixed models for longitudinal data, 2nd edn. Springer, New YorkGoogle Scholar
  46. Wiedenbeck S (1991) The initial stage of program comprehension. Int J Man-Mach Stud 35:517–540CrossRefGoogle Scholar
  47. Yusuf S, Kagdi H, Maletic JI (2007) Assessing the comprehension of uml class diagrams via eye tracking. In: Proceedings of 15th IEEE intl. conf. on program comprehension, Banff CanadaGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Dave Binkley
    • 1
  • Marcia Davis
    • 3
  • Dawn Lawrie
    • 1
  • Jonathan I. Maletic
    • 4
  • Christopher Morrell
    • 2
  • Bonita Sharif
    • 5
  1. 1.Department of Computer ScienceLoyola University MarylandBaltimoreUSA
  2. 2.Department of Mathematics and StatisticsLoyola University MarylandBaltimoreUSA
  3. 3.Center for Social Organization of SchoolsJohns Hopkins UniversityBaltimoreUSA
  4. 4.Department of Computer ScienceKent State UniversityKentUSA
  5. 5.Department of Computer Science and Information SystemsYoungstown State UniversityYoungstownUSA

Personalised recommendations