Advertisement

Associating working memory capacity and code change ordering with code review performance

  • Tobias BaumEmail author
  • Kurt Schneider
  • Alberto Bacchelli
Article
  • 54 Downloads

Abstract

Change-based code review is a software quality assurance technique that is widely used in practice. Therefore, better understanding what influences performance in code reviews and finding ways to improve it can have a large impact. In this study, we examine the association of working memory capacity and cognitive load with code review performance and we test the predictions of a recent theory regarding improved code review efficiency with certain code change part orders. We perform a confirmatory experiment with 50 participants, mostly professional software developers. The participants performed code reviews on one small and two larger code changes from an open source software system to which we had seeded additional defects. We measured their efficiency and effectiveness in defect detection, their working memory capacity, and several potential confounding factors. We find that there is a moderate association between working memory capacity and the effectiveness of finding delocalized defects, influenced by other factors, whereas the association with other defect types is almost non-existing. We also confirm that the effectiveness of reviews is significantly larger for small code changes. We cannot conclude reliably whether the order of presenting the code change parts influences the efficiency of code review. Public preprint [ https://doi.org/10.5281/zenodo.2001923]; data and materials [ https://doi.org/10.6084/m9.figshare.5808609].

Keywords

Change-based code review Working memory Individual differences Code ordering Cognitive support Cognitive load 

Notes

Acknowledgements

We thank all participants and all pre-testers for the time and effort they donated. We furthermore thank Sylvie Gasnier and Günter Faber for advice on the statistical procedures and Javad Ghofrani for help with double-checking the defect coding. We thank Bettina von Helversen from the psychology department at the University of Zurich for advice on the parts related to the theory of cognitive load. Bacchelli gratefully acknowledges the support of the Swiss National Science Foundation through the SNF Project No. PP00P2_170529.

References

  1. Abdelnabi Z, Cantone G, Ciolkowski M, Rombach D (2004) Comparing code reading techniques applied to object-oriented software frameworks with regard to effectiveness and defect detection rate. In: 2004 international symposium on empirical software engineering, 2004. ISESE’04. Proceedings. IEEE, pp 239–248Google Scholar
  2. Agresti A (2007) An introduction to categorical data analysis, 2nd edn. Wiley, HobokenzbMATHCrossRefGoogle Scholar
  3. Agresti A (2010) Analysis of ordinal categorical data, 2nd edn. Wiley, HobokenzbMATHCrossRefGoogle Scholar
  4. Bacchelli A, Bird C (2013) Expectations, outcomes, and challenges of modern code review. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 712–721Google Scholar
  5. Balachandran V (2013) Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 931–940Google Scholar
  6. Barnett M, Bird C, Brunet J, Lahiri SK (2015) Helping developers help themselves: automatic decomposition of code review changesets. In: Proceedings of the 2015 international conference on software engineering. IEEE PressGoogle Scholar
  7. Barton K (2018) MuMIn: Multi-Model Inference. https://CRAN.R-project.org/package=MuMIn, r package version 1.42.1
  8. Basili V, Caldiera G, Lanubile F, Shull F (1996) Studies on reading techniques. In: Proceedings of the twenty-first annual software engineering workshop, vol 96, p 002Google Scholar
  9. Bates D, Maechler M, Bolker B, Walker S et al (2014) lme4: Linear mixed-effects models using eigen and s4. R package version 1(7):1–23Google Scholar
  10. Baum T, Schneider K (2016) On the need for a new generation of code review tools. In: 17th international conference on product-focused software process improvement: PROFES 2016, Trondheim, Norway, November 22-24, 2016, Proceedings 17, Springer, pp 301–308.  https://doi.org/10.1007/978-3-319-49094-6_19
  11. Baum T, Liskin O, Niklas K, Schneider K (2016a) A faceted classification scheme for change-based industrial code review processes. In: 2016 IEEE international conference on software quality, reliability and security (QRS). IEEE, Vienna, Austria.  https://doi.org/10.1109/QRS.2016.19
  12. Baum T, Liskin O, Niklas K, Schneider K (2016b) Factors influencing code review processes in industry. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. ACM, New York, NY, USA, FSE 2016, pp 85–96.  https://doi.org/10.1145/2950290.2950323
  13. Baum T, Leßmann H, Schneider K (2017a) The choice of code review process: a survey on the state of the practice. In: Felderer M, Méndez Fernández D, Turhan B, Kalinowski M, Sarro F, Winkler D (eds) Product-focused software process improvement.  https://doi.org/10.1007/978-3-319-69926-4_9. Springer International Publishing, Cham, pp 111–127
  14. Baum T, Schneider K, Bacchelli A (2017b) On the optimal order of reading source code changes for review. In: 33rd IEEE international conference on software maintenance and evolution (ICSME), Proceedings, pp 329–340.  https://doi.org/10.1109/ICSME.2017.28
  15. Baum T, Schneider K, Bacchelli A (2017c) Online material for on the optimal order of reading source code changes for review.  https://doi.org/10.6084/m9.figshare.5236150
  16. Baum T, Schneider K, Bacchelli A (2018) Online material for associating working memory capacity and code change ordering with code review performance.  https://doi.org/10.6084/m9.figshare.5808609
  17. Bergersen GR, Gustafsson JE (2011) Programming skill, knowledge, and working memory among professional software developers from an investment theory perspective. J Individ Differ 32(4):201–209CrossRefGoogle Scholar
  18. Bernhart M, Grechenig T (2013) On the understanding of programs with continuous code reviews. In: 2013 IEEE 21st international conference on program comprehension (ICPC). IEEE, San Francisco, CA, USA, pp 192–198Google Scholar
  19. Biegel B, Beck F, Hornig W, Diehl S (2012) The order of things: how developers sort fields and methods. In: 2012 28th IEEE international conference on software maintenance (ICSM). IEEE, pp 88–97Google Scholar
  20. Biffl S (2000) Analysis of the impact of reading technique and inspector capability on individual inspection performance. In: Software engineering conference, 2000. APSEC 2000. Proceedings. Seventh Asia-Pacific, IEEE, pp 136–145Google Scholar
  21. Chen F, Zhou J, Wang Y, Yu K, Arshad SZ, Khawaji A, Conway D (2016) Robust multimodal cognitive load measurement. Springer, ChamCrossRefGoogle Scholar
  22. Cohen J (1977) Statistical power analysis for the behavioral sciences. Revised edition. Academic PressGoogle Scholar
  23. Cowan N (2010) The magical mystery four: how is working memory capacity limited, and why? Curr Dir Psychol Sci 19(1):51–57CrossRefGoogle Scholar
  24. Crk I, Kluthe T, Stefik A (2016) Understanding programming expertise: an empirical study of phasic brain wave changes. ACM Transactions on Computer-Human Interaction (TOCHI) 23(1):2Google Scholar
  25. Daneman M, Carpenter PA (1980) Individual differences in working memory and reading. J Verbal Learn Verbal Behav 19(4):450–466CrossRefGoogle Scholar
  26. Daneman M, Merikle PM (1996) Working memory and language comprehension: a meta-analysis. Psychon Bull Rev 3(4):422–433CrossRefGoogle Scholar
  27. Denger C, Ciolkowski M, Lanubile F (2004) Investigating the active guidance factor in reading techniques for defect detection. In: International symposium on empirical software engineering, 2004 Proceedings. IEEE, pp 219–228Google Scholar
  28. DeStefano D, LeFevre JA (2007) Cognitive load in hypertext reading: a review. Comput Hum Behav 23(3):1616–1641CrossRefGoogle Scholar
  29. Dias M, Bacchelli A, Gousios G, Cassou D, Ducasse S (2015) Untangling fine-grained code changes. In: 2015 IEEE 22nd international conference on software analysis, evolution and reengineering. IEEE, pp 341–350Google Scholar
  30. Dowell J, Long J (1998) Target paper: conception of the cognitive engineering design problem. Ergonomics 41(2):126–139CrossRefGoogle Scholar
  31. Dunsmore A, Roper M, Wood M (2000) Object-oriented inspection in the face of delocalisation. In: Proceedings of the 22nd international conference on software engineering. ACM, pp 467–476Google Scholar
  32. Dunsmore A, Roper M, Wood M (2001) Systematic object-oriented inspection – an empirical study. In: Proceedings of the 23rd international conference on software engineering. IEEE Computer Society, pp 135–144Google Scholar
  33. Dunsmore A, Roper M, Wood M (2003) The development and evaluation of three diverse techniques for object-oriented code inspection. IEEE Trans Softw Eng 29(8):677–686.  https://doi.org/10.1109/TSE.2003.1223643 CrossRefGoogle Scholar
  34. Ebert F, Castor F, Novielli N, Serebrenik A (2017) Confusion detection in code reviews. In: 33rd international conference on software maintenance and evolution (ICSME), Proceedings. ICSME, pp 549–553.  https://doi.org/10.1109/ICSME.2017.40
  35. Fagan ME (1976) Design and code inspections to reduce errors in program development. IBM Syst J 15(3):182–211CrossRefGoogle Scholar
  36. Falessi D, Juristo N, Wohlin C, Turhan B, Münch J, Jedlitschka A, Oivo M (2017) Empirical software engineering experts on the use of students and professionals in experiments. Empir Softw Eng, pp 1–38Google Scholar
  37. Field A, Hole G (2002) How to design and report experiments. SageGoogle Scholar
  38. Fritz T, Begel A, Müller SC, Yigit-Elliott S, Züger M (2014) Using psycho-physiological measures to assess task difficulty in software development. In: Proceedings of the 36th international conference on software engineering. ACM, pp 402–413Google Scholar
  39. Geffen Y, Maoz S (2016) On method ordering. In: 2016 IEEE 24th international conference on program comprehension (ICPC), pp 1–10.  https://doi.org/10.1109/ICPC.2016.7503711
  40. Gilb T, Graham D (1993) Software inspection. Addison-Wesley, WokinghamGoogle Scholar
  41. Gousios G, Pinzger M, Deursen AV (2014) An exploratory study of the pull-based software development model. In: Proceedings of the 36th international conference on software engineering. ACM, Hyderabad, India, pp 345–355Google Scholar
  42. Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st international conference on software engineering. IEEE Computer Society, pp 78–88Google Scholar
  43. Herzig K, Zeller A (2013) The impact of tangled code changes. In: 2013 10th IEEE working conference on mining software repositories (MSR). IEEE, pp 121–130Google Scholar
  44. Humble J, Farley D (2011) Continuous delivery. Addison-Wesley, Upper Saddle RiverGoogle Scholar
  45. Hungerford BC, Hevner AR, Collins RW (2004) Reviewing software diagrams: a cognitive study. IEEE Trans Softw Eng 30(2):82–96CrossRefGoogle Scholar
  46. IEEE 24765 (2010) Systems and software engineering vocabulary iso/iec/ieee 24765: 2010. Standard 24765, ISO/IEC/IEEEGoogle Scholar
  47. Jaccard P (1912) The distribution of the Flora in the alpine zone. New Phytologist 11(2):37–50CrossRefGoogle Scholar
  48. Kalyan A, Chiam M, Sun J, Manoharan S (2016) A collaborative code review platform for github. In: 2016 21st international conference on engineering of complex computer systems (ICECCS). IEEE, pp 191–196Google Scholar
  49. Laguilles JS, Williams EA, Saunders DB (2011) Can lottery incentives boost web survey response rates? Findings from four experiments. Res High Educ 52 (5):537–553CrossRefGoogle Scholar
  50. Laitenberger O (2000) Cost-effective detection of software defects through perspective-based inspections. PhD thesis, Universität KaiserslauternGoogle Scholar
  51. MacLeod L, Greiler M, Storey MA, Bird C, Czerwonka J (2017) Code reviewing in the trenches: understanding challenges and best practices. IEEE Software 35(4):34–42.  https://doi.org/10.1109/MS.2017.265100500 CrossRefGoogle Scholar
  52. Mantyla MV, Lassenius C (2009) What types of defects are really discovered in code reviews? IEEE Trans Softw Eng 35(3):430–448CrossRefGoogle Scholar
  53. Matsuda J, Hayashi S, Saeki M (2015) Hierarchical categorization of edit operations for separately committing large refactoring results. In: Proceedings of the 14th international workshop on principles of software evolution. ACM, pp 19–27Google Scholar
  54. McCabe TJ (1976) A complexity measure. IEEE Transactions on Software Engineering (4), pp 308–320,  https://doi.org/10.1109/TSE.1976.233837
  55. McIntosh S, Kamei Y, Adams B, Hassan AE (2015) An empirical study of the impact of modern code review practices on software quality. Empir Softw Eng 21 (5):2146–2189CrossRefGoogle Scholar
  56. McMeekin DA, von Konsky BR, Chang E, Cooper DJ (2009) Evaluating software inspection cognition levels using bloom’s taxonomy. In: 22nd conference on software engineering education and training, 2009. CSEET’09. IEEE, pp 232–239Google Scholar
  57. Miller GA (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81CrossRefGoogle Scholar
  58. Oswald FL, McAbee ST, Redick TS, Hambrick DZ (2015) The development of a short domain-general measure of working memory capacity. Behav Res Methods 47(4):1343–1355CrossRefGoogle Scholar
  59. Paas FG, Van Merriënboer JJ (1994) Instructional control of cognitive load in the training of complex cognitive tasks. Educ Psychol Rev 6(4):351–371CrossRefGoogle Scholar
  60. Parnas DL (1972) On the criteria to be used in decomposing systems into modules. Commun ACM.  https://doi.org/10.1145/361598.361623
  61. Pearl J (2001) Causality: models, reasoning, and inference. Cambridge University Press, CambridgezbMATHGoogle Scholar
  62. Perneger TV (1998) What’s wrong with bonferroni adjustments. BMJ: Br Med J 316(7139):1236CrossRefGoogle Scholar
  63. Platz S, Taeumel M, Steinert B, Hirschfeld R, Masuhara H (2016) Unravel programming sessions with thresher: identifying coherent and complete sets of fine-granular source code changes. In: Proceedings of the 32nd JSSST annual conference, pp 24–39.  https://doi.org/10.11185/imt.12.24
  64. Pollock L, Vijay-Shanker K, Hill E, Sridhara G, Shepherd D (2009) Natural language-based software analyses and tools for software maintenance. In: Software engineering, Springer, pp 94–125Google Scholar
  65. Porter A, Siy H, Mockus A, Votta L (1998) Understanding the sources of variation in software inspections. ACM Trans Softw Eng Methodol (TOSEM) 7(1):41–79CrossRefGoogle Scholar
  66. Rasmussen J (1983) Skills, rules, and knowledge; signals, signs, and symbols, and other distinctions in human performance models. IEEE Trans Syst Man Cybern SMC-13 (3):257–266.  https://doi.org/10.1109/TSMC.1983.6313160 CrossRefGoogle Scholar
  67. Raz T, Yaung AT (1997) Factors affecting design inspection effectiveness in software development. Inf Softw Technol 39(4):297–305CrossRefGoogle Scholar
  68. Rigby PC, Bird C (2013) Convergent contemporary software peer review practices. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering. ACM, Saint Petersburg, Russia, pp 202–212Google Scholar
  69. Rigby PC, Storey MA (2011) Understanding broadcast based peer review on open source software projects. In: Proceedings of the 33rd international conference on software engineering. ACM, pp 541–550Google Scholar
  70. Rigby PC, Cleary B, Painchaud F, Storey M, German DM (2012) Contemporary peer review in action: lessons from open source development. Software, IEEE 29(6):56–61CrossRefGoogle Scholar
  71. Rigby PC, German DM, Cowen L, Storey MA (2014) Peer review on open source software projects: parameters, statistical models, and theory. ACM Trans Softw Eng Methodol 23:35:1–35:33.  https://doi.org/10.1145/2594458 CrossRefGoogle Scholar
  72. Robbins B, Carver J (2009) Cognitive factors in perspective-based reading (pbr): a protocol analysis study. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement. IEEE Computer Society, pp 145–155Google Scholar
  73. Rothlisberger D, Harry M, Binder W, Moret P, Ansaloni D, Villazon A, Nierstrasz O (2012) Exploiting dynamic information in ides improves speed and correctness of software maintenance tasks. IEEE Trans Softw Eng 38(3):579–591CrossRefGoogle Scholar
  74. Sauer C, Jeffery DR, Land L, Yetton P (2000) The effectiveness of software development technical reviews: A behaviorally motivated program of research. IEEE Trans Softw Eng 26(1):1–14CrossRefGoogle Scholar
  75. Siegmund J, Peitek N, Parnin C, Apel S, Hofmeister J, Kästner C, Begel A, Bethmann A, Brechmann A (2017) Measuring neural efficiency of program comprehension. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering. ACM, pp 140–150Google Scholar
  76. Simon HA (1974) How big is a chunk? Science 183(4124):482–488CrossRefGoogle Scholar
  77. Singer E, Ye C (2013) The use and effects of incentives in surveys. The ANNALS of the American Academy of Political and Social Science 645(1):112–141CrossRefGoogle Scholar
  78. Sjøberg DI, Hannay JE, Hansen O, Kampenes VB, Karahasanovic A, Liborg NK, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753CrossRefGoogle Scholar
  79. Skoglund M, Kjellgren V (2004) An experimental comparison of the effectiveness and usefulness of inspection techniques for object-oriented programs. In: 8th international conference on empirical assessment in software engineering (EASE 2004). IET, pp 165–174.  https://doi.org/10.1049/ic:20040409
  80. Sweller J (1988) Cognitive load during problem solving: effects on learning. Cogn Sci 12(2):257–285CrossRefGoogle Scholar
  81. Tao Y, Kim S (2015) Partitioning composite code changes to facilitate code review. In: 2015 IEEE/ACM 12th working conference on mining software repositories (MSR). IEEE, pp 180–190Google Scholar
  82. Thongtanunam P, McIntosh S, Hassan AE, Iida H (2015a) Investigating code review practices in defective files: An empirical study of the qt system. In: MSR ’15 Proceedings of the 12th working conference on mining software repositories, pp 168–179Google Scholar
  83. Thongtanunam P, Tantithamthavorn C, Kula RG, Yoshida N, Iida H, Matsumoto KI (2015b) Who should review my code? A file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd international conference on software analysis, evolution and reengineering (SANER), pp 141–150  https://doi.org/10.1109/SANER.2015.7081824
  84. Unsworth N, Heitz RP, Schrock JC, Engle RW (2005) An automated version of the operation span task. Behav Res Methods 37(3):498–505CrossRefGoogle Scholar
  85. Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York. http://www.stats.ox.ac.uk/pub/MASS4, iSBN 0-387-95457-0zbMATHCrossRefGoogle Scholar
  86. Walenstein A (2002) Theory-based analysis of cognitive support in software comprehension tools. In: Proceedings of the 10th international workshop on program comprehension, 2002. IEEE, pp 75–84Google Scholar
  87. Walenstein A (2003) Observing and measuring cognitive support: steps toward systematic tool evaluation and engineering. In: 11th IEEE international workshop on program comprehension, 2003. IEEE, pp 185–194Google Scholar
  88. Wilhelm O, Hildebrandt A, Oberauer K (2013) What is working memory capacity, and how can we measure it? Frontiers in Psychology 4.  https://doi.org/10.3389/fpsyg.2013.00433
  89. Zhang T, Song M, Pinedo J, Kim M (2015) Interactive code review for systematic changes. In: Proceedings of 37th IEEE/ACM international conference on software engineering. IEEE, pp 111–122  https://doi.org/10.1109/ICSE.2015.33

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Fachgebiet Software EngineeringLeibniz University HannoverHannoverGermany
  2. 2.ZESTUniversity of ZurichZurichSwitzerland

Personalised recommendations