Skip to main content
Log in

Confounding parameters on program comprehension: a literature survey

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Program comprehension is an important human factor in software engineering. To measure and evaluate program comprehension, researchers typically conduct experiments. However, designing experiments requires considerable effort, because confounding parameters need to be controlled for. Our aim is to support researchers in identifying relevant confounding parameters and select appropriate techniques to control their influence. To this end, we conducted a literature survey of 13 journals and conferences over a time span of 10 years. As result, we created a catalog of 39 confounding parameters, including an overview of measurement and control techniques. With the catalog, we give experimenters a tool to design reliable and valid experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. ICPC was a workshop until 2005.

  2. ESEM originated 2007 from merging the International Symposium on Empirical Software Engineering (ISESE) and International Software Metrics Symposium (METRICS)

  3. VLHCC was called Human-Centric Computing Languages and Environments until 2003.

  4. CHASE first took place in 2008.

  5. url http://www.infosun.fim.uni-passau.de/spl/janet/confounding/index.php

  6. In Section 5, we discuss techniques and parameters in detail. Here, we give only an overview.

  7. There are voices that say intelligence is rather something learned than something inborn. Thus, we could also classify it as individual knowledge. However, since our classification aims at a better overview, we do not step into this discussion.

  8. There are controversial discussion about the magical number seven (Baddeley 2001).

  9. The specific contents of courses depend on the country and specific university.

  10. For examples of all identified parameters for specific experiments, see the first author’s PhD thesis (Siegmund 2012).

References

  • Anderson M (2001) Permutation tests for univariate or multivariate analysis of variance and regression. Can J Fish Aquat Sci 58(3):626–639

    Article  Google Scholar 

  • Anderson T, Finn J (1996) The new statistical analysis of data. Springer, New York

    Book  MATH  Google Scholar 

  • Baddeley A (2001) Is working memory still working? Am Psychol 56(11):851–864

    Article  Google Scholar 

  • Beckwith L, Burnett M, Wiedenbeck S, Cook C, Sorte S, Hastings M (2005) Effectiveness of end-user debugging software features: are there gender issues? In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 869–878. ACM Press

  • Bergersen G, Gustafsson JE (2011) Programming skill, knowledge, and working memory among professional software developers from an investment theory perspective. J Individ Differ 32(4):201–209

    Article  Google Scholar 

  • Bettenburg N, Hassan A (2010) Studying the impact of social structures on software quality. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 124–133. IEEE CS

  • Biffl S, Halling M (2003) Investigating the defect detection effectiveness and cost benefit of nominal inspection teams. IEEE Trans Softw Eng 29(5):385–397

    Article  Google Scholar 

  • Binkley D, Lawrie D, Maex S, Morrell C (2008) Impact of limited memory resources. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 83–92. IEEE Computer Society

  • Boehm B (1981) Software engineering economics. Prentice Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Briand LC, Labiche Y, Di Penta M, Yan-Bondoc HD (2005) An experimental investigation of formality in UML-based development. IEEE Trans Softw Eng 31(10):833–849

    Article  Google Scholar 

  • Brooks R (1978) Using a behavioral theory of program comprehension in software engineering. In: Proc. Int’l Conf. Software Engineering (ICSE), pp. 196–201. IEEE CS

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46

    Article  Google Scholar 

  • Cook T, Campbell D (1979) Quasi-experimentation: design & analysis issues for field settings. Houghton Mifflin, Boston

    Google Scholar 

  • Corbett A, Anderson J (2001) Locus of feedback control in computer-based tutoring: Impact on learning rate, achievement and attitudes. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 245–252. ACM Press

  • Druin A, Foss E, Hutchinson H, Golub E, Hatley L (2010) Children’s roles using keyword search interfaces at home. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 413–422. ACM Press

  • Dybå T, Kampenes VB, Sjøberg D (2006) A systematic review of statistical power in software engineering experiments. J Inf Softw Technol 48(8):745–755

    Article  Google Scholar 

  • Dzidek W, Arisholm E, Briand L (2008) A Realistic empirical evaluation of the costs and benefits of UML in software maintenance. IEEE Trans Softw Eng 34(3):407–432

    Article  Google Scholar 

  • Ellis B, Stylos J, Myers B (2007) The factory pattern in API design: a usability evaluation. In: Proc. Int’l Conf. Software Engineering (ICSE), pp. 302–312. IEEE CS

  • Ericsson K, Simon H (1980) Verbal reports as data. Psychol Rev 87(2):215–251

    Article  Google Scholar 

  • Feigenspan J (2009) Empirical comparison of FOSD approaches regarding program comprehension—a feasibility study. Master’s thesis, University of Magdeburg

  • Feigenspan J, Siegmund N, Fruth J (2011) On the role of program comprehension in embedded systems. In: Proc. Workshop Software Reengineering (WSR), pp. 34–35. http://wwwiti.cs.uni-magdeburg.de/iti_db/publikationen/ps/auto/FeSiFr11

  • Feigenspan J, Kästner C, Liebig J, Apel S, Hanenberg S (2012) Measuring programming experience. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 73–82. IEEE CS

  • Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701

    Article  Google Scholar 

  • Fry Z, Weimer W (2010) A human study of fault localization accuracy. In: Proc. Int’l Conf. Software Maintenance (ICSM), pp. 1–10. IEEE CS

  • Goldstein B (2002) Sensation and perception, fifth edn. Cengage Learning Services

  • Gong L, Lai J (2001) Shall we mix synthetic speech and human speech impact on users’ performance, perception, and attitude. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 158–165. ACM Press (2001)

  • Goodwin J (1999) Research in psychology: methods and design, 2nd edn. Wiley

  • Grigoreanu V, Cao J, Kulesza T, Bogart C, Rector K, Burnett M, Wiedenbeck S (2008) Can feature design reduce the gender gap in end-user software development environments? In: Proc. Symposium Visual Languages and Human-Centric Computing (VLHCC), pp. 149–156. IEEE CS

  • Güleşir G, Berg K, Bergmans L, Akşit M (2009) Experimental evaluation of a tool for the verification and transformation of source code in event-driven systems. Empir Softw Eng 14(6):720–777

    Article  Google Scholar 

  • Hu W, Lee H, Zhang Q, Liu T, Geng L, Seghier M, Shakeshaft C, Twomey T, Green D, Yang Y, Price C (2010) Developmental dyslexia in Chinese and English populations: dissociating the effect of dyslexia from language differences. Brain 133(6):1694–1706

    Article  Google Scholar 

  • Ishihara S (1972) Test for colour-blindness. Kanehara Shuppan Co., Tokyo

    Google Scholar 

  • Jablonski P, Hou D (2010) Aiding software maintenance with copy-and-paste clone-awareness. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 170–179. IEEE CS

  • Jäger A, Süß HM, Beauducel A (1997) Berliner Intelligenzstruktur-Test. Hogrefe, Göttingen

    Google Scholar 

  • Jedlitschka A, Ciolkowski M, Pfahl D (2008) Reporting experiments in software engineering. In: Guide to advanced empirical software engineering, pp. 201–228. Springer

  • Jensen E (1998) Teaching with the brain in mind. Atlantic Books, London

    Google Scholar 

  • Juristo N, Moreno A (2001) Basics of software engineering experimentation. Kluwer, Boston

    Book  MATH  Google Scholar 

  • Kampenes V, Dybå T, Hannay J, Sjøberg D (2009) A systematic review of quasi-experiments in software engineering. Inf Softw Technol 51(1):71–82

    Article  Google Scholar 

  • Ko A, Uttl B (2003) Individual differences in program comprehension strategies in unfamiliar programming systems. In: Proc. Int’l Workshop Program Comprehension (IWPC), pp. 175–184. IEEE CS

  • Ko A, Myers B, Coblenz M, Aung H (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng 32(12):971–987

    Article  Google Scholar 

  • McConnell S (2011) What does 10× mean? Measuring variations in programmer productivity. In: Making Software, pp. 567–574. O’Reilly & Associates, Inc

  • McQuiggan SW, Rowe JP, Lester JC (2008) The effects of empathetic virtual characters on presence in narrative-centered learning environments. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 1511–1520. ACM Press

  • Miller G (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81–97

    Article  Google Scholar 

  • Mook D (1996) Motivation: the organization of action, 2nd edn. W.W. Norton & Co., New York

    Google Scholar 

  • Neumann Jv (1945) First draft of a report on the EDVAC

  • Oberauer K, Süß HM, Schulze R, Wilhelm O, Wittmann W (2000) Working memory capacity—facets of a cognitive ability construct. Personal Individ Differ 29(6):1017–1045

    Article  Google Scholar 

  • Oezbek C, Prechelt L (2007) Jtourbus: simplifying program understanding by documentation that provides tours through the source code. In: Proc. Int’l Conf. Software Maintenance (ICSM), pp. 64–73. IEEE CS

  • Pennington N (1987) Stimulus structures and mental representations in expert comprehension of computer programs. Cogn Psychol 19(3):295–341

    Article  Google Scholar 

  • Raven J (1936) Mental tests used in genetic studies: the performances of related individuals in tests mainly educative and mainly reproductive. Master’s thesis, University of London

  • Roethlisberger F (1939) Management and the worker. Harvard University Press, Cambridge

    Google Scholar 

  • Rosenthal R, Jacobson L (1966) Teachers’ expectancies: determinants of pupils’ IQ gains. Psychol Rep 19(1):115–118

    Article  Google Scholar 

  • Sackman H, Erikson W, Grant E (1968) Exploratory experimental studies comparing online and offline programming performance. Commun ACM 11(1):3–11

    Article  Google Scholar 

  • Schlaug G (2001) The brain of musicians. A model for functional and structural adaptation. Ann N Y Acad Sci 930:281–299

    Article  Google Scholar 

  • Shadish W, Cook T, Campbell D (2002) Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin Company, Boston

    Google Scholar 

  • Shaft T, Vessey I (1995) The relevance of application domain knowledge: the case of computer program comprehension. Inf Syst Res 6(3):286–299

    Article  Google Scholar 

  • Sharafi Z, Soh Z, Guéhéneuc YG, Antoniol G (2012) Women and men–different but equal: on the impact of identifier style on source code reading. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 27–36. IEEE CS

  • Sharif B, Maletic J (2009) An empirical study on the comprehension of stereotyped UML class diagram layouts. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 268–272. IEEE CS

  • Sharif B, Maletic J (2010) An eye tracking study on camel case and underscore identifier styles. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 196–205. IEEE CS

  • Shneiderman B, Mayer R (1979) Syntactic/semantic interactions in programmer behavior: a model and experimental results. Int J Parallel Prog 8(3):219–238

    MATH  Google Scholar 

  • Siegmund J (2012) Framework for measuring program comprehension. Ph.D. thesis, School of Computer Science, University of Magdeburg

  • Sjøberg D, Hannay J, Hansen O, Kampenes VB, Karahasanovic A, Liborg NK, Rekdal A (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753

    Article  Google Scholar 

  • Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng 10(5):595–609

    Article  Google Scholar 

  • Standish T (1984) An essay on software reuse. IEEE Trans Softw Eng SE–10(5):494–497

    Article  Google Scholar 

  • Tiarks R (2011) What programmers really do: an observational study. In: Proc. Workshop Software Reengineering (WSR), pp. 36–37

  • Torchiano M (2004) Empirical assessment of UML static object diagrams. In: Proc. Int’l Workshop Program Comprehension (IWPC), pp. 226–230. IEEE CS

  • Vitharana P, Ramamurthy K (2003) Computer-mediated group support, anonymity, and the software inspection process: an empirical investigation. IEEE Trans Softw Eng 29(2):167–180

    Article  Google Scholar 

  • von Mayrhauser A, Vans M (1995) Program comprehension during software maintenance and evolution. Computer 28(8):44–55

    Article  Google Scholar 

  • von Mayrhauser A, Vans M, Howe A (1997) Program understanding behaviour during enhancement of large-scale software. J Softw Maint Res Pract 9(5):299–327

    Article  Google Scholar 

  • Wechsler D (1950) The measurement of adult intelligence, 3rd edn. American Psychological Association, Washington, DC

    Google Scholar 

  • Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, Boston

    Book  Google Scholar 

  • Wundt W (1874) Grundzüge der Physiologischen Psychologie. Engelmann, Leipzig

    Google Scholar 

Download references

Acknowledgments

Thanks to Norbert Siegmund and Christian Kästner for helpful discussions. Thanks to all reviewers for their constructive feedback. Thanks to Raimund Dachselt for his encouragement to write this article. Thanks to Andreas Meister for his support in selecting relevant papers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Janet Siegmund.

Additional information

Communicated by: Andrian Marcus

This author published previous work as Janet Feigenspan.

Appendix

Appendix

In Tables 10 and 11, we give a summary of how each parameter was measured in literature.

Table 10 Measurement techniques of individual confounding parameters
Table 11 Measurement techniques of experimental confounding parameters

The checklist in Table 12 can help researchers to control the influence of confounding parameters. Researchers can document how they measured and controlled for a parameter.

Table 12 Checklist of confounding parameters

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Siegmund, J., Schumann, J. Confounding parameters on program comprehension: a literature survey. Empir Software Eng 20, 1159–1192 (2015). https://doi.org/10.1007/s10664-014-9318-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-014-9318-8

Keywords

Navigation