Characterizing the transfer of program comprehension in onboarding: an information-push perspective

Yates, Rebecca; Power, Norah; Buckley, Jim

doi:10.1007/s10664-019-09741-6

Characterizing the transfer of program comprehension in onboarding: an information-push perspective

Published: 26 July 2019

Volume 25, pages 940–995, (2020)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

1071 Accesses
8 Citations
Explore all metrics

A Correction to this article was published on 24 February 2021

This article has been updated

Abstract

Many software developers struggle to understand code written by others, leading to increased maintenance costs. Research on program comprehension to date has primarily focused on individual developers attempting to understand code. However, software developers also work together to share and transfer understanding of their codebases. This is common during the onboarding process, when a new developer has joined a project or a company. The work reported here uses a Grounded Theory approach to explore the different types of information passed from experts to newcomers during onboarding, and the perceived value of these types. The theory is grounded in field-study data collected during twelve in-situ onboarding sessions, across eight organizations, with a design based on two pilot studies that were carried out in advance. The field-study data was supplemented and validated with interviews and questionnaires. It provides a description of four views through which the experts represent their code to the newcomers, revealing several interesting aspects of expert-led program comprehension. In particular, it provides evidence that extends current thinking on the temporal aspect of code: where experts discuss changes that have been made to the code-base, changes that are currently being made to the code-base (including temporary fixes) and changes intended for the code-base in the future. In addition, a rationale-based view of the code-base is emphasized in the findings, making explicit the system’s functional/non-functional requirements, and their impact on the system’s design. This information was perceived as highly valued by the newcomers. Additionally, Structural and Algorithmic views, which have already been firmly established in program comprehension literature, were also noted in these onboarding sessions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The GenAI is out of the bottle: generative artificial intelligence from a business model innovation perspective

Article Open access 13 September 2023

Qualitative Content Analysis: Theoretical Background and Procedures

From Cognitive Load Theory to Collaborative Cognitive Load Theory

Article Open access 25 April 2018

Change history

24 February 2021
A Correction to this paper has been published: https://doi.org/10.1007/s10664-020-09923-7

References

Adair JG (1984) The Hawthorne effect: a reconsideration of the methodological artefact. J Appl Psychol 69(2):334–345
Google Scholar
Adolph S, Hall W, Kruchten P (2011) Using grounded theory to study the experience of software development. Empir Softw Eng 16(4):487–513
Google Scholar
Afonso LM., Cerqueira RF de G and de Souza CS (2012), Evaluating application programming interfaces as communication artefacts. in Proceedings of the Psychology of Programming Interest Group 2012, pp 151–162
Bass L (2007), Software architecture in practice. Pearson Education. ISBN: 0321815734
Begel A and Simon B (2008a) Novice software developers, all over again. In Proceedings of the Fourth international Workshop on Computing Education Research (ICER '08). ACM, New York, 3–14
Begel A and Simon B (2008b), Struggles of new college graduates in their first software development job. In Proceedings of the 39th SIGCSE technical symposium on Computer science education (SIGCSE '08). ACM, New York, 226–230
Berlin L (1993), Beyond program understanding: A look at programming expertise in industry. In: Empirical Studies of Programmers: Fifth Workshop, pp 6–25
Berlin LM and Jeffries R (1992), Consultants and apprentices: observations about learning and collaborative problem solving. In: Proceedings of the 1992 ACM Conference on Computer-Supported Cooperative Work, pp 130–137
Boehm-Davis DA, Fox JE, Philips BH (1996) Techniques for exploring program comprehension. In: Empirical studies of programmers: Sixth Workshop, pp 3–37
Brooks R (1983) Towards a theory of the comprehension of computer programs. International Journal of Man-Machine Studies 18(6):543–554
Google Scholar
Buckley J, Mooney S, Rosik J and Ali N (2013), ‘JITTAC: a just-in-time tool for architectural consistency’. In: Proceedings of the 35th International Conference on Software Engineering, pp 1291–1294
Buckley J, O'Brien MP, Power N (2006) Empirically refining a model of programmers’ information-seeking behavior during software maintenance. In Proceedings of the 18th Workshop of the Psychology of Programming Interest Group, pp 168-182
Buckley J, Rosik J, Herold S, Wasala A, Botterweck G and Exton C (2016), FLINTS: a tool for architectural-level modeling of features in software systems. In the proceedings of the 10th European Conference on Software Architecture Workshop. pp 14–22
Charmaz K (2009) Shifting the grounds: Constructivist grounded theory methods. In: Morse JM, Stern PN, Corbin J, Bowers B, Charmaz K, Clarke AE (eds) Developing grounded theory: The second generation. Left Coast Press, Walnut Creek, pp 127–154
Google Scholar
Chen K and Rajlich V (2011), Case study of feature location using dependency graph, after 10 years. In: Proceedings of the 18th International Conference on Program Comprehension, pp 1–3
Chen C, Zhang K and Itoh T (2012), Empirical evidence of tags supporting high-level awareness. Cooperative Design, Visualization, and Engineering, pp. 94–101
Chochlov M, English M, Buckley J (2017) A historical, textual analysis approach to feature location. Inf Softw Technol 88:110–126
Google Scholar
Clements P, Garlan D, Bass L, Stafford J, Nord R, Ivers J, and Little R (2002), Documenting software architectures: views and beyond. Pearson Education. ISBN: 0201703726
Corbin J and Strauss A (2008), Basics of qualitative research: Techniques and procedures for developing Grounded Theory. Sage Publications. ISBN: 141290644X
Dagenais B, Ossher H, Bellamy RKE, Robillard MP and de Vries JP (2010), Moving into a new software project landscape. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, pp 275–284
de Gialdino IV (2009), Ontological and Epistemological Foundations of Qualitative Research. at the Forum: Qualitative Social Research. 10(2), Article 30. Available at http://www.qualitative-research.net/index.php/fqs/article/view/1299/3163 Accessed 30 Sept 2018
Dekel U and Herbsleb J (2009a), Reading the documentation of invoked API functions in program comprehension, in IEEE 17th International Conference on Program Comprehension, pp 168–177
Dekel U and Herbsleb JD (2009b), Improving API documentation usability with knowledge pushing, in Proceedings of the 31st International Conference on Software Engineering. IEEE Computer Society, pp 320–330
Denzin N (1983) Interpretive interactionism. In: Morgan G (ed) Beyond Method. Sage, California
Google Scholar
Detienne F (2002), Software design - cognitive aspects. Springer-Verlag. ISBN: 1852332530
Detienne F, Soloway E (1990) An empirically-derived control structure for the process of program understanding. International Journal of Man-Machine Studies 33(3):323–342
Google Scholar
Dit B, Revelle M, Gethers M, Poshyvanyk D (2011) `Feature location in source code: a taxonomy and survey. J Softw Maint Evol Res Pract 25(1):53–95
Google Scholar
Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting Empirical Methods for Software Engineering Research. In: Shull F, Singer J, Sjøberg DIK (eds) Guide to Advanced Empirical Software Engineering. Springer, London
Google Scholar
Ellis D, Haugan M (1997) `Modelling the information seeking patterns of engineers and research scientists in an industrial environment. J Doc 53(4):384–403
Google Scholar
Ericsson KA, Simon HA (1980) Verbal reports as data. Psychol Rev 87(3):215
Fagerholm F, Johnson P, Guinea AS, Borenstein J, and Munch J (2013), Onboarding in Open Source Projects: A Preliminary Analysis. IEEE 8th International Conference on Global Software Engineering Workshops. pp 5–10
Feigenspan J, Kästner C, Liebig J, Apel S, Hanenberg S (2012), Measuring programming experience. In 20th IEEE International Conference on Program Comprehension, pp. 73–82
Fritz T, Ou J, Murphy GC, and Murphy-Hill E (2010), A degree-of-knowledge model to capture source code familiarity. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, Vol. 1, pp 385–394
Gamma E, Helm R, Johnson R and Vlissides J (1995), Design patterns: elements of reusable object-oriented software. Vol. 206, Addison-Wesley. ISBN: 0321700694
Glaser BG, Strauss AL (1967) The discovery of Grounded Theory: Strategies for qualitative research. Aldine de Gruyter, Hawthorne ISBN: 0202302601
Google Scholar
Goncalves MK, de Souza CRB, Gonzalez VM (2011) Collaboration, information seeking and communication: An observational study of software developers' work practices. J Univ Comput Sci 17(14):1913–1930
Google Scholar
Gorton I (2006) Essential software architecture. Springer ISBN: 3–540–28713-2
Hertzum M, Pejtersen AM (2000) `The information-seeking practices of engineers: searching for documents as well as for people. Inf Process Manag 36(5):761–778
Google Scholar
Hoda R, Nobel J, Marshall S (2012) Developing a grounded theory to explain the practices of self-organizing agile teams. Empir Softw Eng 17(6):609–639
Google Scholar
Hunt A, Thomas D (2002) Software archaeology. IEEE Softw 19(2):20–22
Google Scholar
Jordan H, Rosik J, Herold S, Botterweck G, Buckley J (2015) Manually Locating Features in Industrial Source Code: The Search Actions of Software Nomads, in Proccedings of the IEEE 23rd International Conference on Program Comprehension, pp 174–177
Johnson M, Senges M (2010) Learning to be a programmer in a complex organization: A case study on practice-based learning during the onboarding process at Google. J Work Learn 22(3):180–194. https://doi.org/10.1108/13665621011028620 Accessed 17 Dec 2018
Article Google Scholar
Kelly T and Buckley J (2006), A context-aware analysis scheme for bloom’s taxonomy, In: Proceedings of the 14th International Workshop on Program Comprehension, pp 275–284
Kingrey KP (2002) Concepts of information seeking and their presence in the practical library literature. Libr Philos Pract (e-journal) Available at: http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1035&context=libphilprac Accessed 18 Aug 2016
Ko AJ, Myers BA, Coblenz MJ, Aung HH (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng 32:971–987
Google Scholar
Ko AJ, DeLine R and Venolia G (2007), Information needs in collocated software development teams, In: Proceedings of the 29th International Conference on Software Engineering. IEEE Computer Society, pp 344–353
Kuhlthau C (1988) Developing a Model of the Library Search Process: Investigation of Cognitive and Affective Aspects. Reference Quarterly 28(2):232–242
Google Scholar
Lakhotia A (1993) Understanding someone else's code: analysis of experiences. J Syst Softw 23(3):269–275
MathSciNet Google Scholar
LaToza TD, Venolia G and DeLine R (2006), Maintaining mental models: a study of developer work habits, In: Proceedings of the 28th International Conference on Software Engineering, pp 492–501
Lawrance J, Burnett M, Bellamy R, Bogart C and Swart C (2010), Reactive information foraging for evolving goals, In: Proceedings of the 28th International Conference on Human Factors in Computing Systems, CHI '10, pp. 25–34
Lawrance J, Bogart C, Burnett M, Bellamy R, Rector K, Fleming S (2013) How programmers debug, revisited: An information foraging theory perspective. IEEE Trans Softw Eng 39:197–215
Google Scholar
Lee S, Kang S (2012) A study on guiding programmers code navigation with a graphical code recommender. In: Lee R (ed) Software Engineering Research, Management and Applications, Vol. 377 of Studies in Computational Intelligence. Springer, Berlin, pp 61–75
Google Scholar
Lethbridge T, Singer J, Forward A (2003) How software engineers use documentation: The state of the practice. IEEE Softw 20(6):35–39
Google Scholar
Lethbridge T, Sim S, Singer J (2005) Studying software engineers: Data collection techniques for software field studies. Empir Softw Eng 10(3):311–341
Google Scholar
Letovsky S (1987) Cognitive processes in program comprehension. J Syst Softw 7(4):325–339
Google Scholar
Lincoln YS, Guba EG (1985) Establishing trustworthiness. Naturalistic Inquiry 289:331
Littman D, Pinto J, Letovsky S and Soloway E (1986), Mental models and software maintenance, In: Empirical Studies of Programmers: First Workshop, p. 80–93
MacLeod L, Storey M-A, Bergen A (2015), Code, camera, action: how software developers document and share program knowledge using YouTube, In: Proceedings of International Conference on Program Comprehension 2015, pp 104–114
Marchionini G (1997), Information seeking in electronic environments, Vol. 9, Cambridge University Press. ISBN: 0521586747
Matroska (2013), Matroska media container. URL: http://matroska.org/. Accessed 19 June 2016
McDonald DW and Ackerman MS (1998), Just talk to me: a field study of expertise location, In: Proceedings of the 1998 ACM conference on Computer Supported Cooperative Work, CSCW '98, pp 315–324
McKeogh J and Exton C (2004), Eclipse plug-in to monitor programmer behaviour In: Proceedings of the 2004 OOPSLA Workshop on Eclipse Technology Exchange, pp 93–97
Mockus A, Herbsleb JD (2002) Expertise browser: a quantitative approach to identifying expertise, in Proceedings of the 24th International Conference on Software Engineering, pp 503–512
Muhr T (2013), Atlas.ti v6. URL: http://www.atlasti.com. Accessed 08 July 2016
Murray AR (2006), Discourse structure of software explanation: snapshot theory, cognitive patterns and grounded theory methods, PhD thesis, University of Ottawa
Murray A and Lethbridge T (2005a), Presenting micro-theories of program comprehension in pattern form, In: Proceedings of the 13th International Workshop on Program Comprehension, pp 45–54
Murray A and Lethbridge TC (2005b), On generating cognitive patterns of software comprehension, In: Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research, CASCON '05, pp 200–211
Neville-Neil GV (2003) Code spelunking: Exploring cavernous code bases. ACM Queue 1(6):42–48
Google Scholar
Northrup DA (1997) The problem of the self-report in survey research. Institute for Social Research, York University
O’Brien M (2007), Evolving a model of the information-seeking behaviour of industrial programmers, PhD thesis, University of Limerick
O’Brien MP, Buckley J, Shaft TM (2004) Expectation-based, inference-based, and bottom-up software comprehension. J Softw Maint Evol Res Pract 16(6):427–447
Google Scholar
O’Brien M, Buckley J and Exton C (2005) Empirically studying software practitioners – bridging the gap between theory and practice’. In: Proceedings of the 21 International Conference on Software Maintenance, pp 433–442
Pennington N (1987) Stimulus structures and mental representations in expert comprehension of computer programs. Cogn Psychol 19(3):295–341
Google Scholar
Perlow L (1999) The time famine: Toward a sociology of work time. Adm Sci Q 44(1):57–81
Google Scholar
Pirolli P, Card S (1999) Information foraging. Psychol Rev 104(4):643–675
Google Scholar
Poff MA (2003), Pair programming to Facilitate the Training of Newly Hired Programmers. Technical report, Florida Institute of Technology. URI: http://hdl.handle.net/11141/116 Accessed 17 Dec 2018
Ragavan SS, Kuttal SK, Hill C, Sarma A, Piorkowski D, and Burnett M (2016), Foraging Among an Overabundance of Similar Variants. In: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, pp 3509–3521. https://doi.org/10.1145/2858036.2858469
Ragavan SS, Pandya B, Piorkowski D, Hill C, Kuttal SK, Sarma A, and Burnett M (2017), PFIS-V: Modeling Foraging Behavior in the Presence of Variants. In: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, pp 6232–6244. https://doi.org/10.1145/3025453.3025818
Ratanotayanon S and Sim S (2006), When programmers don't ask, in Proceedings of the 21st International Conference on Automated Software Engineering, pp 9–16
Razzaq A, Wasala A, Exton C, Buckley J (2019) The State of Empirical Evaluation in Static Feature Location. ACM Trans Softw Eng Methodol (TOSEM) 28(1)
Riley J (1996), Getting the most from your data, 2nd edn, Technical and Education Services Ltd. ISBN: 0947885307
Rist RS (1986), Plans in programming: definition, demonstration, and development, In First workshop on Empirical Studies of Programmers, pp 28–47
Robillard MP, Coelho W, Murphy GC (2004) How Effective Developers Investigate Source Code: An Exploratory Study. IEEE Trans. Softw. Eng. 30(12):889–903
Google Scholar
Rubin J and Chechik M (2013) A survey of feature location techniques, In: I. Reinhartz-Berger, Sturm A, Clark T, Cohen S, and Bettin J, (eds). Domain engineering, Springer, pp 29–58
Seaman C (1999) Qualitative methods in empirical studies of software engineering. IEEE Trans Softw Eng 25(4):557–572
Google Scholar
Seaman C (2002), The information gathering strategies of software maintainers. In: Proceedings of the International Conference on Software Maintenance, pp 141–149
Shaft TM, Vessey I (1995) `The relevance of application domain knowledge: the case of computer program comprehension. Inf Syst Res 6:286–299
Google Scholar
Sharif KY (2012), Open source programmers' information seeking, PhD thesis, University of Limerick
Sharif KY, English M, Ali N, Exton C, Collins JJ, Buckley J (2015) An empirically-based characterization and quantification of information seeking through mailing lists during Open Source developers’ software evolution. Inf Softw Technol 57:77–94
Google Scholar
Shaw M and Garlan D (1996), Software architecture: perspectives on an emerging discipline. Prentice Hall. ISBN: 0131829572
Sheppard S, Curtis B, Milliman P, Love T (1979) Modern coding practices and programmer performance. Computer 12:41–49
Google Scholar
Shneiderman B, Mayer R (1979) Syntactic/semantic interactions in programmer behavior: A model and experimental results. Int J Comput Inform Sci 8(3):219–238
MATH Google Scholar
Sillito J, Murphy G and De Volder K (2006), Questions Programmers ask during Software Evolution Tasks. Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering, pp 23–34
Sillito J, Murphy G, De Volder K (2008) Asking and answering questions during a programming change task. IEEE Trans Softw Eng 34:434–451
Google Scholar
Sim S, Holt R (1998) The Ramp-Up Problem in Software Projects: A Case Study of How Software Immigrants Naturalize. In: Proceedings of the 1998 International Conference on Software Engineering, pp 361–370
Singer J (1998), Practices of software maintenance. In: Proceedings of the International Conference on Software Maintenance, ICSM '98, pp. 139–145
Smith-Atakan S (2006), Human Computer Interaction. Thompson publishing. ISBN: 1–84480–454-2
Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng 10(5):595–609
Google Scholar
Starke J, Luce C and Sillito J (2009), Searching and skimming: An exploratory study. In: Proceedings of the IEEE International Conference on Software Maintenance ICSM 2009, pp. 157–166
Stol, K-J, Ralph P, and Fitzgerald B (2016), Grounded Theory in Software Engineering Research. the 38th International Conference on Software Engineering, pp. 120–31
Storey MA (2006) Theories, tools and research methods in program comprehension: past, present and future. Softw Qual J 14(3):187–208
Google Scholar
van Deursen A (2001). Program Comprehension Risks and Opportunities in Extreme Programming. Proceedings Eighth Working Conference on Reverse Engineering. pp 176–185
Van Maanen J, Schein EH (1979) Toward a theory of organizational socialization. Res Organ Behav 1:209–264
VideoLAN (2013), VLC Media Player. URL: http://www.videolan.org/vlc/. Accessed 19th June 2016
von Mayrhauser A and Vans AM (1993), From program comprehension to tool requirements for an industrial environment. In: Proceedings of the IEEE Workshop on Program Comprehension, pp 78–86
von Mayrhauser A, Vans AM (1995a) Program understanding: Models and experiments. Adv Comput 40:1–38
Google Scholar
von Mayrhauser A, Vans AM (1995b) `Industrial experience with an integrated code comprehension model. Softw Eng J 10(5):171–182
Google Scholar
von Mayrhauser A, Vans AM, Howe AE (1997), Program understanding behaviour during enhancement of large‐scale software. In: Journal of Software Maintenance: Research and Practice 9 (5), pp 299–327
Wiedenbeck S (1986) Beacons in computer program comprehension. International Journal of Man-Machine Studies 25:697–709
Google Scholar
Wilson TD (1981) On user studies and information needs. J Doc 37(1):3–15
Google Scholar

Download references

Acknowledgments

This work is supported by Science Foundation Ireland grants 03/CE2/I303_1, 04/CE2/I303_1 and 10/CE/I1855 to Lero - the Irish Software Engineering Research Centre (www.lero.ie)

Author information

Authors and Affiliations

Lero, University of Limerick, Limerick, Ireland
Rebecca Yates, Norah Power & Jim Buckley

Authors

Rebecca Yates
View author publications
You can also search for this author in PubMed Google Scholar
Norah Power
View author publications
You can also search for this author in PubMed Google Scholar
Jim Buckley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rebecca Yates.

Additional information

Communicated by: Emerson Murphy-Hill

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised due to a retrospective Open Access order.

Appendices

Appendix 1

Appendix 2

The following questions comprise the set of standard questions used in the follow-up interviews of newcomers. Like the background questionnaire, the standard question evolved slightly in response to the direction of the analysis, and the version presented here is the final version. The interviews took a semi-structured format, so additional questions were introduced in response to the participant’s answers and any unusual features of each session.

1.
Please give the overall purpose of the software that was being discussed in the session.
2.
Give an overview of what you discussed in that session.
3.
When did you first see the code?
4.
When did you start modifying the code?
5.
Please highlight some things you learned in the session that proved useful when you were modifying the code.
6.
With hindsight, what extra session content would have been useful?
7.
How else could the session have improved for you?
8.
If you had to explain this code to another developer joining the project, what would you do?
9.
[After explaining the concept of `the driver’] Do you think it is better for the expert or the newcomer to drive? Why?
10.
How does the relative experience of the newcomer and expert affect the session? Why?
11.
[At the end of the interview] Do you have any other comments about anything we’ve discussed?

Appendix 3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yates, R., Power, N. & Buckley, J. Characterizing the transfer of program comprehension in onboarding: an information-push perspective. Empir Software Eng 25, 940–995 (2020). https://doi.org/10.1007/s10664-019-09741-6

Download citation

Published: 26 July 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s10664-019-09741-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Characterizing the transfer of program comprehension in onboarding: an information-push perspective

Abstract

Access this article

Similar content being viewed by others

The GenAI is out of the bottle: generative artificial intelligence from a business model innovation perspective

Qualitative Content Analysis: Theoretical Background and Procedures

From Cognitive Load Theory to Collaborative Cognitive Load Theory

Change history

24 February 2021

References

Acknowledgments