Abstract
Documentation is an important mechanism for disseminating software architecture knowledge. Software project teams can employ vastly different formats for documenting software architecture, from unstructured narratives to standardized documents. We explored to what extent this documentation format may matter to newcomers joining a software project and attempting to understand its architecture. We conducted a controlled questionnaire-based study wherein we asked 65 participants to answer software architecture understanding questions using one of two randomly-assigned documentation formats: narrative essays, and structured documents. We analyzed the factors associated with answer quality using a Bayesian ordered categorical regression and observed no significant association between the format of architecture documentation and performance on architecture understanding tasks. Instead, prior exposure to the source code of the system was the dominant factor associated with answer quality. We also observed that answers to questions that require applying and creating activities were statistically significantly associated with the use of the system’s source code to answer the question, whereas the document format or level of familiarity with the system were not. Subjective sentiment about the documentation format was comparable: Although more participants agreed that the structured document was easier to navigate and use for writing code, this relation was not statistically significant. We conclude that, in the limited experimental context studied, our results contradict the hypothesis that the format of architectural documentation matters. We surface two more important factors related to effective use of software architecture documentation: prior familiarity with the source code, and the type of architectural information sought.
Similar content being viewed by others
Data Availibility Statement
Replication data for this paper, including the experimental artifacts and analysis scripts, are available at Zenodo, as part of this record: https://zenodo.org/record/7872729
Notes
The last two questions exhibited limited completion rates because of time constraints. The table only includes the four questions we analyzed.
The uneven split is a consequence of the fact that we could not determine in advance who would consent to participate in the study.
References
Bloom BS (2001) A taxonomy for learning, teaching, and assessing : A revision of Bloom’s taxonomy of educational objectives, abridged. Longman
de Boer RC, Farenhorst R, Lago P, van Vliet H, Clerc V, Jansen A (2007) Architectural knowledge: Getting to the core. In Proceedings of the International Conference on the Quality of Software Architectures, pp 197–214
Britto R, Cruzes DS, Smite D, Sablis A (2018) Onboarding software developers and teams in three globally distributed legacy projects: A multi-case study. J Softw Evol Process 30(4):e1921
Britto R, Smite D, Damm LO, Börstler J (2020) Evaluating and strategizing the onboarding of software developers in large-scale globally distributed projects. J Syst Software 169:110699
Brown A, Wilson G (2012) The Architecture of Open Source Applications. Lulu Inc. http://aosabook.org/en/index.html
Bürkner PC, Vuorre M (2019) Ordinal regression models in psychology: A tutorial. Adv Methods Practices Psychol Sci 2(1):77–101
Capilla R, Jansen A, Tang A, Avgeriou P, Babar MA (2016) 10 years of software architecture knowledge management: Practice and future. J Syst Softw 116:191–205
Casalnuovo C, Vasilescu B, Devanbu P, Filkov V (2015) Developer onboarding in GitHub: the role of prior social links and language experience. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, pp 817–828
Clements P, Bachmann F, Bass L, Garlan D, Ivers J, Little R, Merson P, Nord R, Stafford J (2010) Documenting Software Architectures: Views and Beyond, 2nd edn. Addison Wesley
Clerc V, Lago P, van Vliet H (2007) The architect’s mindset. In Proceedings of the International Conference on the Quality of Software Architectures, pp 231–249
Dagenais B, Ossher H, Bellamy RKE, Robillard MP, de Vries JP (2010) Moving into a new software project landscape. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, p 275–284
Díaz-Pace JA, Villavicencio C, Schiaffino S, Nicoletti M, Vázquez H (2016) Producing just enough documentation: An optimization approach applied to the software architecture domain. J Data Semantics 5(1):37–53
Ding W, Liang P, Tang A, van Vliet H (2014) Knowledge-based approaches in software documentation: A systematic literature review. Inf Softw Tech 56(6):545–567
Fagerholm F, Sanchez Guinea A, Borenstein J, Münch J (2014) Onboarding in open source projects. IEEE Softw 31(6):54–61
Fairbanks G (2010) Just Enough Software Architecture: A Risk-Driven Approach. Marshall & Brainerd
Furia CA, Torkar R, Feldt R (2022) Applying bayesian analysis guidelines to empirical software engineering data: The case of programming languages and code quality. ACM Trans Softw Eng Methodol 31(3):1–38
Galster M, Babar MA (2014) Empirical study of architectural knowledge management practices. In Proceedings of the Working IEEE/IFIP Conference on Software Architecture, pp 239–242
Gelman A, Vehtari A, Simpson D, Margossian CC, Carpenter B, Yao Y, Kennedy L, Gabry J, Bürkner PC, Modrák M (2020) Bayesian workflow. https://arxiv.org/abs/2011.01808
de Graaf KA, Tang A, Liang P, van Vliet H (2012) Ontology-based software architecture documentation. In: Proceedings of the Joint Working Conference on Software Architecture and European Conference on Software Architecture, pp 121–130
de Graaf KA, Liang P, Tang A, van Vliet H (2016) How organisation of architecture documentation affects architectural knowledge retrieval. Sci Comput Prog 121:75–99
Heijstek W, Kühne T, Chaudron M (2011) Experimental analysis of textual and graphical representations for software architecture design. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement, pp 167–176
Jansen A, Avgeriou P, van der Ven JS (2009) Enriching software architecture documentation. J Syst Softw 82(8):1232–1248
Jolak R, Savary-Leblanc M, Dalibor M, Wortmann A, Hebig R, Vincur J, Polasek I, Le Pallec X, Gérard S, Chaudron MR (2020) Software engineering whispers: The effect of textual vs. graphical software design descriptions on software design communication. Empir Softw Eng 25(6):4427–4471
Kruchten PB (1995) The 4+1 view model of architecture. IEEE Software 12(6):42–50
Labuschagne A, Holmes R (2015) Do onboarding programs work? In Proceedings of the IEEE/ACM 12th Working Conference on Mining Software Repositories, pp 381–385
Lin B, Robles G, Serebrenik A (2017) Developer turnover in global, industrial open source projects: Insights from applying survival analysis. In Proceedings of the 12th IEEE International Conference on Global Software Engineering, pp 66–75
McElreath R (2020) Statistical Rethinking, 2nd edn. Chapman and Hall
Robillard MP, Medvidovíc N (2016) Disseminating architectural knowledge on open-source projects: A case study of the book “architecture of open-source applications”. In Proceedings of the 38th ACM/IEEE International Conference on Software Engineering
Rohrer JM (2018) Thinking clearly about correlations and causation: Graphical causal models for observational data. Adv Methods Pract Psychol Sci 1(1):27–42
Rozanski N, Woods E (2012) Software Systems Architecture: Working with Stakeholders Using Viewpoints and Perspectives, 2nd edn. Pearson Education
Schoonewille HH, Heijstek W, Chaudron MR, Kühne T (2011) A cognitive perspective on developer comprehension of software design documentation. In Proceedings of the 29th ACM International Conference on Design of Communication, pp 211–218
Spinellis D, Gousios G (eds) (2009) Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design. O’Reilly Media
Steinmacher I, Graciotto Silva MA, Gerosa MA, Redmiles DF (2015) A systematic literature review on the barriers faced by newcomers to open source software projects. Inf Softw Technol 59:67–85
Steinmacher I, Treude C, Gerosa MA (2019) Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Software 36(4):41–49
Tang A, Razavian M, Paech B, Hesse T (2017) Human aspects in software architecture decision making: A literature review. In Proceedings of the IEEE International Conference on Software Architecture, pp 107–116
Torkar R, Furia CA, Feldt R, de Oliveira Neto FG, Gren L, Lenberg P, Ernst NA (2021) A method to assess and argue for practical significance in software engineering. IEEE Trans Software Eng 48(6):2053–2065
Van Deursen A, Aniche M, Aué J, Slag R, De Jong M, Nederlof A, Bouwers E (2017) A collaborative approach to teaching software architecture. In Proceedings of the SIGCSE Technical Symposium on Computer Science Education, pp 591–596
Vehtari A, Gelman A, Gabry J (2016) Practical bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27(5):1413–1432
Vehtari A, Gelman A, Gabry J (2016) Practical bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27(5):1413–1432
Waterman M, Noble J, Allan G (2015) How much up-front? A grounded theory of agile architecture. In Proceedings of the IEEE/ACM International Conference on Software Engineering, pp 347–357
Acknowledgements
We thank Omar Elazhary for his feedback on the study and assistance with the exercise, and Jin Guo and Eirini Kalliamvakou for help with data collection. We are also grateful to Mathieu Nassif, Deeksha Arya, and the anonymous reviewers for their constructive feedback. This work was funded by NSERC
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare they have no financial or non-financial interests. The authors declare no conflict of interest with the suggested reviewers for this scientific article.
Ethical responsibilities
The authors of this paper acquired approval from the relevant institutional ethics committee to run the experiments and have received explicit consent from the participants for the usage and publication of anonymised results.
Additional information
Communicated by: Bara Buhnova.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ernst, N.A., Robillard, M.P. A study of documentation for software architecture. Empir Software Eng 28, 122 (2023). https://doi.org/10.1007/s10664-023-10347-2
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10347-2