Skip to main content
Log in

A study of documentation for software architecture

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Documentation is an important mechanism for disseminating software architecture knowledge. Software project teams can employ vastly different formats for documenting software architecture, from unstructured narratives to standardized documents. We explored to what extent this documentation format may matter to newcomers joining a software project and attempting to understand its architecture. We conducted a controlled questionnaire-based study wherein we asked 65 participants to answer software architecture understanding questions using one of two randomly-assigned documentation formats: narrative essays, and structured documents. We analyzed the factors associated with answer quality using a Bayesian ordered categorical regression and observed no significant association between the format of architecture documentation and performance on architecture understanding tasks. Instead, prior exposure to the source code of the system was the dominant factor associated with answer quality. We also observed that answers to questions that require applying and creating activities were statistically significantly associated with the use of the system’s source code to answer the question, whereas the document format or level of familiarity with the system were not. Subjective sentiment about the documentation format was comparable: Although more participants agreed that the structured document was easier to navigate and use for writing code, this relation was not statistically significant. We conclude that, in the limited experimental context studied, our results contradict the hypothesis that the format of architectural documentation matters. We surface two more important factors related to effective use of software architecture documentation: prior familiarity with the source code, and the type of architectural information sought.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availibility Statement

Replication data for this paper, including the experimental artifacts and analysis scripts, are available at Zenodo, as part of this record: https://zenodo.org/record/7872729

Notes

  1. The last two questions exhibited limited completion rates because of time constraints. The table only includes the four questions we analyzed.

  2. The uneven split is a consequence of the fact that we could not determine in advance who would consent to participate in the study.

References

  • Bloom BS (2001) A taxonomy for learning, teaching, and assessing : A revision of Bloom’s taxonomy of educational objectives, abridged. Longman

    Google Scholar 

  • de Boer RC, Farenhorst R, Lago P, van Vliet H, Clerc V, Jansen A (2007) Architectural knowledge: Getting to the core. In Proceedings of the International Conference on the Quality of Software Architectures, pp 197–214

  • Britto R, Cruzes DS, Smite D, Sablis A (2018) Onboarding software developers and teams in three globally distributed legacy projects: A multi-case study. J Softw Evol Process 30(4):e1921

    Article  Google Scholar 

  • Britto R, Smite D, Damm LO, Börstler J (2020) Evaluating and strategizing the onboarding of software developers in large-scale globally distributed projects. J Syst Software 169:110699

    Article  Google Scholar 

  • Brown A, Wilson G (2012) The Architecture of Open Source Applications. Lulu Inc. http://aosabook.org/en/index.html

  • Bürkner PC, Vuorre M (2019) Ordinal regression models in psychology: A tutorial. Adv Methods Practices Psychol Sci 2(1):77–101

    Article  Google Scholar 

  • Capilla R, Jansen A, Tang A, Avgeriou P, Babar MA (2016) 10 years of software architecture knowledge management: Practice and future. J Syst Softw 116:191–205

    Article  Google Scholar 

  • Casalnuovo C, Vasilescu B, Devanbu P, Filkov V (2015) Developer onboarding in GitHub: the role of prior social links and language experience. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, pp 817–828

  • Clements P, Bachmann F, Bass L, Garlan D, Ivers J, Little R, Merson P, Nord R, Stafford J (2010) Documenting Software Architectures: Views and Beyond, 2nd edn. Addison Wesley

    Google Scholar 

  • Clerc V, Lago P, van Vliet H (2007) The architect’s mindset. In Proceedings of the International Conference on the Quality of Software Architectures, pp 231–249

  • Dagenais B, Ossher H, Bellamy RKE, Robillard MP, de Vries JP (2010) Moving into a new software project landscape. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, p 275–284

  • Díaz-Pace JA, Villavicencio C, Schiaffino S, Nicoletti M, Vázquez H (2016) Producing just enough documentation: An optimization approach applied to the software architecture domain. J Data Semantics 5(1):37–53

    Article  Google Scholar 

  • Ding W, Liang P, Tang A, van Vliet H (2014) Knowledge-based approaches in software documentation: A systematic literature review. Inf Softw Tech 56(6):545–567

    Article  Google Scholar 

  • Fagerholm F, Sanchez Guinea A, Borenstein J, Münch J (2014) Onboarding in open source projects. IEEE Softw 31(6):54–61

    Article  Google Scholar 

  • Fairbanks G (2010) Just Enough Software Architecture: A Risk-Driven Approach. Marshall & Brainerd

  • Furia CA, Torkar R, Feldt R (2022) Applying bayesian analysis guidelines to empirical software engineering data: The case of programming languages and code quality. ACM Trans Softw Eng Methodol 31(3):1–38

    Article  Google Scholar 

  • Galster M, Babar MA (2014) Empirical study of architectural knowledge management practices. In Proceedings of the Working IEEE/IFIP Conference on Software Architecture, pp 239–242

  • Gelman A, Vehtari A, Simpson D, Margossian CC, Carpenter B, Yao Y, Kennedy L, Gabry J, Bürkner PC, Modrák M (2020) Bayesian workflow. https://arxiv.org/abs/2011.01808

  • de Graaf KA, Tang A, Liang P, van Vliet H (2012) Ontology-based software architecture documentation. In: Proceedings of the Joint Working Conference on Software Architecture and European Conference on Software Architecture, pp 121–130

  • de Graaf KA, Liang P, Tang A, van Vliet H (2016) How organisation of architecture documentation affects architectural knowledge retrieval. Sci Comput Prog 121:75–99

    Article  Google Scholar 

  • Heijstek W, Kühne T, Chaudron M (2011) Experimental analysis of textual and graphical representations for software architecture design. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement, pp 167–176

  • Jansen A, Avgeriou P, van der Ven JS (2009) Enriching software architecture documentation. J Syst Softw 82(8):1232–1248

    Article  Google Scholar 

  • Jolak R, Savary-Leblanc M, Dalibor M, Wortmann A, Hebig R, Vincur J, Polasek I, Le Pallec X, Gérard S, Chaudron MR (2020) Software engineering whispers: The effect of textual vs. graphical software design descriptions on software design communication. Empir Softw Eng 25(6):4427–4471

    Article  Google Scholar 

  • Kruchten PB (1995) The 4+1 view model of architecture. IEEE Software 12(6):42–50

    Article  Google Scholar 

  • Labuschagne A, Holmes R (2015) Do onboarding programs work? In Proceedings of the IEEE/ACM 12th Working Conference on Mining Software Repositories, pp 381–385

  • Lin B, Robles G, Serebrenik A (2017) Developer turnover in global, industrial open source projects: Insights from applying survival analysis. In Proceedings of the 12th IEEE International Conference on Global Software Engineering, pp 66–75

  • McElreath R (2020) Statistical Rethinking, 2nd edn. Chapman and Hall

    Book  Google Scholar 

  • Robillard MP, Medvidovíc N (2016) Disseminating architectural knowledge on open-source projects: A case study of the book “architecture of open-source applications”. In Proceedings of the 38th ACM/IEEE International Conference on Software Engineering

  • Rohrer JM (2018) Thinking clearly about correlations and causation: Graphical causal models for observational data. Adv Methods Pract Psychol Sci 1(1):27–42

    Article  MathSciNet  Google Scholar 

  • Rozanski N, Woods E (2012) Software Systems Architecture: Working with Stakeholders Using Viewpoints and Perspectives, 2nd edn. Pearson Education

  • Schoonewille HH, Heijstek W, Chaudron MR, Kühne T (2011) A cognitive perspective on developer comprehension of software design documentation. In Proceedings of the 29th ACM International Conference on Design of Communication, pp 211–218

  • Spinellis D, Gousios G (eds) (2009) Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design. O’Reilly Media

  • Steinmacher I, Graciotto Silva MA, Gerosa MA, Redmiles DF (2015) A systematic literature review on the barriers faced by newcomers to open source software projects. Inf Softw Technol 59:67–85

    Article  Google Scholar 

  • Steinmacher I, Treude C, Gerosa MA (2019) Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Software 36(4):41–49

    Article  Google Scholar 

  • Tang A, Razavian M, Paech B, Hesse T (2017) Human aspects in software architecture decision making: A literature review. In Proceedings of the IEEE International Conference on Software Architecture, pp 107–116

  • Torkar R, Furia CA, Feldt R, de Oliveira Neto FG, Gren L, Lenberg P, Ernst NA (2021) A method to assess and argue for practical significance in software engineering. IEEE Trans Software Eng 48(6):2053–2065

    Article  Google Scholar 

  • Van Deursen A, Aniche M, Aué J, Slag R, De Jong M, Nederlof A, Bouwers E (2017) A collaborative approach to teaching software architecture. In Proceedings of the SIGCSE Technical Symposium on Computer Science Education, pp 591–596

  • Vehtari A, Gelman A, Gabry J (2016) Practical bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27(5):1413–1432

    Article  MathSciNet  MATH  Google Scholar 

  • Vehtari A, Gelman A, Gabry J (2016) Practical bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27(5):1413–1432

    Article  MathSciNet  MATH  Google Scholar 

  • Waterman M, Noble J, Allan G (2015) How much up-front? A grounded theory of agile architecture. In Proceedings of the IEEE/ACM International Conference on Software Engineering, pp 347–357

Download references

Acknowledgements

We thank Omar Elazhary for his feedback on the study and assistance with the exercise, and Jin Guo and Eirini Kalliamvakou for help with data collection. We are also grateful to Mathieu Nassif, Deeksha Arya, and the anonymous reviewers for their constructive feedback. This work was funded by NSERC

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neil A. Ernst.

Ethics declarations

Competing interests

The authors declare they have no financial or non-financial interests. The authors declare no conflict of interest with the suggested reviewers for this scientific article.

Ethical responsibilities

The authors of this paper acquired approval from the relevant institutional ethics committee to run the experiments and have received explicit consent from the participants for the usage and publication of anonymised results.

Additional information

Communicated by: Bara Buhnova.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ernst, N.A., Robillard, M.P. A study of documentation for software architecture. Empir Software Eng 28, 122 (2023). https://doi.org/10.1007/s10664-023-10347-2

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10347-2

Keywords

Navigation