Open Science in Software Engineering

Open Access


Open science describes the movement of making any research artifact available to the public and includes, but is not limited to, open access, open data, and open source. While open science is becoming generally accepted as a norm in other scientific disciplines, in software engineering, we are still struggling in adapting open science to the particularities of our discipline, rendering progress in our scientific community cumbersome. In this chapter, we reflect upon the essentials in open science for software engineering including what open science is, why we should engage in it, and how we should do it. We particularly draw from our experiences made as conference chairs implementing open science initiatives and as researchers actively engaging in open science to critically discuss challenges and pitfalls and to address more advanced topics such as how and under which conditions to share preprints, what infrastructure and licence model to cover, or how do it within the limitations of different reviewing models, such as double-blind reviewing. Our hope is to help establishing a common ground and to contribute to make open science a norm also in software engineering.



We want to thank all the members of the empirical software engineering research community who are actively supporting the open science movement and its adoption to the software engineering community. Just to name a few: Robert Feldt and Tom Zimmermann, editors in chief of the Empirical Software Engineering Journal, are committed to support the implementation of a new Reproducibility and Open Science initiative1—the first one to implement an open data initiative following a holistic process including a badge system. The steering committee of the International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE) supported the implementation of an open science initiative from 2016 on. Markku Oivo, general chair of the International Symposium on Empirical Software Engineering and Measurement (ESEM) 2018, has actively supported the adoption of the CHASE open science initiative with focus on data sharing for the major Empirical Software Engineering conference so that we could pave the road for a long-term change in that community. Sebastian Uchitel, general chair of the International Software Engineering Conference (ICSE) 2017, further supported an initiative to foster sharing of preprints, and Natalia Juristo, general chair of ICSE 2021, further actively supports the adoption of the broader ESEM open science initiative to our major general software engineering conference. Finally, we want to thank Per Runeson, Klaas-Jan Stol, and Breno de França for their elaborate comments on earlier versions on this manuscript.


  1. Arxiv (2019a) arxiv license information. Archived: Accessed 10 Apr 2019
  2. Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. Springer, Berlin, pp 722–735Google Scholar
  3. BOAI (2002) Budapest open access initiative.
  4. Boisseau T, Omhover J-F, Bouchard C (2018) Open-design: a state of the art review. Des Sci 4:e3CrossRefGoogle Scholar
  5. Bolam JP, Foxe JJ (2017) Transparent review at the European journal of neuroscience: experiences one year on. Eur J Neurosci 46(11):2647–2647. CrossRefGoogle Scholar
  6. Chacon S, Straub B (2014) Pro Git. Apress, New YorkCrossRefGoogle Scholar
  7. Childs S, McLeod J, Lomas E, Cook G (2014) Opening research data: issues and opportunities. Rec Manag J 24(2):142–162Google Scholar
  8. Dickersin K (1990) The existence of publication bias and risk factors for its occurrence. J Am Med Assoc 263(10):1385. CrossRefGoogle Scholar
  9. Dijkstra EW (1968) Go to statement considered harmful. Commun ACM 11:147–148MathSciNetCrossRefGoogle Scholar
  10. Eysenbach G (2006) Citation advantage of open access articles. PLoS Biol 4(5):e157CrossRefGoogle Scholar
  11. FOSTER (2019) Open science taxonomy.
  12. Ginsparg P (2011) It was twenty years ago today… Preprint. arXiv:1108.2700Google Scholar
  13. Gómez O, Juristo N, Vegas S (2012) Replication types in experimental disciplines. In: Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement, pp 1–10Google Scholar
  14. Graziotin D (2019) How to disclose data for double-blind review and make it archived open data upon acceptance. Archived: Accessed 10 Apr 2019
  15. Graziotin D, Wang X, Abrahamsson P (2014) A framework for systematic analysis of open access journals and its application in software engineering and information systems. Scientometrics 101(3):1627–1656. Available: CrossRefGoogle Scholar
  16. Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD (2015) The extent and consequences of p-hacking in science. PLOS Biol 13(3):e1002106. CrossRefGoogle Scholar
  17. Houghton JW, Oppenheim C (2010) The economic implications of alternative publishing models. Prometheus 28(1):41–54CrossRefGoogle Scholar
  18. Kerr NL (1998) Harking: hypothesizing after the results are known. Personal Soc Psychol Rev 2(3):196–217CrossRefGoogle Scholar
  19. Knuth DE (1984) Literate programming. Comput J 27(2):97–111CrossRefGoogle Scholar
  20. Koehler W (2002) Web page change and persistence? A four-year longitudinal study. J Am Soc Inf Sci Technol 53(2):162–171. CrossRefGoogle Scholar
  21. Koehler W (2003) A longitudinal study of web pages continued: a consideration of document persistence. Inf Res 9(2).
  22. Lambert C (2006) The marketplace of perceptions. Harv Mag 108(4):50Google Scholar
  23. Mendez D, Passoth J-H (2018) Empirical software engineering: from discipline to interdiscipline. J Syst Softw 148:170–179CrossRefGoogle Scholar
  24. Nagappan M, Robbes R, Kamei Y, Tanter É, McIntosh S, Mockus A, Hassan A (2015) An empirical study of goto in C code from GitHub repositories. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, New YorkGoogle Scholar
  25. Prechelt L, Graziotin D, Méndez Fernández D (2018) A community’s perspective on the status and future of peer review in software engineering. Inf Softw Technol 95:75–85CrossRefGoogle Scholar
  26. R Core Team (2018) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. Google Scholar
  27. Ross-Hellauer T (2017) What is open peer review? A systematic review [version 2; peer review: 4 approved]. F1000Research 6:588.
  28. Rowhani-Farid A, Allen M, Barnett AG (2017) What incentives increase data sharing in health and medical research? A systematic review. Res Integrity Peer Rev 2(1):4CrossRefGoogle Scholar
  29. Saunders B, Kitzinger J, Kitzinger C (2015) Anonymising interview data: challenges and compromise in practice. Qual Res 15(5):616–632. PMID: 26457066. CrossRefGoogle Scholar
  30. Schimmer R, Geschuhn KK, Vogler A (2015) Disrupting the subscription journals’ business model for the necessary large-scale transformation to open access.
  31. Stallman RM, McGrath R, Smith P (2001) GNU make, CiteseerGoogle Scholar
  32. Tennant JP, Dugan JM, Graziotin D, Jacques DC, Waldner F, Mietchen D, Elkhatib Y, Collister LB, Pikas CK, Crick T, Masuzzo P, Caravaggi A, Berg DR, Niemeyer KE, Ross-Hellauer T, Mannheimer S, Rigling L, Katz DS, Tzovaras BG, Pacheco-Mendoza J, Fatima N, Poblet M, Isaakidis M, Irawan DE, Renaut S, Madan CR, Matthias L, Kjær JN, O’Donnell DP, Neylon C, Kearns S, Selvaraju M, Colomb J (2017) A multi-disciplinary perspective on emergent and future innovations in peer review [version 3; peer review: 2 approved]. F1000Research 6:1151.
  33. Tennant J, Beamer JE, Bosman J, Brembs B, Chung NC, Clement G, Crick T, Dugan J, Dunning A, Eccles D et al (2019) Foundations for open scholarship strategy development.
  34. Ushey K, McPherson J, Cheng J, Atkins A, Allaire J (2018) packrat: a dependency management system for projects and their R package dependencies. R package version 0.5.0.
  35. Van den Eynden V, Corti L, Woollard M, Bishop L, Horton L (2011) Managing and sharing data; a best practice guide for researchers. Retrieved from the University of Essex Data Archive: Accessed 31 Mar 2020Google Scholar
  36. Wikimedia (2013) Consequences, risks and side-effects of the license module “non-commercial use only”. OpenGLAM.
  37. Woelfle M, Olliaro P, Todd MH (2011) Open science is a research accelerator. Nat Chem 3:745 EPGoogle Scholar
  38. Xie Y (2015) Dynamic documents with R and knitr, 2nd edn. Chapman and Hall/CRC, Boca Raton. ISBN 978-1498716963.
  39. Xie Y, Allaire J, Grolemund G (2018) R Markdown: the definitive guide. Chapman and Hall/CRC, Boca Raton. ISBN 9781138359338.

Copyright information

© The Author(s) 2020

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Technical University of MunichMunichGermany
  2. 2.Blekinge Institute of TechnologyKarlskronaSweden
  3. 3.fortiss GmbHMunichGermany
  4. 4.University of StuttgartStuttgartGermany
  5. 5.Ludwig-Maximilians-University MunichMunichGermany

Personalised recommendations