Abstract
Open science describes the movement of making any research artifact available to the public and includes, but is not limited to, open access, open data, and open source. While open science is becoming generally accepted as a norm in other scientific disciplines, in software engineering, we are still struggling in adapting open science to the particularities of our discipline, rendering progress in our scientific community cumbersome. In this chapter, we reflect upon the essentials in open science for software engineering including what open science is, why we should engage in it, and how we should do it. We particularly draw from our experiences made as conference chairs implementing open science initiatives and as researchers actively engaging in open science to critically discuss challenges and pitfalls and to address more advanced topics such as how and under which conditions to share preprints, what infrastructure and licence model to cover, or how do it within the limitations of different reviewing models, such as double-blind reviewing. Our hope is to help establishing a common ground and to contribute to make open science a norm also in software engineering.
Download chapter PDF
Notes
- 1.
References
Arxiv (2019a) arxiv license information. https://arxiv.org/help/license. Archived: http://web.archive.org/web/20190410151011/https://arxiv.org/help/license. Accessed 10 Apr 2019
Arxiv (2019b) arxiv license information. https://arXiv.org/licenses/nonexclusive-distrib/1.0/license.html. Archived: http://web.archive.org/web/20190410165523/https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html. Accessed 10 Apr 2019
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. Springer, Berlin, pp 722–735
BOAI (2002) Budapest open access initiative. https://www.budapestopenaccessinitiative.org/read
Boisseau T, Omhover J-F, Bouchard C (2018) Open-design: a state of the art review. Des Sci 4:e3
Bolam JP, Foxe JJ (2017) Transparent review at the European journal of neuroscience: experiences one year on. Eur J Neurosci 46(11):2647–2647. https://onlinelibrary.wiley.com/doi/abs/10.1111/ejn.13762
Chacon S, Straub B (2014) Pro Git. Apress, New York
Childs S, McLeod J, Lomas E, Cook G (2014) Opening research data: issues and opportunities. Rec Manag J 24(2):142–162
Dickersin K (1990) The existence of publication bias and risk factors for its occurrence. J Am Med Assoc 263(10):1385. https://doi.org/10.1001/jama.1990.03440100097014
Dijkstra EW (1968) Go to statement considered harmful. Commun ACM 11:147–148
Eysenbach G (2006) Citation advantage of open access articles. PLoS Biol 4(5):e157
FOSTER (2019) Open science taxonomy. https://www.fosteropenscience.eu/taxonomy/term/7
Ginsparg P (2011) It was twenty years ago today… Preprint. arXiv:1108.2700
Gómez O, Juristo N, Vegas S (2012) Replication types in experimental disciplines. In: Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement, pp 1–10
Graziotin D (2019) How to disclose data for double-blind review and make it archived open data upon acceptance. https://ineed.coffee/5205/. Archived: https://web.archive.org/web/20190410141340/https://ineed.coffee/5205/. Accessed 10 Apr 2019
Graziotin D, Wang X, Abrahamsson P (2014) A framework for systematic analysis of open access journals and its application in software engineering and information systems. Scientometrics 101(3):1627–1656. Available: https://arxiv.org/abs/1308.2597
Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD (2015) The extent and consequences of p-hacking in science. PLOS Biol 13(3):e1002106. https://doi.org/10.1371/journal.pbio.1002106
Houghton JW, Oppenheim C (2010) The economic implications of alternative publishing models. Prometheus 28(1):41–54
Kerr NL (1998) Harking: hypothesizing after the results are known. Personal Soc Psychol Rev 2(3):196–217
Knuth DE (1984) Literate programming. Comput J 27(2):97–111
Koehler W (2002) Web page change and persistence? A four-year longitudinal study. J Am Soc Inf Sci Technol 53(2):162–171. https://doi.org/10.1002/asi.10018
Koehler W (2003) A longitudinal study of web pages continued: a consideration of document persistence. Inf Res 9(2). http://www.informationr.net/ir/9-2/paper174.html
Lambert C (2006) The marketplace of perceptions. Harv Mag 108(4):50
Mendez D, Passoth J-H (2018) Empirical software engineering: from discipline to interdiscipline. J Syst Softw 148:170–179
Nagappan M, Robbes R, Kamei Y, Tanter É, McIntosh S, Mockus A, Hassan A (2015) An empirical study of goto in C code from GitHub repositories. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, New York
O’Connor R (2011) The ACM and me. http://r6.ca/blog/20110930T012533Z.html. Archived: http://web.archive.org/web/20190410153103/http://r6.ca/blog/20110930T012533Z.html. Accessed 10 Apr 2019
Prechelt L, Graziotin D, Méndez Fernández D (2018) A community’s perspective on the status and future of peer review in software engineering. Inf Softw Technol 95:75–85
R Core Team (2018) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
Ross-Hellauer T (2017) What is open peer review? A systematic review [version 2; peer review: 4 approved]. F1000Research 6:588. https://doi.org/10.12688/f1000research.11369.2
Rowhani-Farid A, Allen M, Barnett AG (2017) What incentives increase data sharing in health and medical research? A systematic review. Res Integrity Peer Rev 2(1):4
Saunders B, Kitzinger J, Kitzinger C (2015) Anonymising interview data: challenges and compromise in practice. Qual Res 15(5):616–632. PMID: 26457066. https://doi.org/10.1177/1468794114550439
Schimmer R, Geschuhn KK, Vogler A (2015) Disrupting the subscription journals’ business model for the necessary large-scale transformation to open access. http://pure.mpg.de/pubman/item/escidoc:2148961
Stallman RM, McGrath R, Smith P (2001) GNU make, Citeseer
Tennant JP, Dugan JM, Graziotin D, Jacques DC, Waldner F, Mietchen D, Elkhatib Y, Collister LB, Pikas CK, Crick T, Masuzzo P, Caravaggi A, Berg DR, Niemeyer KE, Ross-Hellauer T, Mannheimer S, Rigling L, Katz DS, Tzovaras BG, Pacheco-Mendoza J, Fatima N, Poblet M, Isaakidis M, Irawan DE, Renaut S, Madan CR, Matthias L, Kjær JN, O’Donnell DP, Neylon C, Kearns S, Selvaraju M, Colomb J (2017) A multi-disciplinary perspective on emergent and future innovations in peer review [version 3; peer review: 2 approved]. F1000Research 6:1151. https://doi.org/10.12688/f1000research.12037.3
Tennant J, Beamer JE, Bosman J, Brembs B, Chung NC, Clement G, Crick T, Dugan J, Dunning A, Eccles D et al (2019) Foundations for open scholarship strategy development. https://osf.io/preprints/metaarxiv/b4v8p
Ushey K, McPherson J, Cheng J, Atkins A, Allaire J (2018) packrat: a dependency management system for projects and their R package dependencies. R package version 0.5.0. https://CRAN.R-project.org/package=packrat
Van den Eynden V, Corti L, Woollard M, Bishop L, Horton L (2011) Managing and sharing data; a best practice guide for researchers. Retrieved from the University of Essex Data Archive: http://repository.essex.ac.uk/2156/1/managingsharing.pdf. Accessed 31 Mar 2020
van Deursen A (2016) Green open access FAQ. https://avandeursen.com/2016/11/06/green-open-access-faq/. Archived: https://web.archive.org/web/20190410141222/https://avandeursen.com/2016/11/06/green-open-access-faq/. Accessed 10 Apr 2019
Wikimedia (2013) Consequences, risks and side-effects of the license module “non-commercial use only”. OpenGLAM. https://openglam.org/2013/01/08/consequences-risks-and-side-effects-of-the-license-module-non-commercial-use-only/
Woelfle M, Olliaro P, Todd MH (2011) Open science is a research accelerator. Nat Chem 3:745 EP
Xie Y (2015) Dynamic documents with R and knitr, 2nd edn. Chapman and Hall/CRC, Boca Raton. ISBN 978-1498716963. https://yihui.name/knitr/
Xie Y, Allaire J, Grolemund G (2018) R Markdown: the definitive guide. Chapman and Hall/CRC, Boca Raton. ISBN 9781138359338. https://bookdown.org/yihui/rmarkdown
Acknowledgements
We want to thank all the members of the empirical software engineering research community who are actively supporting the open science movement and its adoption to the software engineering community. Just to name a few: Robert Feldt and Tom Zimmermann, editors in chief of the Empirical Software Engineering Journal, are committed to support the implementation of a new Reproducibility and Open Science initiativeFootnote 1—the first one to implement an open data initiative following a holistic process including a badge system. The steering committee of the International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE) supported the implementation of an open science initiative from 2016 on. Markku Oivo, general chair of the International Symposium on Empirical Software Engineering and Measurement (ESEM) 2018, has actively supported the adoption of the CHASE open science initiative with focus on data sharing for the major Empirical Software Engineering conference so that we could pave the road for a long-term change in that community. Sebastian Uchitel, general chair of the International Software Engineering Conference (ICSE) 2017, further supported an initiative to foster sharing of preprints, and Natalia Juristo, general chair of ICSE 2021, further actively supports the adoption of the broader ESEM open science initiative to our major general software engineering conference. Finally, we want to thank Per Runeson, Klaas-Jan Stol, and Breno de França for their elaborate comments on earlier versions on this manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2020 The Author(s)
About this chapter
Cite this chapter
Mendez, D., Graziotin, D., Wagner, S., Seibold, H. (2020). Open Science in Software Engineering. In: Felderer, M., Travassos, G. (eds) Contemporary Empirical Methods in Software Engineering. Springer, Cham. https://doi.org/10.1007/978-3-030-32489-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-32489-6_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32488-9
Online ISBN: 978-3-030-32489-6
eBook Packages: Computer ScienceComputer Science (R0)