Unreproducible builds: time to fix, causes, and correlation with external ecosystem factors

Bajaj, Rahul; Fernandes, Eduardo; Adams, Bram; Hassan, Ahmed E.

doi:10.1007/s10664-023-10399-4

Unreproducible builds: time to fix, causes, and correlation with external ecosystem factors

Published: 29 November 2023

Volume 29, article number 11, (2024)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Rahul Bajaj ORCID: orcid.org/0000-0003-3367-6907¹,
Eduardo Fernandes¹,
Bram Adams¹ &
…
Ahmed E. Hassan¹

302 Accesses
Explore all metrics

Abstract

Context

A reproducible build occurs if, given the same source code, build instructions, and build environment (i.e., installed build dependencies), compiling a software project repeatedly generates the same build artifacts. Reproducible builds are essential to identify tampering attempts responsible for supply chain attacks, with most of the research on reproducible builds considering build reproducibility as a project-specific issue. In contrast, modern software projects are part of a larger ecosystem and depend on dozens of other projects, which begs the question of to what extent build reproducibility of a project is the responsibility of that project or perhaps something forced on it.

Objective

This empirical study aims at analyzing reproducible and unreproducible builds in Linux Distributions to systematically investigate the process of making builds reproducible in open-source distributions. Our study targets build performed on 11,528 and 597,066 Arch Linux and Debian packages, respectively.

Method

We compute the likelihood of unreproducible packages becoming reproducible (and vice versa) and identify the root causes behind unreproducible builds. Finally, we compute the correlation between the reproducibility status of packages and three ecosystem factors (i.e., factors outside the control of a given package).

Results

Arch Linux packages become reproducible a median of 30 days quicker when compared to Debian packages, while Debian packages remain reproducible for a median of 68 days longer once fixed. We identified a taxonomy of 16 root causes of unreproducible builds and found that the build reproducibility status of a package across different hardware architectures is statistically significantly different (strong effect size). At the same time, the status also differs between versions of a package for different distributions and depends on the build reproducibility of a package’s build dependencies, albeit with weaker effect sizes.

Conclusions

The ecosystem a project belongs to, plays an important role w.r.t. the project’s build reproducibility. Since these are outside a developer’s control, future work on (fixing) unreproducible builds should consider these ecosystem influences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Blockchain smart contracts: Applications, challenges, and future trends

Article 18 April 2021

Ethics in the Software Development Process: from Codes of Conduct to Ethical Deliberation

Article Open access 21 April 2021

How different are different diff algorithms in Git?

Article Open access 11 September 2019

Data Availibility

The datasets generated and analyzed during the study are available from the corresponding author in a GitHub repository. https://github.com/SAILResearch/replication-21-rahul_bajaj-reproducible_builds-code

Notes

References

Abdalkareem R, Nourry O, Wehaibi S, Mujahid S, Shihab E (2017) Why do developers use trivial packages? an empirical case study on npm. In: Proceedings of the 11th joint meeting on foundations of software engineering (ESEC/FSE). pp 385–395
Adams B, Kavanagh R, Hassan AE, German DM (2016) An empirical study of integration activities in distributions of open source software. Empir Softw Eng 21(3):960–1001
Article Google Scholar
Allison PD (2010) Survival analysis using SAS: a practical guide, 2nd edn. SAS Institute
Google Scholar
Brooks FP (1974) The mythical man-month. Datamation 20(12):44–52
Google Scholar
Butler S, Gamalielsson J, Lundell B, Brax C, Mattsson A, Gustavsson T, Feist J, Kvarnström B, Lönroth E (2022) On business adoption and use of reproducible builds for open and closed source software. Software Qual J 1–33
de Carné de Carnavalet X, Mannan M (2014) Challenges and implications of verifiable builds for security-critical open-source software. In: Proceedings of the 30th annual computer security applications conference (ACSAC). pp 16–25
Chowdhury MAR, Abdalkareem R, Shihab E, Adams B (2021) On the untriviality of trivial packages: An empirical study of npm javascript packages. IEEE Transactions on Software Engineering pp 1–15
Claes M, Mens T, Di Cosmo R, Vouillon J (2015) A historical analysis of Debian package incompatibilities. In: Proceedings of the 12th working conference on mining software repositories (MSR). pp 212–223
Decan A, Mens T, Claes M (2016) On the topology of package dependency networks: A comparison of three programming language ecosystems. In: Proceedings of the 10th European conference on software architecture workshops (ECSAW). pp 21:1–21:4
Decan A, Mens T, Constantinou E (2018) On the impact of security vulnerabilities in the NPM package dependency network. In: Proceedings of the 15th international conference on mining software repositories. pp 181–191
Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting empirical methods for software engineering research. In: Guide to advanced empirical software engineering. Springer, pp 285–311
Fried L (1991) Team size and productivity in systems development bigger does not always mean better. J Inf Syst Manag 8(3):27–35
Google Scholar
Goeminne M, Mens T (2015) Towards a survival analysis of database framework usage in java projects. In: Proceedings of the 2015 IEEE international conference on software maintenance and evolution (ICSME), pp 551–555
Goswami P, Gupta S, Li Z, Meng N, Yao D (2020) Investigating the reproducibility of NPM packages. In: Proceedings of the 2020 international conference on software maintenance and evolution (ICSME). pp 677–681
Kaplan E, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481
Article MathSciNet Google Scholar
Koen R, Olivier MS (2008) The use of file timestamps in digital forensics. In: ISSA. Citeseer, pp 1–16
Lamb C, Zacchiroli S (2021) Reproducible builds: Increasing the integrity of software supply chains. IEEE Software 39(2):62–70
Article Google Scholar
Maes-Bermejo M, Gallego M, Gortázar F, Robles G, Gonzalez-Barahona JM (2022) Revisiting the building of past snapshots-a replication and reproduction study. Empir Softw Eng (EMSE) 27(3):1–26
Google Scholar
Mancinelli F, Boender J, Di Cosmo R, Vouillon J, Durak B, Leroy X, Treinen R (2006) Managing the complexity of large free and open source package-based software distributions. In: Proceedings of the 21st international conference on automated software engineering (ASE). pp 199–208
Mäntylä MV, Adams B, Khomh F, Engström E, Petersen K (2015) On rapid releases and software testing: A case study and a semi-systematic literature review. Empirical Software Engineering 20(5):1384–1425
Article Google Scholar
Mao A, Mason W, Suri S, Watts DJ (2016) An experimental study of team size and performance on a complex task. PloS one 11(4):e0153048
Article Google Scholar
Massacci F, Jaeger T, Peisert S (2021) Solarwinds and the challenges of patching: Can we ever stop dancing with the devil? IEEE Secur Priv 19:14–19
Article Google Scholar
Maste E (2017) Reproducible builds in freebsd. In: Proceedings of 11th Asian conference on BSD based systems (AsiaBSDCon). pp 1–8
McHugh M (2012) Interrater reliability: The Kappa statistic. Biochemia Medica 22(3):276–282
Article MathSciNet Google Scholar
McIntosh S, Adams B, Nagappan M, Hassan AE (2014) Mining co-change information to understand when build changes are necessary. In: Proceedings of the 2014 IEEE international conference on software maintenance and evolution (ICSME). pp 241–250
Michlmayr M, Hunt F, Probert D (2007) Release management in free software projects: Practices and problems. In: Proceedings of the 2007 international federation for information processing international conference on open source systems (IFIPAICT), vol 234. pp 295–300
Miller P (1998) Recursive make considered harmful. AUUGN Journal of AUUG Inc 19(1):14–25
Mirhosseini S, Parnin C (2017) Can automated pull requests encourage software developers to upgrade out-of-date dependencies? In: 2017 32nd IEEE/ACM international conference on automated software engineering (ASE). pp 84–94
Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th international conference on software engineering. pp 284–292
Nussbaum L, Zacchiroli S (2010) The ultimate Debian database: Consolidating bazaar metadata for quality assurance and data mining. In: 2010 7th IEEE working conference on mining software repositories (MSR 2010). pp 52–61
Ohm M, Plate H, Sykosch A, Meier M (2020) Backstabber’s knife collection: A review of open source software supply chain attacks. In: Proceedings of the 2020 international conference on detection of intrusions and malware, and vulnerability assessment, vol 12223. pp 23–43
Ohm M, Sykosch A, Meier M (2020) Towards detection of software supply chain attacks by forensic artifacts. In: Proceedings of the 15th international conference on availability, reliability and security (ARES). pp 1–6
Plackett R (1983) Karl Pearson and the Chi-Squared test. Int Stat Rev 51(1):59–72
Article MathSciNet Google Scholar
Raymond E (1999) The cathedral and the bazaar. Knowl Technol Policy 12(3):23–49
Article Google Scholar
Rea LM, Parker RA (2014) Designing and conducting survey research: A comprehensive guide, 1st edn. John Wiley & Sons
Google Scholar
Ren Z, Jiang H, Xuan J, Yang Z (2016) Automated localization for unreproducible builds. In: Proceedings of the 40th international conference on software engineering (ICSE). pp 71–81
Samoladas I, Angelis L, Stamelos I (2010) Survival analysis on the duration of open source projects. Inf Softw Technol 52(9):902–922
Article Google Scholar
Shi Y, Wen M, Cogo FR, Chen B, Jiang ZMJ (2021) An experience report on producing verifiable builds for large-scale commercial systems. IEEE Transactions on Software Engineering
Thompson K (1984) Reflections on trusting trust. Commun ACM 27(8):761–763
Article Google Scholar
Vu DL, Pashchenko I, Massacci F, Plate H, Sabetta A (2020) Towards using source code repositories to identify software supply chain attacks, pp 2093–2095
Wang Z, Zhang H, Chen TH, Wang S (2021) Would you like a quick peek? Providing logging support to monitor data processing in big data applications. In: Proceedings of the 29th joint meeting on european software engineering conference and symposium on the foundations of software engineering (ESEC/FSE). pp 516–526
Wheeler DA (2005) Countering trusting trust through diverse double-compiling. In: Proceedings of the 21st annual computer security applications conference (ACSAC). pp 1–13
Yan D, Niu Y, Liu K, Liu Z, Liu Z, Bissyandé TF (2021) Estimating the attack surface from residual vulnerabilities in open source software supply chain. In: Proceedings of the 21st international conference on software quality, reliability and security (QRS). pp 493–502
Zerouali A, Constantinou E, Mens T, Robles G, González-Barahona J (2018) An empirical analysis of technical lag in NPM package dependencies. In: International conference on software reuse. pp 95–110
Zerouali A, Mens T, Robles G, Gonzalez-Barahona JM (2019) On the diversity of software package popularity metrics: an empirical study of npm. In: Proceedings of the 26th international conference on software analysis, evolution and reengineering (SANER). pp 589–593

Download references

Author information

Authors and Affiliations

School of Computing, Queen’s University, Kingston, ON, Canada
Rahul Bajaj, Eduardo Fernandes, Bram Adams & Ahmed E. Hassan

Authors

Rahul Bajaj
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Fernandes
View author publications
You can also search for this author in PubMed Google Scholar
Bram Adams
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed E. Hassan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rahul Bajaj.

Ethics declarations

Conflict of Interest

All authors declare that there is no conflict of interest.

Additional information

Communicated by: Philipp Leitner.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bajaj, R., Fernandes, E., Adams, B. et al. Unreproducible builds: time to fix, causes, and correlation with external ecosystem factors. Empir Software Eng 29, 11 (2024). https://doi.org/10.1007/s10664-023-10399-4

Download citation

Accepted: 21 September 2023
Published: 29 November 2023
DOI: https://doi.org/10.1007/s10664-023-10399-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unreproducible builds: time to fix, causes, and correlation with external ecosystem factors