Skip to main content
Log in

Assessing the exposure of software changes

The DiPiDi approach

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Changing a software application with many build-time configuration settings may introduce unexpected side effects. For example, a change intended to be specific to a platform (e.g., Windows) or product configuration (e.g., community editions) might impact other platforms or configurations. Moreover, a change intended to apply to a set of platforms or configurations may be unintentionally limited to a subset. Indeed, understanding the exposure of source code changes is an important risk mitigation step in change-based development approaches. In this paper, we present DiPiDi, a new approach to assess the exposure of source code changes under different build-time configuration settings by statically analyzing build specifications. To evaluate our approach, we produce a prototype implementation of DiPiDi for the CMake build system. We measure the effectiveness and efficiency of developers when performing five tasks in which they must identify the deliverable(s) and conditions under which a source code change will propagate. We assign participants into three groups: without explicit tool support, supported by existing impact analysis tools, and supported by DiPiDi. While our study does not have the statistical power to make generalized quantitative claims, we manually analyze the full distribution of our study’s results and show that DiPiDi results in a net benefit for its users. Through our experimental evaluation, we show that DiPiDi results in a 36 average percentage points improvement in F1-score when identifying impacted deliverables and a reduction of 0.62 units of distance when ranking impacted patches. Furthermore, DiPiDi results in a 42% average task time reduction for our participants when compared to a competing impact analysis approach. DiPiDi’s improvements to both effectiveness and efficiency are especially prevalent in complex programs with many compile-time configurations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://github.com/etlegacy/etlegacy

  2. This study has been reviewed and received ethics clearance through the University of Waterloo Research Ethics Committee (ORE# 43727).

  3. https://github.com/software-rebels/cmake-inspector

  4. https://www.philipzucker.com/z3-rise4fun/strategies.html

  5. https://frama-c.com/fc-plugins/frama-clang.html

  6. https://github.com/software-rebels/cmake-inspector/blob/master/UNDGraph.py

  7. https://github.com/software-rebels/dipidi-experiment-ui

  8. https://github.com/software-rebels/dipidi-participants-ui

  9. https://linux.die.net/man/1/kdecmake

References

  • Adams B, Tromp H, De Schutter K, De Meuter W (2007) Design recovery and maintenance of build systems. In: 2007 IEEE international conference on software maintenance. IEEE, pp 114–123

  • Adams B, De Schutter K, Tromp H, De Meuter W (2008) The evolution of the Linux build system. Electronic Communications of the EASST 8

  • Ahsan SN, Wotawa F (2010) Impact analysis of scrs using single and multi-label machine learning classification. In: Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement, pp 1–4

  • Al-Kofahi JM, Nguyen HV, Nguyen AT, Nguyen TT, Nguyen TN (2012) Detecting semantic changes in makefile build code. In: 2012 28th IEEE international conference on software maintenance (ICSM), pp 150–159 . https://doi.org/10.1109/ICSM.2012.6405266

  • Arnold RS, Bohner SA (1993) Impact analysis-towards a framework for comparison. In: 1993 conference on software maintenance. IEEE, pp 292–301

  • Bezemer CP, McIntosh S, Adams B, German DM, Hassan AE (2017) An empirical study of unspecified dependencies in make-based build systems. Empir Softw Eng 22(6):3117–3148

    Article  Google Scholar 

  • Bosu A, Greiler M, Bird C (2015) Characteristics of useful code reviews: An empirical study at microsoft. In: Proceedings of the 12th working conference on mining software repositories. IEEE Press, MSR ’15, pp 146–156

  • Cao Q, Wen R, Mcintosh S (2017) Forecasting the duration of incremental build jobs. In: 2017 IEEE International conference on software maintenance and evolution (ICSME). IEEE, pp 524–528

  • Cohen J (2010) Modern code review. Making Software: What Really Works, and Why We Believe It 329–336

  • Fontana FA, Mariani E, Mornioli A, Sormani R, Tonello A (2011) An experience report on using code smells detection tools. In: 2011 IEEE fourth international conference on software testing, verification and validation workshops. IEEE, pp 450–457

  • Gethers M, Poshyvanyk D (2010) Using relational topic models to capture coupling among classes in object-oriented software systems. In: 2010 IEEE international conference on software maintenance. IEEE, pp 1–10

  • Gyori A, Lahiri SK, Partush N (2017) Refining interprocedural change-impact analysis using equivalence relations. In: Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis. https://doi.org/10.1145/3092703.3092719, vol 2017. Association for Computing Machinery, New York, pp 318–328

  • Hassan F, Wang X (2018) Hirebuild: An automatic approach to history-driven repair of build scripts. In: 2018 IEEE/ACM 40Th international conference on software engineering (ICSE). IEEE, pp 1078–1089

  • Hochstein L, Jiao Y (2011) The cost of the build tax in scientific software. In: 2011 international symposium on empirical software engineering and measurement. IEEE, pp 384–387

  • Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1/2):81–93

    Article  MATH  Google Scholar 

  • Kirchner F, Kosmatov N, Prevosto V, Signoles J, Yakobowski B (2015) Frama-c: a software analysis perspective. Form Asp Comput 27(3):573–609

    Article  MathSciNet  Google Scholar 

  • Kitware (2020) CMake. https://cmake.org

  • Li B, Sun X, Leung H, Zhang S (2013) A survey of code-based change impact analysis techniques. Software Testing Verification and Reliability 23(8):613–646

    Article  Google Scholar 

  • Liebig J, Apel S, Lengauer C, Kästner C, Schulze M (2010) An analysis of the variability in forty preprocessor-based software product lines. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume, vol 1, pp 105–114

  • Martin K, Hoffman B (2010) Mastering CMake: a cross-platform build system. Kitware

  • McIntosh S, Adams B, Hassan AE (2010) The evolution of ant build systems. In: 2010 7th IEEE working conference on mining software repositories (MSR 2010). IEEE, pp 42–51

  • McIntosh S, Adams B, Nguyen TH, Kamei Y, Hassan AE (2011) An empirical study of build maintenance effort. In: 2011 33Rd international conference on software engineering (ICSE). IEEE, pp 141–150

  • Meidani M, Lamothe M, McIntosh S (2021) Assessing the exposure of software changes: The dipidi approach. arXiv:210400725

  • Moura Ld, Bjørner N (2008) Z3: An efficient smt solver. In: International conference on Tools and algorithms for the construction and analysis of systems. Springer, pp 337–340

  • Nadi S, Holt R (2014) The Linux kernel: A case study of build system variability. Journal of Software: Evolution and Process 26(8):730–746

    Google Scholar 

  • Orrú M, Tempero E, Marchesi M, Tonelli R, Destefanis G (2015) A curated benchmark collection of python systems for empirical studies on software engineering. In: Proceedings of the 11th international conference on predictive models and data analytics in software engineering, pp 1–4

  • Parr TJ, Quong RW (1995) Antlr: a predicated-ll (k) parser generator. Software: Practice and Experience 25(7):789–810

    Google Scholar 

  • Robles G, Gonzalez-Barahona JM, Merelo JJ (2006) Beyond source code: the importance of other artifacts in software development (a case study). J Syst Softw 79(9):1233–1248

    Article  Google Scholar 

  • Rosenthal R (1976) Experimenter effects in behavioral research. Irvington

  • Rovegård P, Angelis L, Wohlin C (2008) An empirical study on views of importance of change impact analysis issues. IEEE Trans Softw Eng 34(4):516–530

    Article  Google Scholar 

  • SciTools (2022) Understand. https://scitools.com/

  • Seo H, Sadowski C, Elbaum S, Aftandilian E, Bowdidge R (2014) Programmers’ build errors: a case study (at google). In: Proceedings of the 36th International Conference on Software Engineering, Association for Computing Machinery, New York, NY, USA, ICSE, vol 2014, pp 724–734. https://doi.org/10.1145/2568225.2568255

  • Sincero J, Schirmeier H, Schröder-Preikschat W, Spinczyk O (2007) Is the Linux kernel a software product line. In: Proc. SPLC workshop on open source software and product lines

  • Tamrawi A, Nguyen HA, Nguyen HV, Nguyen TN (2012) Build code analysis with symbolic evaluation. In: 2012 34Th international conference on software engineering (ICSE). IEEE, pp 650–660

  • Tu Q, Godfrey MW (2001) The build-time software architecture view. In: Proceedings IEEE international conference on software maintenance. ICSM. IEEE, pp 398–407

  • Tukey JW et al (1977) Exploratory data analysis vol 2, Reading, Mass

  • Wen R, Gilbert D, Roche MG, Mcintosh S (2018) Blimp tracer: integrating build impact analysis with code review. In: 2018 IEEE International conference on software maintenance and evolution (ICSME). IEEE, pp 685–694

  • Zhou B, Xia X, Lo D, Wang X (2014) Build predictor: More accurate missed dependency prediction in build configuration files. In: 2014 IEEE 38th annual computer software and applications conference, pp 53–58. https://doi.org/10.1109/COMPSAC.2014.12

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehran Meidani.

Additional information

Communicated by: David Lo, Tegawendé Bissyande

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Registered Reports

Appendices

Appendix A: Demographic Questions

  1. 1.

    How much experience do you have in programming?

    • None

    • Less than a year

    • a year to two years

    • two to five years

    • five years or more

  2. 2.

    How much are you familiar with CMake?

    • None

    • Tried it at least once

    • Used in professional development

  3. 3.

    How often do you currently program?

    • Never

    • Sometimes

    • More than once per week

Appendix B: Post Study Questionnaire

  1. 1.

    If you used any other tool(s) (CLI/IDE) please name it here:

  2. 2.

    If we provided a tool for you to use, how useful it was?

  3. 3.

    How do you feel? (1=Very Tired, 2=Tired, 3=Neutral, 4=Energetic, 5=Very Energetic)

  4. 4.

    How difficult were the tasks? (1=easy, 2=average, 3=hard)

  5. 5.

    How much experience did you have with the projects provided to you?

  6. 6.

    Did you encounter any problem during the experiment?

  7. 7.

    Any feedback about the experiment?

  8. 8.

    Can we contact you for a follow up interview?

Appendix C: Task A

You will be provided with the names of changed files and a set of build specifications. Your task is to list impacted deliverables (targets). Deliverables are defined in CMake files (CMakeLists.txt or .cmake files) using add_library or add_executable commands. You can find these files in the project repository. These commands take a target name and a list of files which impact the target. Some files may be excluded under different configuration. As an example, a code file related to ARM processor may not be included in the deliverable for Intel CPUs. Read more at https://cmake.org/cmake/help/latest/manual/cmake-buildsystem.7.html#binary-targets. The experiment UI provides text inputs for you to list those deliverables.

Follow the steps below to prepare for the task. Once you completed the steps, click on ready and the task will begin.

  1. 1.

    Access DiPiDi tool at ...

  2. 2.

    Clone the repository from https://github.com/libuv/libuv

Given the following commit id and the build time configuration, please find the impacted targets (deliverables). There maybe more input fields than necessary to complete the task.

  1. 1.

    Change Commit ID: cdced3a3ad1b3e4287f92c9d434b543a9e509938

  2. 2.

    Build Configuration: APPLE==False

Input1: ... Input2: ...

Appendix D: Task B - (Impacted Deliverables)

You will be shown three commits and a set of build specifications which are the conditions passed to the build system and may change the build process. These conditions are defined using option or if commands. Read more at https://cmake.org/cmake/help/latest/command/if.html. We ask you to rank the commits listed in the experiment UI based on the number of impacted deliverables. Rank the commits in an ascending order (1=Most Impact, 3=Less Impact)

Follow the steps below to prepare for the task. Once you completed the steps, click on ready and the task will begin.

  1. 1.

    Access DiPiDi tool at ...

  2. 2.

    Clone the repository from https://github.com/libuv/libuv

Given the following build time configurations, please rank the commits based on the given criteria.

  1. 1.

    Build Configurations: MAKE_SYSTEM_NAME==APPLE

  2. 2.

    Criteria: Impacted Deliverables

  1. 1.

    e89abc80ea43065a726ade191b810af53ec6158a: ...

  2. 2.

    953f901dd2330a9979838cd43ff04eacde71b25a: ...

  3. 3.

    e43eb667b5e0cace1eef4b6f5898de83cde262c6: ...

Appendix E: Task B - (Impacted Variants)

You will be shown three commits and a set of build specifications which are the conditions passed to the build system and may change the build process. These conditions are defined using option or if commands. Read more at https://cmake.org/cmake/help/latest/command/if.html. We ask you to rank the commits listed in the experiment UI based on the number of impacted application variants (e.g., number of affected OS). Rank the commits in an ascending order (1=Most Impact, 3=Less Impact)

Follow the steps below to prepare for the task. Once you completed the steps, click on ready and the task will begin.

  1. 1.

    Access DiPiDi tool at ...

  2. 2.

    Clone the repository from https://github.com/libuv/libuv

Given the following build time configurations, please rank the commits based on the given criteria.

  1. 1.

    Build Configurations: LIBUV_BUILD_TESTS==False

  2. 2.

    Impacted Application Variants (Operating systems)

  1. 1.

    e89abc80ea43065a726ade191b810af53ec6158a: ...

  2. 2.

    953f901dd2330a9979838cd43ff04eacde71b25a: ...

  3. 3.

    e43eb667b5e0cace1eef4b6f5898de83cde262c6: ...

Appendix F: Task C - (Identify Commits Affect Deliverables)

You will be shown three commits and asked to identify the commits that affect a specified set of deliverables.

Follow the steps below to prepare for the task. Once you completed the steps, click on ready and the task will begin.

  1. 1.

    Access DiPiDi tool at ...

  2. 2.

    Clone the repository from https://github.com/libuv/libuv

Identify the commits which affect these deliverables: [‘uv’]

  1. 1.

    e89abc80ea43065a726ade191b810af53ec6158a: ?

  2. 2.

    953f901dd2330a9979838cd43ff04eacde71b25a: ?

  3. 3.

    e43eb667b5e0cace1eef4b6f5898de83cde262c6: ?

Appendix G: Task C - (Identify Commits Affect Variant)

You will be shown three commits and asked to identify the commits that affect a specific variant of the software.

Follow the steps below to prepare for the task. Once you completed the steps, click on ready and the task will begin.

  1. 1.

    Access DiPiDi tool at ...

  2. 2.

    Clone the repository from https://github.com/libuv/libuv

Identify the commits which affect this variant: BSD Operating System

  1. 1.

    e89abc80ea43065a726ade191b810af53ec6158a: ?

  2. 2.

    953f901dd2330a9979838cd43ff04eacde71b25a: ?

  3. 3.

    e43eb667b5e0cace1eef4b6f5898de83cde262c6: ?

Appendix H: Task C - (Configuration Setting)

You will be shown three commits and asked to identify the configuration settings under which the changes will affect at least one target. The build configurations may exclude or include a file in the build process for an specific target using conditional commands in the CMake files. Read more at https://cmake.org/cmake/help/latest/command/if.html CMake website.

Follow the steps below to prepare for the task. Once you completed the steps, click on ready and the task will begin.

  1. 1.

    Access DiPiDi tool at ...

  2. 2.

    Clone the repository from https://github.com/libuv/libuv

For each of the given commits, identify at least one configuration setting under which the change will propagate to at least one deliverable(target). If the change will propagate irrespective of the conditional settings, enter the term “ALL”.

  1. 1.

    e89abc80ea43065a726ade191b810af53ec6158a: ...

  2. 2.

    953f901dd2330a9979838cd43ff04eacde71b25a: ...

  3. 3.

    e43eb667b5e0cace1eef4b6f5898de83cde262c6: ...

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meidani, M., Lamothe, M. & McIntosh, S. Assessing the exposure of software changes. Empir Software Eng 28, 41 (2023). https://doi.org/10.1007/s10664-022-10270-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10270-y

Keywords

Navigation