Using acceptance tests to predict merge conflict risk

Rocha, Thaís; Borba, Paulo

doi:10.1007/s10664-022-10266-8

Using acceptance tests to predict merge conflict risk

Published: 09 January 2023

Volume 28, article number 27, (2023)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

275 Accesses
Explore all metrics

Abstract

Merge conflict resolution might be time-consuming and lead to defects, compromising development productivity and system quality. Developers might reduce such adverse impacts by avoiding concurrent programming tasks that are more likely to change the same files and cause merge conflicts. As manually predicting such risk is hard, we propose the TAITI r tool, which approximates the set of files changed by a task (task interface) and reports conflict risk whenever there is an intersection between task interfaces. TAITI r uses as input the acceptance tests related to the tasks for predicting file changes, deriving test-based task interfaces. To assess TAITI r’s conflict risk predictions, we measure precision and recall of 6,360 task pairs from 19 Rails projects on GitHub. Our results confirm that the intersection among task interfaces is associated with a higher probability of merge conflict risk. A minimal intersection predicts conflict risk with 0.59 precision and 0.98 recall. We observe that the higher the intersection size, the higher the number of files changed by both tasks. This way, developers might use the intersection size between interfaces as a degree of conflict risk between tasks, choosing a task to work on depending on it. We also find that TAITI r’s predictions outperform predictions based on changed files by similar past tasks. Our analysis derives several other results, considering variations of our notion of an interface in two dimensions: parts of the test code considered for computing interfaces, kinds of files abstracted by the interfaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting merge conflicts considering social and technical assets

Article Open access 15 December 2023

The life-cycle of merge conflicts: processes, barriers, and strategies

Article 05 February 2019

Indicators for merge conflicts in the wild: survey and empirical study

Article 09 September 2017

Data Availability

The datasets generated during and/or analyzed during the current study are available in this website: https://thaisabr.github.io/conflict-risk-prediction-study-site/.

Notes

The paper focuses on textual merge conflicts caused by parallel changes in a file hunk. This way, in the rest of the text, references to merge conflicts mean textual merge conflicts.
https://github.com/thaisabr/TestInterfaceEvaluation
The range refers to the results related to the prediction of file changes of two samples (the smallest and the largest, respectively) when using TestI with no filtering strategy.
https://github.com/thaisabr/TAITIr
A programming task is an activity performed by a developer that results in code creation or edition, such as developing a new feature, bug fix, or refactoring. Considering the usage of a repository to integrate code contributions, we extracted tasks from merge scenarios in this study. From a merge scenario, we determine a triple formed by left and right commits to be merged, and a base commit that is a common ancestor to left and right. This way, a task is a set of all reachable commits between the left/right commit and the base commit.
https://rubyonrails.org/
https://cucumber.io/
The project is part of our sample, but task T₁₇₅ is not because it does not satisfy the selection criteria explained in Section 4 related to TextI. In sum, there are no older tasks than T₁₇₅ in the sample from project allourideas/allourideas.org, resulting in an empty TextI. We present the example by using fictitious developers. We found the conflict occurrence by merging the tasks, given they were extracted from the same merge scenario — the same base and merge commits.
As a matter of clarity, we slightly simplify the Cucumber test, and we omit some parts of it that are not relevant to our explanation.
Available in our online Appendix (Rocha and Borba 2019).
When searching for GitHub projects, we avoided projects created earlier than 2010. But we also selected projects referenced by the Cucumber’s site, which includes projects created before 2010. Also, the creation date refers to the date a repository was created on GitHub, which does not necessarily reflect the date of the first commit. The project might have been first created in another code hosting platform and moved to GitHub.
https://cucumber.io/docs/community/projects-using-cucumber
For simplicity, we assume a single most recent common ancestor. With so-called criss-cross merge situations in Git, there could be more than one.
Note that every project has at least the oldest task, for which there are no similar past tasks.
We first search for merge commits performed until June 2019. Only when we consolidate our task pair sample, we collect detailed information about the selected projects.
We restrict the set of changed files to Ruby and .html files (and common variations) into app or lib folders.
As a matter of brevity, we omit variants of HTML files, such as .haml and .erb files.
For simplicity, we round the values, but they are not identical in the third decimal.
As a matter of brevity, we present detailed results in our online Appendix.

References

Accioly P, Borba P, Cavalcanti G (2017) Understanding semi-structured merge conflict characteristics in open-source java projects. Empirical Software Engineering https://doi.org/10.1007/s10664-017-9586-1
Adams B, McIntosh S (2016) Modern release engineering in a nutshell – why researchers should care. In: 2016 IEEE 23Rd international conference on software analysis, evolution, and reengineering (SANER), vol 5. pp 78-90
Ahmed I, Brindescu C, Mannan UA, Jensen C, Sarma A (2017) An empirical examination of the relationship between code smells and merge conflicts. In: 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp 58–67
Apel S, Liebig J, Brandl B, Lengauer C, Kästner C (2011) Semistructured merge: Rethinking merge in revision control systems. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ESEC/FSE ’11, https://doi.org/10.1145/2025113.2025141. ACM, New York, pp 190–200
Apel S, Leβenich O, Lengauer C (2012) Structured merge with auto-tuning: Balancing precision and performance. In: Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. ASE 2012, https://doi.org/10.1145/2351676.2351694. ACM, New York, pp 120–129
Bacchelli A, Bird C (2013) Expectations, outcomes, and challenges of modern code review. In: Proceedings of the 2013 International Conference on Software Engineering, IEEE Press, pp 712–721
Bailey M, Lin KI, Sherrell L (2012) Clustering source code files to predict change propagation during software maintenance. In: Proceedings of the 50th Annual Southeast Regional Conference. ACM-SE ’12, https://doi.org/10.1145/2184512.2184538. ACM, New York, pp 106–111
Bass L, Weber I (2016) Zhu l, A Software Architect’s Perspective. Addison-Wesley Professional, DevOps
Berry DM (2017) Evaluation of tools for hairy requirements engineering and software engineering tasks. Tech. rep., University of Waterloo, https://cs.uwaterloo.ca/~dberry/FTP_SITE/tech.reports/EvalPaper.pdf. Accessed: Jan 2021
Biehl JT, Czerwinski M, Smith G, Robertson GG (2007) Fastdash: A visual dashboard for fostering awareness in software teams. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, pp 1313–1322. CHI ’07, https://doi.org/10.1145/1240624.1240823
Borges H, Hora A, Valente MT (2016) Understanding the factors that impact the popularity of github repositories. In: 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp 334–344, DOI https://doi.org/10.1109/ICSME.2016.31
Borici A, Blincoe K, Schröter A, Valetto G, Damian D (2012) Proxiscientia: Toward real-time visualization of task and developer dependencies in collaborating software development teams. In: 2012 5th International Workshop on Co-operative and Human Aspects of Software Engineering (CHASE), pp 5–11, DOI https://doi.org/10.1109/CHASE.2012.6223024
Brun Y, Holmes R, Ernst MD, Notkin D (2013) Early detection of collaboration conflicts and risks. IEEE Trans Softw Eng 39(10):1358–1375. https://doi.org/10.1109/TSE.2013.28
Article Google Scholar
Cavalcanti G, Borba P, Accioly P (2017) Evaluating and improving semistructured merge. Proc ACM Program Lang 1(OOPSLA):59:1–59:27. https://doi.org/10.1145/3133883
Article Google Scholar
Cubranic D, Murphy GC, Singer J, Booth KS (2005) Hipikat: A project memory for software development. IEEE Trans Softw Eng 31(6):446–465. https://doi.org/10.1109/TSE.2005.71
Article Google Scholar
Denninger O (2012) Recommending relevant code artifacts for change requests using multiple predictors. In: Proceedings of the Third International Workshop on Recommendation Systems for Software Engineering. RSSE ’12. IEEE Press, Piscataway, pp 78–79, http://dl.acm.org/citation.cfm?id=2666719.2666737
Dewan P, Hegde R (2007) Semi-synchronous conflict detection and resolution in asynchronous software development. In: ECSCW 2007, Springer, pp 159–178
Dias K, Borba P, Barreto M (2020) Understanding predictive factors for merge conflicts. Inf Softw Technol 121:106256
Article Google Scholar
Dias M, Bacchelli A, Gousios G, Cassou D, Ducasse S (2015) Untangling fine-grained code changes. In: 2015 IEEE 22Nd international conference on software analysis, evolution, and reengineering, SANER, IEEE, pp 341–350
Fowler M (2010) Feature toggle. https://martinfowler.com/bliki/FeatureToggle.html. Accessed: Jan 2021
Fowler M (2020) Feature Branch. https://martinfowler.com/bliki/FeatureBranch.html. Accessed: Jan 2021
Giger E, Pinzger M, Gall HC (2012) Can we predict types of code changes? an empirical analysis. In: Mining Software Repositories (MSR), 2012 9th IEEE Working Conference on, pp 217–226
Grinter R E (1997) Supporting articulation work using software configuration management systems. Comput Supported Coop Work 5(4):447–465
Article Google Scholar
Guimarães ML, Silva AR (2012) Improving early detection of software merge conflicts. In: Proceedings of the 34th International Conference on Software Engineering. IEEE Press, Piscataway, pp 342–352. ICSE ’12, http://dl.acm.org/citation.cfm?id=2337223.2337264
Henderson F (2017) Software engineering at Google. https://arxiv.org/abs/1702.01715, Accessed: Jan 2021
Hodgson P (2017a) Feature branching vs. feature flags: What’s the right tool for the job? Tech. rep., DevOps Blog. https://devops.com/feature-branching-vs-feature-flags-whats-right-tool-job/, Accessed: Jan 2021
Hodgson P (2017b) Feature toggles (aka Feature Flags). https://martinfowler.com/articles/feature-toggles.html. Accessed: Jan 2021
Kasi BK, Sarma A (2013) Cassandra: Proactive conflict minimization through optimized task scheduling. In: Proceedings of the 2013 International Conference on Software Engineering, IEEE Press, ICSE ’13, pp 732–741
Kersten M, Murphy GC (2006) Using task context to improve programmer productivity. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Association for Computing Machinery, pp 1–11, DOI https://doi.org/10.1145/1181775.1181777
Leßenich O, Siegmund J, Apel S, Kästner C, Hunsen C (2018) Indicators for merge conflicts in the wild: survey and empirical study. Autom Softw Eng 25(2):279–313
Article Google Scholar
Nagappan M, Zimmermann T, Bird C (2013) Diversity in software engineering research. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ACM, ESEC/FSE 2013, pp 466–476. http://doi.acm.org/10.1145/2491411.2491415
Potvin R, Levenberg J (2016) Why google stores billions of lines of code in a single repository. Commun ACM 59:78–87. http://dl.acm.org/citation.cfm?id=2854146
Article Google Scholar
Rocha T, Borba P (2019) Online Appendix. https://thaisabr.github.io/conflict-risk-prediction-study-site/, Accessed: Jan 2021
Rocha T, Borba P, Santos J P (2019) Using acceptance tests to predict files changed by programming tasks. J Syst Softw 154:176–195
Article Google Scholar
Salton G, McGill MJ (1986) Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York
MATH Google Scholar
Sarma A, Redmiles D, van der Hoek A (2012) Palantír: Early detection of development conflicts arising from parallel code changes. IEEE Trans Softw Eng 38(4):889–908
Article Google Scholar
Smart J (2014) BDD in Action: Behavior-Driven Development for the Whole Software Lifecycle. Manning Publications Company, https://booktitles.google.com.br/booktitles?id=2BGxngEACAAJ
de Souza CRB, Redmiles D, Dourish P (2003) “breaking the code”, moving between private and public work in collaborative software development. In: Proceedings of the 2003 International ACM SIGGROUP Conference on Supporting Group Work, ACM, GROUP ’03, pp 105–114
Stray V, Sjøberg DI, Dybå T (2016) The daily stand-up meeting: A grounded theory study. J Syst Softw 114:101–124. https://doi.org/10.1016/j.jss.2016.01.004. https://www.sciencedirect.com/science/article/pii/S0164121216000066
Article Google Scholar
Thompson C, Murphy G (2014) Recommending a starting point for a programming task: An initial investigation. 4th International Workshop on Recommendation Systems for Software Engineering, RSSE 2014 - Proceedings, https://doi.org/10.1145/2593822.2593824
Ying ATT, Murphy GC, Ng R, Chu-Carroll MC (2004) Predicting source code changes by mining change history. IEEE Trans Softw Eng 30 (9):574–586
Article Google Scholar
Zampetti F, Di Sorbo A, Visaggio CA, Canfora G, Di Penta M (2020) Demystifying the adoption of behavior-driven development in open source projects. Inf Softw Technol 123:106311–0. https://doi.org/10.1016/j.infsof.2020.106311. https://www.sciencedirect.com/science/article/pii/S095058492030063X
Article Google Scholar
Zimmermann T, Weisgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: Proceedings of the 26th International Conference on Software Engineering. IEEE Computer Society, Washington, pp 563–572. ICSE ’04, http://dl.acm.org/citation.cfm?id=998675.999460

Download references

Acknowledgements

For partially supporting this work, we would like to thank INES (National Software Engineering Institute) and the Brazilian research funding agencies CNPq (grant 309741/2013-0), FACEPE (grants IBPG-0546-1.03/15 and APQ/0388-1.03/14), and CAPES.

Funding

Partial financial support was received from INES (National Software Engineering Institute) and the Brazilian research funding agencies CNPq (grant 309741/2013-0), FACEPE (grants IBPG-0546-1.03/15 and APQ/0388-1.03/14), and CAPES.

Author information

Authors and Affiliations

Federal University of Agreste of Pernambuco, Garanhuns, Brazil
Thaís Rocha
Informatics Center, Federal University of Pernambuco, Recife, Brazil
Paulo Borba

Authors

Thaís Rocha
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Borba
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Thaís Rocha. The first draft of the manuscript was written by Thaís Rocha and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Thaís Rocha.

Ethics declarations

Conflict of Interests

The authors have no competing interests to declare that are relevant to the content of this article. The authors have no relevant financial or non-financial interests to disclose. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Communicated by: Bram Adams

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Rocha, T., Borba, P. Using acceptance tests to predict merge conflict risk. Empir Software Eng 28, 27 (2023). https://doi.org/10.1007/s10664-022-10266-8

Download citation

Accepted: 30 November 2022
Published: 09 January 2023
DOI: https://doi.org/10.1007/s10664-022-10266-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using acceptance tests to predict merge conflict risk

Abstract

Access this article

Similar content being viewed by others

Predicting merge conflicts considering social and technical assets

The life-cycle of merge conflicts: processes, barriers, and strategies

Indicators for merge conflicts in the wild: survey and empirical study

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using acceptance tests to predict merge conflict risk

Abstract

Access this article

Similar content being viewed by others

Predicting merge conflicts considering social and technical assets

The life-cycle of merge conflicts: processes, barriers, and strategies

Indicators for merge conflicts in the wild: survey and empirical study

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation