Skip to main content
Log in

Understanding semi-structured merge conflict characteristics in open-source Java projects

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Empirical studies show that merge conflicts frequently occur, impairing developers’ productivity, since merging conflicting contributions might be a demanding and tedious task. However, the structure of changes that lead to conflicts has not been studied yet. Understanding the underlying structure of conflicts, and the involved syntactic language elements might shed light on how to better avoid merge conflicts. To this end, in this paper we derive a catalog of conflict patterns expressed in terms of the structure of code changes that lead to merge conflicts. We focus on conflicts reported by a semistructured merge tool that exploits knowledge about the underlying syntax of the artifacts. This way, we avoid analyzing a large number of spurious conflicts often reported by typical line based merge tools. To assess the occurrence of such patterns in different systems, we conduct an empirical study reproducing 70,047 merges from 123 GitHub Java projects. Our results show that most semistructured merge conflicts in our sample happen because developers independently edit the same or consecutive lines of the same method. However, the probability of creating a merge conflict is approximately the same when editing methods, class fields, and modifier lists. Furthermore, we noticed that most part of conflicting merge scenarios, and merge conflicts, involve more than two developers. Also, that copying and pasting pieces of code, or even entire files, across different repositories is a common practice and cause of conflicts. Finally, we discuss how our results reveal the need for new research studies and suggest potential improvements to tools supporting collaborative software development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://goo.gl/9BXCmn

  2. http://www.opentripplanner.org/

  3. https://grizzly.java.net/

  4. http://www.gnu.org/software/rcs/

  5. https://github.com/search/advanced

  6. http://wiki.apache.org/cassandra/HowToContribute

  7. https://goo.gl/XQyygC

References

  • Accioly P, Borba P, Cavalcanti G (2017) Online appendix. http://twiki.cin.ufpe.br/twiki/bin/view/SPG/ConflictPatterns. Accessed: 1 November 2017

  • Apel S, Liebig J, Brandl B, Lengauer C, Kästner C (2011) Semistructured merge: rethinking merge in revision control systems. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering. ESEC/FSE ’11. ACM

  • Apel S, Lessenich O, Lengauer C (2012) Structured merge with auto-tuning: balancing precision and performance. In: Proceedings of the 27th IEEE/ACM international conference on automated software engineering. ASE 2012. ACM

  • Barik T, Lubick K, Murphy-Hill E (2015) Commit bubbles. In: Proceedings of the international conference on software engineering, new ideas and emerging results track. ICSE 2015. ACM

  • Bird C, Zimmermann T (2012) Assessing the value of branches with what-if analysis. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering. FSE ’12. ACM

  • Bird C, Rigby PC, Barr ET, Hamilton DJ, German DM, Devanbu P (2009) The promises and perils of mining git. In: Proceedings of the 2009 6th IEEE international working conference on mining software repositories. MSR ’09. IEEE Computer Society

  • Bonferroni CE (1936) Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze

  • Brun Y, Holmes R, Ernst MD, Notkin D (2013) Early detection of collaboration conflicts and risks. IEEE Trans Softw Eng 39:1358–1375

  • Cataldo M, Herbsleb JD (2011) Factors leading to integration failures in global feature-oriented development: an empirical analysis. In: Proceedings of the 33rd international conference on software engineering, ICSE ’11. ACM

  • Cavalcanti G, Accioly P, Borba P (2015) Assessing semistructured merge in version control systems: a replicated experiment. In: Proceedings of the 9th international symposium on empirical software engineering and measurement. ESEM’15. ACM

  • Cavalcanti G, Borba P, Accioly P (2017) Evaluating and improving semistructured merge. In: Proceedings of the ACM on programming languages

  • Costa C, Figueiredo J, Murta L, Tipmerge SA (2016) Recommending experts for integrating changes across branches. In: Proceedings of the 24th ACM SIGSOFT international symposium on foundations of software engineering. FSE 2016. ACM

  • Dias M, Bacchelli A, Gousios G, Cassou D, Ducasse S (2015) Untangling fine-grained code changes. In: Proceedings of the 22nd IEEE international conference on software analysis, evolution, and reengineering. SANER 2015. IEEE Computer Society

  • Dig D, Johnson R (2005) The role of refactorings in api evolution. In: Proceedings of the 21st IEEE international conference on software maintenance. ICSM ’05. IEEE Computer Society

  • Eclipse (2015) Jgit user guide. http://wiki.eclipse.org/JGit/User_Guide. Accessed: 16 June 2017

  • Estler HC, Nordio M, Furia C, Meyer B et al (2014) Awareness and merge conflicts in distributed software development. In: Proceedings of the IEEE 9th international conference on global software engineering. ICGSE’14. IEEE Computer Society

  • Falleri J-R, Morandat F, Blanc X, Martinez M, Monperrus M (2014) Fine-grained and accurate source code differencing. In: ACM/IEEE international conference on automated software engineering. ASE’14

  • Free Software Foundation (2016) Diff utils user’s manual. https://www.gnu.org/software/diffutils/manual/diffutils.html. Accessed: 2017 Jun 16

  • Gousios G, Pinzger M, van Deursen A (2014) An exploratory study of the pull-based software development model. In: Proceedings of the 36th international conference on software engineering. ICSE 2014. ACM

  • Guimarães ML, Silva AR (2012) Improving early detection of software merge conflicts. In: Proceedings of the 34th international conference on software engineering. ICSE ’12. IEEE Press

  • Guzzi A, Bacchelli A, Riche Y, van Deursen A (2015) Supporting developers’ coordination in the ide. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing. CSCW ’15. ACM

  • Hattori L, Lanza M (2010) Syde: a tool for collaborative software development. In: Proceedings of the 32Nd ACM/IEEE international conference on software engineering, vol 2. ICSE ’10. ACM

  • Henkel J, Diwan A (2005) Catchup!: Capturing and replaying refactorings to support api evolution. In: Proceedings of the 27th international conference on software engineering. ICSE ’05. ACM

  • Jackson D, Ladd DA (1994) Semantic diff: a tool for summarizing the effects of modifications. In: Proceedings of the international conference on software maintenance. ICSM ’94, pp 243–252. IEEE Computer Society, Washington, DC. ISBN 0-8186-6330-8. http://dl.acm.org/citation.cfm?id=645543.655704

  • Kalliamvakou E, Damian D, Blincoe K, Singer L, German DM (2015) Open source-style collaborative development practices in commercial projects using github. In: Proceedings of the 37th international conference on software engineering. ICSE ’15. ACM

  • Kasi BK, Sarma A (2013) Cassandra: proactive conflict minimization through optimized task scheduling. In: Proceedings of the 2013 international conference on software engineering. ICSE ’13. IEEE Press

  • Khanna S, Kunal K, Pierce BC (2007) A formal investigation of diff3. In: Proceedings of the 27th international conference on foundations of software technology and theoretical computer science. FSTTCS’07. Springer

  • Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Technical report, Soviet Physics Doklady

  • Menezes G (2016) On the nature of software merge conflicts. PhD thesis, Federal Fluminense University. Accessed: 16 Jun 2017

  • Mens T (2002) A state-of-the-art survey on software merging. IEEE Trans Softw Eng

  • Muslu K, Swart L, Brun Y, Ernst MD (2015) Development history granularity transformations (N). In: 30th IEEE/ACM international conference on automated software engineering. ASE ’15. IEEE Computer Society

  • Nagappan M, Zimmermann T, Bird C (2013) Diversity in software engineering research. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering. ESEC/FSE 2013. ACM

  • Perry DE, Siy HP, Votta LG (2001) Parallel changes in large-scale software development: an observational case study. ACM Trans Softw Eng Methodol 10:308–337

  • Rosenthal R (1994) Parametric measures of effect size. Russell Sage Foundation

  • Sarma A, Redmiles DF, van der Hoek A (2012) Palantír: early detection of development conflicts arising from parallel code changes. IEEE Trans Softw Eng 38:889–908

  • Shihab E, Bird C, Zimmermann T (2012) The effect of branching strategies on software quality. In: Proceedings of the ACM-IEEE international symposium on empirical software engineering and measurement. ESEM ’12. ACM

  • Svajlenko J, Islam JF, Keivanloo I, Roy CK, Mia MM (2014) Towards a big data curated benchmark of inter-project code clones. In: Proceedings of the 2014 IEEE international conference on software maintenance and evolution. ICSME ’14. IEEE Computer Society

  • Wilcoxon F, Wilcox RA (1964) Some rapid approximate statistical procedures. Lederle Laboratories

  • Zimmermann T (2007) Mining workspace updates in CVS. In: Proceedings of the fourth international workshop on mining software repositories. MSR ’07. IEEE Computer Society

Download references

Acknowledgements

We would like to thank the FACEPE (grants APQ 0388-1.03/14 and IBPG 0716-1.03/12), CNPq (grant 309741/2013-0), and CAPES funding agencies for partially supporting this work. We also thank our Software Productivity Group colleagues, and the anonymous reviewers who greatly contributed to improve this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paola Accioly.

Additional information

Communicated by: Romain Robbes

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Accioly, P., Borba, P. & Cavalcanti, G. Understanding semi-structured merge conflict characteristics in open-source Java projects. Empir Software Eng 23, 2051–2085 (2018). https://doi.org/10.1007/s10664-017-9586-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-017-9586-1

Keywords

Navigation