Skip to main content
Log in

On the relation between Github communication activity and merge conflicts

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Version control systems assist developers in managing concurrent changes to a common code base by tracking all code contributions over time. A notorious problem is that, when integrating code contributions, merge conflicts may occur and resolving them is a time-consuming and error-prone task. There is a popular belief that communication and collaboration success are mutually dependent. So, it is believed that great communication activity helps to avoid merge conflicts. However, in practice, the role of communication activity for merge conflicts to occur or to be avoided has not been thoroughly investigated. To better understand this relation, we analyzed the history of 30 popular open-source projects involving 19 thousand merge scenarios. Methodologically, we used a bivariate (Spearman’s rank correlation) and a multivariate (principal component analysis and partial correlations) analysis to quantify their correlation. In bivariate analysis, we found a weak positive correlation between GitHub communication activity and the number of merge conflicts. However, in the multivariate analysis, the positive correlation disappeared, not supporting the intuition that GitHub communication helps to avoid merge conflicts. Interestingly, we found that the strength of this relationship depends on the merge scenarios’ characteristics, such as the number of lines of code changed. Puzzled by these unexpected results, we investigated each covariate, which provided justifications for our findings. The main conclusion from our study is that GitHub communication activity itself does not support the emergence or avoidance of merge conflicts even though such communication is associated only with merge scenario code changes or among developers only.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://developer.github.com/v3/

  2. https://github.com/kubernetes/kubernetes

  3. https://github.com/moby/moby

  4. github.com/getlantern/lantern; commit 86be2a8

  5. github.com/toddmotto/public-apis; commit 0870841

  6. github.com/ReactiveX/RxJava; commit 25ebda

  7. github.com/getlantern/lantern; commits:9d0bbbb and 6b6b534

References

  • Accioly P, Borba P, Cavalcanti G (2017) Understanding semi-structured merge conflict characteristics in open-source java projects. Empir Softw Eng 23(4):1–35. Springer

    Google Scholar 

  • Apel S, Liebig J, Brandl B, Lengauer C, Kästner C (2011) Semistructured merge: rethinking merge in revision control systems. In: Proceedings of the symposium and the European conference on foundations of software engineering (ESEC/FSE). ACM, pp 190–200

  • Apel S, Leßenich O, Lengauer C (2012) Structured merge with autotuning: balancing precision and performance. In: Proceedings of the international conference on automated software engineering (ASE)

  • Aranda J, Venolia G (2009) The secret life of bugs: going past the errors and omissions in software repositories. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 298–308

  • Begel A, Khoo YP, Zimmermann T (2010) Codebook: Discovering and exploiting relationships in software repositories. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 125–134

  • Biehl J, Czerwinski M, Smith G, Robertson G (2007) FASTDash: a visual dashboard for fostering awareness in software teams. In: Proceedings of the conference on human factors in computing systems (CHI), ACM, pp 1313–1322

  • Bird C, Pattison D, D’Souza R, Filkov V, Devanbu P (2008) Latent social structure in open source projects. In: Proceedings of the ACM SIGSOFT symposium on the foundations of software engineering (FSE). ACM, pp 24–35

  • Bird C, Nagappan N, Devanbu P, Gall H, Murphy B (2009) Does distributed development affect software quality? An empirical case study of windows vista. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 518–528

  • Borges H, Valente MT (2018) What’s in a GitHub star? understanding repository starring practices in a social coding platform. J Syst Softw 146(1):112–129

    Article  Google Scholar 

  • Brun Y, Holmes R, Ernst MD, Notkin D (2011) Proactive detection of collaboration conflicts. In: Proceedings of the European software engineering conference and the symposium on foundations of software engineering (ESEC/FSE). ACM, pp 168–178

  • Dabbish L, Stuart C, Tsay J, Herbsleb J (2012) Social coding in GitHub: transparency and collaboration in an open software repository. In: Proceedings of the conference on computer supported cooperative work (CSCW). ACM, pp 1277–1286

  • Dewan P, Hegde R (2007) Semi-synchronous conflict detection and resolution in asynchronous software development. In: Proceedings of the conference on european computer supported cooperative work (ECSCW). ACM, pp 159–178

  • Dickersin K, Min Y, Meinert C (1992) Factors influencing publication of research results: follow-up of applications submitted to two institutional review boards. J Am Med Assoc 267(3):374–378

    Article  Google Scholar 

  • Foucault M, Falleri J-R, Blanc X (2014) Code ownership in open-source software In: Proceedings of the international conference on evaluation and assessment in software engineering (EASE). ACM, pp 1–9

  • Ghiotto G, Murta L, Barros M, van der Hoek A (2018) On the Nature of Merge Conflicts a Study of 2,731 Open Source Java Projects Hosted by Github. In: Transactions on software engineering (TSE), vol. 99 (1). IEEE, pp 1–25

  • Gousios G, Pinzger M, Deursen A (2014) An exploratory study of the pull-based software development model. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 345–355

  • Gousios G, Storey MA, Bacchelli A (2016) Work practices and challenges in pull-based development: The contributor’s perspective. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 285–296

  • Greiler M, Herzig K, Czerwonka J (2015) Code ownership and software quality: A replication study. In: Proceedings of the working conference on mining software repositories (MSR). IEEE, pp 2–12

  • Grinter RE, Herbsleb JD, Perry DE (1999) The geography of coordination: Dealing with distance in R & D work. In: Proceedings of the international ACM SIGGROUP conference on supporting group work (GROUP). ACM, pp 306–315

  • Guimarães ML, Silva AR (2012) Improving early detection of software merge conflicts In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 342–352

  • Guzzi A, Bacchelli A, Lanza M, Pinzger M, Deursen A (2013) Communication in open source software development mailing lists In: Proceedings of the working conference on mining software repositories (MSR). IEEE, pp 277–286

  • Jerrold HZ (1972) Significance testing of the spearman rank correlation coefficient. J Am Stat Assoc 67(339):578–580. Taylor & Francis, Ltd

    Article  Google Scholar 

  • Joblin M, Mauerer W, Apel S, Siegmund J, Riehle D (2015) From developer networks to verified communities: A fine-grained approach. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 563–573

  • Joblin M, Apel S, Mauerer W (2017) Evolutionary trends of developer coordination: a network approach, vol 22

  • Jolliffe IT (2002) Principal component analysis. Springer Series in Statistics, Springer, 2nd edn., p 487

  • Just S, Herzig K, Czerwonka J, Murphy B (2016) Switching to Git: the good, the bad, and the ugly. In: Proceeding of the international symposium on software reliability engineering (ISSRE). IEEE, pp 400–411

  • Kalliamvakou E, Gousios G, Blincoe K, Singer L, German D, Damian D (2014) The promises and perfils of mining GitHub. In: Proceedings of the working conference on mining software repositories (MSR). ACM, pp 92–101

  • Kasi BK, Sarma A (2013) Cassandra: Proative conflict minimization through optimized task scheduling. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 732–741

  • Kim S (2015) pcor: An R package for a fast calculation to semi-partial correlation coefficients. Communication for Statistical Applications and Methods 22(6):665–674

    Article  Google Scholar 

  • LaToza TD, Venolia G, DeLine R (2006) Maintaining mental models: A study of developer work habits. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 492–501

  • Leßenich O, Siegmund J, Apel S, Kästner C, Hunsen C (2017) Indicators for merge conflicts in the wild: survey and empirical study. Autom Softw Eng 25(2):1–35. Springer

    Google Scholar 

  • Liu J, Li J, He L (2016) A comparative study of the effects of pull request on GitHub projects. In: Annual computer software and applications conference (COMPSAC). IEEE, pp 313–322

  • McKee S, Nelson N, Sarma A, Dig D (2017) Software Practitioner perspectives on merge conflicts and resolutions. In: Proceedings of the international conference on software maintenance and evolution (ICSME), IEEE, pp 467–478

  • Mens T (2002) A state-of-the-art survey on software merging, vol 28. IEEE

  • Nelson N, Brindescu C, McKee S, Sarma A, Dig D (2019) The life-cycle of merge conflicts: processes, barriers, and strategies. Empirical software engineering, online first, Springer, pp 1–44

  • Olson C, Rennie D, Cook D, Dickersin K, Flanagin A, Hogan J, Zhu Q, Reiling J, Pace B (2002) Publication bias in editorial decision making. J Am Med Assoc 287(21):2825–2828

    Article  Google Scholar 

  • Panichella S, Bavota G, Penta MD, Canfora G, Antoniol G (2014) How developers’ collaborations identified from different sources tell us about code changes. In: Proceeding of the international conference on software maintenance and evolution (ICSME). IEEE, pp 251–260

  • Reiter E, Robertson R, Osman L (2003) Lessons from a failure: generating tailored smoking cessation letters. Artif Intell 144(1-2):41–58. Elsevier

    Article  Google Scholar 

  • Sarma A, Redmiles DF, van der Hoek A (2012) Palantír: Early detection of development conflicts arising from parallel code changes, vol 38. IEEE

  • Sedano T, Ralph P, Péraire C (2017) Software development waste. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 130–140

  • Siegmund J, Schumann J (2015) Confounding parameters on program comprehension: a literature survey. Empir Softw Eng 20(4):1159–1192

    Article  Google Scholar 

  • Singer L, Figueira Filho F, Cleary B, Treude C, Storey MA, Schneider K (2013) Mutual assessment in the social programmer ecosystem: an empirical investigation of developer profile aggregators. In: Proceedings of the conference on computer supported cooperative work (CSCW). ACM, pp 103–116

  • Souza CRB, Redmiles D, Cheng L, Millen D, Patterson J (2004) How a good software practice thwarts collaboration: the multiple roles of apis in software development. SIGSOFT Software Engineering Notes 29(6):221–230

    Article  Google Scholar 

  • Storey MA, Zagalsky A, Figueira Filho F, Singer L, German DM (2016) How social and communication channels shape and challenge a participatory culture in software development, vol 43

  • Teo T (2014) Handbook of quantitative methods for educational research. SensePublishers, pp 404

  • Tsay J, Dabbish L, Herbsleb J (2014) Influence of social and technical factors for evaluating contribution in GitHub. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 356–366

  • Vale G, Fernandes E, Figueiredo E (2018) On the proposal and evaluation of a benchmark-based threshold derivation method. Softw Qual J 27(1):1–32

    Google Scholar 

  • Vale G, Schimid A, Santos A, Almeida E, Apel S (2019) On the relation between coordination activities and merge conflicts – supplementary Web site available: https://sites.google.com/view/vale-emse2019. Accessed 30 July 2019

  • Wright SP (1992) Adjusted P-values for simultaneous inference. Biometrics 48 (4):1005–1013. Wiley

    Article  Google Scholar 

  • Zimmermann T, Weisgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 563–572

Download references

Acknowledgements

This work was partially supported by CNPq (grant 290136/2015-6) and Bavarian State Ministry of Education, Science and the Arts in the framework of the Centre Digitisation.Bavaria (ZD.B).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gustavo Vale.

Additional information

Communicated by: Jeffrey C. Carver

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vale, G., Schmid, A., Santos, A.R. et al. On the relation between Github communication activity and merge conflicts. Empir Software Eng 25, 402–433 (2020). https://doi.org/10.1007/s10664-019-09774-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-019-09774-x

Keywords

Navigation