Abstract
Version control systems assist developers in managing concurrent changes to a common code base by tracking all code contributions over time. A notorious problem is that, when integrating code contributions, merge conflicts may occur and resolving them is a time-consuming and error-prone task. There is a popular belief that communication and collaboration success are mutually dependent. So, it is believed that great communication activity helps to avoid merge conflicts. However, in practice, the role of communication activity for merge conflicts to occur or to be avoided has not been thoroughly investigated. To better understand this relation, we analyzed the history of 30 popular open-source projects involving 19 thousand merge scenarios. Methodologically, we used a bivariate (Spearman’s rank correlation) and a multivariate (principal component analysis and partial correlations) analysis to quantify their correlation. In bivariate analysis, we found a weak positive correlation between GitHub communication activity and the number of merge conflicts. However, in the multivariate analysis, the positive correlation disappeared, not supporting the intuition that GitHub communication helps to avoid merge conflicts. Interestingly, we found that the strength of this relationship depends on the merge scenarios’ characteristics, such as the number of lines of code changed. Puzzled by these unexpected results, we investigated each covariate, which provided justifications for our findings. The main conclusion from our study is that GitHub communication activity itself does not support the emergence or avoidance of merge conflicts even though such communication is associated only with merge scenario code changes or among developers only.
Similar content being viewed by others
Notes
github.com/getlantern/lantern; commit 86be2a8
github.com/toddmotto/public-apis; commit 0870841
github.com/ReactiveX/RxJava; commit 25ebda
github.com/getlantern/lantern; commits:9d0bbbb and 6b6b534
References
Accioly P, Borba P, Cavalcanti G (2017) Understanding semi-structured merge conflict characteristics in open-source java projects. Empir Softw Eng 23(4):1–35. Springer
Apel S, Liebig J, Brandl B, Lengauer C, Kästner C (2011) Semistructured merge: rethinking merge in revision control systems. In: Proceedings of the symposium and the European conference on foundations of software engineering (ESEC/FSE). ACM, pp 190–200
Apel S, Leßenich O, Lengauer C (2012) Structured merge with autotuning: balancing precision and performance. In: Proceedings of the international conference on automated software engineering (ASE)
Aranda J, Venolia G (2009) The secret life of bugs: going past the errors and omissions in software repositories. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 298–308
Begel A, Khoo YP, Zimmermann T (2010) Codebook: Discovering and exploiting relationships in software repositories. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 125–134
Biehl J, Czerwinski M, Smith G, Robertson G (2007) FASTDash: a visual dashboard for fostering awareness in software teams. In: Proceedings of the conference on human factors in computing systems (CHI), ACM, pp 1313–1322
Bird C, Pattison D, D’Souza R, Filkov V, Devanbu P (2008) Latent social structure in open source projects. In: Proceedings of the ACM SIGSOFT symposium on the foundations of software engineering (FSE). ACM, pp 24–35
Bird C, Nagappan N, Devanbu P, Gall H, Murphy B (2009) Does distributed development affect software quality? An empirical case study of windows vista. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 518–528
Borges H, Valente MT (2018) What’s in a GitHub star? understanding repository starring practices in a social coding platform. J Syst Softw 146(1):112–129
Brun Y, Holmes R, Ernst MD, Notkin D (2011) Proactive detection of collaboration conflicts. In: Proceedings of the European software engineering conference and the symposium on foundations of software engineering (ESEC/FSE). ACM, pp 168–178
Dabbish L, Stuart C, Tsay J, Herbsleb J (2012) Social coding in GitHub: transparency and collaboration in an open software repository. In: Proceedings of the conference on computer supported cooperative work (CSCW). ACM, pp 1277–1286
Dewan P, Hegde R (2007) Semi-synchronous conflict detection and resolution in asynchronous software development. In: Proceedings of the conference on european computer supported cooperative work (ECSCW). ACM, pp 159–178
Dickersin K, Min Y, Meinert C (1992) Factors influencing publication of research results: follow-up of applications submitted to two institutional review boards. J Am Med Assoc 267(3):374–378
Foucault M, Falleri J-R, Blanc X (2014) Code ownership in open-source software In: Proceedings of the international conference on evaluation and assessment in software engineering (EASE). ACM, pp 1–9
Ghiotto G, Murta L, Barros M, van der Hoek A (2018) On the Nature of Merge Conflicts a Study of 2,731 Open Source Java Projects Hosted by Github. In: Transactions on software engineering (TSE), vol. 99 (1). IEEE, pp 1–25
Gousios G, Pinzger M, Deursen A (2014) An exploratory study of the pull-based software development model. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 345–355
Gousios G, Storey MA, Bacchelli A (2016) Work practices and challenges in pull-based development: The contributor’s perspective. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 285–296
Greiler M, Herzig K, Czerwonka J (2015) Code ownership and software quality: A replication study. In: Proceedings of the working conference on mining software repositories (MSR). IEEE, pp 2–12
Grinter RE, Herbsleb JD, Perry DE (1999) The geography of coordination: Dealing with distance in R & D work. In: Proceedings of the international ACM SIGGROUP conference on supporting group work (GROUP). ACM, pp 306–315
Guimarães ML, Silva AR (2012) Improving early detection of software merge conflicts In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 342–352
Guzzi A, Bacchelli A, Lanza M, Pinzger M, Deursen A (2013) Communication in open source software development mailing lists In: Proceedings of the working conference on mining software repositories (MSR). IEEE, pp 277–286
Jerrold HZ (1972) Significance testing of the spearman rank correlation coefficient. J Am Stat Assoc 67(339):578–580. Taylor & Francis, Ltd
Joblin M, Mauerer W, Apel S, Siegmund J, Riehle D (2015) From developer networks to verified communities: A fine-grained approach. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 563–573
Joblin M, Apel S, Mauerer W (2017) Evolutionary trends of developer coordination: a network approach, vol 22
Jolliffe IT (2002) Principal component analysis. Springer Series in Statistics, Springer, 2nd edn., p 487
Just S, Herzig K, Czerwonka J, Murphy B (2016) Switching to Git: the good, the bad, and the ugly. In: Proceeding of the international symposium on software reliability engineering (ISSRE). IEEE, pp 400–411
Kalliamvakou E, Gousios G, Blincoe K, Singer L, German D, Damian D (2014) The promises and perfils of mining GitHub. In: Proceedings of the working conference on mining software repositories (MSR). ACM, pp 92–101
Kasi BK, Sarma A (2013) Cassandra: Proative conflict minimization through optimized task scheduling. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 732–741
Kim S (2015) pcor: An R package for a fast calculation to semi-partial correlation coefficients. Communication for Statistical Applications and Methods 22(6):665–674
LaToza TD, Venolia G, DeLine R (2006) Maintaining mental models: A study of developer work habits. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 492–501
Leßenich O, Siegmund J, Apel S, Kästner C, Hunsen C (2017) Indicators for merge conflicts in the wild: survey and empirical study. Autom Softw Eng 25(2):1–35. Springer
Liu J, Li J, He L (2016) A comparative study of the effects of pull request on GitHub projects. In: Annual computer software and applications conference (COMPSAC). IEEE, pp 313–322
McKee S, Nelson N, Sarma A, Dig D (2017) Software Practitioner perspectives on merge conflicts and resolutions. In: Proceedings of the international conference on software maintenance and evolution (ICSME), IEEE, pp 467–478
Mens T (2002) A state-of-the-art survey on software merging, vol 28. IEEE
Nelson N, Brindescu C, McKee S, Sarma A, Dig D (2019) The life-cycle of merge conflicts: processes, barriers, and strategies. Empirical software engineering, online first, Springer, pp 1–44
Olson C, Rennie D, Cook D, Dickersin K, Flanagin A, Hogan J, Zhu Q, Reiling J, Pace B (2002) Publication bias in editorial decision making. J Am Med Assoc 287(21):2825–2828
Panichella S, Bavota G, Penta MD, Canfora G, Antoniol G (2014) How developers’ collaborations identified from different sources tell us about code changes. In: Proceeding of the international conference on software maintenance and evolution (ICSME). IEEE, pp 251–260
Reiter E, Robertson R, Osman L (2003) Lessons from a failure: generating tailored smoking cessation letters. Artif Intell 144(1-2):41–58. Elsevier
Sarma A, Redmiles DF, van der Hoek A (2012) Palantír: Early detection of development conflicts arising from parallel code changes, vol 38. IEEE
Sedano T, Ralph P, Péraire C (2017) Software development waste. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 130–140
Siegmund J, Schumann J (2015) Confounding parameters on program comprehension: a literature survey. Empir Softw Eng 20(4):1159–1192
Singer L, Figueira Filho F, Cleary B, Treude C, Storey MA, Schneider K (2013) Mutual assessment in the social programmer ecosystem: an empirical investigation of developer profile aggregators. In: Proceedings of the conference on computer supported cooperative work (CSCW). ACM, pp 103–116
Souza CRB, Redmiles D, Cheng L, Millen D, Patterson J (2004) How a good software practice thwarts collaboration: the multiple roles of apis in software development. SIGSOFT Software Engineering Notes 29(6):221–230
Storey MA, Zagalsky A, Figueira Filho F, Singer L, German DM (2016) How social and communication channels shape and challenge a participatory culture in software development, vol 43
Teo T (2014) Handbook of quantitative methods for educational research. SensePublishers, pp 404
Tsay J, Dabbish L, Herbsleb J (2014) Influence of social and technical factors for evaluating contribution in GitHub. In: Proceedings of the international conference on software engineering (ICSE). ACM, pp 356–366
Vale G, Fernandes E, Figueiredo E (2018) On the proposal and evaluation of a benchmark-based threshold derivation method. Softw Qual J 27(1):1–32
Vale G, Schimid A, Santos A, Almeida E, Apel S (2019) On the relation between coordination activities and merge conflicts – supplementary Web site available: https://sites.google.com/view/vale-emse2019. Accessed 30 July 2019
Wright SP (1992) Adjusted P-values for simultaneous inference. Biometrics 48 (4):1005–1013. Wiley
Zimmermann T, Weisgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: Proceedings of the international conference on software engineering (ICSE). IEEE, pp 563–572
Acknowledgements
This work was partially supported by CNPq (grant 290136/2015-6) and Bavarian State Ministry of Education, Science and the Arts in the framework of the Centre Digitisation.Bavaria (ZD.B).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Jeffrey C. Carver
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Vale, G., Schmid, A., Santos, A.R. et al. On the relation between Github communication activity and merge conflicts. Empir Software Eng 25, 402–433 (2020). https://doi.org/10.1007/s10664-019-09774-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-019-09774-x