Empirical Software Engineering

, Volume 22, Issue 4, pp 1795–1830 | Cite as

Tracing distributed collaborative development in apache software foundation projects

  • Mohammad Gharehyazie
  • Vladimir Filkov


Developing and maintaining large software systems typically requires that developers collaborate on many tasks. During such collaborations, when multiple people work on the same chunk of code at the same time, they communicate with each other and employ safeguards in various ways. Recent studies have considered group co-development in OSS projects and found that it is an essential part of many projects. However, those studies were limited to groups of size two, i.e., pairs of developers. Here we go further and characterize co-development in larger groups. We develop an effective methodology for capturing distributed collaboration beyond groups of size two, based on synchronized commit activities among multiple developers, and apply it to data from 26 OSS projects from the Apache Software Foundation. We find that distributed collaborations is prevalent, but not as frequent as expected. We also find that while in distributed collaborative groups, developers’ behavior is different than when programming alone, e.g., high developer focus on specific code packages associates with lower team participation, while packages with higher ownership get less attention from groups than from individuals. Finally, we show that productivity effort during co-development is more often lower for developers while they co-develop in groups. To verify our results we use both quantitative and qualitative methods, including a developer survey. We conclude that these methods and results can be used to understand the effects of the collaborative dynamic in OSS teams on the software engineering process. Our code, along with our datasets and survey is available at


Teamwork Distributed collaborative development OSS 



The authors would like to thank the members of our DECAL research group and Prof. Qi Xuan for the valuable discussion about the ideas and technical details presented in this paper. We thank also Dr. Bogdan Vasilescu for his contributions in designing the survey and for his insightful comments and feedback on this work, and Mehrdad Afshari for his help in improving the paper. The comments by the anonymous reviewers helped us make this paper better, for which we are thankful. Both authors gratefully acknowledge support from the Air Force Office of Scientific Research, award FA955-11-1-0246.


  1. Adams PJ, Capiluppi A, Boldyreff C (2009) Coordination and productivity issues in free software: The role of Brooks’ law. In: IEEE International Conference on Software Maintenance, 2009. ICSM 2009, pages 319–328. IEEEGoogle Scholar
  2. Al-Ani B, Edwards HK (2008) A comparative empirical study of communication in distributed and collocated development teams. In: ICGSE IEEE International Conference on Global Software Engineering, 2008, pages 35–44. IEEEGoogle Scholar
  3. Avritzer A, Paulish DJ (2010) A comparison of commonly used processes for multi-site software development. In: Collaborative Software Engineering, pages 285–302. SpringerGoogle Scholar
  4. Baruch Y (1999) Response rate in academic studies-a comparative analysis. Human relations 52(4):421–438Google Scholar
  5. Bird C, Gourley A, Devanbu P, Gertz M, Swaminathan A (2006) Mining email social networks. Inproceedings of the 2006 international workshop on Mining software repositories. ACM:137–143Google Scholar
  6. Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code!: examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 4–14 ACMGoogle Scholar
  7. Bird C, Pattison D, D’Souza R, Filkov V, Devanbu P (2008) Latent social structure in open source projects. In: proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pages 24–35 ACMGoogle Scholar
  8. Blüthgen N, Menzel F, Blüthgen N (2006) Measuring specialization in species interaction networks. BMC Ecology 6(1):9CrossRefGoogle Scholar
  9. Brooks Jr FP (1995) The Mythical Man-month (Anniversary Ed.). Addison-Wesley Longman Publishing Co., Inc., Boston, MA USAGoogle Scholar
  10. Caglayan B, Bener AB, Miranskyy A (2013) Emergence of developer teams in the collaboration network. In: Cooperative and Human Aspects of Software Engineering (CHASE), 2013 6th International Workshop on, pages 33–40. IEEEGoogle Scholar
  11. Carmel E (1999) Global software teams: collaborating across borders and time zones Prentice Hall PTRGoogle Scholar
  12. Cataldo M, Herbsleb JD (2013) Coordination breakdowns and their impact on development productivity and software failures Engineering. IEEE Trans Softw Eng 39(3):343–360CrossRefGoogle Scholar
  13. Child J (1972) Organizational structure, environment and performance: the role of strategic choice. Sociology 6(1):1–22CrossRefGoogle Scholar
  14. Cohen PR, Levesque HJ (1991) Teamwork SRI International Menlo ParkGoogle Scholar
  15. Crowston K, Li Q, Wei K, Eseryel UY, Howison J (2007) Self-organization of teams for free/libre open source software development. J Inf Softw Technol 49(6):564–575CrossRefGoogle Scholar
  16. Dabbish L, Stuart C, Tsay J, Herbsleb J (2012) Social coding in GitHub: transparency and collaboration in an open software repository. In: Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pages 1277–1286 ACMGoogle Scholar
  17. Damian D, Izquierdo L, Singer J, Kwan I (2007) Awareness in the wild: Why communication breakdowns occur. In: Global Software Engineering, 2007. ICGSE 2007. Second IEEE International Conference on, pages 81–90. IEEEGoogle Scholar
  18. Di Penta M, Harman M, Antoniol G, Qureshi F (2007) The effect of communication overhead on software maintenance project staffing: a search-based approach. In: Software Maintenance, 2007. ICSM 2007. IEEE International Conference on, pages 315–324. IEEEGoogle Scholar
  19. Dugatkin LA (1997) Cooperation among animals, Oxford Series in Ecology and EvolutionGoogle Scholar
  20. Foucault M, Falleri J-R, Blanc X (2014) Code ownership in open-source software. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, page 39 ACMGoogle Scholar
  21. Gharehyazie M, Posnett D, Filkov V (2013) Social activities rival patch submission for prediction of developer initiation in oss projects. In: Software Maintenance (ICSM), 2013 29th IEEE International Conference on, pages 340–349. IEEEGoogle Scholar
  22. Gharehyazie M, Posnett D, Vasilescu B, Filkov V (2014) Developer initiation and social interactions in oss: A case study of the apache software foundation. Empir Softw Eng:1–36Google Scholar
  23. Goeminne M, Claes M, Mens T (2013) A historical dataset for the gnome ecosystemGoogle Scholar
  24. Grechanik M, Jones JA, Orso A, van der Hoek A (2010) Bridging gaps between developers and testers in globally-distributed software development. In: Proceedings of the FSE/SDP workshop on Future of software engineering research, pages 149–154 ACMGoogle Scholar
  25. Gutwin C, Penner R, Schneider K (2004) Group awareness in distributed software development. In: Proceedings of the 2004 ACM conference on Computer supported cooperative work, pages 72–81. ACMGoogle Scholar
  26. Guzzi A, Bacchelli A, Lanza M, Pinzger M, Deursen AV (2013) Communication in open source software development mailing lists. In: MSR, pages 277–286. IEEEGoogle Scholar
  27. Herbsleb JD (2007) Global software engineering: The future of socio-technical coordination. In: 2007 Future of Software Engineering, pages 188–198. IEEE Computer SocietyGoogle Scholar
  28. Herbsleb J, Grinter RE (1999) Architectures, coordination, and distance: Conway’s law and beyond. IEEE Softw 16(5):63–70CrossRefGoogle Scholar
  29. Herbsleb J, Mockus A, Finholt TA, Grinter RE (2001) An empirical study of global software development: distance and speed. In: Proceedings of the 23rd international conference on software engineering, pages 81–90 IEEE Computer SocietyGoogle Scholar
  30. Herbsleb JD, Moitra D (2001) Global software development. IEEE Soft 18 (2):16–20CrossRefGoogle Scholar
  31. Hertel G, Niedner S, Herrmann S (2003) Motivation of software developers in open source projects: an internet-based survey of contributors to the linux kernel. Res Policy 32(7):1159–1177CrossRefGoogle Scholar
  32. Holmstrom H, Conchúir E. Ó, Ågerfalk PJ, Fitzgerald B (2006) Global software development challenges: A case study on temporal, geographical and socio-cultural distance. In: Global Software Engineering, 2006. ICGSE’06. International Conference on, pages 3–11. IEEEGoogle Scholar
  33. Jermakovics A, Sillitti A, Succi G (2011) Mining and visualizing developer networks from version control systems. In: Proceedings of the 4th International Workshop on Cooperative and Human Aspects of Software Engineering, pages 24–31 ACMGoogle Scholar
  34. Kakimoto T, Kamei Y, Ohira M, Matsumoto K (2006) Social network analysis on communications for knowledge collaboration in oss communitiesGoogle Scholar
  35. Kampstra P et al (2008) Beanplot: A boxplot alternative for visual comparison of distributions. J Stat Softw 28(1):1–9Google Scholar
  36. Katzenbach JR (1993) The wisdom of teams: Creating the high-performance organization. Harvard Business PressGoogle Scholar
  37. Kuipers BS, De Witte MC (2005) Teamwork: a case study on development and performance. Int J Hum Resour Manag 16(2):185–201CrossRefGoogle Scholar
  38. Lanubile F, Ebert C, Prikladnicki R, Vizca íno A (2010) Collaboration tools for global software engineering. IEEE soft 2:52–55CrossRefGoogle Scholar
  39. Luther K, Caine K, Ziegler K, Bruckman A (2010) Why it works (when it works): Success factors in online creative collaboration. In: Proceedings of the 16th ACM international conference on Supporting group work, pages 1–10 ACMGoogle Scholar
  40. Maalej W, Happel H-J (2009) From work to word: How do software developers describe their work?. In: Mining Software Repositories, 2009. MSR’09. 6th IEEE International Working Conference on, pages 121–130. IEEEGoogle Scholar
  41. Maalej W, Happel H-J (2010) Can development work describe itself?. In: Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, pages 191–200. IEEEGoogle Scholar
  42. Mistrík I, Grundy J, Van der Hoek A, Whitehead J (2010) Collaborative software engineering: challenges and prospects. In: Collaborative Software Engineering, pages 389–403. SpringerGoogle Scholar
  43. Mockus A (2009) Succession: Measuring transfer of code and developer productivity. In: Proceedings of the 31st International Conference on Software Engineering, pages 67–77 IEEE Computer SocietyGoogle Scholar
  44. Mockus A (2010) Organizational volatility and its effects on software defects. In: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pages 117–126 ACMGoogle Scholar
  45. Moe NB, Dingsøyr T, Dybå T (2010) A teamwork model for understanding an agile team: A case study of a scrum project. Inf Softw Technol 52(5):480–491CrossRefGoogle Scholar
  46. Nagappan N, Murphy B, Basili V (2008) The influence of organizational structure on software quality: an empirical case study. In: Proceedings of the 30th international conference on Software engineering, pages 521–530 ACMGoogle Scholar
  47. Nakakoji K, Yamada K, Giaccardi E (2005) Understanding the nature of collaboration in open-source software development. In: Software Engineering Conference, 2005. APSEC’05. 12th Asia-Pacific, pages 8–pp. IEEEGoogle Scholar
  48. Nakakoji K, Ye Y, Yamamoto Y (2010) Supporting expertise communication in developer-centered collaborative software development environments. In: Collaborative Software Engineering, pages 219–236. SpringerGoogle Scholar
  49. Nguyen T, Wolf T, Damian D (2008) Global software development and delay: Does distance still matter?. In: Global Software Engineering, 2008. ICGSE 2008. IEEE International Conference on, pages 45–54. IEEEGoogle Scholar
  50. Nohria N, Eccles R (1994) Networks and organizations: structure, form, and action. Harvard Business School PressGoogle Scholar
  51. Pagano D, Maalej W (2013) How do open source communities blog? Empir Softw Eng 18(6):1090–1124CrossRefGoogle Scholar
  52. Panichella S, Canfora G, Di Penta M, Oliveto R (2014) How the evolution of emerging collaborations relates to code changes: An empirical study. In: 22nd International Conference on Program Comprehension (ICPC). IEEEGoogle Scholar
  53. Pinzger M, Gall H (2010) Dynamic analysis of communication and collaboration in oss projects. In: Collaborative Software Engineering, pages 265–284. SpringerGoogle Scholar
  54. Posnett D, D’Souza R, Devanbu P, Filkov V (2013) Dual ecological measures of focus in software development. In: 35th International Conference on Software Engineering (ICSE), pages 452–461. IEEEGoogle Scholar
  55. Rahman F, Devanbu P (2011) Ownership, experience and defects: a fine-grained study of authorship. In: Proceedings of the 33rd International Conference on Software Engineering, pages 491–500 ACMGoogle Scholar
  56. Redmiles D, Van Der Hoek A, Al-Ani B, Hildenbrand T, Quirk S, Sarma A, Filho R, de Souza C, Trainer E (2007) Continuous coordination-a new paradigm to support globally distributed software development projects. Wirtschafts Informatik 49(1):28Google Scholar
  57. Robertsa J, Hann I-H, Slaughter S (2006) Communication networks in an open source software project. In: Open Source Systems, pages 297–306. SpringerGoogle Scholar
  58. Salas EE, Fiore SM (2004) Team cognition: Understanding the factors that drive process and performance. American Psychological AssociationGoogle Scholar
  59. Sarma A, Al-Ani B, Trainer E, Silva Filho RS, da Silva IA, Redmiles D, van der Hoek A (2010) Continuous coordination tools and their evaluation. In: Collaborative Software Engineering, pages 153–178. SpringerGoogle Scholar
  60. Sarma A, Herbsleb J, Van Der Hoek A (2008) Challenges in measuring, understanding, and achieving social-technical congruence. In: Proceedings of Socio-Technical Congruence Workshop, In Conjuction With the International Conference on Software EngineeringGoogle Scholar
  61. Scacchi W (2010) Collaboration practices and affordances in free/open source software development. In: Collaborative software engineering, pages 307–327. SpringerGoogle Scholar
  62. Serebrenik A, van den Brand M (2010) Theil index for aggregation of software metrics values. In: Software Maintenance (ICSM), 2010 IEEE International Conference on, pages 1–9. IEEEGoogle Scholar
  63. Takhteyev Y, Hilts A (2010) Investigating the geography of open source software through githubGoogle Scholar
  64. Vasilescu B, Serebrenik A, van den Brand M (2011) You can’t control the unfamiliar: A study on the relations between aggregation techniques for software metrics. In: Software Maintenance (ICSM), 2011 27th IEEE International Conference on, pages 313–322. IEEEGoogle Scholar
  65. Whitehead J, Mistrík I, Grundy J, van der Hoek A (2010) Collaborative software engineering: concepts and techniques. In: Collaborative Software Engineering, pages 1–30. SpringerGoogle Scholar
  66. Wilson EO (1978) What is sociobiology? Society 15(6):10–14Google Scholar
  67. Xuan Q, Devanbu P, Filkov V (2014) Converging work-talk patterns in online task-oriented communities. arXiv:1404.5708
  68. Xuan Q, Fang H, Fu C, Filkov V (2015) Temporal motifs reveal collaboration patterns in online task-oriented networks. Phys Rev E 91(5):052813CrossRefGoogle Scholar
  69. Xuan Q, Filkov V (2013) Synchrony in social groups and its benefits. In: Handbook of Human Computation, pages 791–802. SpringerGoogle Scholar
  70. Xuan Q, Filkov V (2014) Building it together: synchronous development in OSS. In: Proceedings of the 34th International Conference on Software Engineering ACMGoogle Scholar
  71. Xuan Q, Gharehyazie M, Devanbu P, Filkov V (2012) Measuring the effect of social communications on individual working rhythms: A case study of open source software. In: Social Informatics (SocialInformatics), 2012 International Conference on, pages 78–85. IEEEGoogle Scholar
  72. Xuan Q, Okano A, Devanbu P, Filkov V (2014) Focus-shifting patterns of oss developers and their congruence with call graphs. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 401–412 ACMGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Computer Science DepartmentUniversity of California, DavisDavisUSA
  2. 2.AICT Innovation CenterSharif University of TechnologyTehranIran

Personalised recommendations