Skip to main content
Log in

Developer initiation and social interactions in OSS: A case study of the Apache Software Foundation

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Maintaining a productive and collaborative team of developers is essential to Open Source Software (OSS) success, and hinges upon the trust inherent among the team. Whether a project participant is initiated as a committer is a function of both his technical contributions and also his social interactions with other project participants. One’s online social footprint is arguably easier to ascertain and gather than one’s technical contributions e.g., gathering patch submission information requires mining multiple sources with different formats, and then merging the aliases from these sources. In contrast to prior work, where patch submission was found to be an essential ingredient to achieving committer status, here we investigate the extent to which the likelihood of achieving that status can be modeled solely as a social network phenomenon. For 6 different Apache Software Foundation OSS projects we compile and integrate a set of social measures of the communications network among OSS project participants and a set of technical measures, i.e., OSS developers’ patch submission activities. We use these sets to predict whether a project participant will become a committer, and to characterize their socialization patterns around the time of becoming committer. We find that the social network metrics, in particular the amount of two-way communication a person participates in, are more significant predictors of one’s likelihood to becoming a committer. Further, we find that this is true to the extent that other predictors, e.g., patch submission info, need not be included in the models. In addition, we show that future committers are easy to identify with great fidelity when using the first three months of data of their social activities. Moreover, only the first month of their social links are a very useful predictor, coming within 10 % of the three month data’s predictions. Interestingly, we find that on average, for each project, one’s level of socialization ramps up before the time of becoming a committer. After obtaining committer status, their social behavior is more individualized, falling into few distinct modes of behavior. In a significant number of projects, immediately after the initiation there is a notable social cooling-off period. Finally, we find that it is easier to become a committer earlier in the projects life cycle than it is later as the project matures. These results should provide insight on the social nature of gaining trust and advancing in status in distributed projects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://www.apache.org/foundation/how-it-works.html%5C%23pmc

  2. http://community.apache.org/contributors/

  3. http://www.apache.org/foundation/how-it-works.html%5C%23roles

  4. http://community.apache.org/contributors/

  5. http://community.apache.org/contributors/

  6. Issue trackers also capture communication between committers and developers. We did not use those because the mailing lists contained a large enough communication sample which was not obviously biased in any way

  7. http://www.apache.org/foundation/faq.html

References

  • Ashton MC, Lee K, Paunonen SV (2002) What is the central feature of extraversion? Social attention versus reward sensitivity. J Pers Soc Psychol 83(1):245

    Article  Google Scholar 

  • Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2013) Steering user behavior with badges. In: WWW. ACM, pp 95–106

  • Butler BS (2001) Membership size, communication activity, and sustainability: a resource-based model of online social structures. Inf Syst Res 12(4):346–362

    Article  Google Scholar 

  • Bird C, Gourley A, Devanbu P, Swaminathan A, Hsu G (2007) Open borders? pImmigration in open source projects. In: MSR. IEEE, p 6

  • Bettenburg N, Hassan AE (2010) Studying the impact of social structures on software quality. In: ICPC. IEEE, pp 124–133

  • Bird C, Nagappan N, Devanbu P, Gall H, Murphy B (2009) Does distributed development affect software quality? An empirical case study of Windows Vista. Commun ACM 52(8):85–93

    Article  Google Scholar 

  • Bird C, Gourley A, Devanbu P, Gertz M, Swaminathan A (2006) Mining email social networks. In: MSR. ACM, pp. 137–143

  • Begel A, Simon B (2008) Novice software developers, all over again. In: Proceedings of the 4th international workshop on computing education research. ACM, pp 3–14

  • Bauer TN, Erdogan B (2011) Organizational socialization: the effective onboarding of new employees

  • Bettenburg N, Shihab E, Hassan AE (2009) An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In: ICSM. IEEE, pp 539–542

  • Bauer TN, Bodner T, Erdogan B, Truxillo DM, Tucker JS (2007) Newcomer adjustment during organizational socialization: a meta-analytic review of antecedents, outcomes, and methods. J Appl Psychol 92(3):707

    Article  Google Scholar 

  • Crowston K, Wei K, Howison J, Wiggins A (2012) Free/libre open-source software development: What we know and what we do not know. ACM Comput Surv (CSUR) 44(2):7

    Article  Google Scholar 

  • Cataldo M, Herbsleb JD, K M Carley (2008) Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity. In: ESEM. ACM, pp 2–11

  • Crowston K, Howison J (2005) The social structure of free and open source software development. First Monday 10(2)

  • Cheng R, Vassileva J (2006) Design and evaluation of an adaptive incentive mechanism for sustained educational online communities. User Model. User-Adap Inter 16(3–4):321–348

    Article  Google Scholar 

  • Cohen J (2003) Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum

  • Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74(368):829–836

    Article  MathSciNet  MATH  Google Scholar 

  • Ducheneaut N (2005) Socialization in an open source software community: a socio-technical analysis. CSCW 14(4):323–368

    Google Scholar 

  • De Souza C, Froehlich J, Dourish P (2005) Seeking the source: software source code as a social and technical artifact. In: SIGGROUP. ACM, pp 197–206

  • Depue RA, Collins PF (1999) Neurobiology of the structure of personality: Dopamine, facilitation of incentive motivation, and extraversion. Behav Brain Sci 22(03):491–517

    Google Scholar 

  • Deterding S, Sicart M, Nacke L, O’Hara K, Dixon D (2011) Gamification. using game-design elements in non-gaming contexts. In: CHI. ACM, pp 2425–2428

  • Dai G, De Meuse KP (2007) A review of onboarding literature, Lominger Limited Inc., a subsidiary of Korn/Ferry International

  • Fielding R (1999) Shared leadership in the Apache project. Commun ACM 42(4):42–43

    Article  Google Scholar 

  • Fershtman C, Gandal N (2011) Direct and indirect knowledge spillovers: the social network of open-source projects. RAND J Econ 42(1):70–91

    Article  Google Scholar 

  • Farzan R, DiMicco JM, Millen DR, Dugan C, Geyer W, Brownholtz EA (2008) Results from deploying a participation incentive mechanism within the enterprise. In: CHI. ACM, pp 563–572

  • German DM (2003) The GNOME project: a case study of open source, global software development. Softw Process: Improv Pract 8(4):201–215

    Article  Google Scholar 

  • Grant S, Betts B (2013) Encouraging user behaviour with achievements: an empirical study. In: MSR. IEEE, pp 65–68

  • Goeminne M, Mens T (2013) A comparison of identity merge algorithms for software repositories. Sci. Comput Program 78(8):971–986

    Article  Google Scholar 

  • Guzzi A, Bacchelli A, Lanza M, Pinzger M, van Deursen A (2013) Communication in open source software development mailing lists. In: MSR. IEEE, pp 277–286

  • Hertel G, Niedner S, Herrmann S (2003) Motivation of software developers in Open Source projects: an internet-based survey of contributors to the linux kernel. Res Policy 32(7):1159–1177

    Article  Google Scholar 

  • Herraiz I, Robles G, Amor J, Romera T, González Barahona J (2006) The processes of joining in global distributed software projects. In: International workshop on global software development for the practitioner. ACM, pp 27–33

  • Jensen C, Scacchi W (2007) Role migration and advancement processes in OSSD projects: a comparative case study. In: ICSE. IEEE, pp 364–374

  • Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering. ACM, p 9

  • Jureczko M, Spinellis D (2010) Using object-oriented design metrics to predict software defects. Models and Methods of System Dependability. Oficyna Wydawnicza Politechniki Wrocławskiej, pp 69–81

  • Kogut B, Metiu A (2001) Open-source software development and distributed innovation. Oxf Rev Econ Policy 17(2):248–264

    Article  Google Scholar 

  • Krogh G, Hippel E (2006) The promise of research on open source software. Manag Sci 52(7):975–983

    Article  Google Scholar 

  • Kouters E, Vasilescu B, Serebrenik A, van den Brand MGJ (2012) Who’s who in GNOME: using LSA to merge software repository identities. In: ICSM. IEEE, pp 592–595

  • Long Y, Siau K (2007) Social network structures in open source software development teams. J Database Manag (JDM) 18(2):25–40

    Article  Google Scholar 

  • Lucas RE, Diener E, Grob A, Suh EM, Shao L (2000) Cross-cultural evidence for the fundamental features of extraversion. J Pers Soc Psychol 79(3):452

    Article  Google Scholar 

  • Mockus A, Fielding RT, Herbsleb JD (2002) Two case studies of open source software development: Apache and Mozilla. ACM Trans Softw Eng Methodol (TOSEM) 11(3):309–346

    Article  Google Scholar 

  • Mann HB (1945) Nonparametric tests against trend. Econometrica: J Econ Soc:245–259

  • Nakakoji K, Yamamoto Y, Nishinaka Y, Kishida K, Ye Y (2002) Evolution patterns of open-source software systems and communities. In: IWPSE. ACM, pp 76–85

  • Newman M, Forrest S, Balthrop J (2002) Email networks and the spread of computer viruses. Phys Rev E 66(3):035101(R):1–4

    Article  Google Scholar 

  • Posnett D, Filkov V, Devanbu P (2011) Ecological inference in empirical software engineering. In: Proceedings of the 2011 26th IEEE/ACM international conference on automated software engineering. IEEE Computer Society, pp 362–371

  • Qureshi I, Fang Y (2011) Socialization in open source software projects: a growth mixture modeling approach. Organ Res Methods 14(1):208–238

    Article  Google Scholar 

  • Robles G, Gonzalez-Barahona JM (2006) Contributor turnover in libre software projects. In: Open Source Systems. Springer, pp 273–286

  • Roberts J, Hann I, Slaughter S (2006) Understanding the motivations, participation, and performance of open source software developers: a longitudinal study of the Apache projects. Manag Sci 52(7):984–999

    Article  Google Scholar 

  • Raymond E (1999) The cathedral and the bazaar. Knowl, Technol & Policy 12(3):23–49

    Article  MathSciNet  Google Scholar 

  • Rahman F, Posnett D, Devanbu P (2012) Recalling the imprecision of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering. ACM, p 61

  • Rahman F, Posnett D, Herraiz I, Devanbu P (2013) Sample size vs. bias in defect prediction. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering. ACM, pp 147–157

  • Sinha V, Mani S, Sinha S (2011) Entering the circle of trust: developer initiation as committers in open-source projects. In: MSR. ACM, pp 133–142

  • Stewart K, Gosain S (2001) An exploratory study of ideology and trust in open source development groups. In: ICIS. ACM, pp 1–6

  • Scacchi W (2007) Free/Open source software development: Recent research results and methods. Adv Comput 69:243–295

    Article  Google Scholar 

  • Shibuya B, Tamai T (2009) Understanding the process of participating in open source communities. In: International workshop on emerging trends in free/libre/open source software research and development. IEEE, pp 1–6

  • Schultz W (2006) Behavioral theories and the neurophysiology of reward. Annu Rev Psychol 57:87–115

    Article  Google Scholar 

  • Spencer D (2009) Card sorting: Designing usable categories. Rosenfeld Media

  • Von Krogh G, Spaeth S, Lakhani K (2003) Community, joining, and specialization in open source software innovation: a case study. Res Policy 32(7):1217–1241

    Article  Google Scholar 

  • Vasilescu B, Serebrenik A, Goeminne M, Mens T (2013) On the variation and specialisation of workload—a case study of the GNOME ecosystem community. Empir Softw Eng 1–54

  • Vasilescu B, Serebrenik A, Devanbu PT, Filkov V (2014) How social Q&A sites are changing knowledge sharing in open source software communities. In: CSCW. ACM, pp 342–354

  • Vuong Q (1989) Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: J Econ Soc:307–333

  • Ye Y, Kishida K (2003) Toward an understanding of the motivation of open source software developers. In: ICSE. IEEE, pp 419–429

  • Zhou M, Mockus A (2012) What make long term contributors: willingness and opportunity in OSS community. In: ICSE. IEEE, pp 518–528

Download references

Acknowledgements

All authors gratefully acknowledge support from the Air Force Office of Scientific Research, award FA955-11-1-0246. Vasilescu gratefully acknowledges support from the Dutch Science Foundation (NWO), grant NWO 600.065.120.10N235. Part of this research was carried out during Vasilescu’s visits at UC Davis.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladimir Filkov.

Additional information

Communicated by: Yann-Gaël Guéhéneuc and Tom Mens

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gharehyazie, M., Posnett, D., Vasilescu, B. et al. Developer initiation and social interactions in OSS: A case study of the Apache Software Foundation. Empir Software Eng 20, 1318–1353 (2015). https://doi.org/10.1007/s10664-014-9332-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-014-9332-x

Keywords

Navigation