Abstract
Software ecosystems can be viewed as socio-technical networks consisting of technical components (software packages) and social components (communities of developers) that maintain the technical components. Ecosystems evolve over time through socio-technical changes that may greatly impact the ecosystem’s sustainability. Social changes like developer turnover may lead to technical degradation. This motivates the need to identify those factors leading to developer abandonment, in order to automate the process of identifying developers with high abandonment risk. This paper compares such factors for two software package ecosystems, RubyGems and npm. We analyse the evolution of their packages hosted on GitHub, considering development activity in terms of commits, and social interaction with other developers in terms of comments associated to commits, issues or pull requests. We analyse this socio-technical activity for more than 30 and 60k developers for RubyGems and npm, respectively. We use survival analysis to identify which factors coincide with a lower survival probability. Our results reveal that developers with a higher probability to abandon an ecosystem: do not engage in discussions with other developers; do not have strong social and technical activity intensity; communicate or commit less frequently; and do not participate to both technical and social activities for long periods of time. Such observations could be used to automate the identification of developers with a high probability of abandoning the ecosystem and, as such, reduce the risks associated to knowledge loss.
Similar content being viewed by others
Notes
https://rubygems.org/ for RubyGems.
https://www.npmjs.com/ for npm.
We use the 2016-09-05 dump of the GHTorrent dataset.
References
Aué J, Haisma M, Tómasdóttir KF, Bacchelli A (2016) Social diversity and growth levels of open source software projects on GitHub. In: International symposium on empirical software engineering and measurement (ESEM), pp 41:1–41:6. doi:10.1145/2961111.2962633
Blincoe K, Harrison F, Damian D (2015) Ecosystems in GitHub and a method for ecosystem identification using reference coupling. In: Working conference on mining software repositories (MSR), pp 202–207
Bosu A, Carver JC (2014) Impact of developer reputation on code review outcomes in oss projects: an empirical investigation. In: ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), pp 33:1–33:10. doi:10.1145/2652524.2652544
Constantinou E, Mens T (2016) Social and technical evolution of software ecosystems: a case study of rails. In: European conference on software architecture workshops (ECSAW), pp 23:1–23:4
Constantinou E, Mens T (2017) Socio-technical evolution of the Ruby ecosystem in GitHub. In: International conference on software analysis, evolution, and reengineering (SANER), pp 34–44
Crowston K, Wei K, Li Q, Howison J (2006) Core and periphery in free/libre and open source software team communications. In: Annual Hawaii international conference on system sciences (HICSS), p 118.1. doi:10.1109/ICSS.2006.101
Decan A, Goeminne M, Mens T (2017) On the interaction of relational database access technologies in open source java projects. In: CEUR workshop proceedings. Post-proceedings of the 8th seminar on advanced techniques and tools for software evolution (SATToSE), vol 1820. pp 26–35
Decan A, Mens T, Claes M (2017) An empirical comparison of dependency issues in OSS packaging ecosystems. In: International conference on software analysis, evolution, and reengineering (SANER)
Ehls D (2017) Open source project collapse—sources and patterns of failure. In: Hawaii international conference on system sciences (HICSS)
Ferreira M, Ferreira K, Tulio VM (2017) A comparison of three algorithms for computing truck factors. In: IEEE international conference on program comprehension (ICPC)
Foucault M, Palyart M, Blanc X, Murphy GC, Falleri JR (2015) Impact of developer turnover on quality in open-source software. In: Joint meeting on foundations of software engineering (ESEC/FSE), pp 829–841. doi:10.1145/2786805.2786870
Fritz T, Ou J, Murphy GC, Murphy-Hill E (2010) A degree-of-knowledge model to capture source code familiarity. In: ACM/IEEE international conference on software engineering—(ICSE), vol 1. pp 385–394. doi:10.1145/1806799.1806856
Gousios G (2013) The GHTorrent dataset and tool suite. In: Working conference on mining software repositories (MSR), pp 233–236
Guzzi A, Bacchelli A, Lanza M, Pinzger M, Deursen AV (2013) Communication in open source software development mailing lists. In: Working conference on mining software repositories (MSR), pp 277–286
Hirsch JE (2005) An index to quantify an individual’s scientific research output. Natl Acad Sci USA 102(46):16569–16572
Izquierdo-Cortazar D, Robles G, Ortega F, González-Barahona JM (2009) Using software archaeology to measure knowledge loss in software projects due to developer turnover. In: Hawaii international conference on system sciences (HICSS), pp 1–10
Joblin M, Apel S, Hunsen C, Mauerer W (2017) Classifying developers into core and peripheral: an empirical study on count and network metrics. In: International conference on software engineering (ICSE)
Kikas R, Gousios G, Dumas M, Pfahl D (2017) Structure and evolution of package dependency networks. In: International conference on mining software repositories (MSR)
Kleinbaum DG, Klein M (2012) Survival analysis: a self-learning text, 3rd edn. Springer, New York
Lanza M, Marinescu R (2006) Object-oriented metrics in practice, 1st edn. Springer, Berlin
Lin B, Robles G, Serebrenik A (2017) Developer turnover in global, industrial open source projects: insights from applying survival analysis. In: International conference on global software engineering (ICGSE)
Lungu M (2008) Towards reverse engineering software ecosystems. In: International conference on software maintenance (ICSM), pp 428–431
Mockus A, Fielding RT, Herbsleb JD (2002) Two case studies of open source software development: apache and mozilla. ACM Trans Softw Eng Methodol (TOSEM) 11(3):309–346. doi:10.1145/567793.567795
Rigby PC, Zhu YC, Donadelli SM, Mockus A (2016) Quantifying and mitigating turnover-induced knowledge loss: case studies of Chrome and a project at Avaya. In: International conference on software engineering (ICSE), pp 1006–1016. doi:10.1145/2884781.2884851
Robles G, Gonzalez-Barahona JM (2006) Contributor turnover in libre software projects. In: IFIP international conference on open source systems (OSS), pp 273–286. doi:10.1007/0-387-34226-5_28
Samoladas I, Angelis L, Stamelos I (2010) Survival analysis on the duration of open source projects. Inf Softw Technol 52(9):902–922. doi:10.1016/j.infsof.2010.05.001
Scacchi W (2007) Free/open source software development: recent research results and emerging opportunities. In: Joint meeting on european software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers (ESEC-FSE companion), pp 459–468. doi:10.1145/1295014.1295019
Serebrenik A, Mens T (2015) Challenges in software ecosystems research. In: European conference on software architecture workshops (ECSAW), pp 40:1–40:6
Steinmacher I, Chaves AP, Conte TU, Gerosa MA (2014) Preliminary empirical identification of barriers faced by newcomers to open source software projects. In: Brazilian symposium on software engineering (SBES), pp 51–60. doi:10.1109/SBES.2014.9
Steinmacher I, Graciotto Silva MA, Gerosa MA, Redmiles DF (2015) A systematic literature review on the barriers faced by newcomers to open source software projects. Inf Softw Technol 59(C):67–85. doi:10.1016/j.infsof.2014.11.001
Steinmacher I, Wiese I, Chaves AP, Gerosa MA (2013) Why do newcomers abandon open source software projects? In: International workshop on cooperative and human aspects of software engineering (CHASE), pp 25–32. doi:10.1109/CHASE.2013.6614728
Syed S, Jansen S (2013) On clusters in open source ecosystems. In: International workshop on software ecosystems (IWSECO)
Terceiro A, Rios LR, Chavez C (2010) An empirical study on the structural complexity introduced by core and peripheral developers in free software projects. In: Brazilian symposium on software engineering, pp 21–29. doi:10.1109/SBES.2010.26
Vasilescu B, Posnett D, Ray B, van den Brand MG, Serebrenik A, Devanbu P, Filkov V (2015) Gender and tenure diversity in GitHub teams. In: ACM conference on human factors in computing systems (CHI), pp 3789–3798
Vasilescu B, Serebrenik A, Filkov V (2015) A data set for social diversity studies of GitHub teams. In: Working conference on mining software repositories (MSR), pp 514–517
Vasilescu B, Serebrenik A, Goeminne M, Mens T (2014) On the variation and specialisation of workload—a case study of the Gnome ecosystem community. Empir Softw Eng 19(4):955–1008. doi:10.1007/s10664-013-9244-1
Wahyudin D, Mustofa K, Schatten A, Biffl S, Tjoa AM (2007) Monitoring the health status of open source web-engineering projects. Int J Web Inf Syst 3(1):116–139. doi:10.1108/17440080710829252
Wellek S (1993) A log-rank test for equivalence of two survivor functions. Biometrics 49(3):877–881
Yamashit K, McIntosh S, Kamei Y, Ubayashi N (2014) Magnet or sticky? an OSS project-by-project typology. In: Working conference on mining software repositories (MSR), pp 344–347. ACM. doi:10.1145/2597073.2597116
Zhou M, Mockus A (2012) What make long term contributors: willingness and opportunity in OSS community. In: International conference on software engineering (ICSE), pp 518–528. doi:10.1109/ICSE.2012.6227164
Acknowledgements
This research was carried out in the context of FNRS crédit de recherche J.0023.16 entitled “Analysis of Software Project Survival” and the bilateral collaborative research program FRQ-FNRS 30440672 entitled “Towards an Interdisciplinary Socio-Technical Methodology and Analysis of Software Ecosystem Health”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Constantinou, E., Mens, T. An empirical comparison of developer retention in the RubyGems and npm software ecosystems. Innovations Syst Softw Eng 13, 101–115 (2017). https://doi.org/10.1007/s11334-017-0303-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-017-0303-4