Abstract
Open source software development platforms are natural laboratories for studying the diffusion of innovation across human populations, enabling us to better understand what motivates people to adopt new ideas. For example, GitHub, a software repository and collaborative development tool built on the Git distributed version control system, provides a social environment where ideas, techniques, and new methodologies are adopted by other software developers. This paper proposes and evaluates a popularity-based model of the diffusion of innovation on GitHub. GitHub supports a mechanism, forking, for creating personal copies of other software repositories that can be used to measure the propagation of code between developers. We examine the effects of repository popularity on two aspects of knowledge transfer, innovation adoption and sociality, measured on a dataset of GitHub fork events.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Data was retrieved on Jan 16, 2018.
References
Rogers, E. M. (2010). Diffusion of innovations. New York: Simon and Schuster.
GitHub.com (2017). The State of the Octoverse . Available: https://octoverse.github.com/
Gousios, G., & Spinellis, D. (2012). GHTorrent: GitHub’s data from a firehose. In IEEE Working Conference on Mining Software Repositories (pp. 12–21).
Onoue, S., Hata, H., & Matsumoto, K.-i. (2013). A study of the characteristics of developers’ activities in GitHub. In Asia-Pacific Software Engineering Conference, (Vol. 2, pp. 7–12).
Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012). Social coding in GitHub: Transparency and collaboration in an open software repository. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (pp. 1277–1286).
Blincoe, K., Harrison, F., & Damian, D. (2015). Ecosystems in GitHub and a method for ecosystem identification using reference coupling. In Proceedings of the Working Conference on Mining Software Repositories (pp. 202–207).
Saadat, S., Gunaratne, C., Baral, N., Sukthankar, G., & Garibay, I. (2018). Initializing agent-based models with clustering archetypes. In Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, Washington.
Borges, H., Hora, A., & Valente, M. T. (2016). Predicting the popularity of GitHub repositories. In Proceedings of the International Conference on Predictive Models and Data Analytics in Software Engineering (p. 9).
Saadat, S., & Sukthankar, G. (2018). Predicting the performance of software development teams on GitHub. In International Conference on Computational Social Science, Evanston.
Yu, Y., Yin, G., Wang, H., & Wang, T. (2014). Exploring the patterns of social behavior in GitHub. In Proceedings of the International Workshop on Crowd-based Software Development Methods and Technologies (pp. 31–36).
Zhang, Z., Yoo, Y., Wattal, S., Zhang, B., & Kulathinal, R. (2014). Generative diffusion of innovations and knowledge networks in open source projects. In 35th International Conference on Information Systems “Building a Better World Through Information Systems”, ICIS 2014. Auckland: Association for Information Systems.
GitHub.com. Event types & payloads. Available: https://developer.github.com/v3/activity/events/types/
Peterson, K. (2013). The GitHub open source development process. http://kevinp.me/github-process-research/github-processresearch
Duch, J., & Arenas, A. (2005). Community detection in complex networks using extremal optimization. Physical Review E, 72(2), 027104.
Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. In International Conference on Weblogs and Social Media, 8 (pp. 361–362).
Barabasi, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-world networks. Nature, 393(6684), 440.
Barabási, A.-L., & Bonabeau, E. (2003). Scale-free networks. Scientific American, 288(5), 60–69.
Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703.
Acknowledgements
This work was partially supported by grant FA8650-18-C-7823 from the Defense Advanced Research Projects Agency (DARPA). The views and opinions contained in this article are the authors and should not be construed as official or as reflecting the views of the University of Central Florida, DARPA, or the U.S. Department of Defense.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Our code for generating the innovation network from GitHub events can be found at https://github.com/aalrubaye/GithubDataAnalysis. Here are the algorithms for initializing and updating the network.
Algorithm 1: Constructing the model connections (repo–actor, repo–repo, and actor–followers) based on the ground truth dataset GD that includes repositories with fork events R, actors A, and their follower set F
Algorithm 2: Model update procedure for introducing fork events that have occurred since the last time point PM
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Al-Rubaye, A., Sukthankar, G. (2020). A Popularity-Based Model of the Diffusion of Innovation on GitHub. In: Carmichael, T., Yang, Z. (eds) Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas. CSSSA 2018. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-030-35902-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-35902-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35901-0
Online ISBN: 978-3-030-35902-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)