Skip to main content

A Popularity-Based Model of the Diffusion of Innovation on GitHub

  • Conference paper
  • First Online:
Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas (CSSSA 2018)

Part of the book series: Springer Proceedings in Complexity ((SPCOM))

Included in the following conference series:

Abstract

Open source software development platforms are natural laboratories for studying the diffusion of innovation across human populations, enabling us to better understand what motivates people to adopt new ideas. For example, GitHub, a software repository and collaborative development tool built on the Git distributed version control system, provides a social environment where ideas, techniques, and new methodologies are adopted by other software developers. This paper proposes and evaluates a popularity-based model of the diffusion of innovation on GitHub. GitHub supports a mechanism, forking, for creating personal copies of other software repositories that can be used to measure the propagation of code between developers. We examine the effects of repository popularity on two aspects of knowledge transfer, innovation adoption and sociality, measured on a dataset of GitHub fork events.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Data was retrieved on Jan 16, 2018.

References

  1. Rogers, E. M. (2010). Diffusion of innovations. New York: Simon and Schuster.

    Google Scholar 

  2. GitHub.com (2017). The State of the Octoverse . Available: https://octoverse.github.com/

  3. Gousios, G., & Spinellis, D. (2012). GHTorrent: GitHub’s data from a firehose. In IEEE Working Conference on Mining Software Repositories (pp. 12–21).

    Google Scholar 

  4. Onoue, S., Hata, H., & Matsumoto, K.-i. (2013). A study of the characteristics of developers’ activities in GitHub. In Asia-Pacific Software Engineering Conference, (Vol. 2, pp. 7–12).

    Google Scholar 

  5. Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012). Social coding in GitHub: Transparency and collaboration in an open software repository. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (pp. 1277–1286).

    Google Scholar 

  6. Blincoe, K., Harrison, F., & Damian, D. (2015). Ecosystems in GitHub and a method for ecosystem identification using reference coupling. In Proceedings of the Working Conference on Mining Software Repositories (pp. 202–207).

    Google Scholar 

  7. Saadat, S., Gunaratne, C., Baral, N., Sukthankar, G., & Garibay, I. (2018). Initializing agent-based models with clustering archetypes. In Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, Washington.

    Google Scholar 

  8. Borges, H., Hora, A., & Valente, M. T. (2016). Predicting the popularity of GitHub repositories. In Proceedings of the International Conference on Predictive Models and Data Analytics in Software Engineering (p. 9).

    Google Scholar 

  9. Saadat, S., & Sukthankar, G. (2018). Predicting the performance of software development teams on GitHub. In International Conference on Computational Social Science, Evanston.

    Google Scholar 

  10. Yu, Y., Yin, G., Wang, H., & Wang, T. (2014). Exploring the patterns of social behavior in GitHub. In Proceedings of the International Workshop on Crowd-based Software Development Methods and Technologies (pp. 31–36).

    Google Scholar 

  11. Zhang, Z., Yoo, Y., Wattal, S., Zhang, B., & Kulathinal, R. (2014). Generative diffusion of innovations and knowledge networks in open source projects. In 35th International Conference on Information Systems “Building a Better World Through Information Systems”, ICIS 2014. Auckland: Association for Information Systems.

    Google Scholar 

  12. GitHub.com. Event types & payloads. Available: https://developer.github.com/v3/activity/events/types/

  13. Peterson, K. (2013). The GitHub open source development process. http://kevinp.me/github-process-research/github-processresearch

    Google Scholar 

  14. Duch, J., & Arenas, A. (2005). Community detection in complex networks using extremal optimization. Physical Review E, 72(2), 027104.

    Article  Google Scholar 

  15. Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. In International Conference on Weblogs and Social Media, 8 (pp. 361–362).

    Google Scholar 

  16. Barabasi, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.

    Article  MathSciNet  Google Scholar 

  17. Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-world networks. Nature, 393(6684), 440.

    Article  Google Scholar 

  18. Barabási, A.-L., & Bonabeau, E. (2003). Scale-free networks. Scientific American, 288(5), 60–69.

    Article  Google Scholar 

  19. Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was partially supported by grant FA8650-18-C-7823 from the Defense Advanced Research Projects Agency (DARPA). The views and opinions contained in this article are the authors and should not be construed as official or as reflecting the views of the University of Central Florida, DARPA, or the U.S. Department of Defense.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gita Sukthankar .

Editor information

Editors and Affiliations

Appendix

Appendix

Our code for generating the innovation network from GitHub events can be found at https://github.com/aalrubaye/GithubDataAnalysis. Here are the algorithms for initializing and updating the network.

Algorithm 1: Constructing the model connections (repo–actor, repo–repo, and actor–followers) based on the ground truth dataset GD that includes repositories with fork events R, actors A, and their follower set F

Algorithm 2: Model update procedure for introducing fork events that have occurred since the last time point PM

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Al-Rubaye, A., Sukthankar, G. (2020). A Popularity-Based Model of the Diffusion of Innovation on GitHub. In: Carmichael, T., Yang, Z. (eds) Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas. CSSSA 2018. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-030-35902-7_11

Download citation

Publish with us

Policies and ethics