Whom are you going to call? determinants of @-mentions in Github discussions

Abstract

Open Source Software (OSS) project success relies on crowd contributions. When an issue arises in pull-request based systems, @-mentions are used to call on people to task; previous studies have shown that @-mentions in discussions are associated with faster issue resolution. In most projects there may be many developers who could technically handle a variety of tasks. But OSS supports dynamic teams distributed across a wide variety of social and geographic backgrounds, as well as levels of involvement. It is, then, important to know whom to call on, i.e., who can be relied or trusted with important task-related duties, and why. In this paper, we sought to understand which observable socio-technical attributes of developers can be used to build good models of them being future @-mentioned in GitHub issues and pull request discussions. We built overall and project-specific predictive models of future @-mentions, in order to capture the determinants of @-mentions in each of two hundred GitHub projects, and to understand if and how those determinants differ between projects. We found that visibility, expertise, and productivity are associated with an increase in @-mentions, while responsiveness is not, in the presence of a number of control variables. Also, we find that though project-specific differences exist, the overall model can be used for cross-project prediction, indicating its GitHub-wide utility.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    E.g., developers of upstream libraries rarely respond in the downstream project.

  2. 2.

    Developers were asked about communication methods, not explicitly the @-mention.

  3. 3.

    Described in Section 4.2, a reply @-mention is directed towards someone already in the discussion; a call @-mention is directed towards someone not yet in the discussion. In our data, there is indeed a very high correlation between reply @-mentions and discussion length (0.812); however, there is a relatively low correlation between call @-mentions and discussion length (0.283). As our focus is on call @-mentions, correlation between reply @-mentions and discussion length is not a threat.

  4. 4.

    https://github.com/PyGithub/PyGithub

  5. 5.

    PyGithub did not handle properly some Null responses from GitHub’s API.

  6. 6.

    Note that pull requests are a subset of issues.

  7. 7.

    Though we do use outdegree in our model as well.

  8. 8.

    E.g., standard algorithms require a full adjacency matrix to be in memory at once; memory will be exhausted for networks of our size.

  9. 9.

    This measure is originally called d by Bluthgen et al., but we will use δ here to reserve d to represent developers.

  10. 10.

    We do not use \(\mathcal {MAF}\), we use an analogous form for our social networks.

  11. 11.

    https://help.github.com/articles/closing-issues-using-keywords/

  12. 12.

    We use issues fixed before closing as proxy for bugs; a higher value need not imply lack of aptitude, but it indicates a change in expected coding behavior and expertise.

  13. 13.

    \(\mathcal {ISS_{\kappa }}\) is not used for the zero component; it is undefined when call mentions are 0.

  14. 14.

    https://developers.google.com/web/fundamentals/performance/prpl-pattern/

  15. 15.

    We could not perform this in-depth study for discussions not in English.

References

  1. Ackerman AF, Fowler PJ, Ebenau RG (1984) Software inspections and the industrial production of software. In: Proceedings of a symposium on Software validation: inspection-testing-verification-alternatives. Elsevier North-Holland, Inc, pp 13–40

  2. Allison P (2012) When can you safely ignore multicollinearity? https://statisticalhorizons.com/multicollinearity

  3. Bandura A (1973) Aggression: A social learning analysis. Prentice-Hall

  4. Bandura A, Walters RH (1977) Social learning theory

  5. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological) pp 289–300

    MathSciNet  MATH  Google Scholar 

  6. Bird C, Gourley A, Devanbu P, Swaminathan A, Hsu G (2007) Open borders? immigration in open source projects. In: The Fourth international workshop on mining software repositories

  7. Bird C, Rigby PC, Barr ET, Hamilton DJ, German DM, Devanbu P (2009) The promises and perils of mining git. In: 6th IEEE international working conference on mining software repositories, 2009. MSR’09, pp 1–10. IEEE

  8. Blüthgen N, Menzel F, Blüthgen N (2006) Measuring specialization in species interaction networks. BMC Ecol 6(1):9

    Article  Google Scholar 

  9. Brenkert GG (1998) Trust, business and business ethics: an introduction. Bus Ethics Q 8(2):195–203

    Article  Google Scholar 

  10. Brockner J (1996) Understanding the interaction between procedural and distributive justice: The role of trust

  11. Burke M, Marlow C, Lento T (2009) Feed me: motivating newcomer contribution in social network sites. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 945–954. ACM

  12. Burke M, Marlow C, Lento T (2010) Social network activity and social well-being. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 1909–1912. ACM

  13. Calefato F, Lanubile F, Novielli N (2017) A preliminary analysis on the effects of propensity to trust in distributed software development. In: 2017 IEEE 12th international conference on global software engineering (ICGSE), pp 56–60. IEEE

  14. Cameron AC, Trivedi PK (2013) Regression analysis of count data, vol 53. Cambridge University Press, Cambridge

    Google Scholar 

  15. Casalnuovo C, Vasilescu B, Devanbu P, Filkov V (2015) Developer onboarding in github: the role of prior social links and language experience. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp 817–828. ACM

  16. Chow SC, Shao J, Wang H, Lokhnygina Y (2017) Sample size calculations in clinical research. Chapman and Hall/CRC

  17. Cohen J (1988) Statistical power analysis for the behavioural sciences

  18. Cohen J, Cohen P, West SG, Aiken LS (2013) Applied multiple regression/correlation analysis for the behavioral sciences. Routledge

  19. da Costa DA, McIntosh S, Shang W, Kulesza U, Coelho R, Hassan AE (2017) A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Trans Softw Eng 43(7):641–657

    Article  Google Scholar 

  20. Dabbish L, Stuart C, Tsay J, Herbsleb J (2012) Social coding in github: transparency and collaboration in an open software repository. In: Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pp 1277–1286. ACM

  21. Dourish P, Chalmers M (1994) Running out of space: Models of information navigation. In: Short paper presented at HCI, vol 94, pp 23–26

  22. Ducheneaut N (2005) Socialization in an open source software community: A socio-technical analysis. Computer Supported Cooperative Work (CSCW) 14(4):323–368

    Article  Google Scholar 

  23. Faraway JJ (2014) Linear models with R CRC press

  24. Gallivan MJ (2001) Striking a balance between trust and control in a virtual organization: a content analysis of open source software case studies. Inf Syst J 11 (4):277–304

    MathSciNet  Article  Google Scholar 

  25. Gharehyazie M, Posnett D, Filkov V (2013) Social activities rival patch submission for prediction of developer initiation in oss projects. In: 2013 29th IEEE international conference on software maintenance (ICSM), pp 340–349. IEEE

  26. Gharehyazie M, Posnett D, Vasilescu B, Filkov V (2015) Developer initiation and social interactions in oss: A case study of the apache software foundation. Empir Softw Eng 20(5):1318–1353

    Article  Google Scholar 

  27. Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40(3-4):237–264

    MathSciNet  Article  Google Scholar 

  28. Handy C (1995) Trust and the virtual organization. Harv Bus Rev 73(3):40–51

    Google Scholar 

  29. Hossain L, Zhu D (2009) Social networks and coordination performance of distributed software development teams. J High Technol Managem Res 20(1):52–61

    Article  Google Scholar 

  30. Husted BW (1998) The ethical limits of trust in business relations. Bus Ethics Q 8(2):233–248

    Article  Google Scholar 

  31. Ibrahim WM, Bettenburg N, Shihab E, Adams B, Hassan AE (2010) Should i contribute to this discussion?. In: 2010 7th IEEE working conference on mining software repositories (MSR), pp 181–190. IEEE

  32. Inglehan R (1999) Trust, well-being and democracy. Democracy and trust pp 88

  33. Jarvenpaa SL, Knoll K, Leidner DE (1998) Is anybody out there? antecedents of trust in global virtual teams. J Manag Inf Syst 14(4):29–64

    Article  Google Scholar 

  34. Jones TM, Bowie NE (1998) Moral hazards on the road to the “virtual” corporation. Bus Ethics Q 8(2):273–292

    Article  Google Scholar 

  35. Kalliamvakou E, Damian D, Blincoe K, Singer L, German DM (2015) Open source-style collaborative development practices in commercial projects using github. In: Proceedings of the 37th international conference on software engineering-volume 1, pp 574–585. IEEE Press

  36. Kavaler D, Sirovica S, Hellendoorn V, Aranovich R, Filkov V (2017) Perceived language complexity in github issue discussions and their effect on issue resolution. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, pp 72–83. IEEE Press

  37. Kim S, Zimmermann T, Pan K, James Jr E, et al. (2006) Automatic identification of bug-introducing changes. In: ASE’06. 21st IEEE/ACM international conference on automated software engineering, 2006, pp 81–90. IEEE

  38. Kramer RM, Tyler TR (1996) Trust in organizations: Frontiers of theory and research. Sage

  39. Lee MJ, Ferwerda B, Choi J, Hahn J, Moon JY, Kim J (2013) Github developers use rockstars to overcome overflow of news. In: CHI’13 extended abstracts on human factors in computing systems, pp 133–138. ACM

  40. Matter D, Kuhn A, Nierstrasz O (2009) Assigning bug reports using a vocabulary-based expertise model of developers. In: MSR’09. 6th IEEE international working conference on mining software repositories, 2009, pp 131–140. IEEE

  41. McDonald N, Goggins S (2013) Performance and participation in open source software on github. In: CHI’13 extended abstracts on human factors in computing systems, pp 139–144. ACM

  42. McKnight DH, Choudhury V, Kacmar C (2002) Developing and validating trust measures for e-commerce: An integrative typology. Inf Syst Res 13(3):334–359

    Article  Google Scholar 

  43. Mockus A, Herbsleb JD (2002) Expertise browser: a quantitative approach to identifying expertise. In: Proceedings of the 24rd international conference on software engineering, 2002. ICSE 2002, pp 503–512. IEEE

  44. Murphy G, Cubranic D (2004) Automatic bug triage using text categorization. In: Proceedings of the 16th international conference on software engineering & knowledge engineering. Citeseer

  45. Newton K (2001) Trust, social capital, civil society, and democracy. Int Polit Sci Rev 22(2):201–214

    Article  Google Scholar 

  46. Oeldorf-Hirsch A, Sundar SS (2015) Posting, commenting, and tagging: Effects of sharing news stories on facebook. Comput Hum Behav 44:240–249

    Article  Google Scholar 

  47. O’Leary M, Orlikowski W, Yates J (2002) Distributed work over the centuries: Trust and control in the hudson’s bay company, 1670-1826. Distributed work, pp 27–54

  48. Posnett D, D’Souza R, Devanbu P, Filkov V (2013) Dual ecological measures of focus in software development. In: Proceedings of the 2013 international conference on software engineering, pp 452–461. IEEE Press

  49. Qiu L, Lin H, Leung AKY (2013) Cultural differences and switching of in-group sharing behavior between an american (facebook) and a chinese (renren) social networking site. J Cross-Cult Psychol 44(1):106–121

    Article  Google Scholar 

  50. Robert LP, Denis AR, Hung YTC (2009) Individual swift trust and knowledge-based trust in face-to-face and virtual team members. J Manag Inf Syst 26 (2):241–279

    Article  Google Scholar 

  51. Rodrıguez G (2013) Models for count data with overdispersion

  52. Rodríguez-Pérez G, Zaidman A, Serebrenik A, Robles G, González-Barahona JM (2018) What if a bug has a different origin? making sense of bugs without an explicit bug introducing change. In: Proceedings of the 12th ACM/IEEE international symposium on empirical software engineering and measurement, p 52. ACM

  53. Saavedra R, Earley PC, Van Dyne L (1993) Complex interdependence in task-performing groups. J Appl Psychol 78(1):61

    Article  Google Scholar 

  54. Sato Y, Arita S (2004) Impact of globalization on social mobility in Japan and korea: Focusing on middle classes in fluid societies. Int J Jpn Sociol 13(1):36–52

    Article  Google Scholar 

  55. Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: ACM sigsoft software engineering notes, vol 30, pp 1–5. ACM

    Article  Google Scholar 

  56. Steinmacher I, Conte T, Gerosa MA, Redmiles D (2015) Social barriers faced by newcomers placing their first contribution in open source software projects. In: Proceedings of the 18th ACM conference on Computer supported cooperative work & social computing, pp 1379–1392. ACM

  57. Steinmacher I, Conte TU, Treude C, Gerosa MA (2016) Overcoming open source project entry barriers with a portal for newcomers. In: International conference on software engineering

  58. Stolcke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, Taylor P, Martin R, Van Ess-Dykema C, Meteer M (2000) Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist 26 (3):339–373

    Article  Google Scholar 

  59. Tsay J, Dabbish L, Herbsleb J (2014) Let’s talk about it: evaluating contributions through discussion in github. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, pp 144–154. ACM

  60. Vandekerckhove J, Matzke D, Wagenmakers EJ (2015) Model comparison and the principle. In: The Oxford handbook of computational and mathematical psychology, vol 300. Oxford Library of Psychology

  61. Vuong QH (1989) Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: Journal of the Econometric Society pp 307–333

    MathSciNet  Article  Google Scholar 

  62. Yu Y, Wang H, Yin G, Wang T (2016) Reviewer recommendation for pull-requests in github: What can we learn from code review and bug assignment? Inf Softw Technol 74:204–218

    Article  Google Scholar 

  63. Yu Y, Yin G, Wang H, Wang T (2014) Exploring the patterns of social behavior in github. In: Proceedings of the 1st international workshop on crowd-based software development methods and technologies, pp 31–36. ACM

  64. Zhang Y, Wang H, Yin G, Wang T, Yu Y (2015) Exploring the use of@-mention to assist software development in github. In: Proceedings of the 7th Asia-pacific symposium on internetware, pp 83–92. ACM

  65. Zhang Y, Wang H, Yin G, Wang T, Yu Y (2017) Social media in github: the role of @-mention in assisting software development. Science China Information Sciences 60(3):032102

    Article  Google Scholar 

  66. Zucker LG (1986) Production of trust: Institutional sources of economic structure, 1840–1920. Research in organizational behavior

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to David Kavaler.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by: Filippo Lanubile

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kavaler, D., Devanbu, P. & Filkov, V. Whom are you going to call? determinants of @-mentions in Github discussions. Empir Software Eng 24, 3904–3932 (2019). https://doi.org/10.1007/s10664-019-09728-3

Download citation

Keywords

  • Github
  • @-mention
  • Mention
  • Tagging
  • Social tagging
  • Issue