Network Analysis of Software Repositories: Identifying Subject Matter Experts
A software developer joining a large software project faces a steep learning curve before they are able to make real contributions. One challenge is finding the subject matter experts who can answer questions about a specific area of the software or to review changes. This is especially true of large projects with many modules and a large number of authors. In this paper, we describe a method to model a software project as a network using information mined from the project’s version control repository, and demonstrate how network analysis techniques can be used to identify the key authors and subject matter experts. We investigate metrics that can be gathered using network analysis, such as which groups of authors typically work together, and how closely knit the developers are on a project. We analyze several specific projects to demonstrate the applicability of these techniques and several hundred projects to show general trends.
KeywordsBipartite Graph Software Project Subject Matter Expert Eigenvector Centrality Anonymous User
Unable to display preview. Download preview PDF.
- 1.Collins-Sussman, B., Fitzpatrick, B.W., Pilato, C.M.: Version control with subversion (2007), http://svnbook.red-bean.com/en/1.4/index.html
- 2.Howison, J., Crowston, K.: The perils and pitfalls of mining sourceforge. In: Proceedings of the International Workshop on Mining Software Repositories (MSR 2004), pp. 7–11 (2004)Google Scholar
- 4.The igraph website (2010), http://igraph.sourceforge.net/
- 6.Linstead, E., Rigor, P., Bajracharya, S., Lopes, C., Baldi, P.: Mining eclipse developer contributions via author-topic models. In: Fourth International Workshop on Mining Software Repositories, ICSE Workshops, MSR 2007, p. 30 (2007), doi:10.1109/MSR.2007.20Google Scholar
- 7.Lopez-Fernandez, L., Robles, G., Gonzalez-Barahona, J.M.: Applying social network analysis to the information in cvs repositories. In: Proceedings of 26th International Conference on Software Engineering, ICSE 2004 (2004), doi:10.1109/ICSE.2004.1317529Google Scholar
- 8.Newman, M.E.J.: Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences of the United States of America 101(suppl. 1), 5200–5205 (2004), http://www.pnas.org/content/101/suppl.1/5200.abstract, doi:10.1073/pnas.0307545100CrossRefGoogle Scholar
- 10.Ordonez, M., Haddad, H.: The state of metrics in software industry. In: Fifth International Conference on Information Technology: New Generations, ITNG 2008, pp. 453–458 (2008), doi:10.1109/ITNG.2008.106Google Scholar