Network Analysis of Software Repositories: Identifying Subject Matter Experts

  • Andrew Dittrich
  • Mehmet Hadi Gunes
  • Sergiu Dascalu
Part of the Studies in Computational Intelligence book series (SCI, volume 424)


A software developer joining a large software project faces a steep learning curve before they are able to make real contributions. One challenge is finding the subject matter experts who can answer questions about a specific area of the software or to review changes. This is especially true of large projects with many modules and a large number of authors. In this paper, we describe a method to model a software project as a network using information mined from the project’s version control repository, and demonstrate how network analysis techniques can be used to identify the key authors and subject matter experts. We investigate metrics that can be gathered using network analysis, such as which groups of authors typically work together, and how closely knit the developers are on a project. We analyze several specific projects to demonstrate the applicability of these techniques and several hundred projects to show general trends.


Bipartite Graph Software Project Subject Matter Expert Eigenvector Centrality Anonymous User 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Collins-Sussman, B., Fitzpatrick, B.W., Pilato, C.M.: Version control with subversion (2007),
  2. 2.
    Howison, J., Crowston, K.: The perils and pitfalls of mining sourceforge. In: Proceedings of the International Workshop on Mining Software Repositories (MSR 2004), pp. 7–11 (2004)Google Scholar
  3. 3.
    Huang, S.K., Liu, K.M.: Mining version histories to verify the learning process of legitimate peripheral participants. In: Proceedings of the 2005 International Workshop on Mining Software Repositories, MSR 2005, pp. 1–5. ACM, New York (2005)CrossRefGoogle Scholar
  4. 4.
    The igraph website (2010),
  5. 5.
    Kagdi, H., Yusuf, S., Maletic, J.I.: Mining sequences of changed-files from version histories. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, MSR 2006, pp. 47–53. ACM, New York (2006)CrossRefGoogle Scholar
  6. 6.
    Linstead, E., Rigor, P., Bajracharya, S., Lopes, C., Baldi, P.: Mining eclipse developer contributions via author-topic models. In: Fourth International Workshop on Mining Software Repositories, ICSE Workshops, MSR 2007, p. 30 (2007), doi:10.1109/MSR.2007.20Google Scholar
  7. 7.
    Lopez-Fernandez, L., Robles, G., Gonzalez-Barahona, J.M.: Applying social network analysis to the information in cvs repositories. In: Proceedings of 26th International Conference on Software Engineering, ICSE 2004 (2004), doi:10.1109/ICSE.2004.1317529Google Scholar
  8. 8.
    Newman, M.E.J.: Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences of the United States of America 101(suppl. 1), 5200–5205 (2004),, doi:10.1073/pnas.0307545100CrossRefGoogle Scholar
  9. 9.
    Newman, M.E.J.: Networks an Introduction. Oxford University Press, New York (2010)zbMATHGoogle Scholar
  10. 10.
    Ordonez, M., Haddad, H.: The state of metrics in software industry. In: Fifth International Conference on Information Technology: New Generations, ITNG 2008, pp. 453–458 (2008), doi:10.1109/ITNG.2008.106Google Scholar
  11. 11.
    Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74(1), 016,110 (2006), doi:10.1103/PhysRevE.74.016110MathSciNetGoogle Scholar
  12. 12.
    Sommerville, I.: Software Engineering, 8th edn. Addison-Wesley, Harlow (2007)zbMATHGoogle Scholar
  13. 13.
    Umarji, M., Shull, F.: Measuring developers: Aligning perspectives and other best practices. IEEE Software 26(6), 92–94 (2009), doi:10.1109/MS.2009.180CrossRefGoogle Scholar
  14. 14.
    Voinea, L., Telea, A.: Mining software repositories with cvsgrab. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, MSR 2006, pp. 167–168. ACM, New York (2006)CrossRefGoogle Scholar
  15. 15.
    Voinea, L., Telea, A.: An open framework for cvs repository querying, analysis and visualization. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, MSR 2006, pp. 33–39. ACM, New York (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Andrew Dittrich
    • 1
  • Mehmet Hadi Gunes
    • 1
  • Sergiu Dascalu
    • 1
  1. 1.University of NevadaRenoUSA

Personalised recommendations