SCSMiner: mining social coding sites for software developer recommendation with relevance propagation


With the advent of social coding sites, software development has entered a new era of collaborative work. Social coding sites (e.g., GitHub) can integrate social networking and distributed version control in a unified platform to facilitate collaborative developments over the world. One unique characteristic of such sites is that the past development experiences of developers provided on the sites convey the implicit metrics of developer’s programming capability and expertise, which can be applied in many areas, such as software developer recruitment for IT corporations. Motivated by this intuition, we aim to develop a framework to effectively locate the developers with right coding skills. To achieve this goal, we devise a generativ e probabilistic expert ranking model upon which a consistency among projects is incorporated as graph regularization to enhance the expert ranking and a perspective of relevance propagation illustration is introduced. For evaluation, StackOverflow is leveraged to complement the ground truth of expert. Finally, a prototype system, SCSMiner, which provides expert search service based on a real-world dataset crawled from GitHub is implemented and demonstrated.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7


This work was down during Yao Wan’s visit to University of Technology Sydney. This research was partially supported by the Natural Science Foundation of China under grant of No. 61379119 and No. 61672453, Australia Research Council Linkage Project (LP140100937). We would like to thank Lishui Zhou who helps us a lot in the implementation of the demo of SCSMiner. We would like to thank Jie Liang, Tianhan Xia and Junqing Luan for sharing their crawler source code with us and their demo system githuber.info also gave us some inspiration.

