, Volume 96, Issue 3, pp 683–697 | Cite as

Author name disambiguation in scientific collaboration and mobility cases

  • Jiang Wu
  • Xiu-Hao Ding


Scientists generally do scientific collaborations with one another and sometimes change their affiliations, which leads to scientific mobility. This paper proposes a recursive reinforced name disambiguation method that integrates both coauthorship and affiliation information, especially in cases of scientific collaboration and mobility. The proposed method is evaluated using the dataset from the Thomson Reuters Scientific “Web of Science”. The probability of recall and precision of the algorithm are then analyzed. To understand the effect of the name ambiguation on the h-index and g-index before and after the name disambiguation, calculations of their distribution are also presented. Evaluation experiments show that using only the affiliation information in the name disambiguation achieves better performance than that using only the coauthorship information; however, our proposed method that integrates both the coauthorship and affiliation information can control the bias in the name ambiguation to a higher extent.


Author disambiguation Scientific collaboration Scientific mobility Coauthorship Affiliation 



This work was supported in part by the ISTIC-THOMSON Joint Scientometrics Lab Fund (Grant No. IT2012004) and in part by the China National Natural Science Fund (Grant No. 71101059).


  1. Badar, K., Hite, J., & Badir, Y. (2012). Examining the relationship of co-authorship network centrality and gender on academic research performance: the case of chemistry researchers in Pakistan. Scientometrics, 1–21, doi: 10.1007/s11192-012-0764-z.
  2. Chung, C., & Park, H. (2012). Web visibility of scholars in media and communication journals. Scientometrics, 1–9, doi: 10.1007/s11192-012-0707-8.
  3. Egghe, L. (2006). Theory and practise of the g-index. Scientometrics, 69(1), 131–152.MathSciNetCrossRefGoogle Scholar
  4. Guns, R., Liu, Y., & Mahbuba, D. (2011). Q-measures and betweenness centrality in a collaboration network: A case study of the field of informetrics. Scientometrics, 87(1), 133–147.CrossRefGoogle Scholar
  5. Gurney, T., Horlings, E., et al. (2012). Author disambiguation using multi-aspect similarity indicators. Scientometrics, 91(2), 435–449.CrossRefGoogle Scholar
  6. Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569.CrossRefGoogle Scholar
  7. Huang, J., Ertekin, S., & Giles, C. (2006). Efficient name disambiguation for large-scale databases. Knowledge Discovery in Databases, PKDD, 2006(4213), 536–544.Google Scholar
  8. Iglesias, J., & Pecharromán, C. (2007). Scaling the h-index for different scientific ISI fields. Scientometrics, 73(3), 303–320.CrossRefGoogle Scholar
  9. Kang, I., Na, S., Lee, S., Jung, H., Kim, P., Sung, W., et al. (2009). On co-authorship for author disambiguation. Information Processing and Management, 45(1), 84–97.CrossRefGoogle Scholar
  10. Laherrère, J., & Sornette, D. (1998). Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales. The European Physical Journal B, 2(4), 525–539.CrossRefGoogle Scholar
  11. Newman, M. E. J. (2001). Scientific collaboration networks.I. Network construction and fundamental results. Physical Review E, 64(1), 16131.CrossRefGoogle Scholar
  12. Newman, M. E. J. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1), 5200–5205.CrossRefGoogle Scholar
  13. Onodera, N., Iwasawa, M., Midorikawa, N., Yoshikane, F., Amano, K., Ootani, Y., et al. (2011). A method for eliminating articles by homonymous authors from the large number of articles retrieved by author search. Journal of the American Society for Information Science and Technology, 62(4), 677–690.CrossRefGoogle Scholar
  14. Petersen, A. M., Jung, W., Yang, J., & Stanley, H. E. (2011). Quantitative and empirical demonstration of the Matthew effect in a study of career longevity. Proceedings of the National Academy of Sciences, 108(1), 18–23.CrossRefGoogle Scholar
  15. Petersen, A. M., Wang, F., & Stanley, H. E. (2010). Methods for measuring the citations and productivity of scientists across time and discipline. Physical Review E, 81(3), 36114.MathSciNetCrossRefGoogle Scholar
  16. Radicchi, F., Fortunato, S., Markines, B., & Vespignani, A. (2009). Diffusion of scientific credits and the ranking of scientists. Physical Review E, 80(5), 56103.CrossRefGoogle Scholar
  17. Smalheiser, N. R., & Torvik, V. I. (2009). Author name disambiguation. Annual Review of Information Science and Technology, 43(1), 1–43.CrossRefGoogle Scholar
  18. Soler, J. (2007). Separating the articles of authors with the same name. Scientometrics, 72(2), 281–290.MathSciNetCrossRefGoogle Scholar
  19. Tang, L., & Walsh, J. (2010). Bibliometric fingerprints: name disambiguation based on approximate structure equivalence of cognitive maps. Scientometrics, 84(3), 763–784.CrossRefGoogle Scholar
  20. Wooding, S., Wilcox-Jay, K., Lewison, G., & Grant, J. (2006). Co-author inclusion: A novel recursive algorithmic method for dealing with homonyms in bibliometric analysis. Scientometrics, 66(1), 11–21.CrossRefGoogle Scholar
  21. Zhao, D., & Strotmann, A. (2011). Counting first, last, or all authors in citation analysis: A comprehensive comparison in the highly collaborative stem cell research field. Journal of the American Society for Information Science and Technology, 62(4), 654–676.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2013

Authors and Affiliations

  1. 1.School of Information ManagementWuhan UniversityWuhanChina
  2. 2.School of ManagementHuazhong University of Science and TechnologyWuhanChina

Personalised recommendations