, Volume 97, Issue 3, pp 627–637 | Cite as

Analysis of bibliometric indicators for individual scholars in a large data set

  • Filippo RadicchiEmail author
  • Claudio Castellano


Citation numbers and other quantities derived from bibliographic databases are becoming standard tools for the assessment of productivity and impact of research activities. Though widely used, still their statistical properties have not been well established so far. This is especially true in the case of bibliometric indicators aimed at the evaluation of individual scholars, because large-scale data sets are typically difficult to be retrieved. Here, we take advantage of a recently introduced large bibliographic data set, Google Scholar Citations, which collects the entire publication record of individual scholars. We analyze the scientific profile of more than 30,000 researchers, and study the relation between the h-index, the number of publications and the number of citations of individual scientists. While the number of publications of a scientist has a rather weak relation with his/her h-index, we find that the h-index of a scientist is strongly correlated with the number of citations that she/he has received so that the number of citations can be effectively be used as a proxy of the h-index. Allowing for the h-index to depend on both the number of citations and the number of publications, we find only a minor improvement.


Statistical analysis Citations h-Index 


  1. Adler, R., Ewing, J., & Taylor, P. (2009). Citation statistics. Statistical Science, 24(1), 1–14.MathSciNetCrossRefGoogle Scholar
  2. Alonso, S., Cabrerizo, F., Herrera-Viedma, E., & F, H. (2009). h-Index: A review focused in its variants, computation and standardization for different scientific fields. Journal of Informetrics, 3(4), 273–289.CrossRefGoogle Scholar
  3. Bar-Ilan, J. (2008). Which h-index?–a comparison of WOS, Scopus and Google Scholar. Scientometrics, 74(2), 257–271.CrossRefGoogle Scholar
  4. Bornmann, L., & Daniel, H. D. (2006). Selecting scientific excellence through committee peer review: A citation analysis of publications previously published to approval or rejection of post-doctoral research fellowship applicants. Scientometrics, 68(3), 427–440.CrossRefGoogle Scholar
  5. Bornmann, L., & Daniel, H. D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.CrossRefGoogle Scholar
  6. Bornmann, L., Wallon, G., & Ledin, A. (2008). Does the committee peer review select the best applicants for funding? An investigation of the selection process for two European molecular biology organization programmes. PLoS ONE, 3(10), e3480.CrossRefGoogle Scholar
  7. Cabanac, G. (2013). Experimenting with the partnership ability \(\varphi\)-index on a million computer scientists. Scientometrics.Google Scholar
  8. Costas, R., & Bordons, M. (2007). The h-index: Advantages, limitations and its relation with other bibliometric indicators at the micro level. Journal of Informetrics, 1(3), 193–203.CrossRefGoogle Scholar
  9. Costas, R., & Bordons, M. (2008). Is g-index better than h-index? An exploratory study at the individual level. Scientometrics, 77(2), 267–288.CrossRefGoogle Scholar
  10. Davis, P., & Papanek, G. F. (1984). Faculty ratings of major economics departments by citations. The American Economic Review, 74(1), 225–230.Google Scholar
  11. De Solla Price, D. J. (1965). Networks of scientific papers. Science, 149(3683), 510–515.Google Scholar
  12. Egghe, L. (2006). Theory and practise of the g-index. Scientometrics, 69(1), 131–152.MathSciNetCrossRefGoogle Scholar
  13. Egghe, L. (2010). The Hirsch index and related impact measures. Annual Review of Information Science and Technology, 44(1), 65–114.CrossRefGoogle Scholar
  14. Egghe, L., & Rousseau, R. (2006). An informetric model for the Hirsch-index. Scientometrics, 69(1), 121–129.CrossRefGoogle Scholar
  15. Garfield, E. (1998). The impact factor and using it correctly. Der Unfallchirurg, 101(6), 413–414.Google Scholar
  16. Glänzel, W. (2006). On the h-index: A mathematical approach to a new measure of publication activity and citation impact. Scientometrics, 67(2), 315–321.CrossRefGoogle Scholar
  17. Hartley, J. (2012). To cite or not to cite: Author self-citations and the impact factor. Scientometrics, 92(2), 313–317.CrossRefGoogle Scholar
  18. Harzing, A. W. K., & van der Wal, R. (2008). Google Scholar as a new source for citation analysis. Ethics in Science and Environmental Politics, 8(1), 61–73.CrossRefGoogle Scholar
  19. Hendricks, W. A., & Robey, K. W. (1936). The sampling distribution of the coefficient of variation. The Annals of Mathematical Statistics, 7(3), 129–132.CrossRefGoogle Scholar
  20. Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16,569–16,572.CrossRefGoogle Scholar
  21. Iglesias, J., & Pecharromán, C. (2007). Scaling the h-index for different scientific ISI fields. Scientometrics, 73(3), 303–320.CrossRefGoogle Scholar
  22. Jacsó, P. (2005). As we may search—comparison of major features of web of science, scopus and Google Scholar citation-based and citation-enhanced databases. Current Science, 89(9), 1537–1547.Google Scholar
  23. Jacsó, P. (2005). Visualizing overlap and rank differences among web-wide search engines. Online Information Review, 29(5), 554–560.CrossRefGoogle Scholar
  24. Jacsó, P. (2010). Metadata mega mess in Google Scholar. Online Information Review, 34(1), 175–191.CrossRefGoogle Scholar
  25. Kinney, A. L. (2007). National scientific facilities and their science impact on nonbiomedical research. Proceedings of the National Academy of Sciences of the United States of America, 104(46), 17,943–17,947.CrossRefGoogle Scholar
  26. Labbé, C. (2011). Ike Antkare, one of the great stars in the scientific firmament. ISSI newsletter, 6(2), 48–52.Google Scholar
  27. Laherrère, J., & Sornette, D. (1998). Stretched exponential distributions in nature and economy: “Fat tails” with characteristic scales. European Physical Journal B, 2(4), 525–539.CrossRefGoogle Scholar
  28. Lehmann, S., Jackson, A. D., & Lautrup, B. E. (2006). Measures for measures. Nature, 444(7122), 1003–1004.CrossRefGoogle Scholar
  29. MacRoberts, M. H., & MacRoberts, B. R. (1989). Problems of citation analysis: A critical review. Journal of the American Society for Information Science, 40(5), 342–349.CrossRefGoogle Scholar
  30. MacRoberts, M. H., & MacRoberts, B. R. (1996). Problems of citation analysis. Scientometrics, 36(3), 435–444.CrossRefGoogle Scholar
  31. Meho, L. I., & Yang, K. (2007). Impact of data sources on citation counts and rankings of lis faculty: Web of science versus Scopus and Google Scholar. Journal of the American Society for Information Science and Technology, 58(13), 2105–2125.CrossRefGoogle Scholar
  32. Petersen, A. M., Jung, W. s., Yang, J. s., & Stanley, H. E. (2010). Quantitative and empirical demonstration of the Matthew effect in a study of career longevity. Proceedings of the National Academy of Sciences, 108(1), 18–23.CrossRefGoogle Scholar
  33. Petersen, A. M., Wang, F., & Stanley, H. E. (2010). Methods for measuring the citations and productivity of scientists across time and discipline. Physical Review E, 81(3), 1–9.MathSciNetCrossRefGoogle Scholar
  34. Petersen, A. M., Stanley, H. E., & Succi, S. (2011). Statistical regularities in the rank-citation profile of scientists. Scientific reports, 1, 181.CrossRefGoogle Scholar
  35. Petersen, A. M., Riccaboni, M., Stanley, H. E., & Pammolli, F. (2012). Persistence and uncertainty in the academic career. Proceedings of the National Academy of Sciences, 109(14), 5213–5218.CrossRefGoogle Scholar
  36. Pratelli, L., Baccini, A., Barabesi, L., & Marcheselli, M. (2012). Statistical analysis of the Hirsch Index. Scandinavian Journal of Statistics, 39(4), 681–694.MathSciNetCrossRefzbMATHGoogle Scholar
  37. van Raan, A. F. J. (2006). Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. Scientometrics, 67(3), 491–502.Google Scholar
  38. Radicchi, F., & Castellano, C. (2012). A reverse engineering approach to the suppression of citation biases reveals universal properties of citation distributions. PLoS ONE, 7(3), e33,833.CrossRefGoogle Scholar
  39. Radicchi, F., Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences of the United States of America, 105(45), 17,268–17,272.CrossRefGoogle Scholar
  40. Radicchi, F., Fortunato, S., Markines, B., & Vespignani, A. (2009). Diffusion of scientific credits and the ranking of scientists. Physical Review E, 80(5), 056,103.CrossRefGoogle Scholar
  41. Redner, S. (1998). How popular is your paper? An empirical study of citation distribution. European Physical Journal B, 4(2), 131–134.CrossRefGoogle Scholar
  42. Redner, S. (2010). On the meaning of the h-index. Journal of Statistical Mechanics (3), L03,005.Google Scholar
  43. Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences of the United States of America, 105(4), 1118–1123.CrossRefGoogle Scholar
  44. Schreiber, M., Malesios, C., & S, P. (2011). Categorizing h-index variants. Research Evaluation, 21(3), 397–409.CrossRefGoogle Scholar
  45. Schubert, A., & Glänzel, W. (2007). A systematic analysis of Hirsch-type indices for journals. Journal of Informetrics, 1(3), 179–184.CrossRefGoogle Scholar
  46. Spruit H.C. (2012) The relative significance of the H-index. ArXiv e-prints 1201.5476Google Scholar
  47. Stringer, M. J., Sales-Pardo, M., & Amaral, L. A. N. (2008). Effectiveness of journal ranking schemes as a tool for locating Information. PLoS ONE, 3(2), e1683.CrossRefGoogle Scholar
  48. Stringer, M. J., Sales-Pardo, M., & Amaral, L. A. N. (2010). Statistical validation of a global model for the distribution published in a scientific journal. Journal of the American Society for Information Science, 61(7), 1377–1385.CrossRefGoogle Scholar
  49. Wallace, M. L., Larivière, V., & Gingras, Y. (2008). Modeling a century of citation distributions. Journal of Informetrics, 3(4), 296–303.CrossRefGoogle Scholar
  50. West, J., Bergstrom, T., Bergstrom, C. T., Road, H. P., & Fe, S. (2010). Big macs and eigenfactor scores : Don’t let correlation coefficients fool you. Journal of the American Society for Information Science, 61(2008), 1800–1807.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2013

Authors and Affiliations

  1. 1.Departament d’Enginyeria QuimicaUniversitat Rovira i VirgiliTarragonaSpain
  2. 2.Istituto dei Sistemi Complessi (ISC-CNR)RomaItaly
  3. 3.Dipartimento di FisicaSapienza Universitá di RomaRomaItaly

Personalised recommendations