Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners

Abstract

Most governmental research assessment exercises do not use citation data for the Social Sciences and Humanities as Web of Science or Scopus coverage in these disciplines is considered to be insufficient. We therefore assess to what extent Google Scholar can be used as an alternative source of citation data. In order to provide a credible alternative, Google Scholar needs to be stable over time, display comprehensive coverage, and provide non-biased comparisons across disciplines. This article assesses these conditions through a longitudinal study of 20 Nobel Prize winners in Chemistry, Economics, Medicine and Physics. Our results indicate that Google Scholar displays considerable stability over time. However, coverage for disciplines that have traditionally been poorly represented in Google Scholar (Chemistry and Physics) is increasing rapidly. Google Scholar’s coverage is also comprehensive; all of the 800 most cited publications by our Nobelists can be located in Google Scholar, although in four cases there are some problems with the results. Finally, we argue that Google Scholar might provide a less biased comparison across disciplines than the Web of Science. The use of Google Scholar might therefore redress the traditionally disadvantaged position of the Social Sciences in citation analysis.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2

Notes

  1. 1.

    Humanities are not included in the Essential Science Indicators. We acknowledge that the high ranking for Australia in the Social Sciences/Economics & Business is partly caused by its “English-language” advantage. ISI has a bias towards English-language journals and local language journals are more common in the Social Sciences than in the Sciences and Medicine. However, this does not invalidate the argument as ISI listed journals are generally perceived to be top journals in their fields; many Australian journals are not ISI listed either. Hence Australia still punches above its weight. Moreover, although one can quibble with the number of papers measure, the citations per paper measure shows that the Social Sciences perform on par with the Sciences and Medicine.

  2. 2.

    As small sample sizes can easily distort the citations per paper metric, we ignored countries that had published less than 100 articles a year.

  3. 3.

    This was clearly illustrated when we attempted to do a rough assessment of how much a Cited Reference Search would add to our results. For Chemist EJ Corey, more than 2200 of his nearly 3300 entries were to non-ISI listed publications, but 90% of these were stray citations to three journals that are in fact ISI listed. For Physicist Geim, stray citations made up a third of his records for less than 1 % of his citation count.

  4. 4.

    Economist Paul Krugman for instance had 567 entries in the Science databases, even though all of these publications were books or publications in Economic journals.

  5. 5.

    We are fully aware that a Cited Reference Search would provide a higher citation count and h-index for Economics. However, it still provides a very incomplete citation record for these Nobelists as it only measures citations in ISI listed journals and it only reports citations to non-ISI listed publication for the first author. For the two Economists where we could reliably search in the Cited Reference search, it provided 8–10 times as many items as the General Search, but as most of these were stray citations, the Cited Reference Search found only 10–20% of the additional Google Scholar citation count. Moreover, our analysis for the Nobelists top-20 journal articles shows that even for ISI-listed publications Google Scholar provides 3–5 times as many citations.

  6. 6.

    Citations for the other Nobelists in our sample also increased with a similar magnitude (between 0.3 and 1.4 %, with most sitting around 0.6–0.7 %). This seems to suggest that there was no further expansion of coverage in these 4 weeks and that any increase in citations was caused by the natural increase in citations over time. This presents additional evidence of the stability of Google Scholar.

References

  1. Bar-Ilan, J. (2008). Which h-index?: A comparison of Web of Science, Scopus and Google Scholar. Scientometrics, 74(2), 257–271.

  2. Bar-Ilan, J. (2010). Citations to the “Introduction to informetrics” indexed by WOS, Scopus and Google Scholar. Scientometrics, 82(3), 495–506.

  3. Bar-Ilan, J., Levene, M., & Lin, A. (2007). Some measures for comparing citation databases. Journal of Informetrics, 1(1), 26–34.

  4. Belew, R. K. (2005). Scientific impact quantity and quality: Analysis of two sources of bibliographic data, arXiv:cs.IR/0504036 v1, 11 April 2005.

  5. Bornmann, L., & Daniel, H. D. (2005). Does the h-index for ranking of scientists really work? Scientometrics, 65(3), 391–392.

  6. Bornmann, L., Marx, W., Schier, H., Rahm, E., Thor, A., & Daniel, H.-D. (2009). Convergent validity of bibliometric Google Scholar data in the field of chemistry: Citation counts for papers that were accepted by Angewandte Chemie International Edition or rejected but published elsewhere, using Google Scholar, Science Citation Index, Scopus, and Chemical Abstracts. Journal of Informetrics, 3(1), 27–35.

  7. Bosman, J., Mourik, I. van, Rasch, M., Sieverts, E., & Verhoeff, H. (2006). Scopus reviewed and compared. The coverage and functionality of the citation database Scopus, including comparisons with Web of Science and Google Scholar, Utrecht: Utrecht University Library, http://igitur-archive.library.uu.nl/DARLIN/2006-1220-200432/Scopusdoorgelicht&vergelekentranslated.pdf.

  8. Chen, X. (2010). Google Scholar’s dramatic coverage improvement five years after debut. Serials Review., 36(4), 221–226.

  9. Cronin, B., & Meho, L. (2006). Using the h-index to rank influential information scientists. Journal of the American Society for Information Science and Technology, 57, 1275–1278.

  10. Franceschet, M. (2010). A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar. Scientometrics, 83, 243–258.

  11. García-Pérez, M. A. (2010). Accuracy and completeness of publication and Citation Records in the Web of Science, PsycINFO, and Google Scholar: A case study for the computation of h indices in Psychology. Journal of the American Society for Information Science and Technology, 61(10), 2070–2085.

  12. Hare, J. (2011). Most universities below par on research, The Australian, 1 February, 2011, http://www.theaustralian.com.au/higher-education/most-universities-below-par-on-research/story-e6frgcjx-1225997730868.

  13. Harzing, A. W. (2005). Australian research output in economics & business: High volume, low impact? Australian Journal of Management, 30(2), 183–200.

  14. Harzing, A. W. (2007). Publish or Perish. Retrieved from http://www.harzing.com/pop.htm.

  15. Harzing, A. W. (2010a). Citation analysis across disciplines: The Impact of different data sources and citation metrics, www.harzing.com white paper, Retrieved January 31, 2012, from http://www.harzing.com/data_metrics_comparison.htm.

  16. Harzing, A. W. (2010b). The Publish or Perish Book: Your guide to effective and responsible citation analysis. Melbourne: Tarma Software Research.

  17. Harzing, A. W., & van der Wal, R. (2008). Google Scholar as a new source for citation analysis? Ethics in Science and Environmental Politics, 8(1), 62–71.

  18. Huang, M., & Chang, Y. (2008). Characteristics of research output in social sciences and humanities: From a research evaluation perspective. Journal of the American Society for Information Science and Technology, 59(11), 1819–1828.

  19. Jacsó, P. (2010). Metadata mega mess in Google Scholar. Online Information Review, 34(1), 175–191.

  20. Jacsó, P. (2012). Google Scholar author citation tracker: is it too little, too late? Online Information Review, 36(1), 126–141.

  21. Jump, P. (2011). Free app has the cite stuff for REF, Times Higher Education Supplement. Retrieved June 30, 2011, from http://www.timeshighereducation.co.uk/story.asp?storycode=416647.

  22. Kousha, K., & Thelwall, M. (2007). Google Scholar citations and Google Web/URL citations: A multi-discipline exploratory analysis. Journal of the American Society for Information Science and Technology, 58(7), 1055–1065.

  23. Kousha, K., & Thelwall, M. (2008). Sources of Google Scholar citations outside the science citation index: A comparison between four science disciplines. Scientometrics, 74(2), 273–294.

  24. Kousha, K., Thelwall, M., & Rezaie, S. (2011). Assessing the citation impact of books: The role of Google Books, Google Scholar, and Scopus. Journal of the American Society for Information Science and Technology, 62(11), 2147–2164.

  25. Levine-Clark, M., & Gil, E. L. (2009). A comparative analysis of social sciences citation tools. Online Information Review, 33(5), 986–996.

  26. London School of Economics and Political Science. (2011). Impact of the social sciences: Maximizing the impact of academic research. Retrieved from http://blogs.lse.ac.uk/impactofsocialsciences/.

  27. Mayr, P., & Walter, A.-K. (2007). An exploratory study of Google Scholar. Online Information Review, 31(6), 814–830.

  28. Meier, J. J., & Conkling, T. W. (2008). Google Scholar’s coverage of the engineering literature: An empirical study. The Journal of Academic Librarianship, 34(3), 196–201.

  29. Mingers, J., & Lipitakis, E. A. E. C. G. (2010). Counting the citations: a comparison of Web of Science and Google Scholar in the field of business and management. Scientometrics, 85, 613–625.

  30. Murphy, P. (1996). Determining measures of the quality and impact of journals, Commissioned Report No. 49, Australian Government Publishing Service, Canberra.

  31. Nederhof, A. (2006). Bibliometric monitoring of research performance in the social sciences and the humanities: A review. Scientometrics, 66(1), 81–100.

  32. Neuhaus, C., & Daniel, H. D. (2008). Data sources for performing citation analysis: An overview. Journal of Documentation, 64(2), 193–210.

  33. Neuhaus, C., Neuhaus, E., Asher, A., & Wrede, C. (2006). The depth and breadth of Google Scholar: An empirical study portal. Libraries and the Academy, 6(2), 127–141.

  34. Norris, M., & Oppenheim, C. (2007). Comparing alternatives to the Web of Science for coverage of the social sciences’ literature. Journal of Informetrics, 1(1), 161–169.

  35. Pauly, D., & Stergiou, K. I. (2005). Equivalence of results from two citation analyses: Thomson ISI’s Citation Index and Google Scholar’s service (pp. 33–35). December: Ethics in Science and Environmental Politics.

  36. Thornley, C. V., Johnson, A. C., Smeaton, A. C., & Lee, H. (2011a). The Scholarly Impact of TRECVid (2003-9). Journal of the American Society for Information Science and Technology, 62(4), 613–627.

  37. Thornley, C. V., McLoughlin, S. J., Johnson, A. C., & Smeaton, A. F. (2011b). A bibliometric study of video retrieval evaluation benchmarking (TRECVid): A methodological analysis. Journal of Information Science, 37(6), 577–593.

  38. Vaughan, L., & Shaw, D. (2008). A new look at evidence of scholarly citations in citation indexes and from web sources. Scientometrics, 74(2), 317–330.

  39. Walters, W. H. (2007). Google Scholar coverage of a multidisciplinary field. Information Processing and Management, 43(4), 1121–1132.

Download references

Author information

Correspondence to Anne-Wil Harzing.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Harzing, A. A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners. Scientometrics 94, 1057–1075 (2013). https://doi.org/10.1007/s11192-012-0777-7

Download citation

Keywords

  • Google Scholar
  • Web of science
  • Social sciences
  • Citation analysis