Scientometrics

, Volume 98, Issue 2, pp 1547–1565 | Cite as

The expansion of Google Scholar versus Web of Science: a longitudinal study

  • Joost C. F. de Winter
  • Amir A. Zadpoor
  • Dimitra Dodou
Article

Abstract

Web of Science (WoS) and Google Scholar (GS) are prominent citation services with distinct indexing mechanisms. Comprehensive knowledge about the growth patterns of these two citation services is lacking. We analyzed the development of citation counts in WoS and GS for two classic articles and 56 articles from diverse research fields, making a distinction between retroactive growth (i.e., the relative difference between citation counts up to mid-2005 measured in mid-2005 and citation counts up to mid-2005 measured in April 2013) and actual growth (i.e., the relative difference between citation counts up to mid-2005 measured in April 2013 and citation counts up to April 2013 measured in April 2013). One of the classic articles was used for a citation-by-citation analysis. Results showed that GS has substantially grown in a retroactive manner (median of 170 % across articles), especially for articles that initially had low citations counts in GS as compared to WoS. Retroactive growth of WoS was small, with a median of 2 % across articles. Actual growth percentages were moderately higher for GS than for WoS (medians of 54 vs. 41 %). The citation-by-citation analysis showed that the percentage of citations being unique in WoS was lower for more recent citations (6.8 % for citations from 1995 and later vs. 41 % for citations from before 1995), whereas the opposite was noted for GS (57 vs. 33 %). It is concluded that, since its inception, GS has shown substantial expansion, and that the majority of recent works indexed in WoS are now also retrievable via GS. A discussion is provided on quantity versus quality of citations, threats for WoS, weaknesses of GS, and implications for literature research and research evaluation.

Keywords

Automatic indexing Citation classic Citation Index Historic trend Most highly cited paper Strengths and weaknesses 

Supplementary material

11192_2013_1089_MOESM1_ESM.txt (3 kb)
Analysis 1 (TXT 3 kb)
11192_2013_1089_MOESM2_ESM.txt (5 kb)
Analysis 2 (TXT 5 kb)
11192_2013_1089_MOESM3_ESM.xlsx (33 kb)
Analysis 3 (XLSX 33 kb)
11192_2013_1089_MOESM4_ESM.txt (9 kb)
Analysis 4 (TXT 8 kb)
11192_2013_1089_MOESM5_ESM.xlsx (478 kb)
Supplementary material 5 (XLSX 478 kb)

References

  1. Amara, N., & Landry, R. (2012). Counting citations in the field of business and management: Why use Google Scholar rather than the Web of Science. Scientometrics, 93(3), 553–581.CrossRefGoogle Scholar
  2. Bakkalbasi, N., Bauer, K., Glover, J., & Wang, L. (2006). Three options for citation tracking: Google Scholar, Scopus and Web of Science. Biomedical Digital Libraries, 3(7).Google Scholar
  3. Bandura, A. (2001). Social cognitive theory of mass communication. Media Psychology, 3(3), 265–299.CrossRefGoogle Scholar
  4. Bar-Ilan, J. (2010). Citations to the “Introduction to informetrics” indexed by WOS, Scopus and Google Scholar. Scientometrics, 82(3), 495–506.CrossRefGoogle Scholar
  5. Bar-Ilan, J., Levene, M., & Lin, A. (2007). Some measures for comparing citation databases. Journal of Informetrics, 1(1), 26–34.CrossRefGoogle Scholar
  6. Bauer, K., & Bakkalbasi, N. (2005). An examination of citation counts in a new scholarly communication environment. D-Lib Magazine, 11(9).Google Scholar
  7. Beall, J. (2010). “Predatory” open-access scholarly publishers. The Charleston Advisor, 11(4), 10–17.Google Scholar
  8. Beel, J., & Gipp, B. (2010). Academic search engine spam and Google Scholar’s resilience against it. Journal of Electronic Publishing, 13(3), 1–25.CrossRefGoogle Scholar
  9. Bornmann, L., Marx, W., Schier, H., Rahm, E., Thor, A., & Daniel, H.-D. (2009). Convergent validity of bibliometric Google Scholar data in the field of chemistry citation counts for papers that were accepted by Angewandte Chemie International Edition or rejected but published elsewhere, using Google Scholar, Science Citation Index, Scopus, and Chemical Abstracts. Journal of Informetrics, 3(1), 27–35.CrossRefGoogle Scholar
  10. Bosman, J., Van Mourik, I., Rasch, M., Sieverts, E., & Verhoeff, H. (2006). Scopus reviewed and compared: The coverage and functionality of the citation database Scopus, including comparisons with Web of Science and Google Scholar. Utrecht University Library. Retrieved April 2, 2013, from http://igitur-archive.library.uu.nl/DARLIN/2006-1220-200432/UUindex.html.
  11. Bradford, M. M. (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Analytical Biochemistry, 72(1), 248–254.CrossRefGoogle Scholar
  12. Burright, M. (2006). Google Scholar: Science & technology. Issues in Science & Technology Librarianship, 45.Google Scholar
  13. Butler, L., & Visser, M. S. (2006). Extending citation analysis to non-source items. Scientometrics, 66(2), 327–343.CrossRefGoogle Scholar
  14. Cathcart, R., & Roberts, A. (2005). Evaluating Google Scholar as a tool for information literacy. Internet Reference Services Quarterly, 10(3–4), 167–176.CrossRefGoogle Scholar
  15. Chen, X. (2010). Google Scholar’s dramatic coverage improvement five years after debut. Serials Review, 36(4), 221–226.CrossRefGoogle Scholar
  16. Couto, F. M., Grego, T., Pesquita, C., & Verissimo, P. (2009). Handling self-citations using Google Scholar. International Journal of Scientometrics, Informetrics and Bibliometrics, 13(1). Retrieved June 30, 2013, from https://docs.di.fc.ul.pt/jspui/handle/10455/3304.
  17. De Groote, S. L., & Raszewski, R. (2012). Coverage of Google Scholar, Scopus, and Web of Science: A case study of the h-index in nursing. Nursing Outlook, 60(6), 391–400.CrossRefGoogle Scholar
  18. Donlan, R., & Cooke, R. (2005). Running with the devil. Accessing library-licensed full text holdings through Google Scholar. Internet Reference Services Quarterly, 10(3–4), 149–157.CrossRefGoogle Scholar
  19. Duncan, D. B. (1955). Multiple range and multiple F tests. Biometrics, 11(1), 1–42.Google Scholar
  20. Einstein, A. (1936). Lens-like action of a star by the deviation of light in the gravitational field. Science, 84(2188), 506–507.CrossRefGoogle Scholar
  21. Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., & Pappas, G. (2008). Comparison of PubMed, Scopus, Web of Science, and Google Scholar: Strengths and weaknesses. The FASEB Journal, 22(2), 338–342.CrossRefGoogle Scholar
  22. Franceschet, M. (2010). A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar. Scientometrics, 83, 243–258.CrossRefGoogle Scholar
  23. García-Pérez, M. A. (2010). Accuracy and completeness of publication and citation records in the Web of Science, PsycINFO, and Google Scholar: A case study for the computation of h indices in psychology. Journal of the American Society for Information Science and Technology, 61(10), 2070–2085.CrossRefGoogle Scholar
  24. Garfield, E. (1955). Citation indexes for science: A new dimension in documentation through association of ideas. Science, 122(3159), 108–111.CrossRefGoogle Scholar
  25. Garfield, E. (1974). Selecting the all-time citations classics. Here are the fifty most cited papers 1961–1972. Current Contents, 2, 5–8. Retrieved May 6, 2013, from http://garfield.library.upenn.edu/essays/v2p006y1974-76.pdf.
  26. Garfield, E. (1984). The 100 most-cited papers ever and how we select citation classics. Current Contents, 23, 3–9. Retrieved May 6, 2013, from http://www.garfield.library.upenn.edu/essays/v7p175y1984.pdf.
  27. Garfield, E. (1990). The most-cited papers of all time, SCI 1945–1988. Part 1A. The SCI top 100—will the Lowry method ever be obliterated? Current Contents, 7, 3–14. Retrieved May 6, 2013, from http://www.garfield.library.upenn.edu/essays/v13p045y1990.pdf.
  28. Garfield, E. (2005). The agony and the ecstasy—the history and meaning of the journal impact factor. Chicago: International Congress on Peer Review and Biomedical Publication. Retrieved May 6, 2013, from http://garfield.library.upenn.edu/papers/jifchicago2005.pdf.
  29. Garfield, E. (2006). Citation indexes for science: A new dimension in documentation through association of ideas. International Journal of Epidemiology, 35(5), 1123–1127.CrossRefGoogle Scholar
  30. Gehanno, J.-F., Rollin, L., & Darmoni, S. (2013). Is the coverage of Google Scholar enough to be used alone for systematic reviews. BMC Medical Informatics and Decision Making, 13(1), 7.CrossRefGoogle Scholar
  31. Google Scholar (2013). Inclusion guidelines for webmasters. Retrieved April 8, 2013 from http://scholar.google.com/intl/en/scholar/inclusion.html#overview.
  32. Harzing, A. W. (2008). Google Scholar—a new data source for citation analysis. University of Melbourne. Retrieved June 30, 2013, from http://www.harzing.com/pop_gs.htm.
  33. Harzing, A. W. (2013a). A longitudinal study of Google Scholar coverage between 2012 and 2013. Scientometrics. doi:10.1007/s11192-013-0975-y.
  34. Harzing, A. W. (2013b). A preliminary test of Google Scholar as a source for citation data: A longitudinal study of Nobel Prize winners. Scientometrics, 94(3), 1057–1075.CrossRefGoogle Scholar
  35. Hightower, C., & Caldwell, C. (2010). Shifting sands: Science researchers on Google Scholar, Web of Science, and PubMed, with implications for library collections budgets. Issues in Science and Technology Librarianship, 63.Google Scholar
  36. Ioannidis, J., Tatsioni, A., & Karassa, F. B. (2010). Who is afraid of reviewers’ comments? Or, why anything can be published and anything can be cited. European Journal of Clinical Investigation, 40(4), 285–287.CrossRefGoogle Scholar
  37. Jacsó, P. (2005a). Google Scholar: The pros and the cons. Online Information Review, 29(2), 208–214.CrossRefGoogle Scholar
  38. Jacsó, P. (2005b). As we may search—comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases. Current Science, 89(9), 1537–1547.Google Scholar
  39. Jacsó, P. (2005c). Comparison and analysis of the citedness scores in Web of Science and Google Scholar. In Digital libraries: Implementing strategies and sharing experiences. Lecture Notes in Computer Science, vol. 3815 (pp. 360–369). Berlin, Heidelberg: Springer.Google Scholar
  40. Jacsó, P. (2006). Deflated, inflated and phantom citation counts. Online Information Review, 30(3), 297–309.CrossRefGoogle Scholar
  41. Jacsó, P. (2008). Google Scholar revisited. Online Information Review, 32(1), 102–114.CrossRefGoogle Scholar
  42. Kousha, K., & Thelwall, M. (2007). Google Scholar citations and Google Web/URL citations: A multi-discipline exploratory analysis. Journal of the American Society for Information Science and Technology, 58(7), 1055–1065.CrossRefGoogle Scholar
  43. Kousha, K., & Thelwall, M. (2008). Sources of Google Scholar citations outside the Science Citation Index: A comparison between four science disciplines. Scientometrics, 74(2), 273–294.CrossRefGoogle Scholar
  44. Kresge, N., Simoni, R. D., & Hill, R. L. (2005). The most highly cited paper in publishing history: Protein determination by Oliver H. Lowry. The Journal of Biological Chemistry, 280(28), e25–e25.Google Scholar
  45. Kulkarni, A. V., Aziz, B., Shams, I., & Busse, J. W. (2009). Comparisons of citations in Web of Science, Scopus, and Google Scholar for articles published in general medical journals. Journal of the American Medical Association, 302(10), 1092–1096.CrossRefGoogle Scholar
  46. Labbe, C. (2010). Ike Antkare one of the great stars in the scientific firmament. ISSI Newsletter, 6(2), 48–52.Google Scholar
  47. Laemmli, U. K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature, 227(5259), 680–685.CrossRefGoogle Scholar
  48. Larsen, P. O., & Von Ins, M. (2010). The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84(3), 575–603.CrossRefGoogle Scholar
  49. Levine-Clark, M., & Gil, E. (2009). A comparative analysis of social sciences citation tools. Online Information Review, 33(5), 986–996.CrossRefGoogle Scholar
  50. López-Cózar, E. D., Robinson-García, N., & Torres-Salinas, D. (2012). Manipulating Google Scholar citations and Google Scholar metrics: Simple, easy and tempting. arXiv:1212.0638.Google Scholar
  51. Lowry, O. H., Rosebrough, N. J., Farr, A. L., & Randall, R. J. (1951). Protein measurement with the Folin phenol reagent. The Journal of Biological Chemistry, 193(1), 265–275.Google Scholar
  52. Mancini, G., Carbonara, A. O., & Heremans, J. F. (1965). Immunochemical quantitation of antigens by single radial immunodiffusion. Immunochemistry, 2(3), 235–254.CrossRefGoogle Scholar
  53. Mayr, P., & Walter, A. K. (2007). An exploratory study of Google Scholar. Online Information Review, 31(6), 814–830.CrossRefGoogle Scholar
  54. Meho, L. I., & Yang, K. (2007). Impact of data sources on citation counts and rankings of LIS faculty: Web of Science versus Scopus and Google Scholar. Journal of the American Society for Information Science and Technology, 58(13), 2105–2125.CrossRefGoogle Scholar
  55. Meier, J. J., & Conkling, T. W. (2008). Google Scholar’s coverage of the engineering literature: An empirical study. The Journal of Academic Librarianship, 34(3), 196–201.CrossRefGoogle Scholar
  56. Mikki, S. (2010). Comparing Google Scholar and ISI Web of Science for earth sciences. Scientometrics, 82(2), 321–331.CrossRefGoogle Scholar
  57. Mingers, J., & Lipitakis, E. A. (2010). Counting the citations: A comparison of Web of Science and Google Scholar in the field of business and management. Scientometrics, 85(2), 613–625.CrossRefGoogle Scholar
  58. Neuhaus, C., & Daniel, H. D. (2008). Data sources for performing citation analysis: An overview. Journal of Documentation, 64(2), 193–210.CrossRefGoogle Scholar
  59. Neuhaus, C., Neuhaus, E., Asher, A., & Wrede, C. (2006). The depth and breadth of Google Scholar: An empirical study. Libraries and the Academy, 6(2), 127–141.CrossRefGoogle Scholar
  60. Noyori, R. (1992). Asymmetric catalysis by chiral metal complexes. ChemTech, 22(6), 360–367.Google Scholar
  61. Pauly, D., & Stergiou, K. I. (2005). Equivalence of results from two citation analyses: Thomson ISI’s Citation Index and Google’s Scholar service. Ethics in Science and Environmental Politics, 33–35.Google Scholar
  62. Pomerantz, J. (2006). Google Scholar and 100 percent availability of information. Information Technology and Libraries, 25(2), 52–56.Google Scholar
  63. Research Excellence Framework. (2013). Sub-panel 11: Citation data. Retrieved June 30, 2013, from http://www.ref.ac.uk/subguide/citationdata/googlescholar/.
  64. Sen, A. (1974). On some debates in capital theory. Economica: Journal of the Econometric Society, 41(163), 328–335.CrossRefGoogle Scholar
  65. Sharma, V. (2008). Text book of bioinformatics. Meerut: Rastogi Publications.Google Scholar
  66. Thomson Reuters. (2013a). The Thomson Reuters journal selection process. Retrieved April 4, 2013, from http://thomsonreuters.com/products_services/science/free/essays/journal_selection_process/.
  67. Thomson Reuters. (2013b). Web of Science facts sheet. Retrieved April 29, 2013, from http://thomsonreuters.com/content/science/pdf/Web_of_Science_factsheet.pdf.
  68. Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., & Beasley, E. (2001). The sequence of the human genome. Science, 291(5507), 1304–1351.Google Scholar
  69. Vine, R. (2006). Google Scholar. Journal of the Medical Library Association, 94(1), 97–99.Google Scholar
  70. Web of Knowledge. (2006). Retrieved April 29, 2013, from http://wokinfo.com/media/pdf/MostHighlyCitedArticles.pdf.
  71. Wleklinski, J. M. (2005). Studying Google Scholar: Wall to wall coverage? Online, 29(3), 22–26.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2013

Authors and Affiliations

  • Joost C. F. de Winter
    • 1
  • Amir A. Zadpoor
    • 1
  • Dimitra Dodou
    • 1
  1. 1.Delft University of TechnologyDelftThe Netherlands

Personalised recommendations