Skip to main content

Google Scholar as a Citation Database for Non-bibliometric Areas: The EVA Project Results

  • Chapter
  • First Online:
The Evaluation of Research in Social Sciences and Humanities

Abstract

In this chapter, we present the EVA (Extraction, Validation, and Analysis) project and related results about the use of Google Scholar as web database for calculation of citation indexes in non-bibliometric scientific areas, such as social sciences and humanities. The results of the EVA project are presented on a case-study about the publication records retrieved from Google Scholar for a dataset of Italian academic researchers belonging to non-bibliometric scientific areas. The evaluation results against the Elsevier-Scopus citation database are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    The complete name of the area is Antiquities, Philological-Literacy and Historical Artistic Sciences. It has been shortened only for readability. The name of the Italian researcher is irrelevant for the clarity of the example and it is omitted.

  2. 2.

    In this chapter, we call publication p the bibliographic record returned by Google Scholar in response to a query.

  3. 3.

    A detailed description of normalization techniques and related NLP functions is provided in (Manning et al. 2008).

  4. 4.

    The edit distance between two titles’ strings (S1 and S2) is the minimum number of operations (inclusion, substitution or deletion) on single characters needed to transform S1 into S2.

References

  • Aguillo, I. F. (2012). Is Google scholar useful for bibliometrics? A webometric analysis. Scientometrics, 91(2), 343–351.

    Article  Google Scholar 

  • Archambault, É., & Larivière, V. (2010). The limits of bibliometrics for the analysis of the social sciences and humanities literature. World Social Science Report, 251–254.

    Google Scholar 

  • Archambault, É., Vignola-Gagne, E., Côtè, G., Larivire, V., & Gingrasb, Y. (2006). Benchmarking scientific output in the social sciences and humanities: The limits of existing databases. Scientometrics, 68(3), 329–342.

    Article  Google Scholar 

  • Biolcati-Rinaldi, F., Ferrara, A., Pinotti, L., & Salini, S. (2012). Lesson learned by Unimival researchers during the comparative bibliometric analysis project (ABC). in Rassegna Italiana di Valutazione, v. XVI, n. 52, pp. 81-100, ISSN 1826-0713.

    Google Scholar 

  • Bornmann, L., Mutz, R., Neuhaus, C., & Daniel, H. D. (2008). Citation counts for research evaluation: Standards of good practice for analyzing bibliometric data and presenting and interpreting results. Ethics in science and environmental politics, 8(1), 93–102.

    Google Scholar 

  • Burrell, L. Q. (2006). Measuring concentration within and co-concentration between informetric distributions: An empirical study. Scientometrics, 68(3), 441–456. https://doi.org/10.1007/s11192-006-0122-0.

  • Checchi, D., De Fraja, G., & Verzillo, S. (2014). Publish or Perish: An Analysis of the Academic Job Market in Italy. Tech. Rep. Discussion Paper 10084, CEPR Discussion Paper.

    Google Scholar 

  • D’Angelo, C. A., Giuffrida, C., & Abramo, G. (2011). A heuristic approach to author name disambiguation in bibliometrics databases for large-scale research assessments. Journal of the American Society for Information Science and Technology, 62(2), 257–269.

    Article  Google Scholar 

  • Daniel, H. D., & Fisch, R. (1990). Research performance evaluation in the german university sector. Scientometrics, 19(5), 349–361. https://doi.org/10.1007/BF02020698.

    Article  Google Scholar 

  • Dumais, S. T. (2004). Latent semantic analysis. Annual Review of Information Science and Technology, 38(1).

    Google Scholar 

  • Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., & Pappas, G. (2008). Comparison of PubMed, Scopus, web of science, and Google scholar: Strengths and weaknesses. The FASEB Journal, 22(2), 338–342.

    Article  Google Scholar 

  • Ferrara, A., & Salini, S. (2012). Ten challenges in modeling bibliographic data for Bibliometric analysis. Scientometrics, 93(3), 765–785.

    Article  Google Scholar 

  • Ferreira, A. A., Veloso, A., Gonçalves, M. A., & Laender, A. H. (2010). Effective self-training author name disambiguation in scholarly digital libraries. In Proceedings of the 10th annual ACM joint conference on digital libraries (pp. 39–48). Aarhus.

    Google Scholar 

  • Garfield, E. (1980). Is information retrieval in the arts and humanities inherently different from that in science? The effect that ISI©‘S citation index for the arts and humanities is expected to have on future scholarship. The Library Quarterly, 40–57.

    Google Scholar 

  • Han, H., Giles, L., Zha, H., Li, C., & Tsioutsiouliklis, K. (2004). Two supervised learning approaches for name disambiguation in author citations. In Proceedings of the 4th joint ACM/IEEE conference on digital libraries (JCDL 2004) (pp. 296–305). Tucson.

    Google Scholar 

  • Han, H., Xu, W., Zha, H., & Giles, L. (2005a). A hierarchical naive Bayes mixture model for name disambiguation in author citations. In Proceedings of the ACM symposium on applied computing (pp. 1065–1069). Santa Fe.

    Google Scholar 

  • Han, H., Zha, H., & Giles, L. (2005b). Name disambiguation in author citations using a Kway spectral clustering method. In Proceedings of the 5th joint ACM/IEEE conference on digital libraries (JCDL 2005) (pp. 334–343). Denver.

    Google Scholar 

  • Kousha, K., & Thelwall, M. (2007). Google scholar citations and Google web/URL citations: A multi-discipline exploratory analysis. Journal of the American Society for Information Science and Technology, 58(7), 1055–1065.

    Article  Google Scholar 

  • Linmans, A. J. (2010). Why with bibliometrics the humanities does not need to be the weakest link. Scientometrics, 83(2), 337–354.

    Article  Google Scholar 

  • Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval (Vol. 1). Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Nederhof, A. J. (2006). Bibliometric monitoring of research performance in the social sciences and the humanities: A review. Scientometrics, 66(1), 81–100.

    Article  Google Scholar 

  • On, B. W., & Lee, D. (2007). Scalable name disambiguation using multi-level graph partition. In Proceedings of the SIAM International conference on data mining (pp. 575–580). Minneapolis, Minnesota.

    Google Scholar 

  • Prins, A. A., Costas, R., van Leeuwen, T. N., & Wouters, P. F. (2016). Using google scholar in research evaluation of humanities and social science programs: A comparison with web of science data. Research Evaluation, 25(3), 264–270.

    Article  Google Scholar 

  • Smalheiser, N. R., & Torvik, V. I. (2009). Author name disambiguation. Annual Review of Information Science and Technology, 43(1), 1–43.

    Article  Google Scholar 

  • Tang, J., Fong, A. C. M., Wang, B., & Zhang, J. (2012). A unified probabilistic framework for name disambiguation in digital library. IEEE Transactions on Knowledge and Data Engineering, 24(6), 975–987.

    Article  Google Scholar 

  • Torvik, V. I., Weeber, M., Swanson, D. R., & Smalheiser, N. R. (2005). A probabilistic similarity metric for Medline records: A model for author name disambiguation. Journal of the American Society for Information Science and Technology, 56(2), 140–158.

    Article  Google Scholar 

  • Yang, K. H., Peng, H. T., Jiang, J. Y., Lee, H. M., & Ho, J. M. (2008). Author name disambiguation for citations using topic and web correlation. In Proceedings of the 12th European conference on digital libraries (pp. 185–196). Aarhus, Denmark.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfio Ferrara .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ferrara, A., Montanelli, S., Verzillo, S. (2018). Google Scholar as a Citation Database for Non-bibliometric Areas: The EVA Project Results. In: Bonaccorsi, A. (eds) The Evaluation of Research in Social Sciences and Humanities. Springer, Cham. https://doi.org/10.1007/978-3-319-68554-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68554-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68553-3

  • Online ISBN: 978-3-319-68554-0

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics