Skip to main content
Log in

Gsm_hdata: a bibliometric tool to analyze data from google scholar metrics

  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

Abstract

Google Scholar Metrics has been considered an alternative source for analyzing the impact of academic publications. Its free access, user-friendly interface, and wide coverage have contributed to its growing popularity. However, it presents several challenges, such as the absence of advanced search engines and data extraction tools, the lack of bibliometric standardization, and the non-identification of self-citations, making it difficult to use in large-scale research. To address these challenges, Gsm_hdata was developed as a bibliometric tool for research, extraction, and analysis of citation data from the h5-index of Google Scholar Metrics. The tool consists of four resources that perform automatic searches and data extraction of indexed publications, identify and calculate journal self-citation indicators, and allow for yearly data updates. Initial tests have shown that the tool is efficient and accurate. However, some manual cleaning and verification steps are still necessary. There is room for improvement in the tool, including the implementation of a certification system for the publication search engine to avoid errors with homonymous or similarly titled publications. With technical improvements and more robust tests, Gsm_hdata has the potential to become a relevant tool for bibliometric studies. As well as being used in empirical research, this innovative tool can be used by funding agencies that choose the h5-index as a journal impact indicator.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Delgado-López-Cózar E, Cabezas-Clavijo Á (2013) Ranking journals: could Google Scholar Metrics be an alternative to. J Cit Rep Scimago J Rank? Learned Publishing 26(2):101–113. https://doi.org/10.1087/20130206

    Article  Google Scholar 

  2. Waltman L (2016) A review of the literature on citation impact indicators. J Informetrics 10(2):365–391. https://doi.org/10.1016/j.joi.2016.02.007

    Article  Google Scholar 

  3. Delgado López-Cózar E, Orduña-Malea E, Martín-Martín A (2019) Google Scholar as a Data Source for Research Assessment. In W. Glänzel, H. F. Moed, U. Schmoch, & M. Thelwall (Eds.), Springer Handbook of Science and Technology Indicators (pp. 95–127). Springer International Publishing. https://doi.org/10.1007/978-3-030-02511-3_4

  4. Martín-Martín A, Ayllón J, Orduña-Malea E (2014) Google Scholar Metrics 2014: a low cost bibliometric tool. EC3 Working Paper 17. https://doi.org/10.48550/arXiv.1407.2827. ArXiv.Org

    Article  Google Scholar 

  5. Hirsch JE (2005) An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences, 102(46), 16569–16572. https://doi.org/10.1073/pnas.0507655102

  6. Jacsó P (2012) Google Scholar Metrics for Publications: the software and content features of a new open access bibliometric service. Online Inf Rev 36(4):604–619. https://doi.org/10.1108/14684521211254121

    Article  Google Scholar 

  7. Kousha K, Thelwall M (2008) Sources of Google Scholar citations outside the Science Citation Index: a comparison between four science disciplines. Scientometrics 74(2):273–294. https://doi.org/10.1007/s11192-008-0217-x

    Article  Google Scholar 

  8. Delgado-López-Cózar E, Cabezas-Clavijo Á (2012) Google Scholar Metrics: an unreliable tool for assessing scientific journals. El Profesional de La Informacion 21(4):419–427. https://doi.org/10.3145/epi.2012.jul.15

    Article  Google Scholar 

  9. Gusenbauer M (2019) Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics 118(1):177–214. https://doi.org/10.1007/s11192-018-2958-5

    Article  Google Scholar 

  10. Yu T, Yu G, Song Y, Wang MY (2018) Toward the more effective identification of journals with anomalous self-citation. Malaysian J Libr Inform Sci 23(2):25–46. https://doi.org/10.22452/mjlis.vol23no2.2

    Article  Google Scholar 

  11. Delgado López-Cózar E, Robinson-García N, Torres-Salinas D (2014) The Google scholar experiment: how to index false papers and manipulate bibliometric indicators. J Association Inform Sci Technol 65(3):446–454. https://doi.org/10.1002/asi.23056

    Article  Google Scholar 

  12. Schreiber M (2009) The influence of self-citation corrections and the fractionalised counting of multi-authored manuscripts on the Hirsch index. Annalen Der Physik (Leipzig) 18(9):607–621. https://doi.org/10.1002/andp.200910360

    Article  Google Scholar 

  13. Rousseau R (1999) Temporal differences in self-citation rates of scientific journals. Scientometrics 44(3):521–531. https://doi.org/10.1007/BF02458493

    Article  Google Scholar 

  14. Teves L, Julián J (2020) Cueto. “Social Network Analysis (SNA) Applied to Current Issues. Guidelines for Its Implementation in Research and Management Projects.” AWARI, vol. 1, no. 2, Dec. https://doi.org/10.47909/awari.80

  15. Pinto AL, Canto D, Gavron FL, E. M., Talau M (2020) Periódicos científicos brasileiros indexados no Google Scholar Metrics. Informação & Sociedade: Estudos 30(4):1–18. https://doi.org/10.22478/ufpb.1809-4783.2020v30n4.57048

    Article  Google Scholar 

  16. Do Canto FL, Pinto AL, Gavron EM, Talau M (2022) Latin american and caribbean journals indexed in Google Scholar Metrics. Scientometrics 127(2):763–783. https://doi.org/10.1007/s11192-021-04237-x

    Article  Google Scholar 

  17. Luiz Pinto A, Segundo WLRDC, Dias TMR, Silva VS, Gomes JC, Quoniam L (2022) Brazil developing current Research Information Systems (BrCRIS) as data sources for studies of research. Iberoamerican J Sci Meas Communication 2(1). https://doi.org/10.47909/ijsmc.135

  18. Segundo W, Dias TM, Moreira T, Pinto AL, Silva V, Gomes J, Quoniam L, Matas L, Dias A, Schneider J (2022) A strategy for collection, integration, and processing of scientific data in BrCris context. In T. M. Rodrigues Dias (Ed.), Advanced Notes in Information Science (Vol. 2). ColNes Publishing. https://doi.org/10.47909/anis.978-9916-9760-3-6.117

Download references

Acknowledgements

Adilson Luiz Pinto would like to thank the Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq. Process 402042/2022-0 and 200937/2022-7. Post-Doctorate Abroad. Protocol 7763808095369450.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fábio Lorensi do Canto.

Ethics declarations

Conflict of Interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

do Canto, F.L., Pinto, A.L., Dias, T.M.R. et al. Gsm_hdata: a bibliometric tool to analyze data from google scholar metrics. Mobile Netw Appl (2023). https://doi.org/10.1007/s11036-023-02258-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11036-023-02258-9

Keywords

Navigation