Skip to main content
Log in

Accuracy of author names in bibliographic data sources: an Italian case study

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abramo, G., D’Angelo, C. A., & Di Costa, F. (2011). National research assessment exercises: The effects of changing the rules of the game during the game. Scientometrics, 88(1), 229–238. https://doi.org/10.1007/s11192-011-0373-2. (cited by 5).

    Article  Google Scholar 

  • Aksnes, D. W. (2008). When different persons have an identical author name. How frequent are homonyms? Journal of the American Society for Information Science and Technology, 59(5), 838–841. https://doi.org/10.1002/asi.20788.

    Article  Google Scholar 

  • Bennett, D., & Williams, P. (2006). Name authority challenges for indexing and abstracting databases. Evidence Based Library and Information Practice, 1(1), 37–57. https://doi.org/10.18438/B81596.

    Article  Google Scholar 

  • Franceschini, F., & Maisano, D. (2017). Critical remarks on the Italian research assessment exercise VQR 2011–2014. Journal of Informetrics, 11(2), 337–357. https://doi.org/10.1016/j.joi.2017.02.005.

    Article  Google Scholar 

  • Garfield, E. (1981). What’s in a surname? Naturwissenschaften, 68(10), 519–520. https://doi.org/10.1007/BF00365376.

    Article  Google Scholar 

  • Harzing, A. W., & van der Wal, R. (2008). Google scholar as a new source for citation analysis. Ethics in Science and Environmental Politics, 8(1), 61–73.

    Article  Google Scholar 

  • Hood, W. W., & Wilson, C. S. (2003). Informetric studies using databases: Opportunities and challenges. Scientometrics, 58(3), 587–608. https://doi.org/10.1023/B:SCIE.0000006882.47115.c6.

    Article  Google Scholar 

  • Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707–710.

    MathSciNet  Google Scholar 

  • Meho, L. I., & Yang, K. (2007). Impact of data sources on citation counts and rankings of lis faculty: Web of science versus scopus and google scholar. Journal of the American Society for Information Science and Technology, 58(13), 2105–2125. https://doi.org/10.1002/asi.20677.

    Article  Google Scholar 

  • Olensky, M. (2014). Testing an automated accuracy assessment method on bibliographic data. Journal of Library and Information Studies, 12(2), 19–38.

    Google Scholar 

  • Olensky, M. (2015). Data accuracy in bibliometric data sources and its impact on citation matching. PhD thesis. http://edoc.hu-berlin.de/docviews/abstract.php?id=41398. Accessed 23 Oct 2018.

  • Pao, M. L. (1989). Importance of quality data for bibliometric research. In Proceedings of the 10th national online meeting on learned information, Medford, NJ (pp. 321–327).

  • Ruiz-Pérez, R., López-Cózar, E. D., & Jiménez-Contreras, E. (2002). Spanish personal name variations in national and international biomedical databases: Implications for information retrieval and bibliometric studies. Journal of the Medical Library Association, 90(4), 411–430.

    Google Scholar 

  • Tunger, D., Haustein, S., Ruppert, L., Luca, G., & Unterhalt, S. (2010). The Delphic oracle: An analysis of potential error sources in bibliographic databases. In 11th International conference on science and technology indicators, Leiden, The Netherlands, 9 Sept 2010–11 Sept 2010. http://juser.fz-juelich.de/record/138630. Accessed 23 Oct 2018.

Download references

Acknowledgements

First of all, we wish to thank the anonymous reviewers for their useful suggestions. We are also indebted to Irene Bongioanni, Adriano Fazzone, Emanuele Fusco, and all members of Sapienza University of Rome who contributed to the analysis and cleaning of the bibliographic records stored in our local repository. Marco Schaerf was partially supported by H2020 Project Second Hands under grant Agreement No. 643950.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Camil Demetrescu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Demetrescu, C., Ribichini, A. & Schaerf, M. Accuracy of author names in bibliographic data sources: an Italian case study. Scientometrics 117, 1777–1791 (2018). https://doi.org/10.1007/s11192-018-2945-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-018-2945-x

Keywords

Navigation