Advertisement

Scientometrics

, Volume 117, Issue 3, pp 1777–1791 | Cite as

Accuracy of author names in bibliographic data sources: an Italian case study

  • Camil Demetrescu
  • Andrea Ribichini
  • Marco Schaerf
Article

Abstract

We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages.

Keywords

Author names Accuracy Scopus WoS CrossRef PubMed 

Notes

Acknowledgements

First of all, we wish to thank the anonymous reviewers for their useful suggestions. We are also indebted to Irene Bongioanni, Adriano Fazzone, Emanuele Fusco, and all members of Sapienza University of Rome who contributed to the analysis and cleaning of the bibliographic records stored in our local repository. Marco Schaerf was partially supported by H2020 Project Second Hands under grant Agreement No. 643950.

References

  1. Abramo, G., D’Angelo, C. A., & Di Costa, F. (2011). National research assessment exercises: The effects of changing the rules of the game during the game. Scientometrics, 88(1), 229–238.  https://doi.org/10.1007/s11192-011-0373-2. (cited by 5).CrossRefGoogle Scholar
  2. Aksnes, D. W. (2008). When different persons have an identical author name. How frequent are homonyms? Journal of the American Society for Information Science and Technology, 59(5), 838–841.  https://doi.org/10.1002/asi.20788.CrossRefGoogle Scholar
  3. Bennett, D., & Williams, P. (2006). Name authority challenges for indexing and abstracting databases. Evidence Based Library and Information Practice, 1(1), 37–57.  https://doi.org/10.18438/B81596.CrossRefGoogle Scholar
  4. Franceschini, F., & Maisano, D. (2017). Critical remarks on the Italian research assessment exercise VQR 2011–2014. Journal of Informetrics, 11(2), 337–357.  https://doi.org/10.1016/j.joi.2017.02.005.CrossRefGoogle Scholar
  5. Garfield, E. (1981). What’s in a surname? Naturwissenschaften, 68(10), 519–520.  https://doi.org/10.1007/BF00365376.CrossRefGoogle Scholar
  6. Harzing, A. W., & van der Wal, R. (2008). Google scholar as a new source for citation analysis. Ethics in Science and Environmental Politics, 8(1), 61–73.CrossRefGoogle Scholar
  7. Hood, W. W., & Wilson, C. S. (2003). Informetric studies using databases: Opportunities and challenges. Scientometrics, 58(3), 587–608.  https://doi.org/10.1023/B:SCIE.0000006882.47115.c6.CrossRefGoogle Scholar
  8. Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707–710.MathSciNetGoogle Scholar
  9. Meho, L. I., & Yang, K. (2007). Impact of data sources on citation counts and rankings of lis faculty: Web of science versus scopus and google scholar. Journal of the American Society for Information Science and Technology, 58(13), 2105–2125.  https://doi.org/10.1002/asi.20677.CrossRefGoogle Scholar
  10. Olensky, M. (2014). Testing an automated accuracy assessment method on bibliographic data. Journal of Library and Information Studies, 12(2), 19–38.Google Scholar
  11. Olensky, M. (2015). Data accuracy in bibliometric data sources and its impact on citation matching. PhD thesis. http://edoc.hu-berlin.de/docviews/abstract.php?id=41398. Accessed 23 Oct 2018.
  12. Pao, M. L. (1989). Importance of quality data for bibliometric research. In Proceedings of the 10th national online meeting on learned information, Medford, NJ (pp. 321–327).Google Scholar
  13. Ruiz-Pérez, R., López-Cózar, E. D., & Jiménez-Contreras, E. (2002). Spanish personal name variations in national and international biomedical databases: Implications for information retrieval and bibliometric studies. Journal of the Medical Library Association, 90(4), 411–430.Google Scholar
  14. Tunger, D., Haustein, S., Ruppert, L., Luca, G., & Unterhalt, S. (2010). The Delphic oracle: An analysis of potential error sources in bibliographic databases. In 11th International conference on science and technology indicators, Leiden, The Netherlands, 9 Sept 2010–11 Sept 2010. http://juser.fz-juelich.de/record/138630. Accessed 23 Oct 2018.

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2018

Authors and Affiliations

  1. 1.Department of Computer, Control, and Management Engineering “Antonio Ruberti”Sapienza University of RomeRomeItaly
  2. 2.Department of PhysicsSapienza University of RomeRomeItaly
  3. 3.Institute of Information Technologies and TelecommunicationsNorth Caucasus Federal UniversityStavropolRussian Federation

Personalised recommendations