, Volume 113, Issue 1, pp 219–236 | Cite as

Document type assignment accuracy in the journal citation index data of Web of Science

  • Paul DonnerEmail author


This article reports the results of a study of the correctness of document type assignments in the commercial citation index database Web of Science (SCIE, SSCI, AHCI collections). The document type assignments for publication records are compared to those given on the official journal websites or in the publication full-texts for a random sample of 791 Web of Science records across the four document type categories articles, letters, reviews and others, according to the definitions of WoS. The proportion of incorrect assignments across document types and its influence on document specific normalized citations scores are analysed. It is found that document type data is correct in 94% of records. Further analyses show that within records of one document type as assigned in the data source, the records assigned to the type correctly and incorrectly have different average page counts and reference counts.


Citation normalization Document type Data accuracy Bibliometric data Citation impact Web of Science Scopus Data quality 



This study was supported by the German Federal Ministry of Education and Research (BMBF) Grant 01PQ13001, project “Kompetenzzentrum Bibliometrie”. I want to thank Anastasiia Tcypina for help with data collection and Nees Jan van Eck for discussion of the manuscript.


  1. Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. New York: ACM Press.Google Scholar
  2. Barrios, M., Guilera, G., & Gómez-Benito, J. (2013). Impact and structural features of meta-analytical studies, standard articles and reviews in psychology: Similarities and differences. Journal of Informetrics, 7(2), 478–486.CrossRefGoogle Scholar
  3. Braun, T., Glänzel, W., & Schubert, A. (1989). Some data on the distribution of journal publication types in the Science Citation Index Database. Scientometrics, 15(5), 325–330.CrossRefGoogle Scholar
  4. Chaiworapongsa, T., Romero, R., Kim, Y. M., Kim, G. J., Kim, M. R., Espinoza, J., et al. (2008). The maternal plasma soluble vascular endothelial growth factor receptor-1 concentration is elevated in SGA and the magnitude of the increase relates to Doppler abnormalities in the maternal and fetal circulation. The Journal of Maternal-Fetal & Neonatal Medicine, 21(1), 25–40.CrossRefGoogle Scholar
  5. Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2013). A novel approach for estimating the omitted citation rate of bibliometric databases. Journal of the American Society for Information Science and Technology, 64(10), 2149–2156.CrossRefGoogle Scholar
  6. Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2015a). Errors in DOI indexing by bibliometric databases. Scientometrics, 102(3), 2181–2186.CrossRefGoogle Scholar
  7. Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2015b). Influence of omitted citations on the bibliometric statistics of the major Manufacturing journals. Scientometrics, 103(3), 1083–1122.CrossRefGoogle Scholar
  8. Glänzel, W. (2008). Seven myths in bibliometrics about facts and fiction in quantitative science studies. Collnet Journal of Scientometrics and Information Management, 2(1), 9–17.CrossRefGoogle Scholar
  9. Gorraiz, J., & Schloegl, C. (2008). A bibliometric analysis of pharmacology and pharmacy journals: Scopus versus Web of Science. Journal of Information Science, 34(5), 715–725.CrossRefGoogle Scholar
  10. Harzing, A. W. (2013). Document categories in the ISI Web of Knowledge: Misunderstanding the social sciences? Scientometrics, 93(1), 23–34.CrossRefGoogle Scholar
  11. Korn, E. L., & Graubard, B. I. (1998). Confidence intervals for proportions with small expected number of positive counts estimated from survey data. Survey Methodology, 24(2), 193–201.Google Scholar
  12. Lohr, S. L. (2010). Sampling: Design and analysis (2nd ed.). Boston: Brooks/Cole, Cengage Learning.zbMATHGoogle Scholar
  13. Lumley, T. (2004). Analysis of complex survey samples. Journal of Statistical Software, 9(1), 1–19.MathSciNetGoogle Scholar
  14. Lundberg, J. (2007). Lifting the crown—Citation z-score. Journal of Informetrics, 1(2), 145–154.CrossRefGoogle Scholar
  15. Moed, H. F., & van Leeuwen, T. N. (1995). Improving the accuracy of Institute for Scientific Information’s journal impact factors. Journal of the American Society for Information Science, 46(6), 461.CrossRefGoogle Scholar
  16. Montesi, M., & Mackenzie Owen, J. (2008). Research journal articles as document genres: Exploring their role in knowledge organization. Journal of Documentation, 64(1), 143–167.CrossRefGoogle Scholar
  17. Patsopoulos, N. A., Analatos, A. A., & Ioannidis, J. P. (2005). Relative citation impact of various study designs in the health sciences. Journal of the American Medical Association, 293(19), 2362–2366.CrossRefGoogle Scholar
  18. Romero, A., Cortés, J., Escudero, C., López, J., & Moreno, J. (2009). Measuring the influence of clinical trials citations on several bibliometric indicators. Scientometrics, 80(3), 747–760.CrossRefGoogle Scholar
  19. Sigogneau, A. (2000). An analysis of document types published in journals related to physics: Proceeding papers recorded in the Science Citation Index database. Scientometrics, 47(3), 589–604.CrossRefGoogle Scholar
  20. Sirtes, D. (2012). How (dis-) similar are different citation normalizations and the fractional citation indicator? (And how it can be improved). In É. Archambault, Y. Gingras, & V. Larivière (Eds.), Proceedings of 17th international conference on science and technology indicators (STI) (pp. 894–896). Montréal: Science-Metrix and OST.Google Scholar
  21. Spodick, D. H., & Goldberg, R. J. (1983). The editor’s correspondence: Analysis of patterns appearing in selected specialty and general journals. The American Journal of Cardiology, 52(10), 1290–1292.CrossRefGoogle Scholar
  22. Tierney, E., O’Rourke, C., & Fenton, J. E. (2015). What is the role of ‘the letter to the editor’? European Archives of Oto-Rhino-Laryngology, 272(9), 2089–2093.CrossRefGoogle Scholar
  23. Valderrama-Zurián, J.-C., Aguilar-Moya, R., Melero-Fuentes, D., & Aleixandre-Benavent, R. (2015). A systematic analysis of duplicate records in Scopus. Journal of Informetrics, 9(3), 570–576. doi: 10.1016/j.joi.2015.05.002.CrossRefGoogle Scholar
  24. van Leeuwen, T., Costas, R., Calero-Medina, C., & Visser, M. (2013). The role of editorial material in bibliometric research performance assessments. Scientometrics, 95(2), 817–828.CrossRefGoogle Scholar
  25. van Leeuwen, T. N., van der Wurff, L. J., & de Craen, A. J. M. (2007). Classification of “research letters” in general medical journals and its consequences in bibliometric research evaluation processes. Research Evaluation, 16(1), 59–63.CrossRefGoogle Scholar
  26. Vinkler, P. (2010). The evaluation of research by scientometric indicators. Oxford: Chandos Publishing. ISBN 978-1-84334-572-5.CrossRefGoogle Scholar
  27. Waltman, L., van Eck, N. J., van Leeuwen, T. N., Visser, M. S., & van Raan, A. J. F. (2011). Towards a new crown indicator: Some theoretical considerations. Journal of Informetrics, 5(1), 37–47.CrossRefGoogle Scholar
  28. Wang, J. (2013). Citation time window choice for research impact evaluation. Scientometrics, 94(3), 851–872.CrossRefGoogle Scholar
  29. Zuccala, A., & van Leeuwen, T. (2011). Book reviews in humanities research evaluations. Journal of the American Society for Information Science and Technology, 62(10), 1979–1991.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2017

Authors and Affiliations

  1. 1.Deutsches Zentrum für Wissenschafts- und HochschulforschungBerlinGermany

Personalised recommendations