Traditionally, Web of Science and Scopus have been the two most widely used databases for bibliometric analyses. However, during the last few years some new scholarly databases, such as Dimensions, have come up. Several previous studies have compared different databases, either through a direct comparison of article coverage or by comparing the citations across the databases. This article aims to present a comparative analysis of the journal coverage of the three databases (Web of Science, Scopus and Dimensions), with the objective to describe, understand and visualize the differences in them. The most recent master journal lists of the three databases is used for analysis. The results indicate that the databases have significantly different journal coverage, with the Web of Science being most selective and Dimensions being the most exhaustive. About 99.11% and 96.61% of the journals indexed in Web of Science are also indexed in Scopus and Dimensions, respectively. Scopus has 96.42% of its indexed journals also covered by Dimensions. Dimensions database has the most exhaustive journal coverage, with 82.22% more journals than Web of Science and 48.17% more journals than Scopus. This article also analysed the research outputs for 20 selected countries for the 2010–2018 period, as indexed in the three databases, and identified database-induced variations in research output volume, rank, global share and subject area composition for different countries. It is found that there are clearly visible variations in the research output from different countries in the three databases, along with differential coverage of different subject areas by the three databases. The analytical study provides an informative and practically useful picture of the journal coverage of Web of Science, Scopus and Dimensions databases.
This is a preview of subscription content,to check access.
Access this article
Adriaanse, L. S., & Rensleigh, C. (2011). Comparing Web of Science, Scopus and Google Scholar from an environmental sciences perspective. South African journal of libraries and information science, 77(2), 169–178.
Adriaanse, L. S., & Rensleigh, C. (2013). Web of Science, Scopus and Google Scholar: a content comprehensiveness comparison. The Electronic Library, 31(6), 727–744.
Aksnes, D. W., & Sivertsen, G. (2019). A criteria-based assessment of the coverage of Scopus and Web of Science. Journal of Data and Information Science, 4(1), 1–21.
AlRyalat, S. A. S., Malkawi, L. W., & Momani, S. M. (2019). Comparing bibliometric analysis using PubMed, Scopus, and Web of Science databases. Journal of Visualized Experiments, 152, e58494.
Baas, J., Schotten, M., Plume, A., Cote, G., & Karimi, R. (2019). Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies. Quantitative Science Studies, 1(1), 377–386.
Bar-Ilan, J. (2008). Which h-index? a comparison of WoS, Scopus and Google Scholar. Scientometrics, 74(2), 257–271.
Birkle, C., Pendlebury, A. D., Schnell, J., & Adams, J. (2019). Web of Science as a data source for research on scientific and scholarly activity. Quantitative Science Studies, 1(1), 363–376.
Bode, C., Herzog, C., Hook D. & McGrath, R. (2019). A guide to dimensions data approach. Retrieved from https://www.dimensions.ai/resources/a-guide-to-the-dimensions-data-approach/ on May 15th 2020.
Chadegani, A. A., Salehi, H., Yunus, M., Farhadi, H., Fooladi, M., Farhadi, M., & Ale Ebrahim, N. (2013). A comparison between two main academic literature collections: Web of Science and Scopus databases. Asian Social Science, 9(5), 18–26.
De Winter, J. C., Zadpoor, A. A., & Dodou, D. (2014). The expansion of Google Scholar versus Web of Science: a longitudinal study. Scientometrics, 98(2), 1547–1565.
Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., & Pappas, G. (2008). Comparison of PubMed, Scopus, web of science, and Google scholar: strengths and weaknesses. The FASEB Journal, 22(2), 338–342.
Gavel, Y., & Iselid, L. (2008). Web of Science and Scopus: a journal title overlap study. Online Information Review, 32(1), 8–21.
Harzing, A. W. (2019). Two new kids on the block: How do Crossref and Dimensions compare with Google Scholar, Microsoft Academic, Scopus and the Web of Science? Scientometrics, 120(1), 341–349.
Harzing, A. W., & Alakangas, S. (2016). Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison. Scientometrics, 106(2), 787–804.
Herzog, C., Hook, D., & Konkiel, S. (2020). Dimensions: bringing down barriers between scientometricians and data. Quantitative Science Studies, 1(1), 387–395.
Hook, D. W., Porter, S. J., & Herzog, C. (2018). Dimensions: building context for search and evaluation. Frontiers in Research Metrics and Analytics, 3, 23.
Huang, C.-K., Neylon, C., Brookes-Kenworthy, C., Hosking, R., Montgomery, L., Wilson, K., & Ozaygen, A. (2020). Comparison of bibliographic data sources: implications for the robustness of university rankings. Quantitative Science Studies, 1(2), 445–478. https://doi.org/10.1162/qss_a_00031
López-Illescas, C., de Moya Anegón, F., & Moed, H. F. (2009). Comparing bibliometric country-by-country rankings derived from the Web of Science and Scopus: the effect of poorly cited journals in oncology. Journal of Information Science, 35(2), 244–256.
López-Illescas, C., de Moya-Anegón, F., & Moed, H. F. (2008). Coverage and citation impact of oncological journals in the Web of Science and Scopus. Journal of Informetrics, 2(4), 304–316.
Martín-Martín, A., Orduna-Malea, E., & López-Cózar, E. D. (2018a). Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: a multidisciplinary comparison. Scientometrics, 116(3), 2175–2188.
Martín-Martín, A., Orduna-Malea, E., Thelwall, M., & López-Cózar, E. D. (2018b). Google Scholar, Web of Science, and Scopus: a systematic comparison of citations in 252 subject categories. Journal of Informetrics, 12(4), 1160–1177.
Martin-Martin, A., Thelwall, M., Orduna-Malea, E., Lopez-Cozar, E.D. (2020). Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and Open Citations’COCI: a multidisciplinary comparison of coverage via citations. Retrieved from arXiv: https://arxiv.org/abs/2004.14329 on May 15th 2020.
Mayr, P., & Walter, A. K. (2007). An exploratory study of Google Scholar. Online Information Review, 31(6), 814–830.
Mingers, J., & Lipitakis, E. (2010). Counting the citations: a comparison of Web of Science and Google Scholar in the field of business and management. Scientometrics, 85(2), 613–625.
Mongeon, P., & Paul-Hus, A. (2016). The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics, 106(1), 213–228.
Singh, P., Piryani, R., Singh, V. K., & Pinto, D. (2020). Revisiting subject classification in academic databases: a comparison of the classification accuracy of Web of Science, Scopus & Dimensions. Journal of Intelligent & Fuzzy Systems, 39(2), 2471–2476.
Somoza-Fernández, M., Rodríguez-Gairín, J. M., & Urbano, C. (2018). Journal coverage of the emerging sources citation index. Learned Publishing, 31(3), 199–204.
Thelwall, M. (2018). Dimensions: a competitor to Scopus and the Web of Science? Journal of Informetrics, 12(2), 430–435.
Torres-Salinas, D., Lopez-Cózar, E. D., & Jiménez-Contreras, E. (2009). Ranking of departments and researchers within a university using two different databases: Web of Science versus Scopus. Scientometrics, 80(3), 761–774.
Vieira, E. S., & Gomes, J. A. N. F. (2009). A comparison of Scopus and Web of Science for a typical university. Scientometrics, 81, 587. https://doi.org/10.1007/s11192-009-2178-0
Visser, M. S., van Eck, N. J. & Waltman, L. (2019). Large-scale comparison of bibliographic data sources: Web of Science, Scopus, Dimensions and Crossref. In proceedings of 17th International Conference on Scientometrics and Informetrics, pp. 2358–2369.
Visser, M. S., van Eck, N. J. & Waltman, L. (2020). Large-scale comparison of bibliographic data sources: Web of Science, Scopus, Dimensions, Crossref, and Microsoft Academic. Retrieved from https://arxiv.org/abs/2005.10732 on May 25th 2020.
Yang, K., & Meho, L. I. (2006). Citation analysis: a comparison of Google Scholar, Scopus, and Web of Science. Proceedings of the American Society for Information Science and Technology, 43(1), 1–15.
This appendix describes the detailed steps of pre-processing and matching applied to the master journal lists from the three databases.
On a detailed inspection, the three master journal lists were found to have some duplicate and incomplete entries. Further, since the Dimensions database also includes preprints and conferences, its comprehensive journal list contained some entries that referred to preprint servers or conference proceedings. Therefore, the journal lists were pre-processed to remove duplicate and incomplete entries and entries for preprint servers and conference proceedings. The pre-processing steps applied are as follows:
Pre-processing step 1: In the first pre-processing step, we analysed journal entries on two keys: ISSN and e-ISSN. In each of the journal lists, entries that had both these fields null were removed first. Thereafter, entries that had both ISSN and e-ISSN fields duplicated were removed. Thus, at the end of pre-processing step 1, we were left with 13,610 entries in Web of Science journal list (out of total 14,737 entries), 39,851 entries in Scopus journal list (out of total 40,385 entries), and 74,705 entries in Dimensions journal list (out of total 77,471 entries).
Pre-processing step 2: In the second step, we analysed inconsistent entries where same ISSN or e-ISSN values occurred in different journal entries. Such entries (with repeated ISSN or e-ISSN values) were identified and removed. The Web of Science journal list had no such entry. In Scopus, 93 such duplicate occurrences were found and removed, leaving the remaining list to comprise of 39,758 entries. In Dimensions, 112 such entries were found and removed, and the remaining list comprised of 74,593 entries.
Pre-processing step 3: In the third step, the entries in the journal lists have been checked to see if they contain any entry for a non-journal publication source. It was found that the Dimensions journal list included some entries for preprints and conference proceedings as well. Accordingly, the journal list entries were scanned to see occurrence of certain keywords, such as preprint, preprints, preprint-server, symposium, conference, congress etc. A total of 7 entries were found for preprint sources in Dimensions list and were removed. A total of 617 entries were found for conferences in Dimensions list and were removed. The resulting journal list of Dimensions database contained 73,966 journal entries.
Thus, the pre-processed journal list of Web of Science contained 13,610 journal entries; Scopus pre-processed list had 39,758 journal entries, and Dimensions pre-processed list contained 73,966 journal entries.
After the pre-processing steps, a systematic matching process was used to identify overlapping and unique journal records in different databases. We used a step-by-step matching process which used simple matching in the initial steps and a more restrictive matching strategy in later steps when the remaining journal lists became smaller. In the beginning, we did an ISSN/ e-ISSN based record matching and then later on used title-based and title text similarity-based matching. The matching steps along with the intermediate number of matching journal entries at each stage are illustrated below. The matching steps used criteria of exclusion through which records that yielded in match in one step were excluded from rest of the computations for matching.
Matching step 1: The first matching step involved computing matches based on ISSN and e-ISSN fields. First the records were matched on ISSN and thereafter the remaining ones on e-ISSN. For doing this, the journal lists were partitioned in two sets- those having non-null ISSN value (hereafter referred to as ISSN set) and those with non-null e-ISSN values (hereafter referred to as e-ISSN set). The ISSN set comprised of 13,584 journal entries in Web of Science, 37,780 journal entries in Scopus, and 60,538 journal entries in Dimensions. The e-ISSN set comprised of 12,827 journal entries in Web of Science, 14,203 journal entries in Scopus, and 53,505 journal entries in Dimensions. Both these lists had common journal entries too. To avoid duplicate processing of matching, we removed from e-ISSN set all those records that were already included in the ISSN set. This way, the modified e-ISSN set comprised of 17 records in Web of Science, 1978 records in Scopus, and 13,428 records in Dimensions.
The subsequent matching on ISSN followed by e-ISSN is done as follows:
The entries in the ISSN sets are matched by their ISSN field across all database pairs. This resulted in 12,744 matching records in Web of Science and Scopus, 11,305 matching records in Web of Science and Dimensions, and 23,579 matching records in Scopus and Dimensions.
The next step involved matching journal entries in the modified e-ISSN sets of the three databases. Here the entries in the three sets are matched by their e-ISSN values. This resulted in 12 matching records in Web of Science and Scopus, 1084 matching records in Web of Science and Dimensions, and 8 matching records in Scopus and Dimensions.
In the next step, the remaining unmatched records in the ISSN sets after step (a) are matched to modified e-ISSN set with respect to the e-ISSN values. This results in 413 matching records in Web of Science and Scopus, 648 matches in Web of Science and Dimensions, and 43 matching records in Scopus and Dimensions.
The remaining ISSN sets are then compared to find any matches on e-ISSN. The ISSN of Web of Science and Scopus have 164 matching e-ISSNs, Web of Science and Dimensions have 763 matching e-ISSNs, and Scopus and Dimensions have 12,246 matching e-ISSNs.
Similarly, the modified e-ISSN sets are compared with remaining ISSN set. Web of Science and Scopus have 1 matching e-ISSNs, Web of Science and Dimensions have 3 matching e-ISSNs, and Scopus and Dimensions have 239 matching e-ISSNs.
In the last step we did cross matches for the remaining journal entries in both ISSN and e-ISSN sets taken together. The ISSN field in the entries was matched with e-ISSN and vice versa. This was done to address the manual observations that in some records the ISSN and e-ISSN numbers were interchanged in different database lists. This cross matching in the ISSN and e-ISSN sets resulted in 120 matching records in Web of Science and Scopus, 259 matching records in Web of Science and Dimensions, and 999 matching records in Scopus and Dimensions.
Matching Step 2: After the matching of records based on ISSN and e-ISSN fields, we tried to match the remaining records on the journal title field. First an exact title match is done on title fields of records. Thereafter, an inexact match involving cosine similarity is done to process records that have the same journal, spelled or written differently in the three lists. Such cases included journals which are written with ‘&’ in one list and ‘and’ in the other list, as well as records where one database lists three parts (say part A, B, C) of a journal as a separate entry whereas the other has a single entry for all the three parts taken together. The matching was done as follows:
The remaining records (after first step of matching) in the Web of Science and Scopus databases are matched on the title field of record. First an exact match is done. This resulted in 42 matching records in Web of Science and Scopus. However, 12 records with title match have different publisher information in the two databases. Therefore, they were discarded and we were left with 30 matching records by title field. For Web of Science and Dimensions 180 records matched on title, from which only 144 records have same publisher. In case of Scopus and Dimensions we got 188 title matches out of which 120 records have same publisher.
In the second step of title matching, an inexact match was performed between title fields of the remaining records by computing cosine similarity between them. We considered cosine similarity of 0.9 or higher as an indication of match between two titles. This step resulted in 5 matching records in Web of Science and Scopus, with same publisher name. Therefore, only 5 matching records were considered. Web of Science and Dimensions have 19 records with same publishers out of 22 matches. Similarly, Scopus and Dimensions have 26 records with same publishers out of 56 matches.
The pre-processing and matching steps were executed as above to identify overlapping and unique journal entries across the three databases.
About this article
Cite this article
Singh, V.K., Singh, P., Karmakar, M. et al. The journal coverage of Web of Science, Scopus and Dimensions: A comparative analysis. Scientometrics 126, 5113–5142 (2021). https://doi.org/10.1007/s11192-021-03948-5