In recent months the COVID-19 (also known as SARS-CoV-2 and Coronavirus) pandemic has spread throughout the world. In parallel, extensive scholarly research regarding various aspects of the pandemic has been published. In this work, we analyse the changes in biomedical publishing patterns due to the pandemic. We study the changes in the volume of publications in both peer reviewed journals and preprint servers, average time to acceptance of papers submitted to biomedical journals, international (co-)authorship of these papers (expressed by diversity and volume), and the possible association between journal metrics and said changes. We study these possible changes using two approaches: a short-term analysis through which changes during the first six months of the outbreak are examined for both COVID-19 related papers and non-COVID-19 related papers; and a longitudinal approach through which changes are examined in comparison to the previous four years. Our results show that the pandemic has so far had a tremendous effect on all examined accounts of scholarly publications: A sharp increase in publication volume has been witnessed and it can be almost entirely attributed to the pandemic; a significantly faster mean time to acceptance for COVID-19 papers is apparent, and it has (partially) come at the expense of non-COVID-19 papers; and a significant reduction in international collaboration for COVID-19 papers has also been identified. As the pandemic continues to spread, these changes may cause a slow down in research in non-COVID-19 biomedical fields and bring about a lower rate of international collaboration.
The year 2020 began with the extremely fast spread of the COVID-19 pandemic and has already reached multiple peaks in the recent months (World Health Organization 2020; Moore et al. 2020). Countries such as the US, India, Brazil and many others are struggling to flatten the curve. The pandemic has impacted almost every aspect of life, ranging from the economy to tourism, political affairs, the arts and sports, thus there is a global effort in searching for ways to understand and cope with it. These efforts lay, in great part, in the hands of the scientific research community. As such, scholarly research and its publication patterns have also been greatly impacted by this current crisis.
The volume of COVID-19 related publications, especially in the biomedical fields, has increased and has continued to sharply increase since January 2020. However, apart from this increase in volume, other changes in scholarly research are also taking place. Many journals and publication databases now allow free access to COVID-19 related articles and data (https://coronavirus.elsevier.com/.Elesevier coronavirus Research hub, https://www.thelancet.com/coronavirus.The Lancet). Data sets of these articles such as the https://www.semanticscholar.org/cord19.CORD-19 have also been curated for the creation of analysis tools to aid in the fight against this disease.
While it is clear that scholarly publication patterns have changed dramatically due to the pandemic, it remains unclear how these changes are manifested in a few key aspects. We focus on four such aspects by setting the following research questions:
How has the volume of scholarly literature in preprint servers and journals changed due to the pandemic? Specifically, we hypothesize that the focus on COVID-19 caused a reduction in volume of publications of other, non-COVID-19 papers in the same venues.
How are COVID-19 publications in journals related to their associated metrics? Specifically, do journals with higher scientometric scores publish more COVID-19 related papers then journals with lower scores?
How quickly are COVID-19 and non-COVID-19 papers accepted for publication? The peer review process of journals is usually slow, but currently there is a need for a fast turn around, especially for COVID-19 papers. Specifically, we hypothesise that in order to cope with this need for fast turn around, the time until the acceptance of COVID-19 papers has been reduced significantly from “normal” acceptance time and that the time until the acceptance of non-COVID-19 papers has slowed down in order to facilitate that.
How has international publication and (co-)authorship changed? We hypothesise that the pandemic has caused a significant increase in international collaboration. We set to analyse international collaboration from two unique axes: (1) The diversity of countries which collaborate with one another; and (2) The number of internationally co-authored COVID-19 publications.
To address these questions we employ both a short-term analysis technique, focusing on the first 6 months of 2020 (the first six months of the outbreak), and a longitudinal analysis technique through which we compare the publication patterns across the last five years (2016–2020). Our study employs a set of statistical tests in order to ascertain statistically significant changes. These tests are conducted at both the short-term and longitudinal levels. At the short-term level, these tests indicate if any statistically significant differences exist when comparing COVID-19 papers to non-COVID-19 ones. At the longitudinal level, these tests indicate if any statistically significant differences exist when comparing papers published prior to the pandemic to those published during the pandemic. We focus on two main types of venues for research publication: peer reviewed Journals and Preprint servers. Preprint servers are becoming widely used in other fields of research, such as Computer Sciences and Physics, but up until the pandemic the usage of such publication venues in the biomedical fields was limited (Desjardins-Proulx et al. 2013; Maslove 2018).
Understanding changes in publication patterns during the pandemic is valuable due to the possible implications. As the pandemic does not seem to be coming to a stop, these changes, for good and for bad, may have prolonged effects that should be considered by journal editors, recruiting and promotion committees, funding agencies and others.
This paper is organized as follows: "Background and related work" section presents the background and related work in scientometric analysis of pandemic related research. In "Methodology" section we describe the data and tools used in this study. "Results" section presents the results to the research questions we posed. We conclude the paper with a discussion in "Conclusion and discussion" section.
Background and related work
Numerous scientometric studies have examined how publication patterns vary during or following a pandemic (see Zhang et al. (2020) and references therein). These studies commonly focus on one or a few aspects of scientometrics such as growth of publications in various databases, research funding agencies’ countries, average time to acceptance and international collaboration patterns. In these works, two standard techniques are often used: a short-term technique in which publication pattern changes are analysed during a pandemic and a longitudinal technique in which the publication pattern analysis is focused on a collection of papers related to viral pandemics, written over a long period, usually several years.
Recent studies on the COVID-19 pandemic follow these two techniques as well. Adopting the short-term analysis technique, Da Silva et al. (2020) have examined publication volumes of COVID-19 papers and identified top journals, countries and authors. Similarly, Costa et al. (2020) performed keyword analysis and identified the most productive countries, institutions, authors and journals, Lou et al. (2020) observed publication types, journals and publication countries, and Gianola et al. (2020) identified that during the first five months of the pandemic most of the COVID-19 scientific literature comprised of short reports, opinions and perspectives. Chahrour et al. (2020) focused on the international distribution of COVID-19 publications compared with the number of COVID-19 cases in the respective countries, again adopting a short-term analysis approach. Common to the above studies is the focus on COVID-19 related publications. These studies do not consider the possible changes in publication patterns of papers unrelated to the COVID-19 pandemic which were published during the pandemic. One exception is Homolak et al. (2020) who do focus on both COVID-19 related papers and non-COVID-19 related papers in their short-term analysis observing time to publication, authorship and affiliation counts.
Adopting a longitudinal approach, Kun (2020) observed the extremely short time to acceptance for COVID-19 papers. The author focused on the first three months of 2020 for COVID-19 papers and compared them to papers on other corona viruses. Ahmad and Batcha (2020) have taken a different approach, focusing solely on COVID-19 papers yet examining them over the years 2011–2020. Similarly, Tao et al. (2020); Mao et al. (2020); Zhai et al. (2020); Malik et al. (2021) studied the same for the years 2000–2020. These studies examined the publications’ countries of origin, collaboration networks, authors, keywords and additional publication characteristics. Malik et al. (2021) found that the number of nCOV related research papers has spiked several times in the last two decades, correlating with the post SARS and MERS pandemics. In the same vein, Kagan et al. (2020) analysed publications related to multiple nCov viruses and compared those to influenza and additional viruses, and Zhang et al. (2020) did a comparative bibliometric study of multiple outbreaks and performed a preliminary analysis of the COVID-19 outbreak. Other studies have performed both longitudinal and short-term analyses of nCov papers in which they examined international collaboration (Lee and Haupt 2020; Cai et al. 2021). Their studies show that countries affected more by the virus as well as those with higher GDP tended to publish more in international collaborations, and that team sizes for COVID-19 papers dropped during the first months of pandemic as well the number of papers published in international collaborations.
Our study further compares preprint servers and peer-reviewed journals as possible dissemination venues for COVID-19 research output. Prior research by Krumholz et al. (2020) and Johansson et al. (2018) have shown an increase in the usage of preprint servers in previous pandemics. Recently, evidence was provided to support their findings in the current pandemic, as well (Fraser et al. 2020; Fry et al. 2020). Torres-Salinas (2020) have also identified this growth observing eight different repositories. They further observed the total growth in volume, showing that the number of COVID-19 papers produced doubles every 15 days. The work by Vasconcelos et al. (2021) showed an overall growth of preprint papers in repositories across multiple fields and modeled this growth. Our study complements the above in several respects: (1) By providing both short-term and longitudinal statistical analysis, our study gives a wider perspective in which the influence of the pandemic on current research can be observed; (2) We focus on both COVID-19 and non-COVID-19 papers, providing an assessment of how the effects of the pandemic differ in respect to these two types of papers; (3) Our study analyzes time to acceptance for COVID-19 and non COVID-19 publications as well as international collaboration by the diversity of the countries that are collaborating. These aspects have been minimally researched in pandemics in general and during the COVID-19 pandemic in particular. (4) Our study examines two types of venues, namely Preprint servers and scholarly journals; and (5) To the best of our knowledge, we provide the most extensive scientometric-based research on COVID-19 publications to date.
The data for this research was obtained from four main sources:
Elseviers’ ScienceDirectFootnote 1 and ScopusFootnote 2. ScienceDirect is a full-text scientific database which is part of SciVerse. Scopus is an abstract and citation database of peer-reviewed literature. Utilizing both ScienceDirect and Scopus API we extracted data on COVID-19 related journal articles.
medRxivFootnote 3 (pronounced “med-archive”) is a free online archive and distribution server for complete but unpublished manuscripts (preprints) in the medical, clinical, and related health sciences. The server was founded by Cold Spring Harbor Laboratory (CSHL), a not-for-profit research and educational institution, Yale University, and BMJ (mostly referred to as the British Medical Journal), a global healthcare knowledge provider.
bioRxivFootnote 4 (pronounced “bio-archive”) is a free online archive and distribution service for unpublished preprints in the life sciences. It is operated by CSHL.
arXivFootnote 5 is an open archive for scholarly preprints in various fields. It is maintained and operated by Cornell University.
In addition to the above described datasets, we have also extracted supplementary data from PubMed which is a web-portal of the medical database MEDLINE; and ScimagoJR - a publicly available portal which includes journals’ and countries’ scientific indicators developed from the information contained in the Scopus database, created by SCImago research group (González-Pereira et al. 2010). The selection of ScienceDirect and Scopus was due to their wide indexing of journals as well as their API for data extraction. medRxiv and bioRxiv were selected due to their specialization in the biomedical research fields. Similarly, the arXiv repository was chosen due to its high usage across multiple fields. As the arXiv server is used for many fields of research and our study focused on biomedical papers, the data extraction from the arXiv server was limited only to the “quantitative biology” field.
To understand the publication behaviour in the first months of the pandemic, we analysed the data which was extracted from the sources described in "Sources" section. The search in ScienceDirect was done using the search query “COVID-19” OR Coronavirus OR “Corona virus” OR Coronaviruses OR “2019-nCoV” everywhere in the document. The biomedical journals with the highest numbers of COVID-19 related publications were selected for further analysis. For each journal, data for all COVID-19 and non-COVID-19 related papers was downloaded separately. In the same manner, data for all papers in each year of our analysis (2016-2020) was downloaded and split up by months according to the online availability date of the paper. Additional data for each paper was extracted via its DOI from PubMed Entrez and through the Scopus API. This included the dates the paper was received by the journal, accepted for publication and available online, the authors’ affiliations and countries, the journals’ urls and the associated scientometrics. We focus on the “Scimago Journal Rank” (SJR) (González-Pereira et al. 2010) metric as our data was collected from Scopus. Duplicate papers, identified by their DOIs, as well as papers with missing data (DOI, authors or countries) were removed automatically. Additional papers which were removed were those with inaccurate dates or insufficient date information as described in "Analysis approach" section. The total number of papers excluded from our analysis was 347 out of 7419. The average yearly percentage of papers removed from analysis was \(3.76\%\) (for the years 2016–2020, inclusive). Records for the examined papers were analysed according to various attributes including publicizing journal, authors, publicizing countries and dates.
In order to retrieve and analyse COVID-19 data from bioRxiv, medRxiv and arXiv, we queried the archive servers with the search query: “COVID-19” OR Coronavirus OR “Corona virus” OR Coronaviruses OR “2019-nCoV”. This query was executed separately for each of the first six months in 2020. To perform the longitudinal aspect of our analysis we further queried the repositories for all papers published in each of the first six months of 2016–2020 separately. The retrieved results were downloaded and automatically analysed using designated scripts written by the authors.
A subset of the results was manually tested to ensure both accuracy of the data and the scripts. A random subset of several dozen articles was chosen and manually examined by the authors. The relevant dates, authors, countries and additional data were compared against the data automatically extracted by the scripts to ensure no mismatch. No discrepancies were found in this manual verification.
We analyse four main aspects of our data as pertaining to our posed research questions: (1) Publication growth that has occurred during the pandemic and, specifically, the venues which have contributed to this growth; (2) Impact of journals’ Scientometric indicators on publication behaviour; (3) Changes in the time to acceptance of peer reviewed papers; and (4) Changes in authors’ countries of affiliation and international collaborations. We define the time to acceptance as the period between the “date received” and the “date accepted” or the “date online” for a paper, whichever is earlier. Data entries for which the “date received” or both the “date accepted” and the “date online” were missing or corrupted were omitted from our analysis as well as inaccurate data entries for which the “date received” was later than (or the same as) the “date online” or the “date accepted”. The cleaning methodology is detailed in "Retrieval process" section. To conduct our collaboration analysis we define international collaboration as papers authored by two or more authors affiliated with institutions in different countries. We examine two facets of international collaborations:
Diversity of collaboration, i.e., the number of countries with which each country has collaborated over a given time period. For example, a country which has published papers with 10 other countries is more internationally collaboratively diversified than a country who has published with 5 other countries, irrespective of the number of papers published.
“volume” of collaboration, i.e., the number of publications which each country has published in collaboration with other countries over a given time period. For example, a country who has published 10 papers with one other country is more internationally productive than a country who has published 5 papers, even if each of the papers is published with a different country.
The four aspects are analysed and reported for both COVID-19 papers as well as “standard” non-COVID-19 papers published during the pandemic. This comparative approach allows us to identify the effects of the pandemic on the examined aspects both for pandemic-related research as well as the standard biomedical research published during that time. The above aspects are further analysed for pre-pandemic papers published in the years 2016-2019 and compared against the COVID-19 papers and non-COVID-19 papers published in 2020. Our analysis was performed separately for COVID-19 papers and for non-COVID-19 papers in each of the first six months of 2020 and repeated for each of the first six months of each of the previous four years in our research. This part of the study allows us to identify the possible effects of the pandemic on the examined aspects in a longitudinal view. Some of the following analyses, especially regarding time to acceptance, authors’ country of affiliation and international collaboration, require a large volume of publications. Thus, for these analyses we selected a subset of biomedical journals with the highest number of COVID-19 paper publications. These are shown in Table 1. In order to select these journals we performed the queries described in "Retrieval process" section and ordered them by their number of COVID-19 related publications. From the journals with the highest numbers of COVID-19 publications we selected journals in biomedical fields according to their associated categories in Scopus.
All data and code is available under www.github.com/shirAviv/covid-19-scientific-papers.
Analysing the publication growth in the first six months of 2020 shows that not only has there been a huge surge of COVID-19 related publications, as one could expect, but also that these publications are disseminated across multiple venue types. As discussed before, this work focuses on two venue types-preprint servers and peer reviewed journals.
Publication growth in preprint severs
Figure 1 shows that preprint servers are considered a legitimate and even valuable source of dissemination at this time. Interestingly, the publication growth was observed for both COVID-19 publications as well as for non-COVID-19 related papers in all of the preprint servers we analysed. Furthermore, the growth in COVID-19 publications increased in a similar fashion to the international spread of the pandemic, although some decline was apparent in June in the medRxiv preprint server.
The sharpest increase in publications was observed in medRxiv, which was created as a preprint server in mid-2019. While the total number of papers published in it has increased from 200 in January 2020 to nearly 1800 in June 2020 (a factor of 9), the number of COVID-19 related papers increased from 40 in January 2020 to 1350 in June 2020 (a factor of 34). COVID-19 papers showed an increase in percentage from 19.5 to 75% of total papers published on the server during these months.
Focusing on the arXiv preprint server, which is well known for publications in Physics and Computer Science, reveals that the number of publications in biology (quantitative-biology field in the arXiv, q-bio for short) was low over the previous four years with a slow, almost flat, increase. However, from January 2020 until June 2020, the total number of publications in the q-bio field had almost doubled and the number of COVID-19 related papers had increased by a factor of 8. The percentage of COVID-19 papers increased from 20% in the month of February to 35% in June, of total papers published in the q-bio field of the server in these monthsFootnote 6.
Publication growth in journals
Turning to the analysis of peer reviewed journal publications, we observe the growth of COVID-19 related papers in this venue. Table 1 focuses on the top COVID-19 publishing journals in which, especially in April, May and June 2020, COVID-19 papers comprised a substantial percentage of papers published in these journals. Journals in the table are ordered according to their SJR score from 2019.
We further analysed the growth from a longitudinal aspect, observing the first six months of the years 2016–2020. Similar to Fig. 1 which presents the growth in preprint papers, Fig. 2 shows the COVID-19 publication growth for journals in Table 1 as compared with publication growth over the last five years. While we can see a large surge in the total number of publications as compared to previous years, COVID-19 publications seem to account for virtually the entire growth. Specifically, the number of non-COVID-19 related publications follows the same pattern of previous years.
Turning to the journals’ SJR metric, the results show that the growth in COVID-19 related publications is correlated with the journals’ SJR. In low ranking journals, hardly any COVID-19 papers were published. In contrast, the vast majority of COVID-19 papers were published in the highest ranked journals. Specifically, using the SJR score as a sorting criteria, 31% of the COVID-19 papers we analysed were published in top ranking journals (\(\sim\)20% of the analysed journals) with SJR ranking between 7.516 and 14.55. Only 9.8% of the COVID-19 papers were published in bottom ranking journals (\(\sim\)20% of analysed journals) with SJR ranking between 0.103 and 0.11. The number of articles used in our analysis was aggregated in each journal over the first six months of the pandemic. Table 2 shows the top ranking journals and the lowest ranking journals with the number of COVID-19 articles they published. Excluding two low ranked journals with an exceptionally high number of COVID-19 publications, only 1.8% of the COVID-19 papers were published in bottom ranking journals (\(\sim\)20% of the analysed journals) with SJR ranking lower than 0.25.
The Pearson correlation between the total number of COVID-19 publications and the SJR of the associated journal is moderately positive with \(r=0.57\), and statistically significant at \(p=0.0053\). We further calculated the Pearson correlation between the percent of COVID-19 papers out of total papers, summed over the first six months of 2020, for each journal and the SJR score of that journal. The correlation is positive with \(r=0.177\) but not statistically significant at \(p=0.43\). This result could be due to the large amount of papers published in most high ranking journals irrespective of COVID-19.
Time to acceptance
Following the results reported in "Publication growth in journals" section, we now turn to investigate how this publication growth has affected the time to acceptance for the papers. Naturally, time to acceptance is only applicable to journal publications and not for preprint ones. For each of the journals analysed in Table 1, we calculated the time to acceptance in each of the six examined months for the years 2016-2020 by calculating mean and standard deviation (SD) of the time to acceptance.
Figure 3 displays the mean time to acceptance for COVID-19 and non-COVID-19 publications in each of the first six months in 2020 alongside the mean time to acceptance of all papers published in the same months in 2016-2019, inclusive. The journals in the figure are ordered by their rank. Observing the mean time to acceptance for COVID-19 papers in all journals, starting from the month of February 2020 onward, we see that it is extremely short, both when compared to non-COVID-19 papers from the same month and onward and when compared to papers from the previous years.
We performed a series of one tailed t-tests comparing mean time to acceptance of COVID-19 papers to non-COVID-19 papers. The test was performed for each of the months February-June of 2020Footnote 7 as well as for the average time to acceptance for aggregated papers in the months January-June of 2020. Values are shown in Table 3. As can be seen, all of these tests revealed a statistically significant difference in mean time to acceptance with \(p<0.05\). In February 2020, for example, the average time to acceptance of non-COVID-19 papers was almost 10 times longer than that of COVID-19 papers. Across the examined period, on average, non-COVID-19 papers experienced an average time to acceptance of 91.3 days compared to 19.3 days for COVID-19 papers (a factor of 4.7).
Additional support for the above phenomena can be seen in Table 4. Taking a longitudinal approach, we compare each pair of consecutive years as to the mean time to the acceptance of papers from our examined journals. As can be seen in the table, up to and including 2019, no significant changes were observed. For 2019 and 2020 we see that papers published in 2020 experienced a longer time to acceptance, both when considering the entire publication set as well as when focusing on non-COVID-19 papers alone.
Due to our previous finding that high ranked journals yield a higher number of COVID-19 papers, we expected this to affect the time to acceptance as well. Two contradicting hypotheses can be assumed: (1) The high volume of papers submitted to high ranked journals yields a longer acceptance time; and (2) The high volume of accepted papers to high ranked journals implies a faster review process and a shorter acceptance time.
In order to examine these hypotheses, we calculated the Pearson correlation between the SJR score of a journal and the mean acceptance time of COVID-19 related papers for that journal. The correlation was found to be weakly negative, as would be expected from the second hypothesis, but not statistically significant (\(r=-0.266\), \(p=0.43\)).
The results are also depicted in Fig. 4, which displays the mean time to acceptance and the Standard Error of the Mean (SEM) averaged over all analysed journals from Table 1 in the first six months of each of the years 2016–2020. The figure also presents the mean and the SEM for COVID-19 and non-COVID-19 related publications in 2020. We observe an apparent trend for COVID-19 publications which first declines sharply from January to February, mainly attributed to the relatively low number of COVID-19 papers in January and thus the high SEM in January, and then inclines moderately from February onwards, mainly due to the increase of COVID-19 papers. However, despite the observed trend, COVID-19 publications seem to “enjoy” a shorter time to acceptance period compared both to non-COVID-19 publications in 2020 and to publications in 2016-2019. As we speculated before, our data and analyses show an association between non-COVID-19 and a longer time to acceptance. However, additional analysis is needed in order to further understand the impact of short time to acceptance for COVID-19 related papers both in relation to other, at the time of the pandemic, non-COVID-19 publications and to post-pandemic publications in general.
Top publishing countries and international collaboration
In this section we focus on the source countries of the papers under analysis. We first analyse the countries with the highest number of publications and then proceed to international collaboration observing two facets as detailed in "Analysis approach" section. As before, we analyse these trends in the first six months of 2020 (for both COVID-19 and non-COVID-19 related papers) and the first six months of the previous four years.
Top publishing countries
We focus on the countries with the highest COVID-19 related publications and compare them to the top publishing countries of non-COVID-19 papers in the first six months of 2020, as well as to the historical data from the previous four years. In Table 5 we report the comparison between COVID-19 and non-COVID-19 publications for the first six months of 2020. A longitudinal perspective over last five years is presented in Fig. 5.
As can be seen from Fig. 5, while the top publishing countries have remained almost the same over the last five years, a consistent growth in the number of papers is evident. This is consistent with our previous findings, showing the total growth of publications over the years and most significantly in 2020. It is interesting to note that despite the fact that the same countries are top publishing countries throughout the years regardless of the pandemic, Italy is the anomaly as it has never ranked in the top five before but during the pandemic has played a major part in COVID-19 research, as can be seen from its ranking as 5th in COVID-19 related publications. A similar pattern is displayed by Brazil and Hong Kong. Both are in the top 10 publishing countries for COVID-19 papers but with an average world ranking of 14 and 34 in the SJR country ranking, respectively. This can be explained by the large outbreak of the pandemic in these three countries during the examined months.
Diversity of international collaborations Recall from "Analysis approach" section that we measure diversity as the number of countries with which each country has collaborated over a given time period. We first analyse the impact of the COVID-19 pandemic on international collaborations at the time of the pandemic and then proceed to analyse how international collaboration has changed in diversity over the last five years.
Figure 6 displays a comparison of COVID-19 and non-COVID-19 collaborations in 2020. For each country we observed the number of countries with whom it had collaborated for COVID-19 related papers and for non-COVID-19 related papers separately. This research was conducted separately for each of the months February-June of 2020Footnote 8 and also for the total first six months of 2020. The mean number of collaborating countries was calculated and a t-test was performed for each of these periods. The results are shown in Table 6. Our findings showed that for all periods tested, except for the month of May, the mean number of collaborating countries in non-COVID-19 papers was found to be statistically significantly greater than the mean number of collaborating countries for COVID-19 papers, with \(p<0.05\).
Following the short term analysis, we conducted a longitudinal one observing the years 2016–2020. Figure 7 presents the international collaboration diversity in this perspective. For each country we observed the aggregated number of countries with which it had collaborated in the first six months of each year (repeated countries were removed). Similar to the trend we saw for top publishing countries, the top collaborating countries have remained almost the same over the last five years and a consistent growth in the number of countries with which each of these top countries collaborated is evident. We performed pairwise t-tests for each two consecutive years in our study. The test included all countries which collaborated with at least one other country in the first six months of the years 2016-2020. Results are shown in Table 7. The results show that for all examined periods, the number of collaborating countries is steadily increasing. The differences are statistically significant from the 2017-2018 period onward, \(p<0.05\). In addition, we performed t-tests comparing international collaboration on COVID-19 papers in 2019 with those in 2020 as well as non-COVID-19 papers in 2019 with those in 2020. While both showed a statistically significant difference, the direction is reversed as the mean number of collaborating countries for 2020 COVID-19 papers is statistically significantly smaller than that in 2019, but the mean number of collaborating countries for 2020 non-COVID-19 papers is statistically significantly larger than that in 2019. \(p<0.05\) for all accounts.
An increase in collaboration diversity over the years has been demonstrated in previous works as well (Leydesdorff and Wagner 2008; Wagner and Leydesdorff 2005). The results presented here complement these findings by showing that the year 2020 has experienced the largest increase in the number of collaborating counties. However, the findings from the comparison of 2019 to the 2020 COVID-19 papers along with the findings when comparing COVID-19 and non-COVID-19 collaborations (as shown in Table 6) show that collaboration in COVID-19 papers is, surprisingly, low. Specifically, it is lower when compared to non-COVID-19 papers, lower when compared to collaboration in the past and lower than what we would expect during a pandemic, where collaboration is of increased significance.
Volume of international collaboration Recall from "Analysis approach" section that we measure “volume” as the number of publications which each country has published in collaboration with other countries over a given time period. For this analysis we performed two statistical tests for both the short-term COVID-19 pandemic months and for the long-term 2016–2020 period: \(\chi ^2\) test (Pearson 1900) and t-test.
Figure 8 displays the countries with the highest number of papers written in international collaboration for COVID-19 papers compared with non-COVID-19 papers in the first six months of 2020. Figure 9 displays the countries with the highest number of papers written in international collaboration over the last five years. We can observe two interesting collaboration patterns from these figures. The first is that the number of papers written in collaboration has continually increased over the last five years, for the top collaborating countries. This is consistent with our findings of growth in total publications over the last five years, as can be seen in Fig. 2. The second finding that can be observed is that although the US and the UK have remained the top two collaborating countries, when observing collaboration for COVID-19 publications, China is extremely collaborative and Italy is in the top 5 collaborating countries. This can be explained by the pandemic originating in China and its wide spread in Italy.
For the \(\chi ^2\) test we define two extreme cases, papers written with no collaboration at all, and papers written in any form of collaboration, meaning at least two countries collaborated in authorship of that paper. Thus we compare single country authored papers to multi-country authored papers. The test was performed for each of the months February-June of 2020Footnote 9 separately and as a sum over the first six months of 2020. In each time period we examined how many papers were authored by a single country (all authors from the same country) and how many papers were authored in collaboration with other countries. This was done for COVID-19 papers and for non-COVID-19 papers. The results are shown in Table 8. Based on the \(\chi ^2\) statistic and p values we can conclude that for the months of April, May and June 2020 and for the total over the six months, the type of paper (i.e., COVID-19 or non-COVID-19 related) is significantly associated with the authorship by a single country or multiple countries. Specifically, authorship by a single country is indicative of COVID-19 related papers while co-authorship by multiple countries is indicative of non-COVID-19 related papers.
Additional \(\chi ^2\) tests were performed to analyse the longitudinal aspect of our study. Specifically, has the pandemic affected the number of papers internationally co-authored compared to previous years? In this test we measured the mean number of papers authored by single countries and the mean number of papers authored by multiple countries in the first six months of each two consecutive years in our study, comparing 2016–2017, 2017–2018, 2018–2019, 2019–2020. This can be seen in Table 9. Our results for the longitudinal \(\chi ^2\) test are statistically significant at \(p<0.05\) in two specific cases: when comparing international co-authorship for 2019 papers to 2020 COVID-19 papers and when comparing authorship for 2019 papers to 2020 non-COVID-19 papers. Taken jointly, the results from the short-term and the longitudinal \(\chi ^2\) tests indicate that for COVID-19 related papers, the volume of papers co-authored by multiple countries is low, both in comparison to non-COVID-19 related papers and to previous years’ international collaboration behaviour. In order to better understand the findings from our \(\chi ^2\) tests, we performed a series of short-term analysis t-tests and a similar series of long-term analysis t-tests. In the short-term analysis we measured, for every country, the number of papers written in international collaboration. This was performed in each of the months February-June of 2020Footnote 10 and for the total first six months of 2020 for COVID-19 related papers and for non-COVID-19 related ones. The mean number of collaborative papers was calculated and a t-test was performed for each of these periods. The results are displayed in Table 10 and show that for the examined periods of February, March and June and the total first six months of 2020, the mean number of COVID-19 collaborative papers is statistically significantly smaller than for non-COVID-19, with \(p<0.05\).
Following the short-term analysis we conducted a longitudinal one, observing the years 2016–2020. For every country, we observed the total number of papers written in international collaboration in the first six months of each year. We performed pairwise t-tests for each two consecutive years in our study. From Table 11 we can observe that a statistically significant difference exists only for the period 2019–2020. The mean number of internationally collaborated papers in 2019 is statistically significantly smaller than the mean number of internationally collaborated papers in 2020, with \(p<0.05\). However, this difference is not statistically significant when comparing 2019 collaborated papers separately to 2020 COVID-19 papers and to 2020 non-COVID-19 papers. These findings indicate that when international collaboration is measured by the volume of collaborated papers, the COVID-19 pandemic does not seem to affect the increase in international collaboration.
Conclusion and discussion
In this study we have analysed how the COVID-19 pandemic effected the publication patterns in biomedical literature. We employed two types of analyses to address each of our research questions - short-term analysis and a longitudinal analysis of preprint servers and peer reviewed journals.
Our analysis showed a significant increase in published papers both in peer reviewed journals and in preprint servers compared to previous years. The new MedRxiv preprint server especially stands out with an exceptionally large increase in publications and we expect this preprint server to continue this pattern post-pandemic. Notably, while the increase in publication in preprint servers has occurred for both COVID-19 and non-COVID-19 related papers, this is not the case for the journals we have analysed. In these journals virtually the entire growth was due to COVID-19 papers while, on average, the volume of non-COVID-19 papers has remained similar to previous years. Our results also showed that high ranked journals publish more COVID-19 papers than low ranked journals. It is further apparent that journals had responded quickly to the pandemic by lowering the time to acceptance for COVID-19 papers. Unfortunately, this seems to have come at a cost for non-COVID-19 papers, whose time to acceptance was longer than that which was observed in previous years (and obviously longer than that of COVID-19 papers). While our analysis suggests strong supportive evidence to that conclusion, one cannot definitively rule out other “hidden” contributors which are outside the scope of this work.
Taken jointly, the non-increasing volume and longer time to acceptance of non-COVID-19 papers may lead to a slow down in non-COVID-19 related research and publication, at the very least in journal publications. In future work, we intend to extend our analysis into these findings in order to further understand if a “slow down” can be observed. While such research may not be “urgent”, it is, presumably, of no less importance than the current crisis. On the other hand, we do observe an increase in the use of preprint servers irrespective of COVID-19 publications. This may indicate that the community is recognizing the above phenomena and adjusting their publication behaviour accordingly.
Turning our focus to the countries authoring these papers, we observed the top publishing countries and international collaboration patterns. We observed that while the US and the UK have remained the top publishing and collaborating countries, Italy, Brazil and Hong Kong have produced a significant amount of COVID-19 related papers as well, disproportional to their lower ranking in respect to the number of non-COVID-19 papers authored in these countries over the previous four years. This observation could be explained by the major impact this pandemic has had on these countries. Our results further showed consistent growth over the last five years in international collaboration when examining both collaboration diversity and volume. During the pandemic, we observed that the volume of COVID-19 related papers written in collaboration had increased compared to non-COVID-19 papers. However, contrary to our original hypothesis, international collaboration diversity in COVID-19 papers was lower than in non-COVID-19 papers and lower than previous years. We observed that most COVID-19 papers were authored by a single country or only very few countries. This can be explained by the complexity of conducting international studies during the pandemic. However, this may also suggest that countries are doing a significant amount of COVID-19 research nationally and knowledge is officially shared only after the research has been published. This phenomenon is obviously undesired, especially at a time of international crisis.
We recognize that the current study is limited by the amount, quality and diversity of the data used. In the context of this work, the number of COVID-19 related papers was relatively low especially in the first 3 months of 2020. This could skew our findings in respect to both time to acceptance and collaboration analysis. In addition, we chose to focus on biomedical publications alone. Observing additional fields could lead to a broader trend which may not necessarily align with our results. Additional potential limitations to our study relate to our selection of data sources. We have selected only three public repositories to focus on due to the large increase of papers in these archives as well as their relevance to the selected fields. Similarly, we focused only on journals indexed by Scopus. A wider perspective on the matter could be obtained by including additional repositories as well as data from other indexing sources such as Web Of Science and Microsoft Academic. While this work is, to the best of our knowledge, the most extensive one on all four accounts when discussing scholarly publications during the COVID-19 outbreak, a larger analysis may reveal additional or other trends which were not captured here. We plan to extend this work further in several directions: First, we plan to apply more advanced analysis techniques and statistical methods such as time series and unsupervised learning methods (Madsen 2007; Han et al. 2011) and mixed effects modeling (Laird and Ware 1982) to our data. This could assist in identifying additional publication trends which were not revealed in the current study. Second, we wish to investigate the long-term effects of this pandemic on scholarly research, both in the biomedical literature as well as in other fields. Such research would complement this work by analysing (hopefully) post-pandemic changes in citation, collaboration, time to acceptance patterns and additional scholarly publication trends.
No COVID-19 related papers were published in the arXiv q-bio field in January.
Data for January was insufficient to perform this test.
Data for January was insufficient to perform this test.
As before, data for January was insufficient to perform this test.
Data for January was insufficient to perform this test.
Ahmad, M., & Batcha, M.S. (2020). Identifying and mapping the global research output on coronavirus disease: A scientometric study. Library Philosophy and Practice (e-journal) 4125.
Cai, X., Fry, C. V., & Wagner, C. S. (2021). International collaboration during the COVID-19 crisis: Autumn 2020 developments. Scientometrics, 126(4), 3683–3692.
Chahrour, M., Assi, S., Bejjani, M., Nasrallah, A. A., Salhab, H., Fares, M. Y., & Khachfe, H. H. (2020). A bibliometric analysis of COVID-19 research activity: A call for increased output. Cureus. https://doi.org/10.7759/cureus.7357.
Costa, I. C. P., Sampaio, R. S., Souza, F. A. C. D., Dias, T. K. C., Costa, B. H. S., & Chaves, E. D. C. L. (2020). Scientific production in online journals about the new coronavirus (Covid-19): Bibliometric research. Texto & Contexto - Enfermagem, 29. https://doi.org/10.1590/1980-265x-tce-2020-0235.
Da Silva, J. A. T., Tsigaris, P., & Erfanmanesh, M. (2020). Publishing volumes in major databases related to Covid-19. Scientometrics, 126(1), 831–842.
Desjardins-Proulx, P., White, E. P., Adamson, J. J., Ram, K., Poisot, T., & Gravel, D. (2013). The case for open preprints in biology. PLoS Biology, 11(5), e1001563.
Fraser, N., Brierley, L., Dey, G., Polka, J. K., Pálfy, M., & Coates, J. A. (2020). Preprinting a pandemic: The role of preprints in the COVID-19 pandemic. BioRxiv, 2020.05. 22.111294.
Fry, C. V., Cai, X., Zhang, Y., & Wagner, C. S. (2020). Consolidation in a crisis: Patterns of international collaboration in early covid-19 research. PloS One, 15(7), e0236307.
Gianola, S., Jesus, T. S., Bargeri, S., & Castellini, G. (2020). Characteristics of academic publications, preprints, and registered clinical trials on the covid-19 pandemic. PloS One, 15(10), e0240123.
González-Pereira, B., Guerrero-Bote, V. P., & Moya-Anegón, F. (2010). A new approach to the metric of journals’ scientific prestige: The sjr indicator. Journal of informetrics, 4(3), 379–391.
Han, J., Pei, J., & Kamber, M. (2011). Data mining: Concepts and techniques. Amsterdam: Elsevier.
Homolak, J., Kodvanj, I., & Virag, D. (2020). Preliminary analysis of covid-19 academic information patterns: a call for open science in the times of closed borders. Scientometrics, 124(3), 2687–2701.
Johansson, M. A., Reich, N. G., Meyers, L. A., & Lipsitch, M. (2018). Preprints: An underutilized mechanism to accelerate outbreak science. PLoS Medicine, 15(4), 1002549.
Kagan, D., Moran-Gilad, J., & Fire, M. (2020). Scientometric trends for coronaviruses and other emerging viral infections. GigaScience, 9(8). https://doi.org/10.1093/gigascience/giaa085.
Krumholz, H.M., Bloom, T., & Ross, J.S. (2020). Preprints can fill a void in times of rapidly changing science. STAT 31.
Kun, Á. (2020). Time to acceptance of 3 days for papers about covid-19. Publications, 8(2), 30.
Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38(4), 963. https://doi.org/10.2307/2529876.
Lee, J. J., & Haupt, J. P. (2020). Scientific globalism during a global crisis: research collaboration and open access publications on COVID-19. Higher Education, 81(5), 949–966. https://doi.org/10.1007/s10734-020-00589-0.
Leydesdorff, L., & Wagner, C. S. (2008). International collaboration in science and the formation of a core group. Journal of Informetrics, 2(4), 317–325.
Lou, J., Tian, S. J., Niu, S. M., Kang, X. Q., Lian, H. X., Zhang, L. X., & Zhang, J. J. (2020). Coronavirus disease 2019: A bibliometric analysis and review. Eur Rev Med Pharmacol Sci, 24(6), 3411–21.
Madsen, H. (2007). Time series analysis. Florida: CRC Press.
Malik, A. A., Butt, N. S., Bashir, M. A., & Gilani, S. A. (2021). A scientometric analysis on coronaviruses research (1900–2020): Time for a continuous, cooperative and global approach. Journal of Infection and Public Health, 14(3), 311–319. https://doi.org/10.1016/j.jiph.2020.12.008.
Mao, X., Guo, L., Fu, P., & Xiang, C. (2020). The status and trends of coronavirus research: A global bibliometric and visualized analysis. Medicine, 99(22), e20137.
Maslove, D. M. (2018). Medical preprints-a debate worth having. Jama, 319(5), 443–444.
Moore, K. A., Lipsitch, M., Barry, J. M., & Osterholm, M. T. (2020). COVID‐19: the CIDRAP viewpoint: part 1: the future of the COVID-19 pandemic: lessons learned from pandemic influenza. In Center for Infectious Disease Research and Policy, University of Minnesota April 30th.
Pearson, K. (1900). X. on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 50(302), 157–175.
Tao, Z., Zhou, S., Yao, R., Wen, K., Da, W., Meng, Y., Yang, K., Liu, H., & Tao, L. (2020). COVID-19 will stimulate a new coronavirus research breakthrough: a 20-year bibliometric analysis. Annals of Translational Medicine, 8(8), 528. https://doi.org/10.21037/atm.2020.04.26.
Torres-Salinas, D. (2020). Daily growth rate of scientific production on covid-19. analysis in databases and open access repositories. arXiv preprint arXiv:2004.06721.
Vasconcelos, G. L., Cordeiro, L. P., Duarte-Filho, G. C., & Brum, A. A. (2021). Modeling the epidemic growth of preprints on COVID-19 and SARS-CoV-2. Frontiers in Physics, 9. https://doi.org/10.3389/fphy.2021.603502.
Wagner, C. S., & Leydesdorff, L. (2005). Network structure, self-organization, and the growth of international collaboration in science. Research Policy, 34(10), 1608–1618.
World Health Organization. (2020). Coronavirus disease 2019 (covid-19): situation report, 101.
Zhai, F., Zhai, Y., Cong, C., Song, T., Xiang, R., Feng, T., et al. (2020). Research progress of coronavirus based on bibliometric analysis. International Journal of Environmental Research and Public Health, 17(11), 3766.
Zhang, L., Zhao, W., Sun, B., Huang, Y., & Glänzel, W. (2020). How scientific research reacts to international public health emergencies: A global analysis of response patterns. Scientometrics, 124(1), 747–773. https://doi.org/10.1007/s11192-020-03531-4.
Conflict of interest
The authors declare that they have no conflict of interest.
About this article
Cite this article
Aviv-Reuven, S., Rosenfeld, A. Publication patterns’ changes due to the COVID-19 pandemic: a longitudinal and short-term scientometric analysis. Scientometrics 126, 6761–6784 (2021). https://doi.org/10.1007/s11192-021-04059-x
- Publication analysis