Background

Randomised controlled trials (RCTs) are considered one of the simplest and most powerful tools for assessing the safety and effectiveness of treatment interventions [1,2,3]. When appropriately designed, conducted and reported, RCTs can produce an immediate impact on clinical practice and patient care [4].

The evolution of RCTs has been an enduring and continuing process [5,6,7,8,9,10,11,12,13,14,15]. Since the 1970s the publication landscape for RCTs has exhibited an exponential growth. For example, a 1965–2001 bibliometric analysis of the literature identified 369 articles published in 1970 compared to 11,159 published in 2000 [5]. The development of clinical trial registries (such as clinicaltrials.gov) [9, 10], the exponential increase in journals publishing trial protocols, results and secondary studies, and growing support for data-sharing policies [11, 12] have created an open research environment of transparency and accountability. Furthermore, the publication of reporting guidelines (such as CONSORT and SPIRIT) [4, 13,14,15] have served to facilitate the transition between research and reporting to ensure standardisation and ease of readability.

RCTs published in major medical journals are highly cited and have an instrumental role in clinical practice and health policy decisions [5, 16, 17]. Previous studies have focused on the quality of the reporting of methods and results of RCTs [18,19,20,21,22] and publication practices [23,24,25,26,27,28] in selected samples of articles published in high-impact-factor (IF) medical journals. However, to the best of our knowledge, no mapping studies have been conducted on major medical journals to investigate the most common subjects, most productive scientists and countries, most prolific journals and “citation classics” across multiple specialties.

The objective of this study was to describe and characterise the global clinical research publication activity through RCT articles published in high-IF medical journals during the past decades.

Methods

Eligibility criteria

This cross-sectional analysis investigated RCT-related articles (that is, primary RCTs, secondary analyses and methodology papers using clinical data) published in major medical journals. We excluded narrative reviews, systematic reviews, meta-analyses, pool-analyses, letters and newspaper articles. All RCT-related articles indexed in PubMed/MEDLINE had to be published in one of the major medical journals with an IF exceeding 10 (2016 IF according to the Journal Citation Reports [JCR] published in June 2017). These medical journals were chosen because they were identified as publishing clinical research with scientific merit and clinical relevance (see Table 1 for a list of the included medical journals).

Table 1 Included high-impact-factor medical journals

Search

On March 22, 2018, we systematically searched MEDLINE through PubMed (National Library of Medicine, Bethesda, MD, United States) for all RCT-related articles published in high-IF medical journals (from inception to December 31, 2017). A senior information specialist (AA-A) and a clinical epidemiologist (FC-L) designed an electronic literature search using a validated research methodology filter for RCTs (with 97% specificity and 93% sensitivity) [29]. The search was peer reviewed by members of the study team, including a second (senior) information specialist (RA-B). The full search strategy is provided in Additional file 1. On May 7, 2018, we searched the Web of Science (WoS) (Clarivate Analytics, Philadelphia, Penn., United States) by using PubMed IDs (PMIDs) from the PubMed/MEDLINE searches. Merging MEDLINE with other citation indices such as the WoS combines the advantages of MEDLINE (e.g., Medical Subject Headings [MeSH], a comprehensive controlled vocabulary for indexing journal articles) with the relational capabilities and data of the WoS [30].

Data extraction and normalisation

For each included article, raw (meta) data on the journal and article titles, subject category, the year of publication, keywords, and the authors’ names, institutional affiliation(s), funding source, and country was downloaded online through the WoS by one researcher (A-AA). We also used the WoS to determine the extent to which each article had been cited in the scientific peer-review literature using the “times cited” number (that is, the number of times a publication has been cited by other publications). Two researchers (FC-L, RA-B) independently verified the data to minimise potential information errors. A process of normalisation was conducted by two researchers to bring together the different names of an author or country and the keywords (further details are available in Additional file 2). Specifically, one researcher (AA-A) checked the names by which an individual author appeared in two or more different forms (for example, “John McMurray” or “John J. McMurray” or “John J.V. McMurray”) using coincidence in that author’s place(s) of work as the basic criterion for normalisation (for example, University of Glasgow, Scotland, United Kingdom) [31], and a second researcher (FC-L or RA-B) verified the data. A threshold of 30 articles was applied to review 200 names by which an individual author appeared in two or more different forms.

We extracted both “author keywords” and “keyword plus,” which are automatically assigned by the WoS from the titles of the references of the articles, as topical (also called textural, linguistic or sematic) data [32]. To ensure consistency in the data, one researcher (RA-B) corrected keywords by unifying grammatical variants and using only one keyword developed to name the same concept (for example, “randomized trial” or “randomized clinical trial” or “randomized controlled trial” or “randomised controlled trial”). In addition, the same researcher (RA-B) removed typographical, transcription and/or indexing errors, and a second researcher (FC-L) verified the data. All potential discrepancies were resolved via consensus amongst these investigators. All these data were collected and entered into a Microsoft Access® (Microsoft, Seattle, WA, United States) database between May 7, 2018, and January 9, 2019.

Data analysis

We analysed data for the number of articles, citations, signatures (or total number of authors included in all the articles of each author), collaboration index (that is the mean number of author’s signatures per article), countries, journals and keywords. Data were summarised as frequencies and percentages for the categorical items. The most prolific authors (>100 articles), countries (>100 articles), funding institutions (>100 articles), and the most cited papers (“top-100 citation classics”) were identified. Network plots were generated for intense scientific collaboration between countries (applying a threshold of 100 articles in collaboration).

We conducted an exploratory analyses of topical data using a set of unique keywords and their frequencies to examine the topic coverage, major topics (“word clouds” of keywords) and their interrelations (“co-words networks”) in RCT articles. The main goal in topical analyses is to understand the topical distribution of a dataset, i.e. what topics are covered and how much of each topic is covered in a scientific discipline [32]. The most frequently used keywords were identified for the most prolific journals (with at least 1000 articles). Based on the most frequently used keywords (with at least 500 articles), a word cloud was created from text that the user provides and more emphasis was placed on words that appear with greater frequency in the source text. A “co-words network” was created to illustrate the co-occurrence of highly frequent words in the articles (applying a threshold of 100 articles in collaboration). The network analysis was carried out with the use of PAJEK (University of Ljubljana, Slovenia) [33], a software package for large network analysis that is free for non-commercial use to construct network graphs. The PRISMA checklist [34] (http://www.prisma-statement.org/) guided the reporting of the present analysis (and is available in Additional file 3).

Results

A total of 39,329 records were identified by the PubMed/MEDLINE search (Fig. 1), and 39,305 articles met the study inclusion criteria (Additional file 4) after 24 records had been excluded (Additional file 5). Table 2 details the general characteristics of the articles.

Fig. 1
figure 1

Flow diagram with selection of articles

Table 2 General characteristics of the study sample

Publication trend

The number of articles increased exponentially over the period 1965–2017 (Fig. 2). Approximately 60% (n = 23,635) of the articles have been published since 2000.

Fig. 2
figure 2

Number of articles by year of publication

Journals and subject category

Forty journals published 39,305 articles, and 23.8% of them (n = 9355) were published by four journals with an IF > 30. The Lancet (9.1%; n = 3593), the Journal of Clinical Oncology (8.5%; n = 3343) and The New England Journal of Medicine (8.3%; n = 3275) published the largest number of articles, followed by The BMJ (6.4%; n = 2516) and Circulation (5.9%; n = 2331). Most articles were classified as “medicine, general & internal” (30.7%; n = 13,688); “cardiac & cardiovascular systems” (13.1%; n = 5828); or “oncology” (12.9%; n = 5760) according to the WoS journal categorisations (Table 2).

Authors, institutions and countries

Most articles (62.3%; n = 24,496) were written by seven or more authors, and only 11.4% (n = 4469) of the articles were written by three or fewer authors. The first authors of the articles were based most commonly in North America and Western Europe; first authors from the United States were responsible for 36.9% (n = 14,508) of the articles (Table 2). We identified 17 authors who published 100 or more articles (Table 3). All of the most productive authors were male. The most prolific authors were Robert M. Califf, with 239 articles (from Duke University, United States); Eugene Braunwald, with 218 (from Harvard University, United States); Salim Yusuf, with 217 (from McMaster University, Canada); Eric J. Topol, with 212 (from Scripps Translational Science Institute, United States); Harvey D. White, with 186 (from University of Auckland, New Zealand); Lars Wallentin, with 144 (Uppsala University, Sweden); and Christopher B. Granger, with 140 (from Duke University, United States).

Table 3 Most productive authors and their institutions

Overall, 154 countries worldwide contributed to the analysed articles. The publication productivity ranking for countries (Table 4) was led by the United States (n = 18,393 articles, with 3.4 million citations), followed by the United Kingdom (n = 8028 articles, with 1.3 million citations), Canada (n = 4548 articles, with 1.0 million citations) and Germany (n = 4415 articles, with 0.9 million citations). A total of 37 countries had at least 100 articles in co-authorship. Figure 3 shows a visual representation of the most intense collaborative network between these 37 countries, in which we can see the relationships of some countries with respect to others and the position that each occupies in the network.

Table 4 Productivity and patterns of collaboration by top countries
Fig. 3
figure 3

Global collaborative network between countries. Note: Most productive cluster of countries applying a threshold of 100 or more papers signed in co-authorship. Node sizes are proportional to the number of papers, and line thicknesses are proportional to the number of collaborations. Node colours: America = red; Asia = yellow; Africa = green; Europe = blue; Oceania = purple

Funding source

A total of 16,485 articles (41.9%) reported sources of funding. The 40 most frequent funding institutions (with 100 or more articles) are listed in Table 5. The main funders were the National Institutes of Health (NIH), with 7422 articles; Hoffmann-La Roche (n = 1188), Pfizer (n = 1139), Merck Sharp & Dohme (n = 1097) and Novartis (n = 1052).

Table 5 Most frequent funding institutions

Most cited articles

Overall, included articles received 5.9 million citations, of which 83.1% of the citations (n = 4,950,604) corresponded to 15,142 (38.5%) articles with more than 100 citations. In addition, 641 (1.63%) articles with more than 1000 citations accounted for 20.7% of the total citations (n = 1,234,462). The most cited articles by number of citations (“100 citation classics”) are listed in Table 6. All of the most cited papers were published in English. These most cited articles were published in nine journals, led by The New England Journal of Medicine, with 78 articles, followed by The Lancet (n = 9) and JAMA (n = 7). The list of most cited papers contained innovative research methodologies. For example, the most cited article was a method paper published in The Lancet (“Bland-Altman method”) [35]. This seminal paper changed how method comparison studies are performed in clinical research. The list of the most cited papers also reflected important studies examining the health effects of pharmacological interventions on patients with chronic diseases. Common themes in major advances in health interventions included diabetes control [36,37,38,39,40,41]; the effects of hormone replacement therapy in postmenopausal women [42, 43]; therapies for diverse cancers such as glioblastoma, colorectal cancer, breast cancer, melanoma and hepatocellular carcinoma [44,45,46,47,48,49,50]; important interventional studies in the field of clinical cardiology, such as lipid-lowering statin therapy trials, antihypertensive trials, and antiplatelet and/or antithrombotic trials [51,52,53,54,55,56,57,58,59,60,61,62,63].

Table 6 Most cited articles

Common keywords

The most commonly used article keywords were “clinical trial” (16.1%; n = 6332 papers), followed by “therapy” (10.8%; n = 4267), “randomised controlled trial” (6.6%; n = 2587), “chemotherapy” (5.6%; n = 2224), “risk” (5.1%; n = 2026), “efficacy” (4.9%; n = 1933) and “double-blind” (4.9%; n = 1929). The most frequently used keywords in the most prolific journals are shown in Table 7. In addition, exploratory analyses of word clouds and networks based on keywords (co-words) showed the broad range of the topics covered (see Additional file 6).

Table 7 Most prolific journals and most commonly used keywords per journal

Discussion

In this cross-sectional analysis, we presented a global mapping of RCT-related articles published in high-IF medical journals for the period 1965–2017. We identified the most prolific scientists, institutions and countries, main funding sources, most common subjects and topics, “citation classics” and most prolific high-IF medical journals from multiple specialties over the last 50 years.

In general, we found a strong clustering of articles published in British and American medical journals (The Lancet, Journal of Clinical Oncology, The New England Journal of Medicine, The BMJ, Circulation, JAMA, JACC and Diabetes Care accounted for 53% of the RCT-related articles). Many of these journals have been developed by active medical associations, both nationally and internationally. We hypothesize that different publishing patterns between journals may potentially reflect editorial policies and/or preferences, with some general medicine journals (such as The Lancet and The New England Journal of Medicine) and specialty journals (such as Journal of Clinical Oncology and Circulation), being more interested in and/or promoting the publication of RCTs. In contrast, a substantial number of these articles are behind publication paywalls (very few of the medical journals in our study sample are Open Access), and thus, research results may not be accessible to a large fraction of the scientific community and society as a whole, including clinicians (and patients) who may want them to help inform their clinical practice.

The results of this study highlight the expanding collaborative networks between countries in multiple regions, revealing a discernible scientific community, with the most productive countries having an important number of collaborations. Publication activity efforts were global during the study period, with articles from scientists and institutions in more than 150 different countries. However, the scientific community is centred on a nucleus of scientists from Western countries, with the most intense global collaborations taking place among the United States, United Kingdom and Canada. The presence and influence that these countries have on biomedical research [64,65,66] may be due to their large multi-stakeholder research partnerships, greater financial investment in clinical research, and high population of active scientists and research centres compared to other countries.

Publication activity worldwide shows that low- and middle-income countries have low levels of articles in high-IF medical journals. Difficulties in healthcare, education and research systems, information access and communication, language barriers and economic and institutional instability all represent challenges (and clear disadvantages) for productivity in low- and middle-income regions. In addition, restrictions and difficulties in conducting clinical research in resource-poor situations result in the exclusion of many of these countries from the planning, conduct and publication of RCTs [67,68,69]. As might be expected, our results support previous findings that low- and middle-income countries [31, 70, 71] had minimal contributions in articles published in major medical journals. For example, a previous study [70] showed that most of the authors of original papers published in five high-impact general medical journals (including The New England Journal of Medicine, The Lancet, JAMA, The BMJ and Annals of Internal Medicine) were more frequently affiliated with institutions in the same country as the journal. To address some of these problems, scientists, institutions and funders should promote collaborations (beyond historical, cultural and political factors) to share knowledge, expertise and innovative methodologies for clinical research. This may involve partnerships with Western countries to support capacity and resource development and research training.

RCT-related articles were published most often in high IF medical journals devoted to general and internal medicine, cardiology and oncology (nearly 57% of all articles). Similarly, the lists of the most cited articles identified topics which reflect major advances in the management of chronic conditions (such diabetes, cardiovascular disorders and cancer). The large relative productivity in general internal medicine, cardiology and oncology may be explained by the important role of randomised evidence to novel treatments and preventive strategies for these chronic diseases. In line with previous research [72,73,74,75], most of these highly cited RCTs addressed interventions for burdensome conditions that are health priorities in Western countries [76, 77]. Funding of (international, collaborative) RCTs may come from varying sources including commercial and non-commercial sponsors. However, previous analyses of RCT-related articles published in high-IF journals have suggested that study sponsors may influence how RCTs are designed, conducted and reported, sometimes serving financial rather than public interests [78]. Given that research funding is often restricted, the scientific community is responsible for using the available resources most efficiently when exploring research priorities to afford knowledge users and population health needs [76, 77, 79, 80].

Our findings suggest that women are vastly underrepresented in the group of most prolific scientists publishing in high-impact medical journals. This is in direct contrast to recent studies that have identified a gender gap in research publications [81,82,83,84]. For example, a previous study [84] showed that women in first authorship positions increased from 27% in 1994 to 37% in 2014 in leading medical journals (including Annals of Internal Medicine, JAMA Internal Medicine, The BMJ, JAMA, The Lancet and The New England Journal of Medicine), but progress has plateaued or declined since 2009. An urgent need exists to investigate the underlying causes of the potential gender gap to help identify publication practices and strategies to increase women’s influence [82, 84].

Several limitations exist in our study. First, we characterised the knowledge structures generated by a large number of articles published in major medical journals that are included in the WoS database. However, our results are limited to a subset of all clinical-trial-related articles published in 40 leading medical journals. We suspect that these articles represent those that have great implications for clinical practice and that are relevant to clinical practice guidelines and healthcare regulators. Although the publication production analysed has been drawn from an exhaustive analysis of the biomedical literature, possibly, the search missed some relevant articles (and journals). Some reports may be published in journals without being indexed as RCTs, making them difficult to identify. Second, as in many bibliometric analyses, the normalisation of the different names of an author, country and funding sources is fundamentally important to avoiding potential errors. We conducted a careful manual validation of the references and textual data to avoid typographical, transcription and/or indexing errors. However, we recognize this procedure does not assure complete certainty. Third, the affiliation addresses of authors do not necessarily reflect the country where the research was conducted or the research funding source. Fourth, topical analysis that extracts a set of unique keywords, word profiles and co-words may indicate intellectual organization in publication production, albeit with inherent limitations [85, 86]. Fifth, the use of citation analysis carries some problems [87,88,89,90,91]. A potential length time-effect bias exists, which puts the more recent articles at a disadvantage. In addition, the biomedical literature is rich in barriers and motivations for publication and citation preferences [87], including self-citation (bias towards one’s own work) [88], language bias (bias towards publishing and citing English articles), omission bias (bias whereby competitors are purposely not cited), and selective reporting and publication bias (bias in which “negative” results are withheld from publication and citation) [89,90,91,92]. In addition, citations are also treated as equal regardless of whether research is being cited for its positive contribution to the field or because it is being criticized. Finally, our methods represent only a mapping approach, which could be complemented further by more detailed analyses such as by examining the content (e.g. differences in journal or author characteristics between publicly funded and industry-funded studies, designs/methodology, etc.), the reporting and the reproducible research practices through research of research (“meta-research”) studies [92,93,94,95,96,97,98].

Conclusion

The global analysis presented in this study provides evidence of the scientific growth of RCT- related articles published in high-IF medical journals. Over the last 50 years, publication activity in leading medical journals has increased, with Western countries (most notably, the United States) leading but with low- and middle-income countries showing very limited representation. Our analysis contributes to a better conceptualization and understanding of RCT articles and identified the main areas of research, the most influential publication sources chosen for their scientific dissemination and the major scientific leaders. Given the dynamic nature of the field, whether the growth trend remains the same in the coming years and how the characteristics of the field change over time will be interesting to see.