Background

Randomized controlled trials (RCTs) are considered the gold standard for evidence-based medicine because they are designed to minimize the risk of bias [1]. However, the applicability of their results has been criticized because of restrictive selection criteria, with, commonly, exclusion of older adults and people with co-morbidities or severe disease [24]. Also, conducting an RCT is sometimes impossible or inappropriate (eg, when studying rare or long-term events) [1, 3, 5], which results in critical information gaps.

In contrast, observational studies, the overarching term for all non-experimental non-randomized studies (including cohort, case–control, and cross-sectional studies) [6], generally are more likely to reflect clinical practice in real life because of their broader range of participants, longer follow-up time, and lower costs than RCTs [710]. With the aim of generating evidence that will guide healthcare decisions, the field of comparative effectiveness research (CER) emphasizes the need to incorporate data from observational studies to complement RCTs [8, 1116]. A comprehensive assessment in 2009 indicated that 54 % of CER studies had an observational study design [17]. Therefore, an increasing number of systematic reviews and meta-analyses are including data from non-randomized studies to assess therapeutic interventions.

Similar to systematic reviews of RCTs, reviews including non-randomized studies are expected to follow the general recommendations for good conduct, such as retrieving all relevant studies and assessing their risk of bias. However, some elements should be adapted specifically to the inclusion of non-randomized studies because their study designs inherently differ from RCTs [7, 9, 14, 1823]. Lacking randomization, they are likely subject to confounding bias, which results in an imbalance in prognostic factors associated with the outcome of interest that may severely compromise the validity of their results [24].

Previous methodological reviews evaluating systematic reviews including observational studies exist [2528]. However, these studies have a different objective. One assessed the main characteristics of all systematic reviews indexed in Medline on November 2004 whatever the type of included studies (ie, therapeutic, epidemiological, prognostic or diagnostic studies) [27]. Two others focused on the methods and reporting of harms in systematic reviews of adverse events [26, 28]. The last one was in the field of psychiatry and did not concern therapeutic evaluation but assessment of prevalence or association [25]. Further, none of these previous reviews has evaluated the specific methodological problems raised by the inclusion of non-randomized studies.

In this study, we performed a methodological review of meta-analyses including non-randomized studies of interventions (NRSI) to evaluate key methodological components common to all meta-analyses and those specifically related to the inclusion of non-randomized studies.

Methods

Study design

This is a methodological review of meta-analyses including NRSI for therapeutic evaluation. For clarity and consistency, we refer to this article as a “methodological review”, the systematic reviews with meta-analyses included in this methodological review as “meta-analyses”, and the studies included in these meta-analyses as “studies”.

Search strategy

Our goal was not to create an exhaustive list of all meta-analyses that include NRSI but rather to identify a relatively representative sample of recently published meta-analyses that a health professional would most likely encounter when searching for meta-analyses. We therefore searched MEDLINE via PubMed because of its wide use among health professionals, combining keywords and MeSH terms for NRSI, systematic reviews, and meta-analyses (Appendix 1). The search was conducted on January 7, 2014 and restricted to the year 2013.

Eligibility criteria

To be eligible, a meta-analysis had to examine a therapeutic or preventive intervention (such as vaccines) for efficacy or safety, include data from at least one NRSI, and be published in 2013. We excluded meta-analyses that included studies without a comparison group and meta-analyses of etiological assessment. When it was difficult to distinguish an etiological from a therapeutic evaluation, we agreed to include the former if the authors considered the inclusion of RCTs in their meta-analysis. To illustrate: a meta-analysis that investigated the association of the use of statins and risk of cancer would was considered a therapeutic evaluation if the authors planned to include RCTs. Individual patient data meta-analyses were also excluded, as were non-randomized studies that conducted a meta-analysis of the literature as secondary analysis. Finally, we did not include meta-analyses published in a language other than English or those for which the full text was not available.

Selection of relevant meta-analyses

The selection of relevant meta-analyses was conducted in 2 steps. In the first step, one reviewer (CR) excluded clearly irrelevant studies based on the title, abstract, and full text, then, a second reviewer (TF) performed the final selection, discussing all doubtful cases with a third reviewer (AD).

Data extraction

The data extraction form for this methodological review was developed from the MOOSE statement for reporting meta-analyses that include observational studies [29], the PRISMA statement for reporting systematic reviews and meta-analyses of studies evaluating healthcare interventions [30, 31], and the AMSTAR measurement tool for assessing the methodological quality of systematic reviews [32]. The data extraction form was tested by one reviewer (TF) with 10 studies before data extraction commenced.

Two reviewers (TF, CR) independently extracted all data in duplicate, resolving discrepancies with a third reviewer (AD) if necessary. The following characteristics were extracted from the full text and online appendix of each meta-analysis:

  • General characteristics: We collected whether the journal was a specialty or a general journal, the location of the corresponding author, and the medical area. We verified whether the meta-analysis was registered on the international prospective register of systematic reviews by the University of York’s Centre for Reviews and Dissemination (PROSPERO). We collected whether epidemiologists or statisticians were involved, relying on the definition given by Delgado-Rodriguez et al.[33] and assessed whether the authors reported the funding sources and declared conflict of interests. We assessed whether the meta-analyses evaluated a pharmacological or non-pharmacological intervention. Non-pharmacological interventions were classified as surgical procedures or other interventions. We also assessed the type of studies included: only NRSI or both NRSI and RCTs.

  • Systematic review methods:

    • ▪Search strategy: We collected how many and which electronic databases were searched, and whether the search strategy for at least one database was provided. We collected whether reference lists and journals were hand-searched and whether the authors searched for grey literature, and if yes, how: search of registries (eg, ClinicalTrials.gov), conference abstracts, or contacting experts. We assessed whether the authors restrict their searches by language.

    • ▪Study selection and data extraction process: We assessed whether study selection and data extraction were conducted in duplicate.

    • ▪Contact of the study authors: We noted whether it was mentioned that study authors were contacted for clarification or additional results.

    • ▪Methodological quality/risk of bias assessment: We assessed whether methodological quality or risk of bias assessment was conducted, what tools were used, and whether the assessment was conducted in duplicate.

  • Meta-analysis methods:

    • ▪Studies combined: We assessed the types of NRSI included. NRSI were categorized as concurrent (prospective) cohort, nonconcurrent (retrospective) cohort, case–control, or historically controlled studies according to the definition by Ioannidis et al. [21]. We also assessed whether the authors combined the results from NRSI and RCTs and whether they combined results from different types of NRSI (eg, cohort and case–control studies).

    • ▪Meta-analysis model: We collected whether the authors used crude or adjusted estimates for NRSI and whether they used fixed- or random-effects models to pool the data. For adjusted estimates, we also collected whether the confounding factors taken into account were listed.

    • ▪Assessment and exploration of heterogeneity: We collected whether and how the authors assessed heterogeneity and whether they conducted meta-regression, subgroup, or sensitivity analyses to explore heterogeneity.

    • ▪Assessment of reporting bias: We collected information on whether the authors assessed reporting bias, and how.

Statistical analysis

The analysis of the data consisted of descriptive statistics, providing numbers and percentages for qualitative variables and median (minimum, maximum, or interquartile range) for quantitative variables. The results were stratified for meta-analyses including only NRSI and those including both NRSI and RCTs. We did not assess statistical differences between these strata. Statistical analysis involved use of R 3.0.2. (R Core Team [2013]. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL: http://www.R-project.org/).

Results

Study selection

Our MEDLINE search identified 3602 citations; Among the 341 potentially relevant meta-analyses, 188 were eligible for this review (Fig. 1). Complete references for the included meta-analyses and meta-analyses excluded are in Appendixes 2 and 3, respectively.

Fig. 1
figure 1

Study selection flowchart

General characteristics (Table 1)

Table 1 General characteristics of therapeutic meta-analyses published in 2013 and including non-randomized studies of intervention (n = 188)

Among the 188 included meta-analyses, 49 (26 %) were of surgery, 33 (18 %) cardiology, and 25 (13 %) oncology. Half of the meta-analyses assessed non-pharmacological interventions (n = 92, 49 %); 74 involved surgical procedures. Approximately one third (n = 69, 37 %) included only NRSI, and two thirds included both NRSI and RCTs (n = 119, 63 %).

In total, 36 meta-analyses (19 %) involved epidemiologists or statisticians. Conflict of interest was declared in 166 (88 %), with 26 reporting a potential conflict of interest. About one-third of the meta-analyses did not report a source of funding (n = 69, 37 %).

Systematic review methods (Table 2)

Table 2 Systematic review methods of therapeutic meta-analyses published in 2013 and including non-randomized studies (n = 188)

Literature search

Overall, all but one of the meta-analyses reported the search of at least 1 electronic database and 147 (78 %) reported the search of > 2 electronic databases. One third provided the search strategy for each database (n = 62, 33 %). MEDLINE, Embase, and the Cochrane Library were most frequently searched (187 [99 %], 149 [79 %], and 126 (67 %) meta-analyses, respectively). In addition to the search of electronic databases, 162 meta-analyses (86 %) reported screening the reference lists of included studies, and 12 (6 %) reported hand-searching journals. About one-third of the meta-analyses (n = 72, 38 %) reported searching for grey literature: 41 (22 %) conference abstracts, 33 (18 %) registries, and 15 (8 %) contacted experts. For 82 meta-analyses (44 %), the authors reported that they did not restrict their searches by language.

Methodological quality/risk of bias assessment

Methodological quality or risk of bias of included studies was assessed in 135 (72 %) meta-analyses.

For the 119 meta-analyses including RCTs and NRSI, risk of bias was assessed in 88 (74 %) with 4 assessing risk of bias for RCTs only. RCTs were assessed with the Cochrane Risk of Bias tool in 42 (35 %) meta-analyses. The assessment of risk of bias involved the same tool for both RCTs and NRSI in 27 (23 %) meta-analyses. For the assessment of NRSI, a variety of tools were used. The most frequently used tool was the Newcastle-Ottawa Scale (n = 68). GRADE and the Cochrane Collaboration Risk of Bias Tool were used in 13 and 10 meta-analyses, respectively. In 37 meta-analyses, authors used other tools; in 12, they developed their own tools; and in 12, they were unclear about the methods used for assessing methodological quality/risk of bias. Overall, the authors have considered the risk of confounding bias in their risk of bias assessment in 33 meta-analyses (18 %). Of the 135 meta-analyses with an assessment of risk of bias, 87 (64 %) reported having performed it in duplicate.

Meta-analysis methods (Table 3)

Studies combined

For 130 meta-analyses (69 %), the authors did not clearly report the design for each individual study. Among the meta-analyses that included both NRSI and RCTs (n = 119), for 88 (74 %), the results of NRSI and RCTs were combined.

Table 3 Meta-analysis methods of therapeutic meta-analyses published in 2013 and including non-randomized studies (n = 188)

Concerning NRSI combined, 52 meta-analyses (28 %) included only cohort studies and 5 only prospective cohort studies; 46 meta-analyses (24 %) combined cohort and case–control studies, and 23 (12 %) included all types of NRSI. The other 67 meta-analyses (36 %) included “observational studies” (without further details) (n = 28, 15 %), “prospective and retrospective studies” (n = 23, 12 %), and only “retrospective studies” (n = 16, 9 %).

Crude or adjusted estimates used for NRSI

For 131 meta-analyses (70 %), whether crude or adjusted estimates of treatment effect from the NRSI were used for the meta-analysis was unclear or not reported. For the remaining meta-analyses, the authors reported combining crude and adjusted estimates for 22 (12 %), only adjusted estimates for 21 (11 %), and only crude estimates for 6 (3 %). For 8 meta-analyses (4 %), the authors extracted both the crude and adjusted estimates and used them separately in 2 meta-analyses. Among the 51 meta-analyses involving adjusted estimates, 17 (33 %) did not report the confounding factors adjusted for.

Meta-analysis model

A random-effects model was used for half of the meta-analyses (n = 95). For 52 (28 %), a fixed-effects model was used primarily but then replaced with a random-effects model if high heterogeneity was observed in the model. For 26 meta-analyses (14 %), the authors used both fixed- and random-effects models, and for 9 (5 %), a fixed-effects model. The type of model was not reported or was unclear for 6 meta-analyses (3 %). We found 2 network meta-analyses (1 %).

Heterogeneity assessment

Almost all meta-analyses assessed heterogeneity (n = 182, 97 %). The I2 statistic was used in 164 meta-analyses (87 %), Cochran Q χ2 test in 115 (61 %), and between-study variance τ2 in 6 (3 %). Heterogeneity was explored in 157 meta-analyses (84 %) by subgroup analyses (n = 126, 67 %), sensitivity analyses (n = 109, 58 %) and meta-regression analyses (n = 34, 18 %).

For 44 of 88 (50 %) meta-analyses combining results from RCTs and NRSI, a subgroup or sensitivity analysis was based on the type of study (RCT vs NRSI). For 28 meta-analyses (15 %), subgroup or sensitivity analyses were based on the type of NRSI included.

Reporting bias assessment

Reporting bias was assessed in 127 meta-analyses (68 %) by standard funnel plots (n = 111, 59 %), Egger’s test (n = 68, 36 %), or Begg’s test (n = 42, 22 %). Overall, 82 of the 105 meta-analyses (78 %) including 10 or more studies reported having assessed reporting bias.

Discussion

We systematically assessed key methodological components of a large sample of therapeutic meta-analyses including NRSI in a variety of medical areas. Our results highlight some important methodological shortcomings. Only 38 % of the meta-analyses reported having searched for grey literature. Specific points related to the inclusion of NRSI raise concerns, with 69 % of the meta-analyses not reporting the study design of the included NRSI, and 70 % not reporting whether crude or adjusted estimates were combined.

Strengths and limitations of study

To the best of our knowledge, no previous study has comprehensively assessed both key methodological components common to all systematic reviews and elements specific to the inclusion of non-randomized studies. Other studies that previously evaluated methods or reporting of systematic reviews including NRSI concentrated on the reporting of harms [26, 28] and on systematic reviews in psychiatric epidemiology [25].

Our study has some limitations. The representativeness of our sample could be debated because we searched for studies in only one online database (MEDLINE), and limited our selection to meta-analyses in English. In addition, for the assessment of the methods, we depended completely on the reporting; we did not assess protocols or contact the authors if methods were not clearly reported. Even though poor reporting does not necessarily reflect poor conduct, it may severely limit the reader’s comprehension of the systematic review process [34].

Before being able to apply the results of any meta-analysis to patient care, health professionals need to evaluate the credibility of the methods of the meta-analysis [35]. One of the key methodological elements is the search for relevant studies. Because not all studies (and particularly those with negative results) are published in scientific journals, a meta-analysis must involve a search for grey literature to try to avoid such publication bias (a type of reporting bias) [24, 35]. However, we found that only 38 % of our meta-analyses reported having searched for grey literature. Because of no mandatory registration for NRSI as for RCTs, most NRSI are not registered, so searching for grey literature of NRSI is difficult [36]. However, a recent study found that for 32 % of the observational studies registered at ClinicalTrials.gov, unpublished results could be retrieved [37]. In contrast, we found that many meta-analyses assessed reporting bias (68 %). Reviewers may have compensated for the absence of searching for grey literature by assessing reporting bias. Evaluating reporting bias does not exempt the reviewers from searching for grey literature because the assessment of Funnel plot asymmetry may be subjective and statistical methods to test for asymmetry of the plot may lack power [38, 39].

Another critical part of the systematic review process is assessing the methodological quality or risk of bias of the studies included, because the validity of the meta-analysis could be questionable with problems in the design and conduct of individual studies [40]. We found that 72 % of our meta-analyses reported having assessed the methodological quality or risk of bias but only 33 (18 %) considered the risk of confounding bias in their assessment. The Cochrane Collaboration has recognized the need to improve the assessment of risk of bias for NRSI and is currently developing a tool for this.

Finally, we found specific issues related to the inclusion of NRSI. In 69 % of the meta-analyses, the study design for each included study was unclear. The risk of bias may vary depending on the type of NRSI, with case–control studies generally considered as having a higher risk of bias than cohort studies. A description of the type of studies included in the meta-analysis is crucial. In addition, NRSI are prone to confounding: an imbalance in prognostic factors associated with the outcome of interest [24]. NRSI are expected to at least present adjusted estimates from multivariate analyses [3, 4]. Many of our meta-analyses (70 %) did not report or were unclear about whether the crude or adjusted estimates of NRSI were combined. Among the meta-analyses involving adjusted estimates, 33 % did not report the confounding factors adjusted for. This information was likely poorly reported in the individual studies, but then the reviewers should contact the authors for clarification or report it clearly in the meta-analysis.

Conclusions

Some key methodological components of the systematic review process – search for grey literature, description of the type of NRSI included, assessment of risk of confounding bias and reporting of whether crude or adjusted estimates were combined—are not adequately reported in meta-analyses including NRSI. Attention should be paid to improving these elements in such meta-analyses to have an increased confidence in their results.

Ethics

Not applicable. This article reports a meta-research study.

Consent

Not needed. This study does not include human participants.

Availability of supporting data

Data are available upon request for academic researchers.