Avoid common mistakes on your manuscript.
Meta-analyses underpin guideline recommendations for clinical decisions and synthesize multiple studies into two important estimates: the pooled treatment effect and the non-random variation (heterogeneity) in treatment effect between studies. When different studies produce conflicting results, it is essential to identify the factors that vary between the studies in order to explain the heterogeneity in treatment effects. For example, perhaps corticosteroids are beneficial only in sepsis patients with high severity of illness [1]? Perhaps low tidal volume ventilation is beneficial only in patients affected by acute respiratory distress syndrome (ARDS) with low pulmonary compliance [2]? Here, we illustrate why identifying “treatment responders” requires individual patient data.
When relying on averaged patient characteristics reported in trial papers to explain between-trial differences in treatment effects, the observed direction of effect modification can be completely reversed from the direction that is relevant for individual treatment decisions: patient characteristics that appear to be associated with treatment benefit at the study level may actually be associated with harm at the individual patient level, and vice versa. We illustrate that the circumstances that lead to effect modification reversal can be expected to occur in intensive care-related research questions.
A reversal of conclusions
To illustrate the problem, we simulated study data. The R code to re-generate and analyze similar data can be found in the supplementary materials.
Suppose that four randomised controlled trials (RCTs) have been performed testing the same intervention. One trial showed benefit from the intervention, two trials showed no effect and one trial showed harm. Readers of the four trials noticed that the reported average severity of illness was lowest in the trial that showed benefit.
The hypothesis of effect modification by severity of illness is explored in a conventional meta-analysis that stratifies the trials by average illness severity. While the pooled effect of the two trials with low mean severity of illness is more consistent with benefit, the pooled effect of the two trial with high mean severity of illness clearly indicates harm. A meta-regression analysis is performed [3], which likewise shows a strong negative association between the studies’ mean severity of illness and treatment benefit (p < 0.01). The meta-analysts conclude that the treatment is harmful in patients with high severity of illness.
Meanwhile, the investigators of the individual studies collaborate to perform an individual patient data meta-analysis. They find a strong positive association between severity of illness and treatment benefit (p < 0.01) and conclude—in direct contrast with the first study-level meta-analysis—that treatment is beneficial in patients with high severity of illness.
The reversed conclusion about the direction of effect modification by severity of illness has important consequences for individual patient care. Should the therapy be avoided or recommended in patients with high illness severity?
The results from the meta-analysis that uses individual patient data are correct: within each of the four RCTs, more severely ill patients benefited more from treatment. Figure 1 shows the true relationship between severity of illness and treatment effect (panel A), as well as misleading study-level association between severity and treatment effect in a meta-analysis (panel B) and meta-regression (panel C), and the correctly estimated association between severity and treatment effect using individual patient data (panel D).
Even though the study-level meta-analysis was wrong, no technical error was made. The true direction of effect modification is inherently unobservable from aggregated data at the study level.
Explaining effect modification reversal
Effect modification reversal is a special case of Simpson’s paradox: the reversal of a statistical association when the level of data aggregation changes [4]. It can be explained as follows.
Suppose that a variable—the confounder—is unevenly distributed across RCTs and that a true effect modifier is correlated to the confounder. For example, one could imagine that the percentage of patients with extrapulmonary (instead of pulmonary) ARDS differs between several different RCTs testing low vs. high positive end-expiratory pressure (PEEP). Suppose that high PEEP is more beneficial in patients with high alveolar recruitability, which tends to be higher on average in extrapulmonary ARDS.
Effect modification reversal occurs when a hypothesized effect-modifying variable is positively correlated to the confounder, but negatively correlated to the true effect modifier, or vice versa. Severity of illness may be positively correlated to extrapulmonary ARDS (those with extrapulmonary ARDS are more severely ill) but negatively correlated with recruitability (patients with high recruitability are less severely ill, given their status of pulmonary vs. extrapulmonary ARDS).
The correlation structure in this example leads to effect modification reversal: studies including more patients with extrapulmonary ARDS have included more patients with high recruitability and, therefore, demonstrate more benefit from a high PEEP strategy. These studies also included, on average, more severely ill patients (extrapulmonary ARDS was positively correlated with severity). At the study level, treatment benefit appears to be positively associated with severity of illness (Fig. 1, panels B, C). In fact, given that any patient has pulmonary or extrapulmonary ARDS, treatment benefit is negatively associated with severity of illness, as severity of illness was negatively correlated to recruitability (Fig. 1, panel D).
Failing to identify the true effect modifier is not the heart of the problem. When the data are analyzed at the individual patient level, the correct direction of effect modification will be borne out. In our example, an analysis that estimates interaction between severity and outcome while conditioning on the study (for example in a mixed-effects model) will demonstrate the true direction of effect modification.
Take-home message
The cause of between-trial heterogeneity cannot be reliably inferred from study-level data, whether assessed informally in the literature appraisal or formally through meta-regression analysis. Apparent and even statistically significant effect modification in one direction may actually be completely misleading. When only study-level data are available, interpreting between-study differences in patient characteristics as evidence of effect modification are misguided and potentially hazardous for individual care decisions. This underscores the need for international collaborative programs for data sharing and evidence synthesis using individual patient data [5,6,7,8].
Data availability
Not applicable.
References
Rochwerg B, Oczkowski SJ, Siemieniuk RAC et al (2018) Corticosteroids in sepsis: an updated systematic review and meta-analysis. Crit Care Med 46:1411–1420. https://doi.org/10.1097/CCM.0000000000003262
Deans KJ, Minneci PC, Cui X et al (2005) Mechanical ventilation in ARDS: one size does not fit all. Crit Care Med 33:1141–1143. https://doi.org/10.1097/01.ccm.0000162384.71993.a3
van Houwelingen HC, Arends LR, Stijnen T (2002) Advanced methods in meta-analysis: multivariate approach and meta-regression. Stat Med 21:589–624. https://doi.org/10.1002/sim.1040
Julious SA, Mullee MA (1994) Confounding and Simpson’s paradox. BMJ 309:1480–1481. https://doi.org/10.1136/bmj.309.6967.1480
Reade MC, Delaney A, Bailey MJ et al (2010) Prospective meta-analysis using individual patient data in intensive care medicine. Intensive Care Med 36:11–21. https://doi.org/10.1007/s00134-009-1650-x
Juschten J, Tuinman PR, de Grooth H-J (2023) Harmonization of reported baseline characteristics is a prerequisite for progress in ARDS research. Ann Am Thorac Soc. https://doi.org/10.1513/AnnalsATS.202212-1038IP
Investigators PRISM, Rowan KM, Angus DC et al (2017) Early, goal-directed therapy for septic shock—a patient-level meta-analysis. N Engl J Med 376:2223–2234. https://doi.org/10.1056/NEJMoa1701380
Young PJ, Bellomo R, Bernard GR et al (2019) Fever control in critically ill adults. An individual patient data meta-analysis of randomised controlled trials. Intensive Care Med 45:468–476. https://doi.org/10.1007/s00134-019-05553-w
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare to have no competing interests.
Compliance with ethical standards
No patients or animals were directly involved in this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
134_2023_7163_MOESM1_ESM.html
Supplementary file1 Supplementary material A document containing R code to perform similar simulations as described in this paper is available as electronic supplementary material (HTML 872 KB)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which permits any non-commercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc/4.0/.
About this article
Cite this article
de Grooth, HJ., Parienti, JJ. Heterogeneity between studies can be explained more reliably with individual patient data. Intensive Care Med 49, 1238–1241 (2023). https://doi.org/10.1007/s00134-023-07163-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00134-023-07163-z