Discovering novel therapies and defining the causal contribution of genes, proteins and lipids is challenging in sepsis, because sepsis is so heterogeneous; effective therapies in one patient may not be effective in another, explaining in part why sepsis trials have not been very successful.

We highlight Mendelian randomization (MR), a statistical methods that first helps establish causal relationships between intermediate phenotypes such as plasma proteins and clinical phenotypes using genetics. In sepsis, there are innumerable studies showing significant associations between proteins, metabolites, other biomarkers and outcomes, yet the causal contribution of biomarkers is highly uncertain, potentially due to confounding, reverse causation, or because, for example, inflammatory markers are non-specific epiphenomena. Genetics are a powerful tool to interrogate which intermediate traits, including plasma proteins, metabolites, or even radiographic features, contribute to a disease. MR also enhances prognostic and predictive enrichment of sepsis trials by better defining biologically relevant clinical subgroups. The technique has been successfully used to define several traits with causal contributions to risk for acute respiratory distress syndrome (ARDS) [1,2,3] and to sepsis mortality [4, 5], highlighting pathways which warrant targeting and identifying specific at-risk populations. Thus, MR holds promise to advance successful precision sepsis trials.

MR uses observational data and genotype as a statistical instrument to estimate an intermediate variable (e.g., protein, metabolite), because humans are randomly “assigned” their genotypes at conception [6, 7]. Underlying MR is the assumption that the portion of the trait that is genetically determined is less vulnerable to measurement error or confounding, and not at risk for reverse causality, since genotype always precedes outcome. MR aids predictive enrichment by separating biologically causal pathways from the numerous non-causal biomarker and outcome associations.

Although MR lacks utility for individual subject classifications, we argue that it has value in preparing for precision medicine trials by highlighting the key variables that classify individuals and that influence disease risk or outcome. Exciting biomarker-enriched sepsis trials [8] and retrospective evaluations suggest that heterogeneity of treatment effect can be predicted by biologically defined subtypes [9, 10] suggesting that precision sepsis treatment might soon be reality.

Mendelian randomization relevance in sepsis

MR is an adaptation of instrumental variable analysis, a tool for causal inference from observational data, wherein the instrument is genotype. In other complex traits, MR very effectively identifies which intermediate variables (e.g., plasma proteins, imaging markers or physiologic measurements) contribute causally to disease, as opposed to being merely correlated [11]. In sepsis, few causal intermediates have been identified, and better elucidation of these traits might focus attention on the most promising targets.

In sepsis, MR links three types of evidence to infer causality, (1) genotype versus intermediate phenotype, (2) intermediate phenotype versus clinical phenotype, and (3) genotype versus clinical phenotype.

Figure 1A illustrates how a statistical instrument (tobacco tax) can infer causality (smoking causes cancer) from observational data. Figure 1B extends these concepts to illustrate how MR rules out alternative explanations for changes in the protein and changes in the phenotype in the absence of a causal relationship between the protein and clinical phenotype (e.g., severity of sepsis could independently cause both). In this case the genotype of the protein, which is randomized at conception according to the independent assortment of alleles, can be used as the instrument. If (1) the genotype of the protein is related to the activity of the protein (e.g., by protein abundance, isoform, or function) and (2) genotype of the protein is also related to the clinical phenotype (e.g., sepsis survival) then alternate explanations are avoided. Thus, the protein likely causally contributes to the clinical phenotype. Note that both cis regulation (‘genotype of the protein’) and trans regulation are relevant, since multi-locus MR uses all variants with strong effect and tests for consistency.

Fig. 1
figure 1

A Illustrates the underlying rationale of MR by raising the question of whether smoking causes cancer (red arrow). It is possible that an observed or even unobserved variable within the environment may entice a person to smoke and that same environmental variable may also contribute to causing cancer. If a tobacco tax (the instrument) 1. (straight horizontal black arrow) results in a decrease in smoking and 2. (curved arrow) is also associated with a decrease in incidence of cancer then smoking must indeed cause cancer, because a tobacco tax could not reduce cancer incidence in any other way. The instrument avoids confounders. B Is similar to Panel A in that it raises the question of whether a protein (e.g., IL-1b) causally alters (red arrow with question mark) a clinical phenotype (e.g., sepsis survival) or whether there is an alternate explanation. C IgG SNP effect and COVID-19. Each point corresponds to the SNP effect in each dimension. The grey lines represent the standard errors of each dimension. The blue line corresponds to the linear regression estimate of the relationship between the effects on the two variables

We used Mendelian randomization to answer the following question: do low low-density lipoprotein (LDL) levels increase sepsis mortality? We used Proprotein Convertase Subtilisin/Kexin type 9 PCSK9) genotype and 3-Hydroxy-3-Methylglutaryl-CoA (HMG-CoA) reductase (HMGCR) genotype as instrumental variables with demonstrable influence on LDL concentration [4].

HMG-CoA reductase is the rate-limiting enzyme for cholesterol synthesis; PCSK9 regulates LDL clearance. We reasoned that if both PCSK9 loss-of-function genotype (which increases LDL clearance) and HMGCR loss-of-function genotype (which reduces LDL production) similarly increase mortality, one would conclude that low LDL causes sepsis mortality. However, HMG-CoA reductase genetic score was not associated with increased sepsis mortality, whereas PCSK9 genetic score was associated with decreased mortality. Thus, increased LDL clearance via PCSK9 genotype effects may lower mortality [4] by increasing LDL-bound pathogen lipid clearance. Furthermore, PCSK9 genotype could be used to enrich trials of PCSK9 inhibitor(s) in sepsis.

MR also shows that body mass index [12] and high-density lipoprotein (HDL) levels causally contribute to sepsis mortality [13, 14] and vitamin D causally contributes to risk of bacterial pneumonia [15].

In another MR example, we evaluated whether native IgG antibody level alters risk of hospitalization due to coronavirus disease 2019 (COVID-19). First, we identified IgG genotypes associated with IgG level (via GWAS catalogue data set from 70 studies), restricting analysis to variants that function as strong statistical instruments (p < 5 × 10–8). Second, we used the GWAS statistics from the COVID-19 Host Genetics Initiative (freeze 6, B2_(12)ALL_leave_23andme) for COVID-19 hospitalization (n = 24,274) versus non-COVID population (n = 2,061,529) [16]. Genetic variants with a greater effect on IgG levels had a greater effect on decreasing COVID-19 hospitalization (Fig. 1C).

Limitations of MR include the need for large sample sizes and the potential for differential genetic predictors for quiescent versus evoked sepsis traits. Further limitations are inadequate phenotype definition, gene–environment interaction, measurement error, and linkage disequilibrium [17]. There are examples in which the Mendelian randomization assumptions are violated causing biases [17]. Sensitivity analysis can determine the consequences of assumption violations and tries to mitigate such violations [17]. Furthermore, an extension of MR, the MR–Egger method uses tests to determine whether genes have (1) multiple effects—directional pleiotropy—(2) causal effects, and (3) an estimate of the size of the causal effect [18]. However, causal estimates from the MR–Egger method causal estimates can be biased with Type 1 error.

In conclusion, familiarity with Mendelian randomization may help better (1) elucidate causality and (2) enhance selection of patients for trials through enrichment and classify patients likely to benefit, thus catalysing more successful sepsis trials.