Introduction

Under normal physiological conditions, platelets and fibrin form clots to prevent blood loss at the site of vessel injury [1]. However, when clots (or thromboses) form abnormally they can disrupt blood flow [2, 3] and when this occurs in the deep veins of the limbs or pelvis this is known as deep vein thrombosis (DVT). A complication of DVT is pulmonary embolism (PE), where a clot breaks away from a deep vein wall and becomes lodged in a pulmonary blood vessel, obstructing blood flow to the lungs and causing respiratory dysfunction. In 2021, there were approximately one million incident cases of venous thromboembolism (VTE) in the United states alone [4]. DVT accounts for approximately two-thirds of VTE events and PE is the primary contributor to mortality. While VTE was a primary cause for 10,511 deaths in the UK in 2020 [5], the actual contribution of VTE to annual deaths is estimated to be 2–threefold higher [6].

To prevent acute and chronic complications it is essential to establish an accurate diagnosis of DVT. The symptoms of DVT alone are often not specific or sufficient to make a diagnosis, and about half of those suffering DVT will have no symptoms [7]. Symptoms are considered in conjunction with known risk factors to help estimate the likelihood of DVT and determine whether thromboprophylaxis is required [3]. Pharmacological thromboprophylaxis includes the use of anticoagulants, such as intravenous heparin and oral warfarin (a vitamin K antagonist), which have been used in combination to treat DVT for over 50 years, but require constant maintenance and monitoring [3]. More recently direct oral anticoagulants (DOAC), such as dabigatran (which inhibits thrombin) or rivaroxaban (which inhibits factor Xa), have been employed with reduced economic costs relative to traditional treatments [8].

Risk factors for DVT include age, obesity and genetic factors (such as deficiencies in the anticoagulation proteins: antithrombin, protein C, protein S and Factor V Leiden) [2, 9, 10]. However, the mechanisms through which these risk factors act have not been clearly established. The identification of novel causal risk factors and potential drug targets is required for improved DVT prophylaxis [3].

Mendelian randomization (MR) allows us to infer causality while addressing limitations of observational epidemiology such as confounding and reverse causation [11,12,13,14]. The design of a MR analysis is analogous to that of a randomised control trial (RCT), the “gold standard” method for evaluating the effectiveness of an intervention (Supplementary Fig. 1) [15]. It is an instrumental variable-based method that uses genetic variants as proxies (or instruments) for exposures to permit causal inference when interpreting relationships between these exposures and disease outcomes [16]. Here, we have used two-sample MR, which uses data from separate genome-wide association studies (GWAS) for exposures and outcomes of interest [17] to consider the effect of multiple exposures (phenotypes) on DVT risk.

To advance our understanding of DVT aetiology, we undertook a MR phenome-wide association study (MR-PheWAS). As 24 out of 57 exposures estimated to influence DVT were adiposity-related, we explored whether levels of circulating proteins, known to be altered by adiposity, were responsible for this association.

Methods

Study design

With the aim to identify novel risk factors for DVT, we performed a MR-PheWAS to estimate the effects of 973 exposures on DVT risk. As 24 of the 57 exposures estimated to influence DVT were adiposity-related (see Table 1), we next decided to investigate potential mediators of this mechanistic relationship further. We focussed our mechanistic investigations on circulating proteins altered by adiposity [18, 19] and performed a two-sample mediation MR to estimate the effect of BMI on DVT with BMI-associated proteins as mediators. An overview of the study design is shown in Fig. 1. All analyses were conducted using R version 3.6.1. The MR-PheWAS was conducted using the TwoSampleMR R package [14]. STROBE-MR [20] reporting guidelines were followed (Additional file 4).

Table 1 Traits passing the PhenoSpD significance threshold (5.43E-5) in the MR-PheWAS of all traits in UK Biobank on DVT risk with the Inverse Variance Weighted (SNP > 1) and Wald Ratio (SNP = 1). Exposures highlighted in orange are referred to as "adiposity-related" in the main text
Fig. 1
figure 1

Overview of the study. First, a MR-PheWAS analysis to find risk factors for DVT was done using the MR-Base database and identified many of these to be associated with adiposity (N=24/57). This was followed by a two-sample mediation MR between BMI-associated pQTL data on DVT risk. MR = mendelian randomization; GWAS = genome-wide association study; VTE = venous thromboembolism; DVT = deep vein thrombosis; SNP = single-nucleotide polymorphism; pQTL = protein quantitative trait loci; PAI-1 = Plasminogen activator inhibitor-1; NOTCH1 = Neurogenic locus notch homolog protein 1; INHBC = Inhibin Subunit Beta C; S Table = Supplementary Table

Data preparation

Deep vein thrombosis GWAS data

Our outcome of interest (DVT) was presented in MR-Base as “Non-cancer illness code self-reported: deep venous thrombosis (dvt)”; these summary results describe a GWAS of Europeans (6,767 cases and 330,392 controls) performed using the PHEnome Scan ANalysis Tool (PHESANT), followed by genotypic data selected through SNP quality control (QC) [21, 22] (http://www.nealelab.is/uk-biobank).

GWAS data for exposures

Genetic data for exposures were obtained from the MR-Base platform of harmonised GWAS summary data [14]. The MR-Base platform permits the hypothesis-free analysis of all catalogued exposures to DVT. The exposures encompassed lifestyle, disease and biological traits. Non-European (N = 88) and duplicate (N = 138) studies were excluded. In the case of duplicate studies, those with the highest sample size were retained. VTE (DVT and PE) and VTE-related (e.g. phlebitis and thrombophlebitis) traits were removed (N = 9). The genetic instruments used for the analysis were single-nucleotide polymorphisms (SNPs) associated with each of the exposures at a genome-wide level of significance (P < 5e-8). As genetic confounding may bias MR estimates if SNPs are correlated [23], linkage disequilibrium (LD) clumping in PLINK [24] was conducted to ensure the SNPs used to instrument exposures were independent (radius = 10,000 kb; r2 = 0.001) using the 1000 Genomes European reference panel [25]. We also used the 1000 Genomes European dataset [25] to identify potential SNP proxies (with which the initial SNP is in LD with, r2 > 0.8) for those SNPs not present in the DVT summary statistics. Where not specified in Supplementary Table 2, the reported effect size for a given SNP was expressed along with the standard error (SE) in standard deviation units of the level of the risk factor for a continuous exposure, or as a unit change in the exposure on the log-odds scale for a binary trait.

Protein quantitative trait locus data

We aimed to determine whether BMI-associated proteins were mediating the relationship between adiposity and DVT. A list of BMI-associated proteins was obtained from two previous MR studies investigating the effect of BMI on the circulating proteome [18, 19]. We used protein quantitative trait loci (pQTL) data [26, 27] to identify SNPs associated with circulating protein levels at a genome wide level of significance (P ≤ 5e-08). Protein detection platforms for the pQTL data included the SOMAScan® by SomaLogic and Olink (ProSeek CVD array I) [28,29,30,31]. Twenty-five proteins were identified using these criteria (Supplementary Table 1). PLINK clumping (radius = 10,000 kb; r2 = 0.001) was performed to ensure the genetic variants used to instrument protein levels were independent. Proxy SNPs for those SNPs that were not present in the DVT data were identified through the 1000 Genomes European dataset [25].

Data harmonisation

The majority of GWAS present the effects of a SNP on a trait in relation to the allele on the forward strand. However, the allele present on the forward strand can change as reference panels get updated. This requires correction (harmonisation) so that both exposure and outcome data reference the same strand [32]. For exposure and outcome data harmonisation, incorrect but unambiguous alleles were corrected, while ambiguous alleles were removed. In the case of palindromic SNPs (A/T or C/G), allele frequencies were used to solve ambiguities. Harmonisation was not possible for 483 exposures (variants were not present in the DVT GWAS), resulting in a final list of 973 exposures to include in the MR-PheWAS (Supplementary Table 2). For our pQTL analysis, 21 out of 25 proteins had genetic variants (including proxies) available in the DVT GWAS, and only 15 proteins had valid SNPs after harmonization (Supplementary Table 3). Finally, PhenoSpD was used for multiple testing correction in the MR-PheWAS analysis (P = 5.43e-5), while Bonferroni correction was used in the pQTL MR (P = 0.003) (Supplementary Methods).

MR-PheWAS

A hypothesis-free MR-PheWAS was conducted using the TwoSampleMR R package [33]. The effect of a given exposure on DVT was estimated using the inverse-variance weighted (IVW) method for exposures with more than one SNP [34]. Wald ratios (WRs) were derived for exposures with a single SNP [35]. A full description of all MR analyses referenced in this study is available in the Supplementary Methods, while SNPs used in the MR analysis are available in Supplementary Table 5.

Conditional analysis

We performed a conditional analysis for each single-SNP trait using the GCTA-COJO software [36] to identify any potential shared secondary signals in a 1 MB region [37], with the aim of performing an additional colocalization analysis on those secondary signals if the primary colocalization analysis did not find a shared causal signal. We downloaded summary statistics for these traits from OpenGWAS (https://gwas.mrcieu.ac.uk/) [38] and used genotypic data from the Avon Longitudinal Study of Parents and Children (ALSPAC) as a reference panel. Further details of the cohort are described elsewhere [39, 40], in brief: 14,541 pregnancies to women with an expected delivery date of April 1, 1991, to December 31, 1992, were enrolled. We used the genotypic data of 8,890 mothers to perform our conditional analysis. Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committee. The study website contains details of all available data through a fully searchable data dictionary and variable search tool (http://www.bristol.ac.uk/alspac/researchers/our-data/).

Colocalization analysis

Only one genetic instrument was available for some of the exposures investigated (N = 10). As the Wald ratio estimator is susceptible to genetic confounding, we performed a colocalization analysis on the un-pruned genetic dataset for each single-SNP trait. Genetic confounding in this case refers to confounding by LD, where the SNP associated with the exposure is in LD with a SNP affecting another trait that affects the outcome independent of the exposure, which invalidates MR assumptions [41]. Colocalization analysis uses Bayesian statistics to estimate whether an exposure and outcome share a causal signal in a region of the genome [42], which can then strengthen the evidence that there is a causal relationship by providing evidence that the detected effect in the MR analysis is not due to confounding by LD. We used the R package “coloc” (https://cran.r-project.org/web/packages/coloc/) approximate Bayes factor (coloc.abf) function with default settings for prior probabilities to conduct a colocalization analysis with the following hypotheses: H0 (no causal variant), H1 (causal variant for trait 1 only), H2 (causal variant for trait 2 only), H3 (two distinct causal variants) and H4 (one common causal variant) [42]. We then used LocusZoom (https://locuszoom.org/) to provide visual evidence for the presence of a shared signal between our exposures and DVT.

Results

MR-PheWAS

Of the 973 exposures investigated, 945 were identified as independent using PhenoSpD, setting the P-value threshold for our MR analysis at 5.43e-5. Fifty-seven exposures were estimated to influence DVT risk (Fig. 2, Table 1). Sensitivity analyses results for all traits using additional MR methods are shown in Supplementary Table 4.

Fig. 2
figure 2

A many-to-one forest plot of the exposures which passed the P-value threshold following multiple testing correction (5.43e-5). Each trait is accompanied by two additional descriptive columns (No. SNPs and P -value), while log risk ratio (RR) is displayed to the right, alongside with the confidence intervals. MR methods: Inverse variance weighted (SNP > 1) and Wald ratio (SNP = 1)

We observed strong causal evidence for a number of exposures including: “Hyperthyroidism/thyrotoxicosis” (IVW Log RR: 2.39, 95% CI: 1.88 to 2.90; P = 8.69e-18); “Treatment/medication code: carbimazole” (IVW Log RR: 3.60, 95% CI: 2.70 to 4.50, P = 2.41e-12); “Chronic obstructive airways disease/chronic obstructive pulmonary disease (COPD)” (WR Log RR: 3.72, 95% CI: 1.39 to 4.37; P = 9.21e-07); “Varicose veins” (IVW Log RR: 1.90, 95% CI: 1.30 to 2.50; P = 2.36e-07) and “Varicose veins of the lower extremities” (IVW Log RR: 3.40, 95% CI: 2.31 to 4.49; P = 5.13e-07) (Fig. 2, Table 1).

Adiposity, an established risk factor for DVT [43], and its related traits (N = 24, see Table 1 note) were all positively associated with DVT. These include traits identified in previous MR studies, such as “Body Mass Index” (IVW Log RR: 0.40, 95% CI: 0.32 to 0.47; P = 1.60e-22), fat mass e.g. “Whole body fat mass” (IVW Log RR: 0.44, 95% CI: 0.36 to 0.51; P = 4.65e-27) and fat-free mass e.g. “Whole body fat-free mass” (IVW Log RR: 0.41, 95% CI: 0.31 to 0.50; P = 3.90e-14) [44] (Fig. 2, Table 1). Another previously-associated trait is “Height” (IVW Log RR: 0.15, 95% CI: 0.08 to 0.21; P = 5.92e-06) [45]. Other associated height-related traits not previously investigated in an MR framework include “Standing height” (IVW Log RR: 0.17, 95% CI: 0.09 to 0.24; P = 4.61e-06) and “Comparative height size at age 10” (IVW Log RR: 0.30, 95% CI: 0.20 to 0.40; P = 1.93e-06) (Fig. 2, Table 1).

Over 50% of the exposures (N = 31) which passed our P-value threshold for multiple testing were found to have heterogenous effects between instruments using the maximum likelihood method. Of these, most (N = 24) were traits related to body size (mass and adiposity). The remaining heterogenous traits were: “basal metabolic rate” (PHet: 3.71e-03); “warfarin treatment” (PHet: 5.66e-40); “Height” (PHet: 1.58e-03); “Standing height” (PHet = 4.61e-06); “Comparative height size at age 10” (PHet = 1.93e-06); “Impedance of leg (right)” (PHet: 4.23e-06) and “Impedance of leg (left)” (PHet: 9.96e-21). These findings are consistent with our IVW and MR-Egger heterogeneity analyses (Table 1).

MR-Egger estimates indicated strong evidence of horizontal pleiotropy for “Qualifications: None of the above” (intercept = -5.69e-04, P = 3.35e-02), “Impedance of leg (right)” (intercept = 2.58e-04, P = 3.22e-04) and “Impedance of leg (left)” (intercept = 2.22e-04, P = 7.24e-03) (Table 1). The former trait refers to those who answered “None of the above” in the self-report questionnaire on education in UK Biobank (“College or University degree”, “A levels/AS levels or equivalent”, “O levels/GCSEs or equivalent”, “CSEs or equivalent”, “NVQ or HND or HNC or equivalent”, “Other professional qualifications eg: nursing, teaching”). We were unable to assess whether the “Prospective memory result” trait was pleiotropic, as this exposure was instrumented using only 2 SNPs. In bidirectional MR analyses, DVT was estimated to increase warfarin treatment (“Treatment/medication code: warfarin” (beta = 0.29; SE = 0.02; P = 1.79e-30)), implying reverse causation, and therefore violating MR assumptions (Table 2).

Table 2 Reverse MR of traits passing the P-value threshold from the main analysis in Table 1. Exposures highlighted in orange are referred to as "adiposity-related" in the main text

Estimated effects of BMI-driven proteins on DVT risk

Of the 57 traits estimated to increase risk of DVT (Table 1, Fig. 2), 24 were adiposity-related. While adiposity is an established risk factor for DVT, the biological mechanisms underlying the effect of adiposity on DVT are not well understood. We therefore used a two-sample MR mediation analysis to test whether altered levels of 15 circulating blood proteins, driven by adiposity, are responsible for this association. Two recent MR studies have demonstrated that BMI causally affects the levels of 15 circulating proteins [18, 19]. Three of these proteins were estimated to influence DVT risk: Neurogenic locus notch homolog protein 1 (NOTCH1; WR Log RR: 0.57, 95% CI: 0.45 to 0.68; P = 1.12e-23), Plasminogen activator inhibitor-1 (PAI-1; WR Log RR: 0.42, 95% CI: 0.30 to 0.54; P = 4.27e-12) and Inhibin beta C chain (INHBC; WR Log RR: -1.18, 95% CI: -2.18 to -0.69; P = 0.002). Mediation analysis was performed for PAI-1 (the only protein where BMI-protein and protein-DVT effect estimates were consistent in directionality): the proportion of the BMI-DVT effect mediated by PAI-1 was estimated to be 18.56% (Table 3, Fig. 3, Supplementary Table 3).

Table 3 Mediation MR analysis of BMI-associated protein levels on DVT passing the multiple testing P-value threshold (0.003), with a two-step MR of the indirect effect of BMI on DVT through protein levels and proportion mediated (%) by PAI-1
Fig. 3
figure 3

A many-to-one forest plot of the three BMI-associated proteins which passed the multiple-testing corrected P-value threshold (0.003) in the MR analysis. Each protein is accompanied by two additional descriptive columns (type of analysis conducted and P-value), while the effect is displayed to the right, alongside with the confidence intervals (Beta coefficient/Log RR ± 95% CI). Effect sizes of BMI on proteins taken from Goudswaard et al. [18] and Zaghlool et al. [19]

Conditional and colocalization analyses

Seven of the 57 traits in the MR-PheWAS and 3 proteins from the pQTL MR analyses could be instrumented using only one genetic variant, and therefore required a conditional and colocalization analysis to provide additional evidence of causality. There were no secondary signals after conditioning on the top SNP for each exposure-DVT pair. There was evidence of a shared causal variant for PAI-1 (PP.S = 97.5%), strengthening the evidence that there is a true causal relationship between the levels of this protein and DVT (Table 4, Fig. 4). For the other traits, this indicated that we couldn’t be certain that the effect seen in the MR is not due to confounding by LD, which as opposed to the PAI-1 findings, limits the evidence of a causal effect of those traits on DVT.

Table 4 Colocalization analysis results for exposures instrumented through only one SNP
Fig. 4
figure 4

LocusZoom plots in a 1Mb region of the SNP used to proxy each PAI-1 in both exposure (A) and outcome (DVT, B) data. The x-axis represents the position within the chromosome, while the y-axis is the -log10 of the P-value. Each dot is a SNP, and the colours indicate how much LD there is between the reference SNP and the other genetic variants

Discussion

With the aim to identify novel causal risk factors for DVT, we performed a hypothesis-free MR-PheWAS of 945 exposures to DVT, of which 57 passed a conservative P-value threshold for evidence of causality. We confirmed causality for several previously established risk factors for DVT (such as BMI and height) and have identified several novel putative causal risk factors (such as hyperthyroidism and varicose veins). Of the 57 exposures estimated to influence DVT risk, 24 were adiposity-related traits. Therefore, we investigated whether the impact of adiposity on DVT is mediated by circulating proteins known to be altered by BMI [18, 19]. Here, we provide novel evidence that the circulating protein, PAI-1 has a causal role in DVT aetiology and is involved in mediating the BMI-DVT relationship.

Height has been previously associated with increased DVT risk [46] and our results align with this finding. With increased height, a greater volume of blood is required which can increase the stress on blood vessels, disrupting haemostasis [46]. Fat-free mass was also estimated to increase risk of DVT in our study. While counterintuitive, this effect could be mediated through height, as taller people usually have more fat-free mass [44, 45]. As expected, many body size related traits showed evidence of heterogeneity, likely due to the large number of SNPs used to instrument these traits and the many underlying biological pathways explaining variation in adiposity.

Venous blood stasis caused by immobility is also a known risk factor for DVT [3]. Here, we report evidence that long standing illness, disability, or infirmity increases DVT risk. A proposed mechanism is stasis of blood flow in the veins which can be either due to a particular neurological condition or due to the paralysis of the lower limbs [47].

Our study also provides evidence for novel DVT risk factors. Hyperthyroidism has previously been proposed to contribute to DVT, as indicated by a recent systematic review and meta-analysis of cohort studies showing association with DVT (RR: 1.33, 95% CI: 1.28 to 1.39; I2 = 14%) [48]. In the present study, we provide novel evidence for a causal effect of hyperthyroidism/thyrotoxicosis on DVT risk (IVW RR: 10.91, 95% CI: 3.97 to 18.17; P = 3.14e-25). The underlying mechanism is not fully understood but may involve thyroid hormones (THs) promoting a hypercoagulable state and venous thrombi formation, by increasing plasma concentration of factor VIII, fibrinogen, PAI-1 and vWF [49]. TH T4 may also directly enhance platelet function through integrin αvβ3 [50]. In addition, THs enhance basal metabolic rate (BMR) and thermogenesis, both of which affect body weight. Indeed, we found that an increase in basal metabolic rate is associated with DVT. While a higher BMR should lead to lower BMI and thus lower DVT risk, it is likely that our results may be explained by the hyperthyroidism-associated mechanisms outlined above.

Our MR estimates also support evidence of a causal association between varicose veins and increased risk of DVT. Varicose veins can result in the inability of the blood to fully return to the heart, leading to the enlargement of the veins, and in time, potentially an increased risk of DVT due to stasis [51]. Varicose veins have been outlined as a possible risk factor in general practice patients in Germany [52], as well as in a Chinese retrospective study of over 100 K people [51].

COPD was also associated with an increased risk of DVT. COPD is a severe chronic respiratory disease, having been studied extensively for its role in PE [53]. Indeed, both PE and DVT are more prevalent and underdiagnosed in people with COPD [54]. Our colocalization analysis did not provide evidence that would support our MR estimates. Moreover, as the SNP used to proxy for COPD (rs9579496) is intergenic i.e. in-between genes, we were unable to compare our results with any locus-specific experimental studies.

Finally, as adiposity is an established risk factor for DVT, the estimates we observe between adiposity-related traits and DVT most likely reflect true causal relationships. The estimate we report here for BMI (RR: 1.49, 95% CI: 1.38 to 1.60; P = 3.14e-25) is consistent with a previous MR study conducted in individuals of Danish descent (OR: 1.57, 95% CI: 1.08 to 1.97; P = 3e-03) [10]. In addition, our results are in agreement with the estimated effect of BMI on VTE in the FinnGen consortium (MR RR: 1.58, 95% CI: 1.28 to 1.95; P = 2.00e-05) [44]. Higher adiposity is associated with dysregulated metabolism, which is one factor that can promote a hypercoagulable state and impair venous return, increasing the chance of thrombi formation [55]. Given that 42% of the traits we found to be associated with DVT were adiposity-related, and that previously we and others found that adiposity is associated with changes to the circulating proteome [18, 19], we hypothesised that adiposity-driven changes to the circulating proteome may promote DVT. BMI-driven candidates include proteins that can modulate coagulation (anti-thrombin III, PAI-1) [56, 57], platelet function (adiponectin, IGFBP/IGF) [58] and/or thrombosis (galectin-3) [59].

Using our MR approach, we were able to estimate the effect of 15 BMI-driven circulating proteins on DVT risk. Our analyses suggest a causal role for 3 of these proteins (NOTCH1, PAI-1 and INHBC). Given the established role of some of the circulating proteins in coagulation and thrombosis, the lack of evidence for an estimated effect is surprising e.g. anti-thrombin III [56]. This could represent a true result or our ability to instrument circulating proteins using single SNPs.

PAI-1 was the only protein for which evidence was directionally consistent with mediation of the BMI-DVT relationship (circulating levels of PAI-1 were positively associated with BMI and with DVT). A study using data from the Million Veterans Program to identify novel VTE risk factors has also confirmed colocalization with DVT for the same PAI-1 SNP (rs6993770, ZFPM2 locus) used in our analysis [60]. Klarin et al. previously identified in their MR analysis that rs4602861 (ZFPM2 locus) increased the risk of VTE (OR: 1.08, CI: 1.03–1.15) [61], which is in LD with our PAI-1 SNP used here (R2 = 0.93). In addition to replicating this previous finding, we have also shown that this locus increases DVT risk through regulating PAI-1 levels. Moreover, PAI-1 has been associated with an increase in VEGF levels [62,63,64], which was found to increase the risk of VTE in a previous MR study [65], further adding to the evidence that PAI-1 is involved in DVT development. A follow-up analysis in a murine model found that PAI-1-overexpressing mice had 1.5-fold larger thrombus size compared to PAI-1−/− mice [60]. Moreover, a recent observational study done in inhabitants of Tromsø, Norway (cases = 383, controls = 782) found that PAI-1 increased the risk of future VTE, and that PAI-1 mediated ~ 15% of the obesity-VTE relationship [66], a number comparable to our MR estimate (18.6%). These results are consistent with the known role for PAI-1 in inhibiting fibrinolysis (breakdown of a clot) [67]. In addition, PAI-1 expression has been previously found to be associated with DVT formation in mice [67] and in humans after total hip arthroplasty [57]. PAI-1 overexpression is enhanced in visceral fat tissue [68], and while waist-to-hip ratio (WHR) is highly correlated with visceral fat [69], we did not find evidence of an effect of WHR on DVT (Supplementary Table 4). Finally, there has been extensive research into PAI-1 drug targets, ranging from synthetic peptides, RNA aptamers to monoclonal antibodies [70]. Rosuvastatin, an HMG-CoA reductase inhibitor, has been found to inhibit PAI-1 in vitro [71]. Randomised clinical trials using rosuvastatin have confirmed that it reduced occurrence of symptomatic venous thromboembolism [72] and increased plasma fibrinolytic potential [73], supporting a role for statins in VTE treatment and prevention, possibly via altered PAI-1.

Although we found evidence for a role of INHBC and NOTCH1 in DVT risk, estimates were inconsistent with mediation of the BMI-DVT relationship. We found that circulating INHBC levels were negatively associated with DVT, suggesting circulating levels of INHBC may have a protective effect. Inhibins are part of the growth and differentiation superfamily of transforming growth factor beta (TGF-β) [74] and play a role in inhibiting the levels of follicle-stimulating hormone (FSH) produced by the pituitary gland [75]. Although we did not find evidence of causality between FSH and DVT, a recent study showed that FSH can enhance thrombin generation [76]. This discrepancy could be due to INHBC acting through a different pathway compared to FSH. With regards to NOTCH1, we found that higher expression was associated with an increased risk of DVT. NOTCH1 plays a role in responses to microenvironmental conditions, vascular development and is a shear stress and flow sensor in the vasculature [77]. While NOTCH targeting has not been done in relation to VTE, current small molecular drugs such as Crenigacestat [78] and targeting antibodies such as Brontictuzumab [79] are being used in clinical trials to inhibit NOTCH signalling for the treatment of T-cell acute lymphoblastic leukaemia and solid tumours, respectively [80]. Nevertheless, the pQTLs for these two proteins had a stronger association with DVT, and this might indicate reverse causation, horizontal pleiotropy or measurement error in the exposure (i.e. protein levels) [81, 82]. Therefore, the results for INHBC and NOTCH1 should be interpreted with caution, as the colocalization analysis did not provide evidence for a shared signal for the SNPs instrumenting these two proteins and DVT, which does make it more likely that these results are due to confounding by LD [41].

There are some limitations to our approach. Firstly, although the number of traits in MR-Base is large and continues to grow, and the approach was undertaken in a hypothesis-free manner, we were limited by the traits available in the platform at the time of the analysis. In addition, the availability of genetic instruments for some traits within the platform are limited, meaning a false null finding could be reported. While the number of exposures in OpenGWAS/MR-Base allows for a large analysis of aggregated data, this can also come at the cost of being limited by the GWAS data present in the database. For example, the COPD trait used here had only one instrument, while a more recent GWAS of COPD done in UKBB had identified 82 associations with COPD [83]. Moreover, some of the exposures did not have a SNP or proxy present in the outcome (DVT) dataset, making it infeasible to perform MR analysis. Finally, we have chosen to investigate risk factors for DVT as opposed to PE (which is observed in about 40% of DVT cases [84] to increase our power to detect causal risk factors. Future analyses could focus on PE specifically to identify predictive risk factors for this outcome.

In summary, we have confirmed estimates of previously identified traits on DVT (e.g. adiposity-related, height), and identified novel estimates (e.g. hyperthyroidism and varicose veins) with the disease. We also provide evidence that the relationship between adiposity and DVT is mediated by dysregulated levels of circulating proteins (PAI-1). These findings improve the understanding of DVT aetiology and have notable clinical significance, particularly in regard to hyperthyroidism and PAI-1.