Risk prediction of recurrent venous thromboembolism: a multiple genetic risk model

A single genetic biomarker is unable to accurately predict the risk for venous thromboembolism (VTE) recurrence. We aimed to: (a) develop a multiple single nucleotide polymorphisms (SNPs) model to predict the risk of VTE recurrence and (b) validate a previously described genetic risk score (GRS) and compare its performance with the model developed in this study. Twenty-two SNPs, including established and putative SNPs associated with VTE risk, were genotyped in the Malmö thrombophilia study cohort (MATS; n = 1465, follow-up ~ 10 years) by using TaqMan PCR. Out of 22-SNPs, 12 had an association with the risk of VTE recurrence and were included for calculating GRSs. The risk of VTE recurrence was calculated by stratifying patients according to number of risk alleles. In 12-SNP GRS, patients with ≥ 7 risk alleles were associated with higher risk of VTE recurrence compared to patients having ≤ 6 risk alleles. In a simplified model (8-SNP GRS), the discriminative power of 8-SNP GRS was similar to that of 12-SNP GRS based on post-test probabilities (PP). Furthermore, 8-SNP GRS further improved the risk prediction of VTE recurrence in unprovoked VTE and male patients (PP% = 15.4 vs 8.3, 17.1 vs 7.2 and 19.0 vs 7.1 for high risk groups vs low risk groups in whole population, males and unprovoked VTE patients respectively). In addition, we also validated previously described 5-SNP GRS in our cohort and found that the 8-SNP GRS performed better than the 5-SNP GRS in terms of higher PP. Our results show that a multiple SNP GRS consisting of 8-SNPs may be an effective model for prediction of VTE recurrence, particularly in unprovoked VTE and male patients. Electronic supplementary material The online version of this article (10.1007/s11239-018-1762-7) contains supplementary material, which is available to authorized users.


Highlights
• A single genetic biomarker is unable to accurately predict the risk for VTE recurrence • We genotyped important genetic variants in multiple genes associated with VTE in a prospective follow-up study of 1465 objectively diagnosed VTE patients. • Our results showed that a genetic risk score (GRS) consisting of 8-single nucleotide polymorphisms (SNPs) can be an effective model for the prediction of VTE recurrence. • We also validated previously described 5-SNP GRS in Swedish population and showed that 8-SNP GRS developed in this study had modestly improved discriminating performance than previously described 5-SNP model based on the post-test probabilities of high and low risk groups.

Introduction
Venous thromboembolism (VTE) that comprises deep vein thrombosis (DVT) and pulmonary embolism (PE) is associated with significant rate of morbidity, mortality, substantial health-care costs and high rate of recurrence. In Europe, an estimated annual incidence rate of VTE ranges from 1 to 2 per 1000 person-years that is comparable to the incidence rate of stroke [1][2][3]. Patients diagnosed with primary VTE are always at the risk for recurrence, irrespective of the time elapsed since the primary event, nevertheless, the risk of recurrence does decline over time. About 20-30% of primary VTE patients develop recurrence within 5-years after diagnosis [4,5].
The risk of recurrence in unprovoked primary VTE patients is 2-3 fold higher as compared to provoked VTE i.e., VTE with transient risk factors (surgical intervention, immobilization or cast therapy within the last month, female hormone therapy, use of contraceptives pills, current pregnancy and postpartum period [first 6 weeks after delivery]) and malignancies [5][6][7][8]. Taking into consideration the high rates of morbidity and mortality, prediction and prevention of VTE recurrence is indispensable. Continued anticoagulation after the primary VTE decreases the risk of recurrence, however, it should be carefully weighed against the risk of serious side effects i.e. anticoagulant-related bleeding [9][10][11].
VTE is a multifactorial and complex disease influenced by several genetic and non-genetic risk factors. If the cause of primary VTE is genetic, the risk of recurrence is lifelong. Familial studies have estimated the heritability of VTE at about 50-60% [12,13]. Previously known genetic risk factors i.e. heritable thrombophilia: factor V Leiden, factor II G20210A mutation, protein C, protein S and antithrombin deficiency, does not appear to predict the risk for recurrent VTE [14]. Likewise, Bezemer et al., and Sundquist et al., reported that despite the presence of known genetic risk factors, family history is still a significant risk factor for primary and recurrent VTE, suggesting the presence of additional unidentified genetic risk factors [15,16].
Unfortunately, no single genetic biomarker can accurately predict the risk of VTE recurrence, therefore, it necessitates the risk classification based on multiple genetic biomarkers to precisely predict the risk of VTE recurrence [17,18]. For the risk prediction of VTE recurrence, Van Hylckama Vlieg et al. analyzed 31-SNPs from 22 genes that have been previously investigated for their role in risk assessment of primary VTE [19]. They proposed a simplified genetic risk score model (5-SNP GRS) i.e. rs6025, rs1799963, rs8176719, rs2066865, and rs2036914, which could adequately stratify patients with high and low risk of VTE recurrence. However, this model remains to be validated in an independent cohort.
To examine as to what extent the VTE related genetic variants can be used to predict the risk of VTE recurrence in whole population as well as in high-risk groups, i.e., unprovoked and in male VTE patients, we investigated 22-SNPs in 16 VTE associated genes. These genes are associated with various biological pathways (lipid metabolism, thrombosis and hemostasis, inflammation etc.) implicated in the pathogenesis and/or pathophysiology of VTE [20][21][22][23][24][25]. In this study, we analyzed these 22 genetic variants in the MATS cohort to: (a) develop a multiple SNPs model to predict the risk of VTE recurrence and (b) validate a previously described genetic risk model and compare its performance with a model developed in this study.

Study population
Malmö thrombophilia study (MATS; n = 1465) is a population based, prospective follow up study in which objectively diagnosed VTE patients at Skåne University Hospital Malmö, were included. MATS patients were followed from the time of inclusion until diagnosis for VTE recurrence, death or end of the study (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008). The study design and complete information about MATS (including treatment, inclusion, exclusion criteria etc.) has been described elsewhere [26,27]. In short, inclusion criteria for MATS were: age > 18 years, objective diagnosis for VTE and/or PE with one or more of the following methods i.e. phlebography, computed tomography (CT), lung scintigraphy, duplex ultrasonography and magnetic resonance imaging (MRI) and willing to participate in MATS. The rate of consensual participation in MATS was 70% and remaining 30% patients did not participate due to one or more of the following reasons: language problems, presence of other severe diseases, dementia and/or unwillingness to participate in MATS. Thrombophilia was defined as the presence of the factor V Leiden (FVL, rs6025) or factor II G20210A mutation (rs1799963) or a level below the laboratory reference range of protein C (< 0.7 kilo international unit (kIU)/L) or free protein S (female < 0.5 kIU/L, male < 0.65 kIU/L) or antithrombin (< 0.82 kIU/L) in VTE patients without anticoagulant treatment. MATS patients were treated according to the standard treatment protocol at Malmö University Hospital, i.e., low-molecular weight heparin (LMWH) or unfractionated heparin (UFH) during the initiation of oral anticoagulants (until international normalized ratio [INR] value is ≥ 2.0 but at least 5 days). According to Malmö University Hospital treatment protocol recommendations, VTE patients will get 3-6 months of oral anticoagulant therapy for firsttime VTE with the consideration of extension of treatment if VTE recurrence occurs. Family history of VTE was defined as a history of VTE in first-degree relatives (son/daughter, sibling, or parent). Data on recurrent VTE and mortality was obtained from the patient files. Unprovoked VTE was defined as VTE in patients without acquired risk factors for VTE (surgical intervention, immobilization or cast therapy within the last month, female hormone therapy, use of contraceptives pills, current pregnancy and postpartum period [first 6 weeks after delivery]) and malignancies diagnosed prior to or at diagnosis of the first VTE event.
The follow-up period in MATS (mean ± SD, 3.9 ± 2.5 years) was calculated after stopping the anticoagulant treatment and until the diagnosis for VTE recurrence, death of the patient or end of the study (December 2008).

Ethical permission
All the patients provided a written consent according to the declaration of Helsinki. The study was approved by the Medical Ethical Committee of the Lund University.

DNA extraction and genotype determination
From the whole blood, genomic DNA was extracted by using QiAmp 96 DNA Blood Kit (Qiagen, Hilden, Germany) according to the supplier's instructions. Pre-designed or, in some cases, custom made TaqMan® Genotyping Assays were designed and used for genotyping (with VIC and FAM probes). Genotyping of selected SNPs was performed by TaqMan PCR according to the manufacturer's protocol (Applied Biosystems, Life Technologies Corporation, Carlsbad, CA, USA). To determine various alleles in investigated polymorphisms, BioRad CFX manager software was used. A full list of SNPs included in this study is presented in Table 2.
A chromogenic method using Berichrom® Protein C reagent (Siemens Healthcare Diagnostics, Upplands Väsby, Sweden) was used to analyse the Protein C activities [28]. Analysis of free Protein S antigen concentration was performed by latex immunoassay with Coamatic® Protein S-Free (Chromogenix, Haemochrom Diagnostica AB, Gothenburg, Sweden) [29]. Antithrombin III antigen concentration was analyzed by a thrombin-based method using Berichrom Antithrombin reagent (Siemens Healthcare Diagnostics) [30].

Statistical approach
Chi square test was used to compare dichotomous variables while Student t-test (if data was normally distributed) or Mann-Whitney U test (if data was not normally distributed) to compare continuous variables. For normally distributed variable, data was presented as mean ± SD whereas for the skewed variable, data was presented as median and IQR (interquartile range). To examine individual associations of all 22-SNPs with the risk of recurrent VTE, univariate and multivariate (adjusted for age, sex and family history of VTE) Cox regression analyses were performed (Table 2). Genotypes in all SNPs were analyzed and presented either as recessive [homozygous wild type (0) plus heterozygous (1) and compared with homozygous mutated form (2)] or dominant models (homozygous wild-type compared with heterozygous plus homozygous mutated form).
The number of risk alleles of the individual SNPs with a positive association to recurrent VTE (HR > 1.2) were summed to generate a genetic risk score (10 out of the 22-SNPs were excluded because of too low HR). Three different genetic risk scores (GRSs) were developed by summing different number of SNPs; 12-SNPs; 8-SNPs and 5-SNPs, a previously described model [19]. The distribution of the number of risk alleles for 12-SNP GRS is shown in Fig. 1 separated by recurrent and non-recurrent VTE.
The three GRSs were dichotomized into two groups (low and high risk) to evaluate their predictive ability, as measured by HRs (high vs low) and post-test probabilities (for low risk group as well as for high risk group). HRs were estimated to yield a measure of the association to the risk of recurrent VTE and post-test probabilities (predictive values after fitted logistic regression models) to estimate the possibility to differentiate recurrent from non-recurrent VTE. These probabilities were compared to a pre-test probability, calculated as the prevalence of recurrent VTE (number of individuals with recurrent VTE divided by total number of individuals). The pre-and post-test probabilities are presented which show the percentages of patients at risk of VTE recurrence before and after the application of genetic risk score respectively.
Confidence intervals of post-test probabilities were calculated by normal approximation with standard errors Fig. 1 The 12-single nucleotide polymorphisms and their risk alleles distribution among recurrent and non-recurrent VTE patients estimated using the delta method. The three models were evaluated in whole population, for males and females, and for provoked and unprovoked VTE patients (Table 3).
Kaplan-Meier curves were plotted for 12-SNPs and 8-SNP GRSs to calculate recurrence-free survival in whole population and for males and unprovoked VTE patients (Fig. 2). To assess the goodness-of-fit of these models, Akaike information criteria (AIC) and Harrell's Concordance Index (Harrell's C) were calculated. STATA version 15 (StataCorp LP) and SPSS version 21 (IBM, Armonk, NY, USA) were used for all statistical analyses.

Results
Of the 1465 patients in this cohort, the patients who had one or more thrombotic events before the start of this study (n = 154) were excluded because the follow-up period was started after stopping the anticoagulant treatment and we do not have information about the treatment and followup period of pervious episodes of VTE in these patients. Among the remaining patients (n = 1311), 148 had VTE recurrence during the follow-up period. By comparing recurrent and non-recurrent VTE patients, we found that the frequency of thrombophilia was significantly higher (49.6%) in recurrent VTE patients as compared to those with non-recurrent VTE (36.9%). Moreover, 32.4% of recurrent VTE patients had a family history of VTE as compared to 23.5% in non-recurrent VTE patients. Higher number of male patients had VTE recurrence compared to the female patients, though the difference didn't reach statistical significance. However, stratification according to age showed that the number of male patients ≤ 45 years of age was significantly higher in the recurrent group compared to female patients with ≤ 45 years of age. However, no difference was found between male and female patients above > 45 years of age. No significant differences were found between recurrent and non-recurrent VTE patients in the other variables, i.e., BMI, DVT, PE, malignancy and other acquired risk factors for VTE (Table 1).
For the risk assessment analyses of recurrent VTE among the 1311 VTE patients, those patients who had VTE recurrence during anticoagulant treatment, those who died during anticoagulant treatment or those for whom complete information was missing were excluded (n = 260). Of the remaining 1051 patients, 126 (12%) had recurrent VTE and 925 were recurrence-free during the follow-up period and these were included for the risk assessment analyses.

Risk of VTE recurrence in 12-SNP GRS
The individual association of each of 22-SNPs with the risk of VTE recurrence is shown in Table 2. Among these SNPs, the strongest positive associations were observed for A2M (rs3832852), ABO (rs8176719), FGG (rs2066865) and FVL (HR 2.60, 1.97, 1.77 and 1.62 respectively, adjusted for sex, age and family history of VTE). All other SNPs showed mild, no, or negative association with the risk of VTE recurrence when analyzed individually. Furthermore, to investigate whether increasing the number of risk alleles can increase the risk of VTE recurrence, we first developed a genetic risk score based on 12-SNPs that had a significant or trend of a positive association with the risk of VTE recurrence ( Table 2). The total number of risk alleles in all VTE patients analyzed in the study ranged from 0 to 13. A trend of increasing number of patients at risk of recurrence was observed with increasing number of risk alleles (Fig. 1). Thus, by using a 12-SNP GRS, we were able to stratify patients in low and high risk groups. Patients having ≥ 7 risk alleles were at higher risk of VTE recurrence compared to those with ≤ 6 risk alleles (HR 2.06, 95% CI 1.41-2.99).
We also tested the discriminatory performance of 12-SNP GRS in terms of post-test probabilities and showed that the high risk group stratified by 12-SNP GRS had a higher probability of VTE recurrence than the low risk group (PP% for high risk group = 15.5, 95% CI 12.5-18.5 vs PP% for low risk group = 8.0, 95% CI 5.7-10.4) ( Table 3). We also adjusted the 12-SNP GRS with sex, age and family history of VTE and the results remained unchanged (Supplementary Table 1).

Risk of VTE recurrence in 8-SNP GRS
To develop a genetic risk score using a more parsimonious model, we tested various combinations by adding

Risk prediction of VTE recurrence according to sex
In sub-analyses, we stratified VTE patients according to sex to investigate the performance of 8-SNP GRS for risk assessment of VTE recurrence. Male patients having ≥ 5 risk alleles had significantly higher risk (HR 2.42, 95% CI 1.39-4.21) of VTE recurrence than those having ≤ 4 risk alleles (Reference) ( Table 3). Furthermore, the discriminating performance of 8-SNP GRS was improved in male patients (PP% for high risk group = 17.1, 95% CI 12.7-21.5) compared to the whole population (PP% for high risk   (Table 3). Further stratification of patients into three categories i.e. low, medium and high was not possible due to the low number of patients in this subgroup.

Risk prediction of VTE recurrence in unprovoked VTE patients
In the sub-analyses of unprovoked first VTE patients (n = 619), the risk of recurrence was higher in patients having ≥ 5 risk alleles (HR 2.83, 95% CI 1.73-4.61) compared to those having ≤ 4 risk alleles (reference). The discriminating performance of the 8-SNP GRS was further improved in unprovoked VTE patients in terms of post-test probabilities (PP% = 19.0, 95% CI 14.7-23.4 vs PP% = 7.1, 95% CI 4.3-10.0 for high vs low risk groups, respectively). In provoked VTE patients however, the discriminating performance of the 8-SNP GRS was not improved in the high risk group as compared to the low risk group in terms of post-test probabilities (PP% = 10.7, 95% CI 6.7-14.6 vs PP% = 10.1, 95% CI 5.9-14.3) Table 3. Further stratification of patients into three categories i.e. low, medium and high was not possible due to the low number of patients in this subgroup.

Validation of the previously described genetic risk score and comparison with the 8-SNP GRS
We also validated the previously described 5-SNP GRS, (rs6025, rs1799963, rs8176719, rs2036914 and rs2066865), for the risk assessment of VTE recurrence by van Hylckama Vlieg et al. [19]. We found that the previously described 5-SNP GRS was also efficient in discrimination of patients in the whole population (PP% for high risk group patients = 14.1, 95% CI 11.3-16.9) as well as in unprovoked VTE (PP% for high risk group patients = 17.1, 95% CI 13.1-21.1) and male VTE patients (PP% for high risk group patients = 14.6, 95% CI 10.6-18.5) ( Table 3).
Comparison between the previously described 5-SNP GRS and the 8-SNP GRS presented in this study showed that the discriminatory performance of the 8-SNP GRS was modestly improved in terms of higher post-test probabilities in the whole population  (Table 3).
To measure how well the risk score predicts the distribution of time in the Cox regression models as well as to estimate the quality of the model relative to the others, Harrell's C and AIC were calculated, respectively (data not shown).
Kaplan-Meier curves were plotted to calculate the recurrence-free survival for the 12-SNPs and 8-SNP GRS. Patients having ≥ 7 risk alleles had a shorter recurrencefree survival as compared to patients having ≤ 6 risk alleles in the 12-SNP GRS (Fig. 2a). Similar results were found in the 8-SNP GRS for the whole population as well as for unprovoked first VTE and male patients, where a high number of risk alleles was associated with a shorter recurrencefree survival than a low number of risk alleles (Fig. 2b-d  respectively).

Discussion
In the current study, we investigated the role of 22-SNPs from 16 genes for the risk assessment of recurrent VTE. We have developed a GRS based on 8-SNPs [FII (rs1799963), FV (rs6025), ABO (rs8176719), ApoM (rs805297), F11 (rs2036914), FGG (rs2066865), PAI-1 (rs1799889) and TFAM (rs1937)] and showed that this model is effective in discriminating VTE patients into high and low risk groups. The discriminatory power of the 8-SNP GRS was even stronger in male and unprovoked VTE patients. We also validated previously described 5-SNP GRS and compared it's performance with our 8-SNP GRS and the results showed that our 8-SNP GRS had a modestly improved performance in discriminating patients with a higher risk of VTE recurrence in terms of higher post-test probabilities [19].
Efforts have been made to develop risk models based on genetic risk scores for better prediction of VTE [17][18][19]31]. In agreement with our findings, some previous studies have also found that an increase in the number of risk alleles in the prediction model significantly improves the risk prediction of VTE recurrence [18,19].
Genetic variants in the 8-SNP GRS represent genes that are associated with the pathophysiology of VTE. Factor V and Factor II are integral parts of the thrombotic pathway [32]. Having an ABO blood group was reported to be involved in thrombosis, affecting factor VIII and von Willebrand factor levels [33]. FGG, being the precursor of fibrin, is an essential component of the hemostatic system [34]. In previous studies, we have shown that genetic variants and protein levels of ApoM were associated with the risk of VTE recurrence in a sex-specific manner [22,26]. Similarly, PAI-1 polymorphism has been reported by us to be associated with an increased risk of VTE recurrence in the presence of FVL [35]. We have recently shown an association between risk of VTE recurrence and SNPs in mitochondrial regulating genes, suggesting a potential role of mitochondrial function in VTE [36]. Mitochondrial transcription factor A (TFAM) has been shown to be involved in mitochondrial biogenesis that in turn affects inflammation; a well-known mechanism in thrombosis [37][38][39]. The association between TFAM polymorphism and the risk of VTE recurrence warrants, however, further investigation of mitochondrial regulating genes in VTE.
The risk of VTE recurrence is higher after stopping anticoagulant therapy and proper selection of patients with the highest risk of VTE recurrence for long-term anticoagulant therapy is important. Although FVL and FII G20210A are the most important genetic variants associated with the risk of primary VTE, both genetic variants are unable to precisely predict the risk of recurrent VTE [14,40]. We investigated 22-SNPs (including FVL and FII G20210A) related to VTE and present a GRS which can be a useful model for prediction of VTE recurrence. VTE recurrence is higher in patients with unprovoked first VTE as compared to those with provoked, i.e., those having acquired risk factors for VTE [7,41]. We investigated our 8-SNP GRS in provoked and unprovoked VTE and found that the discriminatory performance in identifying patients at higher risk of VTE recurrence was improved in terms of post-test probability in unprovoked VTE patients. However, in provoked VTE patients, 8-SNP GRS was unable to stratify patients into high and low risk of VTE recurrence. The possible explanation for these results could be that in provoked VTE patients, the cause of VTE is not genetic rather it is the provoking factor which increases the risk and once that risk factor is removed, patients are no longer at the risk of VTE recurrence. In agreement with our findings, previous studies also showed that the risk of VTE recurrence is lower in those VTE patients whom first episode was provoked by transient risk factors [42]. Furthermore, sex is an important risk factor of VTE recurrence and male patients have approximately 2-4 fold higher risk of VTE recurrence as compared to females [43][44][45], though the pathophysiology underlying this phenomenon is still unknown. Our results showed that higher number of risk alleles in male patients was associated with higher risk of VTE recurrence and had better discriminatory performance than in female patients. These findings are consistent with previous reports that male and female patients have different risk factors for VTE recurrence [46] and our findings may partly suggest an explanation for the higher risk of VTE recurrence in males.
We also validated a previously described genetic risk model in our patient cohort [19]. Furthermore, we showed that the 8-SNP GRS had a modestly better risk prediction performance in the whole population as well as those with unprovoked VTE and male patients, compared to the previously described 5-SNP GRS. Both GRSs (8-SNPs, and 5-SNPs) had five SNPs in common. The improved discriminative performance for the 8-SNP GRS may possibly be due to the inclusion of 3 additional SNPs; however, our findings need to be validated in other cohorts as well.
One of the limitation of the current study is that it does not include patients ≤ 18 years of age; therefore, our model cannot be used for patients ≤ 18 years. Secondly, the current study was performed in Malmö and surrounding areas and therefore its validity may be applied only in patients with similar ethnic background, i.e., Swedish. Nevertheless, in this study we have not only validated a previously described GRS but also improved its performance in terms of prediction value in our new model.
The common genetic variants are usually located in noncoding regions of the chromosome, which don't code for specific functional proteins and therefore may have a weaker effect on VTE. By contrast, variants at exonic regions of the chromosome, which code for functional proteins are often rare, but may have a greater impact on disease development. The recent increase in studies based on exome and/or wholegenome sequencing will further identify novel genetic risk variants, which may, together with clinical risk factors, further improve the risk prediction of VTE recurrence [47,48].
The 8-SNP GRS developed in this study can differentiate patients into low and high risk of VTE patients. If validated, this model might be integrated for consideration in designing individualized therapeutic strategies for VTE patients; low risk group may stop anticoagulant treatment after initial treatment period while high risk group may continue anticoagulant therapy for a longer period.
In conclusion, we showed that 8-SNP GRS can be an effective model for prediction of VTE recurrence. Its performance was further improved in patients with high risk of VTE recurrence i.e., males and unprovoked VTE patients. However, to confirm the rationality of the presented model, validation in other cohorts is needed. distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.