In the ACCORD trial, there were 6537 (63.8%) participants without CVD at baseline of which 6466 (63.1%) had <10% missing data and were included in the analyses. A detailed description of the covariates and inclusion/exclusion criteria is provided in ESM Table 1. A total of 44 variables were included in the semi-supervised clustering procedure to identify the top 20 most significant variables to be used in the clustering analysis (Fig. 1). The validation analysis included 4211 of 4906 (85.8%) participants with available follow-up data and no prior CVD who were enrolled in the Look AHEAD trial and 1495 of 2368 (63.1%) participants enrolled in the BARI 2D trial. Baseline characteristics of the derivation and validation cohorts are provided in ESM Table 2.
Determination of the optimal phenomapping strategy and development of a phenogroup classifier
When comparing the four phenomapping methods, clustering with FMMs and three phenogroups had the best-performing internal validation metrics including lowest BIC and highest Dunn index (ESM Table 3, ESM Fig. 2). When phenogroup membership was added to the PCE risk score, the FMM method with three clusters had the greatest improvement in model discrimination (ESM Table 3). Similar to the derivation cohort, the FMM method was the top-performing phenomapping method in the external validation cohort (Look AHEAD) (ESM Table 4).
Using the FMM phenogroups derived in the derivation cohort, we developed an MLR classifier to predict phenogroup membership for individuals (ESM Table 5). Using the MLR classifier, each participant in the validation cohort was successfully matched into one of the three previously defined clusters. We observed high agreement between the MLR classifier and phenomapping in the Look AHEAD dataset (Cohen’s κ = 0.91 [95% CI 0.88, 0.93]) and a minimum of ten non-missing covariates was needed to achieve an 80% accurate prediction in the MLR classifier (ESM Fig. 3).
Prognostic utility of the FMM phenogroups in improving atherosclerotic cardiovascular disease risk prediction
Over 9.1 years of follow-up, 789 (12.2%) and 963 (14.9%) participants had a primary outcome and all-cause mortality event, respectively. Phenogroup membership from the FMM clustering method was significantly associated with risk of the primary composite outcome and all-cause mortality on follow-up, with a graded decrease in risk from phenogroup 1 (highest risk) to phenogroup 3 (lowest risk) (ESM Fig. 4). The mean PPV of the SNDR risk score was 0.16 (95% CI 0.14, 0.19). Addition of the FMM phenogroup significantly increased the mean PPV (PPV = 0.21 [95% CI 0.20, 0.23]). Similarly, in decision curve analysis, phenogroup membership also improved the prognostic utility of both risk scores (ESM Fig. 5). In the Look AHEAD external validation cohort, addition of the FMM phenogroup to the SNDR risk score significantly increased the mean PPV (increase from 0.14 [0.12, 1.17] to 0.16 [0.14, 0.0.19]). A similar improvement was observed in decision curve analysis (ESM Fig. 5).
Characterisation of FMM-based phenogroups
Baseline characteristics of study participants across the three phenogroups as determined by the FMM method are shown in Tables 1 and 2. Compared with the other groups, phenogroup 1 participants (n = 663, 10.3%) were more likely to be men, to be of self-reported black race and to have higher burden of traditional cardiovascular risk factors, including higher BP, smoking prevalence, HbA1c, fasting blood glucose, low-density lipoprotein cholesterol and triacylglycerol levels. Phenogroup 2 (n = 2388, 36.9%) had participants with intermediate burden of CVD risk factors. In contrast, phenogroup 3 (n = 3415, 52.8%) participants had the lowest burden of CVD risk factors, with lower BP, low-density lipoprotein cholesterol levels, HbA1c and smoking prevalence. The prevalence of type 2 diabetes-related complications such as foot ulceration, lower extremity amputations, proteinuria and eye surgery history was highest in phenogroup 1 and lowest in phenogroup 2 members. The pattern of baseline characteristics was similar when phenomapping separately in men and women (ESM Table 6). The pattern of baseline characteristics across the FMM-based phenogroups in the Look AHEAD and BARI 2D validation cohorts was mostly similar to that observed in the derivation cohort (ESM Tables 7, 8).
Table 1 Baseline demographic and clinical characteristics of the study participants stratified by FMM-based phenogroup in the derivation cohort (ACCORD) Table 2 Baseline ECG, laboratory and diabetes complication characteristics of the study participants stratified by FMM-based phenogroup in the derivation cohort (ACCORD)
Effect modification by FMM-based phenogroups for cardiovascular benefits associated with different therapies in type 2 diabetes
The proportion of participants randomised to intensive glycaemic control was similar across the three phenogroups in the derivation cohort (Table 1). We observed a significant interaction between randomisation to intensive glycaemic control (vs standard) and phenogroup for the risk of the primary composite outcome (p-interaction = 0.042) (Fig. 2). Intensive glycaemic control was associated with lower risk of the primary composite outcome in phenogroup 3 (adjusted HR [aHR] 0.65; 95% CI 0.51, 0.83; p value <0.001) and there was no significant association with risk in phenogroup 1 (aHR 1.25; 95% CI 0.91, 1.77; p value = 0.19). A similar pattern of association was noted with randomisation to combination lipid therapy with a significant treatment interaction by phenogroup membership (p-interaction < 0.001). Specifically, combination lipid therapy was associated with significantly lower risk in phenogroup 3 (aHR 0.71; 95% CI 0.52, 0.98; p value = 0.04) and higher risk in phenogroup 1 (aHR 1.49; 95% CI 0.98, 2.24; p value = 0.06). There was no significant treatment effect of intensive BP control on risk of the primary composite outcome across the phenogroups (Fig. 2). Event rates and number of events prevented across treatment groups and phenogroups are shown in ESM Table 9. Among participants in phenogroup 1, the risk of all-cause mortality was numerically higher in the intensive vs standard glycaemic control groups (28.5% vs 25.8%, respectively; p value = 0.34). In contrast, the risk of all-cause mortality was lower in the intensive vs standard glycaemic control group in phenogroup 3 (8.9% vs 10.9%, respectively; p value = 0.05) (ESM Fig. 6).
In the Look AHEAD external validation cohort, we observed a significant interaction between intensive lifestyle intervention (ILI) therapy (vs standard) and phenogroup for the risk of the primary composite outcome (p-interaction = 0.002) (Fig. 2). Specifically, ILI was associated with a lower risk of the primary composite outcome in phenogroup 3 (aHR 0.77; 95% CI 0.61, 0.98; p value = 0.03) and higher risk in phenogroup 1 (aHR 1.58; 95% CI 1.10, 2.47; p value = 0.03).
In the BARI 2D external validation cohort, we similarly observed a significant interaction between early coronary revascularisation (vs medical therapy) and phenogroup for the risk of the primary composite outcome (p-interaction = 0.003) (Fig. 2). Early revascularisation was associated with lower risk of the primary composite outcome in phenogroup 3 (aHR 0.64; 95% CI 0.47, 0.91; p value = 0.008) and higher risk in phenogroup 1 (aHR 1.90; 95% CI 1.07, 3.35; p value = 0.03).
Association of FMM-based phenogroups with SH events
In the derivation cohort (ACCORD trial), there were a total of 1478 SH events over a median follow-up of 8.8 (IQR = 5.7–10.1) years. The mean number of cumulative SH events per participant was highest in phenogroup 1 (0.32 events) followed by phenogroup 2 (0.26 events) and phenogroup 3 (0.19 events) (p value = 0.0004) (Fig. 3a). In adjusted analysis, the incidence rate ratio of an SH event was significantly higher in phenogroups 1 and 2 as compared with phenogroup 3 (phenogroup 1: 0.41; 95% CI 0.13, 0.69; p value = 0.005; and phenogroup 2: 0.22; 95% CI 0.04, 0.41; p value = 0.03). There was no significant interaction between phenogroup membership and intensive glycaemic control for the risk of SH events (p-interaction = 0.68). Additional adjustment for baseline glucose-lowering medications did not attenuate the associations between phenogroups and SH events.
In the Look AHEAD validation cohort, there were a total of 637 SH events over a median follow-up of 9.5 (IQR = 8.8–10.3) years. Similar to the derivation cohort, the mean cumulative number of SH events per participant was highest in phenogroup 1 (0.21 events) followed by phenogroup 2 (0.17 events) and phenogroup 3 (0.15 events) (p value = 0.03). In adjusted analysis, the incidence ratio of an SH event was significantly higher in phenogroup 1 (0.33; 95% CI 0.11, 0.54; p value = 0.002) as compared with phenogroup 3. There was no significant difference in SH events between phenogroups 2 and 3 and no significant interaction between phenogroup membership and randomisation to ILI for the risk of SH events (p-interaction = 0.40).
Association of FMM-based phenogroups with treatment non-adherence
In the derivation cohort, phenogroup 1 participants also had the highest rates of medication non-adherence, with a mean of 2.09 visits during follow-up with <80% medication adherence compared with 1.09 and 1.08 visits for phenogroups 2 and 3, respectively (p value = 0.008) (Fig. 3b). In adjusted analysis, the incidence rate ratio of medication non-adherence was significantly higher in phenogroup 1 compared with phenogroup 3 (0.35; 95% CI 0.10, 0.60) but not compared with phenogroup 2 (0.05; 95% CI –0.12, 0.29; p value = 0.65). Among phenogroup 1 participants randomised to either intensive glycaemic control or combination lipid therapy, medication non-adherence was significantly higher in individuals with a primary composite outcome event on follow-up compared with those without (ESM Table 10).
In the Look AHEAD validation cohort, similar to the derivation cohort, phenogroup 1 had the highest rates of treatment non-adherence with a mean of 26.5% participants with <80% of attended intervention clinic visits compared with 17.9% and 16.4% in phenogroups 2 and 3, respectively (p value < 0.001). Similar to the derivation cohort, among individuals randomised to ILI in phenogroup 1, treatment non-adherence was significantly higher in individuals with a primary composite outcome event compared with those without (ESM Table 10).