Clinical variable-based cluster analysis identifies novel subgroups with a distinct genetic signature, lipidomic pattern and cardio-renal risks in Asian patients with recent-onset type 2 diabetes

Aims/hypothesis We sought to subtype South East Asian patients with type 2 diabetes by de novo cluster analysis on clinical variables, and to determine whether the novel subgroups carry distinct genetic and lipidomic features as well as differential cardio-renal risks. Methods Analysis by k-means algorithm was performed in 687 participants with recent-onset diabetes in Singapore. Genetic risk for beta cell dysfunction was assessed by polygenic risk score. We used a discovery–validation approach for the lipidomics study. Risks for cardio-renal complications were studied by survival analysis. Results Cluster analysis identified three novel diabetic subgroups, i.e. mild obesity-related diabetes (MOD, 45%), mild age-related diabetes with insulin insufficiency (MARD-II, 36%) and severe insulin-resistant diabetes with relative insulin insufficiency (SIRD-RII, 19%). Compared with the MOD subgroup, MARD-II had a higher polygenic risk score for beta cell dysfunction. The SIRD-RII subgroup had higher levels of sphingolipids (ceramides and sphingomyelins) and glycerophospholipids (phosphatidylethanolamine and phosphatidylcholine), whereas the MARD-II subgroup had lower levels of sphingolipids and glycerophospholipids but higher levels of lysophosphatidylcholines. Over a median of 7.3 years follow-up, the SIRD-RII subgroup had the highest risks for incident heart failure and progressive kidney disease, while the MARD-II subgroup had moderately elevated risk for kidney disease progression. Conclusions/interpretation Cluster analysis on clinical variables identified novel subgroups with distinct genetic, lipidomic signatures and varying cardio-renal risks in South East Asian participants with type 2 diabetes. Our study suggests that this easily actionable approach may be adapted in other ethnic populations to stratify the heterogeneous type 2 diabetes population for precision medicine. Graphical abstract Supplementary Information The online version contains peer-reviewed but unedited supplementary material available at 10.1007/s00125-022-05741-2.

construct PRS for type 1 diabetes. For the 7 non-DR3/DR4 SNPs, we applied a linear weighting of 0, 1, and 2 for genotypes containing 0, 1, and 2 risk alleles and multiplied by their effects on risk of type 1 diabetes. For DR3/DR4-DQ8 contribution, we imputed DR3/DR4-DQ8 haplotypes and the corresponding weights were assigned to each individuals score [2][3][4]. The final type 1 diabetes PRS was obtained from the sum of these two sets of SNPs divided by 15. GWAS genotyping, quality control procedures and principal component (PC) analysis have been described previously [5].

Lipidomics Assay by LC-MS
Plasma samples were randomized for each cohort, respectively, before analytical assay. Batch quality control (BQC) samples were prepared by pooling equal amount of aliquot from all plasma samples in each cohort, respectively, before lipid extraction. Plasma (10 µL) was mixed with 190 µL 1-butanol/methanol (BuMe, 1:1 v/v) containing internal standards. The mixture was vortexed for 30 s, then sonicated for 30 min at 20°C. The samples were then centrifuged at 14,000 x g for 10 min at 10°C and the supernatant was carefully transferred into autosampler vials, not to disturb the precipitated protein pellets. Extracted blanks were prepared using the same extraction protocol, using 10 µl BuMe instead of plasma samples. Technical quality control (TQC) samples were generated by pooling the lipid extracts of study samples to measure instrumental variability.
Reverse Phase chromatographic separation of plasma samples was based on a modified version of Huynh et al [6]. The analysis was carried out on an Agilent 6495 QQQ and Infinity-II LC-MS system, using a Zorbax RRHD Eclipse Plus C18, 95Å (2.1 x 100 mm, 1.8 µm) column. The mobile phases consisted of (A) 10 mmol/L ammonium formate and formic acid (0.1%) in water/acetonitrile/2-propanol (50:30:20, v/v) and (B) 10 mmol/L ammonium formate and formic acid (0.1%) in 2-propanol/ acetonitrile/water (90:9:1, v/v). Using a flow rate of 0.4 mL/min, the gradient started from 15% B to 50% B in 2.5 min, 50 to 57% B in 0.1 min, 57% B to 70% B from 2.6 to 9 min, 70% B to 93% B from in 0.1 min, 93% B to 96% B from 9.1 to 11 min, 96% B to 100% B in 0.1 min, where it was maintained till 11.9 min, then re-equilibrated at 15% B for 3 min prior to the next injection. The injection volume was 2 µL. Autosampler and column thermostat temperature were at 15°C and 45°C, respectively. Total method run time, including needle wash, was 16.1 min. To test the linear response, TQCs extracts were injected at different volumes.
The mass spectrometer conditions were as follows: Capillary voltage 3500V, Drying gas temperature and flow rate 150°C and 17L/min, Sheath gas temperature and flow rate 200°C and 10L/min, Nebulizer pressure 20 psi. Targeted analysis was performed in Dynamic MRM positive ion mode, using "unit" resolution (0.7 amu) for Q1 and Q3 isolation width.
Chromatographic peaks were annotated based on retention time and specific MRM transitions using Agilent MassHunter Quantitative Analysis software (version B.10.1). Internal standards were used to normalize the raw peak areas in the corresponding lipid class. Endogenous species were quantified using one standard per lipid class thus our method can only deliver relative quantitation results.

ESM
PEPD C/T 0.0552 R, risk allele; A, alternative allele. Weight (effect size) for each SNP was derived from meta-analysis of GWAS data from 77,418 individuals with T2D and 356,122 healthy individuals of East Asian ancestry (PMID: 32499647) and calculated as natural log (Odds Ratio). Please also refer to ESM methods.  3) ACR, albumin-to-creatinine ratio; RAS, renin-angiotensin system; MOD, mild obesity-related diabetes; SIRD-RII, severe insulin-resistant diabetes with relative insulin insufficiency; MARD-II, mild age-related diabetes with insulin insufficiency ESM .08 ACR, albumin-to-creatinine ratio; RAS, renin-angiotensin system; MOD, mild obesity-related diabetes; SIRD-RII, severe insulin-resistant diabetes with relative insulin insufficiency; MARD-II, mild age-related diabetes with insulin insufficiency ESM 13E-04 # positive coefficient indicates lipid levels were higher in SIRD-RII as compared to MOD subgroup. ## p value < 0.0011 was considered statistically significant in discovery cohort based on Bonferroni correction threshold (0.05/45=0.0011) ### lipids in validation cohort with nominal p <0.05 and having the same coefficient direction as that in discovery cohort were considered as statistically significant. PE, phosphatidylethanolamine; PC, phosphatidylcholine; Cer, ceramide; SM, sphingomyelin; MOD, mild obesity-related diabetes; SIRD-RII, severe insulin-resistant diabetes with relative insulin insufficiency ESM 0.15 2.65E-04 0.28 4.85E-05 # a positive coefficient indicates lipid levels were higher in MARD-II as compared to MOD subgroup. Vice versa, a negative coefficient indicates lipid levels were lower in MARD-II as compared to MOD subgroup. ## p value < 0.0011 was considered statistically significant in discovery cohort based on Bonferroni correction threshold 0.05/45=0.0011 ### lipids in validation cohort with a nominal p <0.05 and having the same coefficient direction as that in discovery cohort were considered statistically significant. PC, phosphatidylcholine; LPC, lysophosphatidylcholine; PE, phosphatidylethanolamine; Cer, ceramide; SM, sphingomyelin; MOD, mild obesity-related diabetes; MARD-II, mild age-related diabetes with insulin insufficiency ESM 3) CKD, chronic kidney disease, MACE, major adverse cardiovascular events; AMI, acute myocardial infarcction # Only participants with baseline eGFR above 60 ml/min/1.73m 2 were included in the analysis on progressive CKD ## Incidence rates were presented as events per 1,000 person-years. ### stroke and AMI events were not analysed separately due to small event numbers. MOD, mild obesity-related diabetes; SIRD-RII, severe insulin-resistant diabetes with relative insulin insufficiency; MARD-II, mild age-related diabetes with insulin insufficiency ESM