figure b

Introduction

Type 2 diabetes is a highly heterogeneous disease; almost half a billion people [1] living with this condition present with various profiles of physical characteristics, metabolic function and disease severity [2]. As a result, a large group of highly different people are classified as having the same condition. Consequently, this makes it difficult to meet their needs in terms of appropriate care for every individual.

In 2018, Ahlqvist et al identified four subgroups of people with type 2 diabetes based on disease severity and metabolic variables: moderate age-related diabetes (MARD), moderate obesity-related diabetes (MOD), severe insulin-deficient diabetes (SIDD) and severe insulin-resistant diabetes (SIRD) [3]. Subsequently, several studies have replicated the same subgroups using a clustering algorithm [4,5,6,7,8,9], and others have used a nearest centroid approach to allocate individuals to the proposed subgroups [10, 11]. Evaluation of subgroup characteristics showed that SIDD was the most therapy-resistant subgroup [9] and SIRD was the subgroup with the lowest physical fitness level [10] and the highest risk of chronic kidney disease (CKD) [3, 5, 6].

Although studies have replicated the subgroups and have shown important differences between them, these subgroups have not yet been implemented in clinical practice. Interestingly, the names given to the subgroups incorporate a qualitative description of disease severity or degree of metabolic derangement: two ‘moderate’ subgroups and two ‘severe’ subgroups. This disease-related nomenclature is not neutral, even though the subgroups are based on few (clinical) variables. However, it is unknown whether the moderate and severe states are reflected in individuals’ everyday well-being. As patient views are known to differ from those which doctors perceive as important [12, 13], and diabetes has been associated with impaired quality of life (QoL) [14, 15], it would also be relevant to study how the ‘moderate’ and ‘severe’ stages of the clusters are reflected in individuals’ QoL.

Therefore, we aimed to investigate the evolution of QoL in people in each cluster of type 2 diabetes compared with people without diabetes.

Methods

Data source

We used data from the Maastricht Study, an observational, prospective, population-based cohort study. The rationale and methodology have been described previously [16]. In brief, the study focuses on the aetiology, pathophysiology, complications and comorbidities of type 2 diabetes and is characterised by an extensive phenotyping approach. All individuals between 40 and 75 years of age living in the southern part of the Netherlands were eligible for participation. Participants were recruited through mass media campaigns, from municipal registries and from the regional Diabetes Patient Registry via mailings. Recruitment was stratified according to known type 2 diabetes status, with an oversampling of individuals with type 2 diabetes for reasons of efficiency. The present study included the first participants, who completed the baseline survey between November 2010 and November 2013. We used data until 2013 only, as some of the required variables, such as homeostatic model assessments, are not yet available in the newer data. The examinations of each participant were performed within a time window of 3 months after the baseline visit. The Maastricht Study has the approval of the institutional medical ethics committee (NL31329.068.10) and the Dutch Ministry of Health, Welfare and Sport (Permit 131088-105234-PG). All participants gave their written informed consent. How representative the study sample is of the source population in the study region is monitored continuously as described elsewhere [16].

The current study is part of the HTx Project. HTx is a Horizon 2020 project supported by the European Union and lasting for 5 years from January 2019. The main aim of HTx is to create a framework for the Next Generation Health Technology Assessment to support patient-centred, societally oriented, real-time decision-making on access to and reimbursement for health technologies throughout Europe.

Study population

From the Maastricht Study dataset, we selected all people with type 2 diabetes based on the OGTT performed during their first (baseline) visit to the study centre or use of glucose-lowering drugs based on the WHO definition [17]. Type 2 diabetes was defined by a fasting glucose level ≥7.0 mmol/l and 2 h post-glucose drink glucose level ≥11.1 mmol/l, or the use of glucose-lowering drugs, and the absence of a type 1 diabetes diagnosis. We excluded individuals with missing values in the variables needed for clustering and subsequently those with outliers (>3 SD from the mean) in these variables. People were allocated to the newly diagnosed group if they had never been diagnosed with diabetes before (based on a baseline questionnaire) and did not use medication for diabetes at baseline but were classified as having diabetes according to the OGTT at baseline. The other people were included in the already diagnosed group. Separately, we included people without diabetes from the Maastricht Study dataset, according to the baseline OGTT (fasting glucose level <6.1 mmol/l and 2 h post-glucose drink glucose level <7.8 mmol/l) and no use of glucose-lowering drugs, in order to plot their QoL for comparison with people with diabetes. People with prediabetes (fasting glucose level <7.0 mmol/l and 2 h post-glucose drink glucose level <11.1 mmol/l, and no use of glucose-lowering drugs) were excluded. Sex was self-reported and the Maastricht Study only provided the options ‘male’ and ‘female’. The definitions used are according to the Maastricht Study methodology [16].

Clustering

Individuals were assigned to clusters using the nearest centroid approach using the centroids published by Ahlqvist et al [3]. They identified clusters through a data-driven cluster analysis using k-means and hierarchical clustering in individuals with newly diagnosed diabetes from a Swedish cohort. The clustering variables for type 2 diabetes included age at diagnosis, BMI, HbA1c, HOMA-B and HOMA-IR (using HOMA2 and C-peptide levels). The resulting clusters were MARD, MOD, SIDD and SIRD. MARD was characterised by a higher age at diabetes diagnosis; MOD by a high BMI; SIDD by a relatively low BMI, lower age and low insulin secretion (i.e. low HOMA-B); and SIRD by a high BMI and high HOMA-IR. MARD and MOD were additionally characterised by moderate metabolic derangement, and SIDD and SIRD by severe metabolic derangement [3]. We used the centroids and the same baseline variables to assign individuals to clusters.

Variables

To characterise the population and evaluate its comparability with the population in the Ahlqvist et al study, we explored a broad range of additional characteristics. All variables are listed in electronic supplementary material (ESM) Table 1.

HbA1c values were measured at baseline, and follow-up measurements were available from routine care through linkage with hospital data. The Short Form 36 (SF-36) questionnaire was completed at baseline and then once a year, with data currently available for 7 years of follow-up. The mental component summary (MCS) and physical component summary (PCS) scores were derived from the SF-36, which has been reported to be validated and reliable [18]. The SF-36 includes 36 questions contributing to eight health domains, which in turn contribute to MCS and PCS scores. These scores are calculated based on scoring all questions based on factor analysis, and transformed to a mean of 50 and an SD of 10, as described elsewhere [19].

Outcomes

We used several outcomes in this study to characterise the different clusters, including complications at baseline, and HbA1c and QoL during follow-up. The cluster-wise association with diabetes-related complications was determined at baseline, as follow-up data were not available for complications. The complications included CKD (defined as having an eGFR <60 ml/min per 1.73 m2 and/or albumin excretion of at least 30 mg/day), neuropathy (defined as the presence of neuropathic pain and/or impaired vibration sense), retinopathy (based on fundoscopy), non-alcoholic fatty liver disease (NAFLD) (defined as having at least 5.56% liver fat [20]), CVD and cerebrovascular disease. Cluster-wise first time to reach adequate glycaemic control was defined as an HbA1c <53 mmol/mol (7.0%) during follow-up [21]. Finally, QoL was determined at baseline and during follow-up based on the MCS and PCS scores of the SF-36.

Statistical analyses

All analyses were performed separately for individuals who were newly diagnosed with type 2 diabetes during their baseline visit to the Maastricht Study centre and for those who were already diagnosed with type 2 diabetes.

We used descriptive statistics to summarise cluster-wise and total baseline characteristics and compared these characteristics with χ2 tests for categorical variables and with one-way ANOVA for continuous variables.

Logistic regression models estimated ORs for the cluster-wise associations with the presence of diabetes-related outcomes and depression at baseline. These models were adjusted for age, sex, diabetes duration (only in the already diagnosed group) and educational level (with the ‘low’ category as the reference group). We performed two sensitivity analyses in which we replaced educational level by the International Socio-Economic Index of occupational status 2008 (ISEI-08) classification and by equivalent income, to see if these proxies of socioeconomic position had a different effect.

We depicted the evolution of HbA1c over time by dividing the follow-up time into 6 month intervals, taking the mean of the measurements per interval per cluster and plotting these points over time.

A Kaplan–Meier curve was created to visualise the time to reach adequate glycaemic control (HbA1c <53 mmol/mol [<7.0%]) and Cox proportional hazards models were used to estimate the HR of reaching adequate glycaemic control. These models were adjusted for age, sex, diabetes duration and educational level. We performed two sensitivity analyses in which we replaced educational level by the ISEI-08 classification and by equivalent income, to see if these proxies of socioeconomic position had a different effect. Moreover, we used Kaplan–Meier curves and Cox proportional hazards models to evaluate likely depression, defined as a deterioration of 3 points in MCS score. This proxy was used due to the absence of depression data during follow-up.

In our main analysis, we depicted QoL over time by plotting the yearly mean component scores (MCS and PCS) per cluster. Generalised estimating equations were used to adjust the plots for age and sex. We performed a sensitivity analysis by adding a correction for BMI to this model. In a separate sensitivity analysis we analysed the data by sex to evaluate whether there was a sex-specific effect (without adjustment for sex in this analysis).

An α level of 0.05 was used and data were analysed using IBM SPSS Statistics v.26 for Windows (IBM, Armonk, NY) and R language v.4.1 and RStudio v.1.4 (https://www.R-project.org/).

Results

Participant selection

Figure 1 shows the selection of participants from the initial dataset to sets of newly diagnosed individuals (n=127), already diagnosed individuals (n=585) with no missing values or outliers in the clustering variables and a population of people without diabetes (n=1924), used in the QoL analysis.

Fig. 1
figure 1

Flowchart of participant selection

Baseline characteristics

Tables 1 and 2 show the most important cluster-wise and total baseline characteristics of newly and already diagnosed individuals, respectively. Apart from the differences in clustering variables, there were significant differences in eGFR, MCS score, PCS score and number of people with neuropathy among newly diagnosed individuals. In already diagnosed individuals, there were significant differences in sex, diabetes duration, eGFR, PCS score, the number of people with retinopathy and the number of people using glucagon-like peptide-1 receptor agonists (GLP1-RAs), insulin and other glucose-lowering drugs. More characteristics of newly diagnosed and already diagnosed individuals are shown in ESM Tables 2 and 3, respectively, and characteristics of people without diabetes are shown in ESM Table 4.

Table 1 Total and cluster-wise baseline characteristics of newly diagnosed individuals
Table 2 Total and cluster-wise baseline characteristics of already diagnosed individuals

Complications at baseline

Table 3 shows the ORs for the presence of complications at baseline for newly diagnosed individuals. The SIRD cluster was associated with neuropathy. Overall, the numbers of complications were small in newly diagnosed individuals. Table 4 shows the ORs for the complications at baseline for already diagnosed individuals. The MOD cluster was associated with CKD (OR 3.04; 95% CI 1.62, 5.69). The SIDD cluster was associated with retinopathy, although this effect was no longer apparent after correction for education. The SIDD cluster was also associated with CVD, although this effect was no longer apparent after correction for diabetes duration. The presence of neuropathy, NAFLD and cerebrovascular disease did not differ between the clusters. The number of people with depression at baseline was too small to evaluate in newly diagnosed individuals. In already diagnosed individuals, the SIDD cluster was associated with depression (ESM Table 5).

Table 3 Cluster-wise ORs of complications at baseline for newly diagnosed individuals (n=127)
Table 4 Cluster-wise ORs of complications at baseline for already diagnosed individuals (n=585)

HbA1c during follow-up

The analyses of HbA1c during follow-up showed that already diagnosed individuals in the SIDD cluster were less likely to reach glycaemic control than individuals in the other clusters. Figure 2 shows the cluster-wise evolution of HbA1c over time during the 7 years of follow-up. The mean HbA1c of the SIDD cluster was consistently higher than that in the other clusters, in particular in the already diagnosed population. This is also reflected in the Kaplan–Meier curve of time to reach adequate glycaemic control (Fig. 3) and confirmed by the Cox regression, with an adjusted HR of reaching adequate glycaemic control of 0.31 (95% CI 0.22, 0.43) for the SIDD cluster compared with the MARD cluster. The Kaplan–Meier curve of newly diagnosed individuals (Fig. 3) did not indicate a difference in time to reach adequate glycaemic control between the clusters, and this was confirmed by the Cox regression (data not shown).

Fig. 2
figure 2

Cluster-wise evolution of HbA1c (mmol/mol) during follow-up for newly diagnosed (a) and already diagnosed (b) individuals

Fig. 3
figure 3

Kaplan–Meier curves of time to reach adequate glycaemic control (HbA1c <53 mmol/mol [<7.0%]) for newly diagnosed (a) and already diagnosed (b) individuals

QoL during follow-up

Figures 4 and 5 show the evolution of QoL based on the SF-36 during the 7 years of follow-up. The mean MCS score (Fig. 4) appeared to fluctuate between 50 and 55 over time but was similar in all clusters and in people without diabetes overall. The mean PCS score (Fig. 5) seemed to decline slightly over time, with a decrease of approximately 3 points in newly diagnosed individuals and 1–2 points in already diagnosed individuals. The mean MCS score was lower in all already diagnosed clusters than in people without diabetes. Among already diagnosed people, the mean PCS score in the MARD cluster (approx. 48) was slightly lower than the score in people without diabetes (approx. 52), whereas the MOD, SIDD and SIRD clusters scored much lower (approx. 43). The differences in mean PCS score between clusters in newly diagnosed individuals were less obvious than those in already diagnosed individuals. People without diabetes scored highest, followed closely by people in the MARD cluster (mean PCS scores decreased from approx. 52 to 51 (no diabetes) and 50 (MARD) at 7 years), with the MOD and SIDD clusters having a slightly lower mean PCS scores over time (both mean PCS scores decreased from approx. 50 to 46 at 7 years). The SIRD cluster scored lowest, with the PCS score decreasing from around 45 to 43 at 7 years. Both MCS and PCS scores were lower in already diagnosed individuals than in newly diagnosed individuals.

Fig. 4
figure 4

Cluster-wise evolution of MCS scores during follow-up for newly diagnosed (a) and already diagnosed (b) individuals compared with people without diabetes, adjusted for sex and age

Fig. 5
figure 5

Cluster-wise evolution of PCS scores during follow-up for newly diagnosed (a) and already diagnosed (b) individuals compared with people without diabetes, adjusted for sex and age

The analysis in which we used a deterioration of at least 3 points in MCS score as a proxy for likely depression during follow-up showed no significant HRs (ESM Tables 6, 7), which was confirmed by the Kaplan–Meier curves (ESM Fig. 1).

Sensitivity analyses

Using ISEI-08 or equivalent income instead of educational level for confounder correction did not change the results of the logistic regression or Cox proportional hazards model. Correcting QoL over time for BMI did not considerably change the resulting graphs. Analysing QoL by sex did not lead to differences in evolution of QoL.

Discussion

The results of this study show that the terms ‘moderate’ and ‘severe’ used in the names of the novel clusters of type 2 diabetes are not reflected in individuals’ QoL. All individuals with diabetes scored lower than people without diabetes based on the PCS and MCS scores of the SF-36. The MOD cluster scored particularly low, although people in this cluster are labelled as having a ‘moderate’ degree of disease.

The use of the nearest centroid method led to four groups of people comparable to those reported by Ahlqvist et al [3]. Generally, the characteristics of the subgroups hold true in our population, with the differences between the clusters being smaller than those in the population studied by Ahlqvist et al. Their study population consisted of 14,652 Swedish adults with newly diagnosed diabetes. Compared with this population, our population was younger and showed less extreme values in HbA1c, HOMA-B and HOMA-IR.

Based on the Kaplan–Meier curves, our results match with those reported by Ahlqvist et al: the SIDD cluster was less likely to reach glycaemic control and thus appeared more therapy resistant than the other clusters. However, we only replicated this finding in already diagnosed individuals, not in newly diagnosed individuals. This is potentially due to the small number of people with newly diagnosed diabetes, or because of the limited metabolic derangement in this group to begin with.

Our results are also in line with studies reporting that diabetes is associated with a reduced QoL [14, 15], studies replicating the same clusters proposed by Ahlqvist et al by running a clustering algorithm [4,5,6,7,8,9] and those using the nearest centroid approach as we did [10, 11]. Similar to our findings, previous studies have shown that SIDD was the most therapy-resistant cluster [9] and SIRD was the cluster with the lowest physical fitness [10].

Some of the findings in previous studies were not confirmed in our study. First, there are reports of an increased risk of CKD in the SIRD cluster [3, 5, 6]. Although the absence of this association in our study could be due to population differences, it could also be due to the longer diabetes duration in the MOD cluster (11.3 years) than in the SIRD cluster (4.5 years). In addition, we did not replicate the typical evolution of HbA1c after diabetes diagnosis, with a drop in HbA1c after starting glucose-lowering treatment initially, but a subsequent slow increase in HbA1c as treatment is no longer sufficient, as reported by Dennis et al [4]. This could be because baseline HbA1c levels in newly diagnosed individuals were not markedly elevated. These individuals were diagnosed by chance (i.e. had no symptoms) during the visit to the study centre and probably did not seek treatment afterwards. The population in the study by Dennis et al was a trial population, with each individual starting on a glucose-lowering drug. Moreover, the 95% CIs of the cluster-wise associations with complications at baseline were wide. This indicates that the models had limited robustness, which could be due to the small number of people, in particular in the newly diagnosed group. The analysis of depression at baseline was also limited by the small number of events.

After confirmation that our clusters were comparable to those reported previously, we studied the cluster-wise evolution of QoL based on the MCS and PCS scores of the SF-36. People with type 2 diabetes had a lower QoL than those without, but the degree to which QoL was lower differed between the clusters. MOD, in particular, appeared to not be ‘moderate’ when looking at QoL, in particular PCS score. We hypothesised that this might be due to impaired functioning because of obesity, but adjustment for BMI turned out to have little effect. The question remains whether the observed differences in QoL scores are (clinically) relevant. This is a subjective matter, as there are no strict guidelines on the interpretation of MCS and PCS scores. Generally, a change of 2–3 points is considered to be relevant [19, 22]. This means that the observed differences in MCS and PCS scores can be interpreted on two levels: relevant difference (i.e. 2–3 points difference) and significant difference (i.e. no overlap in 95% CI).

For both already and newly diagnosed individuals, there were no significant or relevant differences in MCS scores between the clusters and people without diabetes. There were both significant and relevant differences in PCS scores among already diagnosed individuals between people without diabetes, MARD and the other three clusters (MOD, SIDD, SIRD). There were no significant differences in PCS scores among newly diagnosed individuals, but most of the differences were relevant. In newly diagnosed individuals, the MARD cluster scored relevantly lower in terms of PCS score than people without diabetes after approximately 5 years, whereas the other clusters scored relevantly lower from the beginning. All clusters, except for MOD and SIDD, scored relevantly differently from each other. In newly diagnosed individuals, PCS scores deteriorate over time and people in the SIRD cluster score lower than people in other clusters. In already diagnosed people, both the SIRD and SIDD clusters score lowest, followed by the MOD cluster. Assuming that Fig. 5a reflects PCS scores in the early stages of the disease and Fig. 5b reflects PCS scores in the late stages of the disease, it becomes evident that the SIRD cluster scores low from the beginning, whereas the SIDD and MOD clusters show a considerable decline in PCS score as diabetes progresses. People in the MARD cluster score higher than the other clusters, starting at a similar level to people without diabetes, but reach a relevantly different lower score after approximately 5 years. We also evaluated QoL by sex, but this did not lead to different results.

From the QoL graphs, it is clear that people with diabetes report lower QoL than people without diabetes, and that QoL differs between clusters of people with diabetes. In particular, the MARD cluster scored higher than the other clusters, whereas the MOD cluster scored similarly to the ‘severe’ clusters (SIDD and SIRD). Differences in MCS score were smaller than differences in PCS score. In line with these small differences in MCS, the HRs of likely depression based on the MCS score, as well as the Kaplan–Meier curves, showed no significant differences during follow-up.

Our results show that there is probably an effect of diabetes duration on QoL. The graphs show a decline in PCS score over time and the scores in newly diagnosed individuals are higher than in already diagnosed individuals. QoL might deteriorate as diabetes progresses, and this could also be an explanation for the limited findings in newly diagnosed individuals. People at the beginning of their disease appear to be more similar to each other and to people without diabetes, whereas differences become more apparent as the disease progresses.

This study has several strengths. The extensive exploration of the clusters allowed for confirmation of similar clusters to those reported previously, before moving on to exploration of QoL. The similarity of our clusters to those reported previously confirms the external validity of this study. The large set of variables allowed for extensive characterisation of the clusters. Finally, this is the first study, to our knowledge, to explore cluster-wise QoL. The availability of follow-up data on QoL allowed for investigation of its evolution over time, and the use of validated methods supports the validity of the study.

There are some limitations to keep in mind when interpreting our results. We included a relatively small sample size, leading to limited numbers of people in each cluster. In particular, the number of newly diagnosed individuals was small, meaning on the one hand that the results in this group should be interpreted with caution, but on the other hand that the non-significant findings could be significant in larger populations. Additionally, our population consisted of relatively ‘healthy’ people with type 2 diabetes and 99% of our population was white. It has been reported that different clusters might apply to other ethnicities [23,24,25]. Moreover, only self-reported sex was available and we could not account for different gender identities. As we did account for biological sex, we expect the absence of gender identities would have impacted the analysis of QoL only to a limited extent. We could not replicate all previous findings due to the data on comorbidities being available at baseline only. Finally, the study population might not be fully representative of the mental health of the general population, as the Maastricht Study includes only those willing to participate and visit the research centre; people with mental health problems may be less inclined to participate. This might have led to the small differences in MCS scores.

This study shows that the ‘moderate’ and ‘severe’ nomenclature of previously suggested clusters of type 2 diabetes is not entirely reflected in QoL. Of the clusters with diabetes, people in the MARD cluster scored relatively highly in terms of PCS score, whereas people in the MOD cluster reported lower PCS scores, comparable to those in the SIDD and SIRD clusters. This indicates that the MOD cluster is not ‘moderate’ in terms of impaired QoL, as it is similar to the ‘severe’ clusters. Although the ‘moderate’ and ‘severe’ annotations are reflected in the degree of metabolic derangement, they do not entirely hold up from an individual’s perspective in terms of QoL. It might be better to remove the ‘moderate’ and ‘severe’ annotations from the names of the clusters before implementing the clusters in practice, as this non-neutral nomenclature could affect disease perception of people with type 2 diabetes, their healthcare providers and society at large. Further research, preferably in larger populations, is required to confirm these findings and provide more support for reconsideration of the nomenclature used in clusters of type 2 diabetes.