FormalPara Key Points for Decision Makers

Patients with type 2 diabetes differ on comorbid conditions and complications.

There is no ‘average’ patient with diabetes.

The stratification into ten subgroups proposed in this paper may support clinicians in implementing effective preventive interventions and policy makers in designing resource allocation tailored to patients’ needs.

1 Introduction

Diabetes mellitus (DM) is a major public health problem with a high morbidity and mortality burden and high healthcare costs [1,2,3]. The International Diabetes Federation estimated that, in 2017, 451 million people were living with diabetes worldwide, and this figure is expected to increase to 693 million by 2045 [4] due to population growth, population aging, fast urbanization, excessive caloric intake and sedentary behaviors [5].

Type 2 diabetes (T2DM) is the most prevalent type of diabetes, affecting almost 90% of people with diabetes [6].

In Italy, it has been estimated that the prevalence of DM in 2016 was 6.3%, and that one in five people were not diagnosed [7].

The total global healthcare expenditure for people with diabetes aged 18–99 years was estimated at US$850 billion in 2017 and is expected to increase by 7% by 2045 [4]. In 2017, the North American and Caribbean Region accounted for 52% of the total amount spent worldwide on diabetes and the Europe Region accounted for a large share of the total global spending (26%), with a mean expenditure of 3432 international dollars [4]. In 2045, it is expected that healthcare costs for diabetes will remain stable for the population under the age of 50 years, but will increase by 37% for the population above 70 years [8]. The principal cost components are hospital and outpatient care, even if drug costs are becoming more relevant since the introduction of the expensive analog insulins and incretin-based agents and sodium–glucose cotransporter 2 (SGLT-2) inhibitors [9]. Many studies and systematic reviews [8, 10,11,12,13,14,15,16,17] have evaluated healthcare costs incurred by a population with diabetes using a cost-of-illness approach that includes all direct and indirect costs related to lost productivity. A recent systematic review [3] showed that the annual direct costs of diabetes per patient in international dollars, regardless of the costing method applied and the cost components included, range from $242 in Mexico in 2010 [18] to $11,917 in a study conducted in the USA in 2007 [19], while indirect costs range from $45 for Pakistan in 2006 [20] to $16,914 for the Bahamas in 2000 [21].

A recent study by Marcellusi et al. [22] reported that the total economic burden of patients with diabetes in Italy accounts for €20.3 billion/year, of which 54% is attributable to indirect costs and 46% to direct costs. It has been estimated that in Italy the mean annual healthcare cost for patients with diabetes is about €3000 [7, 21, 23, 24]; that is more than twice that of patients without diabetes.

Converging evidence from different studies suggests that few patients with great clinical complexity account for a large part of diabetes-related costs [10, 11]. In fact, diabetes complications, in particular, those due to microvascular and macrovascular damage (eye, renal and cardiovascular complications) account for a large proportion of healthcare expenditure [6, 10, 11, 25, 26], and comorbidities not directly related to diabetes also contribute to the economic burden of this condition [25, 27, 28].

Given the heterogeneity of patients with T2DM in terms of age, complications and comorbid conditions, the aim of the present study was to estimate, in a geographical area of North-Eastern Italy, the annual medical costs of T2DM, either all-cause or diabetes-related, and to stratify patients into different cost groups based on demographic and clinical characteristics. The Italian National Health System (NHS) provides universal coverage for all its citizens and funds each Regional Health System, which pays for almost all medical costs, except for the small part of the costs of drugs and services that are co-paid by patients. People with diabetes are exempt from co-paying. In this context, the identification of patient subgroups with similar healthcare needs and clinical complexity could be useful to improve quality and cost control and to provide effective and cost-effective healthcare interventions and implement a payment system based on a tailored budget perspective, as applied in other countries.

2 Methods

We conducted a prevalence-based cost-of-illness study using a comprehensive global bottom-up approach to calculate the direct medical costs incurred by the NHS for patients with T2DM over 1 year.

The comprehensive approach, which incorporates costs attributable and non-attributable to T2DM diabetes, provides an accurate description of medical expenditure because diabetes is a chronic disease frequently affected by complications and comorbid conditions.

2.1 Data Source

Patient data including healthcare use (outpatient and inpatient care) and drugs were obtained by linking data from different regional administrative databases using a unique anonymized patient identifier.

The hospital discharge record (HDR) database includes demographic characteristics, admissions and discharge dates, the main and up to five secondary diagnoses and up to six interventions [identified using the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) coding system]. In Italy, HDRs are a reliable and accurate source of hospital data based on the diagnosis-related group (DRG) system that is used to allocate funds to hospitals and to monitor quality of care and outcomes at the national level.

The outpatient pharmaceutical database includes information on patients’ gender and age, prescriptions [substance name, Anatomical Therapeutic Chemical (ATC) classification system code-V.2013, trade name, date of prescription filling, and number of packages]. This database includes drugs reimbursed by the healthcare system that are prescribed by the general practitioner (GP) or a specialist. Drug prescriptions are tracked in the AFT (outpatient pharmaceutical supply) and FED (direct supply drugs) databases.

The outpatient services database (ASA) includes laboratory tests, diagnostic, therapeutic and rehabilitation services and specialty visits, but does not include primary care visits, that cannot be traced.

Demographic information was retrieved from the different sources (HDR, ASA, AFT, and FED).

2.2 Study Population

The study population included adult patients with T2DM living in three Local Health Authorities (LHAs) of Emilia-Romagna Region (Italy), comprising about 2 million people. Patients were identified in 2014 through a regional algorithm [24] that combined information from different administrative databases (hospital admissions, outpatient care, drug prescriptions, exemption for disease) and LHA data. In particular, we selected patients who in 2014 met at least one of the following criteria:

  • Hospital discharge with ICD-9 CM primary or secondary diagnosis of diabetes (code 250.XX).

  • At least two consecutive prescriptions of drugs for diabetes classified with the ATC classification system (code A10).

  • Exemption from co-payment healthcare costs for DM.

Patients alive on 1/1/2015 were followed up to 31/12/2015 to collect health services use and calculate the related annual direct medical costs. Subjects with missing data were excluded. Comorbidities were tracked from drug prescriptions in the previous year using an algorithm developed by Maio et al. [29], and complications in the previous 3 years were retrieved from the HDR database and classified according to a regional dossier [24].

2.3 Outcomes

The outcomes of interest were the annual healthcare service use (outpatient and inpatient care and drug treatment) and costs of health services provided to patients with T2DM over the year 2015.

2.4 Costs

To calculate the annual medical costs for each patient with T2DM, we multiplied the number of services used by the respective unit cost. We used DRG tariffs as a proxy of costs for hospital admissions (http://salute.regione.emilia-romagna.it/siseps/sanita/sdo/files/DGR_1673_2014.pdf/view), the regional nomenclator (http://salute.regione.emilia-romagna.it/documentazione/nomenclatore-tariffario-rer) for specialty visits and the unit cost of prescriptions for medications.

2.5 Statistical Analysis

Categorical variables were summarized using frequencies and percentages, and continuous variables using mean, standard deviation, median and range.

A generalized linear model (GLM) with gamma probability distribution and log link was used to investigate the determinants of costs. In this model, costs were regressed on gender, age group (under 65 years, between 65 and 75 years, over 75 years), LHA of residence (Bologna, Parma, Modena), duration of illness (less than 1 year, between 1 and < 5 years, 5 years or more), complications in the previous 3 years (coma, ischemic heart disease, stroke, peripheral revascularization, amputation, eye complications, renal complications and dialysis) and number of comorbidities (none, one, two, three or more), detected through drug prescriptions. The goodness of fit of the model was measured using the predictive ratio, which is calculated as the ratio of predicted costs to observed costs. The value of the predictive ratio is always positive, with a value closer to one indicating higher predictive performance.

Then a classification and regression tree (CART) model [30] was used to stratify patients into homogeneous subgroups of costs based on demographic and clinical characteristics. This method offers advantages with respect to other classification methods such as logistic regression or cluster analysis because it relies on assumptions about the type of relationship between the outcome and the explanatory variables, can handle multicollinearity among explanatory variables by selecting the best splitter and deals easily with interactions among variables [31].

The independent variables used in the CART were the same as those included in the GLM, except for the number of comorbidities. This was done because more than 90% of patients had at least one comorbidity derived from drug prescriptions, and some of these (gastrointestinal and musculoskeletal) were very common and non-specific, thereby hindering the creation of distinct subgroups.

The CART selects in an automated way from the set of variables the one most associated with costs, and uses it to divide the population into two subgroups. If the independent variable is continuous (such as age), the procedure identifies a specific cut-off that maximizes the cost difference between two subgroups. If the independent variables consist of more than two categories, the categories are merged to optimize the distinction into subgroups. Dichotomous variables are used as such by the procedure. The procedure continues recursively, selecting the second best variable and subdividing each subgroup recursively into two further subgroups and so on, until no significant improvement in the classification of participants is possible. The maximum depth that can be achieved in the tree, i.e., maximum number of subsequent divisions, is five. An internal validation of the CART was conducted using a split-sample approach, in which the tree was first generated using a training sample, including 50% randomly selected cases, and then tested for classification accuracy in the remaining 50% of cases. Moreover, the tree was pruned to avoid overfitting the model. The tree’s predictive accuracy was measured using the risk estimate that denotes the within-node variance. A low risk estimate indicates node homogeneity.

Statistical analyses were performed using IBM SPSS, version 25.

3 Results

Through the database linkage we identified 102,638 patients with T2DM at 1/1/2015, of which 1304 were excluded, because of missing data. The final study population therefore includes 101,334 patients

Table 1 describes the characteristics of the study population. Patients had a mean age of 70.9 years and a slight predominance of males (54.1%); 72.8% had diabetes for at least 5 years; 65.6% had two or more comorbidities and 77.0% had no complications in the previous 3 years. Almost all patients were taking medications or had received outpatient care, and one in five was hospitalized during the follow-up.

Table 1 Characteristics of the study population (N = 101,334)

The total costs for diabetic patients incurred in 2015 by the three LHAs (Table 2) show a skewed frequency distribution, with a median value of €1012 and a mean value of €3086. Hospitalizations, drugs and outpatient care accounted for 53.1%, 29.5% and 17.4% of the total cost, respectively.

Table 2 Summary of direct medical costs

In the GLM estimating the total costs as a function of sociodemographic and clinical variables (Table 3), all independent variables were statistically significant (p < 0.001). Annual costs were higher in males, in patients aged more than 65 years compared with younger patients, and increased with the number of comorbidities. In addition, the complications that were the most relevant drivers of costs were dialysis (b = 2.165, 95% CI 2.045–2.286; p < 0.001), peripheral revascularization (b = 0.494, 95% CI 0.387–0.600; p < 0.001), renal complications (b = 0.470, 95% CI 0.427–0.514; p < 0.001), coma (b = 0.369, 95% CI 0.304–0.434; p < 0.001), and ischemic heart disease (b = 0.317, 95% CI 0.289–0.344; p < 0.001).

Table 3 Predictors of overall costs. Results of the generalized linear model. The predictive ratio is 1.011, denoting a good concordance between predicted and observed costs

Notably, costs were also higher for newly diagnosed patients compared with those of patients with an established diagnosis and for patients living in the Parma LHA as compared with those living in Bologna or Modena LHAs.

In the CART model, the costs estimated by the GLM were used as the dependent variable. This procedure initially split the study population into two subgroups based on age (< 65 years and ≥ 65 years) and created further subdivisions using five variables (dialysis, ischemic heart disease, renal complications, stroke and duration of illness) to reach ten final homogeneous cost subgroups. The other complications, including coma, peripheral revascularization, amputation and eye complications, did not enter the model. The training and the test sample yielded overlapping results. The risk coefficient was low (0.126, with a standard error of 0.001) both in the training and in the test sample, denoting a good fit to the data. We report here the findings related to the test sample, which comprises 50% of cases (Fig. 1).

Fig. 1
figure 1

Classification and regression tree based on the testing sample that comprises 50% of the study population (N = 50,799). Boxes include the absolute and percentage frequency of subgroups and the histogram of the cost distribution. The final nodes are marked in red

The empirical costs in euro in the final ten subgroups of the test sample ranged from a median of €483 to €39,578 (Table 4). The two subgroups with the highest cost (> €31,000), which comprise 0.3% of the population, consisted of dialysis patients, regardless of age and the presence of ischemic heart disease. Groups B, C, E, F and L, comprising 14% of cases, with median cost ranging between €1415 and €3303, included patients with ischemic heart disease, renal disease or stroke. Group I, which was the largest (57.9%), included subjects aged 65 years or over without complications, with a median cost of €1054. The last two groups, which together comprised about one fourth of the cases (27.8%), consisted of subjects aged less than 65 years and with a duration of illness ≥ 5 years or < 1 year (median cost €720) or a duration of illness between 1 and 4 years (median cost €483).

Table 4 Median empirical costs of the 10 subgroups identified using classification and regression tree analysis. Subgroups are arranged in decreasing order of costs

4 Discussion

In this cost-of-illness study we analyzed the annual direct medical costs of a population of patients with T2DM comprising over 70% elderly, a large majority (72.9%) with a duration of illness > 5 years, and 65.6% with two or more chronic conditions.

The mean annual medical cost per capita was €3086, consistent with the results of other studies conducted in Italy or in Europe [7, 8, 11, 23, 32]. However, the median annual medical cost (€1012) is a more accurate summary measure because the distribution of costs is skewed, with a few patients accounting for a large proportion of costs, in line with other studies [10, 12]. We confirmed that hospitalization accounts for the large majority of costs (53.1%) followed by drugs (29.5%) and outpatient services (17.4%) [7, 8, 11, 23].

Our results support the evidence [6, 10,11,12, 25, 33] that the most relevant cost drivers are age, complications and comorbidities. Duration of illness had an independent association with costs. In fact, we found that medical costs are higher in the first year of diagnosis. It is reasonable to assume that in the first year of illness, resource consumption and access to healthcare services are more relevant [14] for diagnosis and staging in order to identify the best therapeutic strategies. Notably, the area of residence was associated with different costs, after adjusting for patients’ demographic and clinical characteristics, suggesting a role of organizational factors that warrants further investigation.

Through CART analysis we identified ten homogeneous subgroups of patients, five aged < 65 years and five aged 65 years or more. The large majority (85.7% of the study population) fell into three subgroups including elderly or younger patients without ischemic heart disease, renal complications, stroke and not on dialysis, with annual median costs < €1055. Five patient groups with higher costs, up to €3303, were characterized by the presence of at least one complication. Dialysis, which represents the end stage of chronic kidney disease, was the most expensive intervention and was delivered to a small minority of patients (0.3%) with annual costs amounting to more than €31,000.

These results suggest that preventive interventions are required to control the increasing prevalence of the disease, due to the aging population, and the rate of complications.

In this way, we could improve not only the health status of T2DM patients but also the amount of their medical costs [34, 35]. Because the natural process of ageing cannot be controlled, the focus of interventions should be mainly on correcting unhealthy behaviors related to poor diet or sedentary lifestyles to prevent both the onset of diabetes and its related complications [36, 37]. Concerning hospitalizations, which accounted for the majority (53.1%) of costs, a possible strategy would be to reduce those potentially preventable through more appropriate management of patients’ follow-up [6] and of drug treatments. In particular, some studies highlighted that the use of some SGLT-2 inhibitors compared with placebo could lead to a significant reduction in the hospitalization for heart failure, which is one of the major causes of admission to hospital in patients with T2DM [38, 39]. Our results concerning the costs of chronic kidney disease are consistent with other reports showing that, starting from stage 3A, the financial burden of chronic kidney disease increases as renal function declines [40, 41]. This supports early identification and clinical interventions for chronic kidney disease in patients with diabetes to delay progression towards end-stage renal failure through optimal management strategies [42].

Given the budget constraints and the need to allocate healthcare resources in an equitable way, our stratification of T2DM patients into cost subgroups could inform treatment decisions and the allocation of related resources in a way that matches patients’ needs. Our results are useful especially because of the growing importance of the budget allocation from the perspective of activity-based provider payment systems [43], bundle payments and value-based bundle payments [44].

The results of the present study should be interpreted in light of strengths and limitations. To the best of our knowledge this is the first study to stratify T2DM patients into cost subgroups based on demographic and clinical characteristics using a classification tree approach. This procedure allows us to define subgroups of patients with combinations of demographic and clinical characteristics, while in general, stratification is performed using only one criterion. For instance, another Italian study applied a cost-of-illness approach to administrative data and stratified the patient population according to the number of comorbidities [22]. Thus, our results can provide useful insights as regards the implementation of healthcare policies and the organization of healthcare services.

In addition, the study is population-based, so it has no selection bias or external validity limitations. Moreover, it takes advantage of the linkage of individual data from high-quality administrative databases [45]. The inclusion criteria for this study have been validated in other studies [24, 46].

The implications of our study need to be interpreted also in light of some limitations. First, microcosting was not possible. In fact, we used tariffs that do not represent the real cost for the local health authority, but only the amount of money the LHA reimburses to providers. Second, in this study, we did not estimate indirect costs for patients and caregivers. In Italy, information on productivity losses and out-of-pocket expenditure is not available at the patient level, but more research should be conducted to acquire these data, as social costs associated with diabetes are substantial. Third, it is possible that people suffering from T2DM are untreated and therefore are not captured by administrative databases.

Our results can be generalized to the other Italian regions because people with diabetes are exempt from co-payment of health services across the country and costs of drugs and outpatient and inpatient care are set at the national level, with small regional variations related to local agreements. Moreover, the methodology is replicable in other countries where the availability of administrative healthcare databases allows the reconstruction of the care pathway of patients with T2DM.

In conclusion, this study provides policy makers and clinicians relevant information about the cost of T2DM care based on patients’ demographic and clinical characteristics. This information, combined with available data on epidemiological trends, may support clinicians in implementing effective healthcare interventions (e.g., prevention of complications and associated comorbidities) and policymakers in designing resource allocation tailored to patients’ needs.