FormalPara Key Summary Points

Why carry out this study?

This study was a large, population-level analysis of patients with type 2 diabetes mellitus in Ontario, Canada.

This study compared the demographic and clinical characteristics of patient populations with type 2 diabetes mellitus identified by using sensitive and specific cohort definitions to understand their impact on characterizing the disease.

What was learned from the study?

Although the sample sizes were different between sensitive (n = 1,093,812) and specific (n = 783,228) cohorts, the demographic and clinical characteristics were comparable. Patients in both the cohorts had multiple comorbidities.

The high proportion of patients with comorbidities further illustrate the importance of taking a comprehensive approach to patient management.

Introduction

Diabetes poses a substantial burden on the Canadian population [1]. In 2020, about 3,772,000 people in Canada were living with diabetes, with an estimated direct cost of $3.8 billion to the healthcare system. By 2030, both prevalence and direct costs are projected to increase by 30% [1]. People living with diabetes are at an increased risk of comorbidities, hospitalizations, premature death and all-cause mortality [1]. As the incidence and prevalence of diabetes is increasing, accurate estimates of the disease burden on the population can inform diabetes prevention and management efforts. Population-based research plays an important role in understanding the trends of diabetes and impacts of interventions [2, 3]. Additionally, it can help assess the economic impact of diabetes on individuals and the healthcare system, guide the optimal allocation of resources, provide valuable guidance for clinical practice, and identify research priorities [4]. National surveys, health systems data, registries, and administrative data are some of the methods used to measure the current estimates of the burden of diabetes [3]. Although population-based research is an important approach in addressing this increased burden of diabetes, their accuracy may be limited by underrepresentation during sampling, selection bias, confounding factors, and misclassification bias [3]. The accuracy of diabetes prevalence estimates can be improved by the development of accurate case definitions to identify people with diabetes among various sources (e.g., ambulatory and hospitalization records) of administrative data [3]. In Ontario, Canada’s most populous province, the ICES data repository captures comprehensive, census-level data of all publicly covered health care services since 1986 [5]. As such, the ICES data reflect day-to-day interactions of Ontarians with the health care system through the government-run health plan known as the Ontario Health Insurance Plan (OHIP), which provides medically necessary services to most residents of Ontario [5]. The ICES data repository currently includes health service records for over 14 million people [6].

The ICES Ontario Diabetes Database (ODD) is a cohort of Ontario residents with non-gestational diabetes identified through a clinically validated algorithm [7]. Previous work has been done using ICES data to identify and validate case definitions for the identification of people with diabetes using the ODD, using various combinations of hospitalization and physician visit codes. These case definitions were validated against diagnoses of diabetes among adults (20 years of age and older) in electronic medical records (EMRs), with varying levels of sensitivity and specificity [2]. In administrative data research, case definitions can be adopted to increase the sensitivity or specificity of the disease depending on the research objectives of the study. A sensitive case definition detects as many true cases as possible but can also include individuals who do not have the disease (i.e., false positives). On the other hand, a specific case definition detects only those individuals who truly have the disease and may incorrectly classify some who have the disease as not having the disease of interest (false negatives) [2].

This study was conducted to identify patient populations in Ontario administrative data using previously validated sensitive and specific diabetes case definitions, and to compare the demographic and clinical characteristics of these patient populations to understand their impact on characterizing the disease.

Methods

Study Design and Data Sources

This study used multiple linked administrative health data sources from the ICES data repository to identify and compare two cohorts of patients with type 2 diabetes using different case definitions. Encrypted OHIP card numbers of patients were used to link ICES datasets to follow patients through healthcare touchpoints. All data sources were linked at the patient level to facilitate longitudinal analysis of the patient journey through the healthcare system (see Supplementary File Table S1 for a description of the databases).

Study Population

Two different case definitions were used to identify two cohorts of patients with type 2 diabetes mellitus using previously validated sensitive and specific case definitions [2]. For both cohorts, patients were selected if they were registered in the ODD with at least one HbA1c value during the selection period (from October 1, 2013 to September 30, 2015), with the earliest HbA1c value during the selection period considered the index date (Fig. 1). As diagnosis codes for diabetes mellitus used by physicians outside of the hospital setting are not specific to type 1 diabetes mellitus or type 2 diabetes mellitus, the following exclusion criteria were used to select patients with type 2 diabetes mellitus: (1) patients with a diagnosis of type 1 diabetes (International Statistical Classification of Diseases and Related Health Problems, 10th Revision, Canada (ICD-10-CA) = E10) any time during the study period, (2) patients who were under 15 years of age when first diagnosed with diabetes, and (3) patients under 30 years of age at index. Additionally, patients were excluded if they had any gaps in OHIP coverage within the 1 year prior to index date or if they had a date of death which was prior to their index. To meet the sensitive case definition (Sensitivity: 90.0%, Specificity: 97.7%), patients were required to have either at least 1 hospitalization with a diagnosis code of diabetes (ICD-10-CA = E11, E13, or E14) OR at least 2 physician claims for diabetes (OHIP code = 248 or 250) in a 1-year period OR at least 1 prescription for an antihyperglycemic drug prior to the index date [2]. To meet the specific case definition (Sensitivity: 79.9%, Specificity: 99.1%), patients were required to have at least 3 physician claims for diabetes in a 1-year period [2].

Fig. 1
figure 1

Study design. HbA1c, hemoglobin A1c

Variables

Demographic information at baseline including age, sex, and address were collected from the Registered Persons Database (RPDB). Age was calculated at the time of index and patients were grouped as follows: 30–45, 46–55, 56–65, 66–75, 76–85, and 86 + years. Neighbourhood-level income quintile, Local Health Integration Network (LHIN) of residence [8, 9], rurality, and Rurality Index Ontario (RIO) [10] were estimated based on residential address using the Postal Code Conversion File Plus [11].

Data on hospital admissions were collected from the Discharge Abstract Database (DAD), while data on emergency department visits were retrieved from the National Ambulatory Care Reporting System (NACRS). Patient claims for physician services claims were extracted from the OHIP database, while the Ontario Drug Benefit (ODB) database was the source of all prescription claims dispensed under Ontario’s provincial public drug program. Additional information on home care was retrieved from the Home Care Database, and the Ontario Laboratory Information System (OLIS) database was used for laboratory data. These databases were used to determine the prevalence of comorbidities (myocardial infarction, stroke, (unstable) angina, congestive heart failure (CHF), chronic ischemic heart disease (CIHD), dyslipidemia, hypertension, transient ischemic attack, venous thromboembolism, peripheral vascular disease (PVD), microalbuminuria, macroalbuminuria, end-stage renal disease (ESRD), lower extremity amputation, autonomic neuropathy, retinopathy, depression, obesity, and cirrhosis) by looking back from April 2002 up to index date. See Supplementary File Table S2 for table of diagnosis codes used to classify comorbidities.

Clinical and laboratory values were also identified at baseline from the OLIS database using Logical Observation Identifiers Names and Codes (LOINC). Glycemic control was assessed based on the average HbA1c results (LOINC: 17855-8, 17856-6, 41995-2, 4548-4, 59261-8, and 71875-9) in the 365 days prior to index date where index date was included. One-year average HbA1c was calculated by averaging all HbA1c values in the year prior to and including the HbA1c value at index date. Low-density lipoprotein (LDL [LOINC: 22748-8 and 39469-2]) and high-density lipoprotein (HDL [LOINC: 14646-4]) cholesterol test results closest to index date in the 365 days prior to index date were reported. Time from diagnosis of diabetes to index date was assessed based on the time of detection of diabetes based on entry to the ODD database to index date, in years. This variable was categorized into five categories: ≤ 4, 5–9, 10–14, 15–19, and 20 + years. ODD entry source was determined based on the criteria of patient entry into the ODD database: through either one hospital admission for diabetes (DAD), one antihyperglycemic medication claimed through provincial payer (ODB), or three diagnostic codes for diabetes (OHIP) in one year, or some combination thereof.

Data Analysis

The prevalence of type 2 diabetes mellitus was estimated by dividing the number of patients identified in each cohort by the Statistics Canada’s Ontario population estimate as of July 1, 2015 (13,792,100 persons).

Descriptive analyses were conducted using number and percentages for categorical variables and mean (SD) and median (IQR) for continuous variables. In accordance with ICES policies, results based on less than six patients were reported as “1–5”. All analyses were conducted using SAS version 9.3 or higher (SAS Institute, Cary, NC). This study was created using ethical principles that have their origins in the Declaration of Helsinki and applicable privacy laws. It was approved by Advarra IRB Services (IRB approvals Pro00049255 and Pro00039416). Analyses were conducted by an ICES analyst, using the analysis plan developed by the study authors. Access to ICES databases is granted through ICES Data Access Services. ICES does not allow third party direct access to their data holdings.

Results

In this study, based on their most recent HbA1c values, 1,265,329 patients were identified using sensitive case definition and 938,952 patients using specific case definition from the ODD (Supplementary File Table S3). Of these patients, 150,461 and 145,872 patients with type 1 diabetes were excluded from the sensitive and specific cohorts respectively. Further, patients aged < 15 years at ODD case detection, those aged < 30 years at index date, those who were OHIP ineligible at index or any of previous four quarters, or those who died before index were excluded. The final sensitive cohort included 1,093,812 patients and the final specific cohort included 783,228 patients, with an estimated prevalence of 7.9% and 5.7% of Ontarians living with type 2 diabetes mellitus, respectively.

Baseline Demographic and Clinical Characteristics

Baseline demographic and clinical characteristics were generally comparable between the patients in the sensitive and specific cohorts (Table 1). The mean (SD) age of patients in the sensitive and specific cohorts was 64.1 (13.27) years and 64.8 (12.87) years respectively; slightly more than 50% of the patients from both the cohorts were within the age group of 56–75 years. Male patients accounted for 52.4% and 53.6% in the sensitive and specific cohorts, respectively. About 60% of patients from both the cohorts belonged to the medium–low (quintile 2) to medium–high (quintile 4) quintiles whereas approximately one-fourth of the patients in sensitive (23.3%) and specific (23.4%) cohorts belonged to the lowest income quintile (quintile 1). Majority of patients resided in large urban areas while only 10% reside in rural areas.

Table 1 Population description

The mean one-year HbA1c in the sensitive cohort was 6.9% (52 mmol/mol), whereas the mean one-year HbA1c in the specific cohort was 7.1% (55 mmol/mol). In the sensitive cohort, 64.4% of patients reported a 1-year mean HbA1c level < 7% (53 mmol/mol), with 25.3% between 7.0 to 8.5% (53–69 mmol/mol). For the specific cohort, these numbers were 55.7% and 31.5%, respectively (Fig. 2). In the sensitive cohort, the mean HDL and LDL cholesterol values were 48 mg/dl (1.2 mmol/L) and 85 mg/dl (2.2 mmol/L) respectively, whereas it was 48 mg/dl (1.2 mmol/L) and 79 mg/dl (2.1 mmol/L) in the specific cohort. The mean (SD) time from diagnosis of diabetes to index date in the sensitive and specific cohorts were 7.3 (6.53) years and 8.3 (5.89) years, respectively.

Fig. 2
figure 2

One-year average HbA1c groups of patients with type 2 diabetes mellitus using the “sensitive” and “specific” case definitions. HbA1c, hemoglobin A1c

Comorbidities

Patients in sensitive and specific cohorts had multiple comorbidities (Fig. 3; Supplementary File Table S4). Hypertension was the most commonly reported comorbidity in both the cohorts, with 77.3% and 79.8% of patients in sensitive and specific cohorts respectively reporting hypertension. Other most commonly reported comorbidities in at least 20% of the patients in both sensitive and specific cohorts respectively included dyslipidemia (54.5% and 55.0%), CIHD (27.4% and 28.4%), microalbuminuria (29.0% and 22.4%), angina (20.9% and 21.6%), and obesity (21.2% and 21.3%).

Fig. 3
figure 3

Prevalence of comorbidities of patients with type 2 diabetes mellitus using the “sensitive” and “specific” case definitions. Comorbidities reported in > 5% of patients were presented in the graph

Discussion

This study was a large, population-level analysis of patients with type 2 diabetes mellitus using administrative data in Ontario, Canada. The results from this study provide a comprehensive look at the demographic and clinical characteristics of patients with type 2 diabetes mellitus in Ontario. For this study, two case definitions for diabetes were considered sensitive definition (at least two physician visits with diabetes OR at least one drug claim for diabetes OR at least one hospitalization with diabetes within 1 year) and the specific definition (at least three physician visits with diabetes within 1 year). In this study, more patients with type 2 diabetes mellitus were identified by the sensitive case definition compared with specific case definition; however, the demographic and clinical characteristics were mostly comparable between the cohorts. The majority of patients presented with comorbidities in both the cohorts.

In this study, the prevalence of type 2 diabetes in Ontario was estimated to be 7.9% based on the sensitive case definition and 5.7% based on the specific case definition. As per 2016–17 estimates from the Canadian Chronic Disease Surveillance System, the prevalence of diabetes in Canada was 8.8% [12] and about 90% of diabetes cases in Canada are attributed to type 2 diabetes [13]. Hence, the estimated prevalence of type 2 diabetes based on the sensitive definition from our study is in line with that of the data observed from Canada. Patients in both the cohorts had a mean age of approximately 64 years, and there were slightly higher proportion of male patients compared with female patients. Nearly one-fourth of patients (23%) belonged to quintile 1 (lowest income quintile) and 16% to quintile 5 (highest income quintile). The demographics observed in our study are comparable with the demographics of patients with diabetes included in a large population-based retrospective study whose diabetes status was identified from ODD [14]. The patient population had a mean age of ~ 58 years and were more likely males. About 24% of patients were under quintile 1 and 16% of patients under quintile 5 [14]. In our study, while there were very little differences on most demographic characteristics, the specific cohort definition had a slightly lower proportion of patients with a 1-year mean HbA1c of < 7% (53 mmol/mol) at 56% compared to 64% in the specific cohort. Although patients with HbA1c greater than 7% (53 mmol/mol) are at an increased risk of comorbidities [15, 16], the proportion of patients with comorbidities in both cohorts were comparable, which suggests that both case definitions are identifying similar patient populations.

Within our study cohorts, many patients with type 2 diabetes mellitus were identified with comorbid conditions. The most commonly reported comorbidities were hypertension followed by dyslipidemia, CIHD, microalbuminuria and angina. Thus, most of the patients with type 2 diabetes present with cardiovascular comorbidities, which is in line with the published literature [17, 18]. Notably, hypertension was reported in 77% of patients in the sensitive cohort and 80% of patients in the specific cohort. The 2006–2007 data from Canadian Chronic Disease Surveillance System (CCDSS) reported that 63% of adult patients with diabetes had hypertension as a comorbidity [19]. In a retrospective study conducted during July 2014 to June 2015, the prevalence of hypertension in patients with type 2 diabetes was 82% [20]. This reflects that the prevalence of hypertension observed in this study is comparable with the data from other studies. Obesity, on the other hand, was reported in only around 21% of patients with type 2 diabetes in this study, whereas in another study, prevalence of obesity/overweight in this patient population was about 78% [20], implying that obesity may have been underreported in our study.

Our study has applicability to other administrative databases where case definitions may be used to identify patients based on interactions with the healthcare system due to the lack of diagnosis information in an outpatient setting. Our findings show that the patients identified in both of our case definitions are similar on demographic and clinical characteristics, supporting the use of both definitions for research purposes. However, our research would recommend using sensitive definitions where estimates of prevalence and population burden are required due to the larger sample size captured.

This study has several strengths. As Ontario is a single payer system for medical claims, the data used in this study is near census. Hence, the study does not risk missing a recorded diagnosis of diabetes mellitus if a patient requires care from several uncoordinated sources, while incomplete medical history or utilization from a chart audit may face this risk. Moreover, the long study duration helped retrieve comprehensive data and better reflect real-world conditions, thereby resulting in an increased external validity. In addition, the study showed a consistency in coding over time and used validated sensitive and specific case definitions.

The study also has a few limitations. Firstly, the study is limited by ambiguity in the OHIP diabetes diagnosis codes, which do not specifically identify type 1 or type 2 diabetes mellitus within the community setting. To overcome this, we used several inclusion/exclusion criteria to eliminate from the cohort those that had a high probability of having type 1 diabetes mellitus. Secondly, the administrative databases include only publicly reimbursed medical claims and therefore, cash and private prescription drug claims are not captured in this study. Moreover, individuals aged at least 65 years or those receiving professional home and community care services or receiving benefits from Ontario Works or Ontario Disability Support Program or enrolled in Trillium Drug Program are only considered eligible for Ontario Drug Benefit program. Thirdly, case definitions still have a risk of misclassification, but this is minimized through high sensitivity/specificity [2]. Fourthly, individuals who are more effectively managed may have a lower probability of undergoing HbA1c testing, potentially leading to the introduction of selection bias. Additionally, the comorbidities were derived using ICD-10-CA codes, some of which may be under-reported (e.g., obesity). Finally, the data for this study was collected before the widespread use of sodium–glucose cotransporter 2 (SGLT-2) inhibitors and glucagon-like peptide 1 (GLP-1) receptor agonists for treatment of type 2 diabetes mellitus, which may have impacted the prevalence of cardiovascular comorbidities.

Conclusion

The prevalence of diabetes in Canada is increasing over time, and about 90% of all diabetes cases are type 2. Striking the right balance between sensitivity and specificity in defining diabetes in administrative is important to ensure accurate capture of patient populations to better understand interventions and health outcomes for individuals with type 2 diabetes mellitus. Our results also highlighted that individuals with type 2 diabetes commonly experience comorbidities, regardless of the case definition used. This emphasizes the need for an interdisciplinary approach to diabetes care and associated comorbid conditions.