Study Design
We will perform a cross-sectional analysis of people with type 2 diabetes registered with primary care practices in the Royal College of General Practitioners (RCGP) Research and Surveillance Centre (RSC) network. Data will be extracted from the computerised medical records of patients registered with RCGP RSC practices on 31 July 2019.
Data Source
The RCGP RSC is a sentinel network made up of volunteer primary care practices distributed throughout England and parts of Wales. The RCGP RSC has been supported by the Department of Health for over 50 years and produces a weekly report to monitor the trends of respiratory diseases including influenza [7]. There are over 1500 practices within the network, which includes a representative sample of over 10 million registered patients [8]. The network represents approximately 10% of primary care practices in England and Wales.
Primary care is suited to this type of research because this is where the majority of people with type 2 diabetes are managed. In the UK, every patient has a unique patient identifier, a National Health Service (NHS) number; an advantage this carries is that when an individual registers with a different practice, their medical history moves with them. This also allows for linkage with other data sets such as genetic and hospital data [9].
Primary care data are computerised and recorded using clinical codes and free text. Until recently, the standard coding terminology used in primary care was the Read classification (Read Version 2 and Clinical Terms Version 3 [CTV3]) [10]. This was replaced by the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) in April 2018 [11]. Clinical codes include information for diagnoses, prescriptions, investigations, and processes of care.
Data completeness of the RCGP RSC database is high in the type 2 diabetes population owing to the pay-for-performance incentive programme to improve coding for chronic diseases, the Quality and Outcomes Framework [8, 12, 13], and a dedicated team of practice liaison officers that work closely with the practices and are able to provide feedback on coding.
Study Population
The study population will comprise all adults with type 2 diabetes registered in the RCGP RSC network on 31 July 2019. Included patients will be aged 18 years or older on this date. Any patients that de-registered from a practice in the RCGP RSC network before 31 July 2019 will not be included in the study. The cohort will be identified using a two-step ontological process [13]. In the first step, all people with diabetes are identified according to diagnostic codes, results from blood glucose tests, and use of glucose-lowering drugs. These people are then categorised by diabetes type using a seven-step algorithm, which takes into account medications, BMI, age at first insulin prescription, and other features specific to type 1 diabetes or type 2 diabetes.
Statistical Analysis
We will determine the proportion of adults with type 2 diabetes ever prescribed an SGLT2i. The percentage of SGLT2is prescribed in this cohort will then be calculated according to renal function, using the following eGFR values: < 45, 45–59, and ≥ 60 mL/min/1.73 m2. In addition, we will calculate the proportion of prescriptions in people with heart failure and type 2 diabetes, stratified by BMI categories (underweight, < 18.5 kg/m2; normal weight, 18.5–24.9; overweight, 25.0–29.9; obese, ≥ 30), and report the percentage of SGLT2is prescribed as an add-on to a diuretic or following discontinuation of prescribing for a diuretic. To identify the presence of heart failure, we will use codes for heart failure provided in the 2019/20 Quality and Outcomes Framework business rules [14]. These include heart failure diagnosis codes, codes for left ventricular systolic dysfunction diagnosis, and codes that confirm heart failure by echocardiogram (e.g. “echocardiogram shows left ventricular systolic dysfunction”). Demographic data (age, gender, and ethnicity) for these data will be reported. These summary statistics will be reported using counts and percentages for categorical data, and means and standard deviations for continuous data.
To determine whether the presence of heart failure or renal function influences the propensity to prescribe SGLT2is, we will perform a multilevel logistic regression analysis (using clustering to account for inter-practice variation). The model will be adjusted for age, gender, ethnicity, socioeconomic status (using the Index of Multiple Deprivation [15]), BMI, blood pressure, glycated haemoglobin (HbA1c), presence of cardiovascular disease, and use of diuretics. For sensitivity analysis, we will repeat the logistic regression after applying multiple imputation with chained equations as a method to account for missing information for key variables in the database. Odds ratios with 95% confidence intervals and P values will be reported for each variable.