Background

The Charlson comorbidity index (CCI) is the most commonly used casemix adjustment method in health outcome studies that use administrative data [1,2,3,4,5,6,7,8,9,10,11,12,13]. In short, using a population of general medical inpatients at one hospital over 30 years ago Charlson identified 17 comorbidites that were associated with one-year mortality and assigned weights to these conditions that when summed created an index that predicted mortality [11]. However, recent work by Quan et al. suggests reweighting of the original Charlson score may be appropriate as new data sources become available, and as the management and outcomes of patients with chronic conditions evolve [3]. In the case of stroke, although it remains a common reason for hospitalization, there have been significant declines in stroke hospitalization and mortality rates over the past fifteen years [13,14,15,16,17,18]. Considering the median age of stroke patients is 75 years, many of these patients have a high pre-existing comorbid burden that can impact mortality making adjustment for differences in comorbidity burden critical when comparing quality of care across provider groups, regions, and countries [13, 17]. The validity of the CCI to adjust for comorbidity in stroke care, was done over 10 years ago, using ICD-9-CM coded inpatient adminstrative data from acute stroke patients across nine Vetern Affairs hospitals [4]. These circumstances support testing the accuracy of the original Charlson weights in a contemporary cohort of ischemic stroke inpatients.

We sought to determine whether the original Charlson comorbidities and corresponding weights were applicable in a cohort of ischemic stroke patients in Ontario by applying Quan et al’s approach to re-calculating Charlson comorbidity weights [3]. We also examined the impact of including atrial fibrillation into the CCI given it is a highly prevalent comorbid condition for ischemic stroke patients and is associated with higher mortality [19,20,21,22,23].

Methods

Setting

This is a retrospective population-based cohort of adults hospitalized with acute ischemic stroke in Ontario, Canada (population 13.3 million). Ontario has a publicly funded health care system that provides coverage of all medically necessary services delivered in hospitals, emergency departments and physician offices and prescription medications for adults aged 65 years and older.

Data sources

The Ontario Stroke Registry’s (OSR) provincial acute stroke audit was used to identify hospitalized ischemic stroke cases. The stroke audit database consists of audited medical records of a population-based sample of patients aged 18 years and older discharged from provincial hospitals between April 1, 2012 and March 31, 2013 with transient ischemic attack (TIA), ischemic stroke, or intracerebral hemorrhage. Details of the audit methodology have been reported previously [18]. The audit was conducted by trained research personnel with access to stroke specialists for consultation and validation by duplicate chart abstraction has demonstrated excellent agreement for stroke type and stroke severity [24]. We restricted the cohort to patients with ischemic stroke. However because several of the 17 Charlson comorbidity conditions that make up the CCI were not captured in the chart audit, we linked the OSR ischemic cohort to the Discharge Abstract database (DAD) and Same-day Surgery database (SDS) using unique encoded identifiers to identify all Charlson comorbidities as well as atrial fibrillation. We included atrial fibrillation as a comorbid condition given its association with stroke mortality [19, 20, 22].

The DAD and SDS is compiled and maintained by the Canadian Institute for Health Information (CIHI) and contains information on all inpatient and day-surgery discharges from acute care hospitals in the province. Data elements in the DAD and SDS include the most responsible diagnosis and up to 24 other diagnoses that are coded according to the International Classification of Diseases, 10th revision, Canada (ICD-10-CA) standard. For each ischemic stroke record we determined the presence or absence of individual Charlson comorbidities as defined by Quan et al ICD-10-CA codes and the atrial fibrillation ICD-10-CA code (see Additional file 1) found in the DAD and SDS databses using a two-year look-back of DAD and SDS records and for index event (i.e., hosptialization for acute stroke), all the diagnoses coded and any diagnosis type (most responsible, pre-admission or post-admission) were also included in the identification process with exception of cerebrovascular disease [25,26,27]. In-hospital death was captured in the DAD and to obtain one-year and 30-day mortality we linked the ischemic stroke cohort to the Ontario Registered Persons Database (RPDB), a database of health insurance plan registrants that includes date of death. All linkages were done using unique encoded identifiers and analyzed at ICES.

Analysis

The ischemic stroke cohort was randomly split into two cohorts; a test cohort (2/3rd) and validation cohort (1/3rd). Descriptive analysis compared characteristics of test and validation cohorts. In the test cohort we developed a multivariable Cox-proportional hazards model (Cox-PH) with one-year mortality as the dependent variable; predictor variables included age groups (< 45 years, 45–54, 55–64, 65–74, 75–84, 85+), sex, individual Charlson comorbidities as well as atrial fibrillation. Charlson comorbidities and atrial fibrillation were considered if they had a frequency of at least 10 patients and a bivariate association with one-year mortality of p < = 0.15. After adding all eligible comorbidites to the model we retained conditions with hazard ratios (HR) greater or equal to 1.2 and p-value < 0.05. The revised comorbidity weights were assigned to the individual comorbidities according to the following algorithm as developed by Quan: a weight of 1 for risk-adjusted hazard ratio of > = 1.2 but < 1.5, a weight of 2 for a hazard ratio of > = 1.5 but < 2.5, a weight of 3 for a hazard ratio of > = 2.5 but < 3.5, a weight of 4 for a hazard ratio of > = 3.5 but < 4.5, a weight of 5 for a hazard ratio of > = 4.5 but < 6 and a weight of 6 for a hazard ratio > = 6 [3]. These steps were repeated with the addition of atrial fibrillation. In summary, we 1) re-weighted the CCI from an ischemic stroke cohort (ISCCI); and 2) re-weighted the CCI adding atrial fibrillation as a comorbid condition (ISCCI-AF). In the validation cohort, we 1) assigned the original Charlson weights to the 17 comorbidities in the ischemic stroke cohort; 2) assigned the ISCCI recalibrated weights; and 3) assigned the ISCCI-AF weights. We then used logistic regression to model three outcomes; in-patient, 30 day and 1 year mortality following ischemic stroke adjusting for age, sex and each of the three comorbidity indices: the original CCI, the ischemic stroke Charlson index (ISCCI) and the ISCCI-AF; which created a total of nine separate models. We used the Hosmer-Lemeshow goodness-of-fit test to assess model calibration and, the c-statistic to assess model discrimination. A c-statistic between 0.7–0.8 indicates reasonable model and > 0.8 is considered a strong model [28]. We also computed the continuous net reclassification index (cNRI) to determine whether the ISCCI model had an improved ability to predict risk of death compared to the original Charlson index (ISCCI vs CCI). The cNRI measures the improvement in correctly classifying patients as high (or low) risk for death by calculating the sum of the differences in the estimated probability of net upward reclassification of death and the estimated probability of net downward reclassification for no death [29]. We used SAS Enterprise Guide, version 6.1 for all analyses (SAS Institute, Cary, NC).

Results

Characteristics of ischemic stroke patients in the test and validation cohorts were similar (Table 1). The majority of ischemic stroke patients had mild stroke (59.7% in the test cohort and 58.6% in the validation). In addition to the most responsible diagnosis of ischemic stroke, a median of four diagnosis codes were found in the index hospitalization record in both cohorts. All-cause death within one year of admission was similar for the test (24.6%) and validation (23.4%) cohorts. The median survival time among those who died within 1 year was 30 days and 32 days, in the 2 cohorts, respectively. The frequency of individual comorbid conditions was similar with the exception of mild liver disease where the validation cohort had a slightly higher proportion (Table 1). The most frequently reported comorbidities were diabetes with complication (28%), atrial fibrillation (27%), and hemi or paraplegia (17%).

Table 1 Characteristics of Ischemic Stroke test and validation cohorts, Ontario Stroke Audit, 2012/13

Comorbidity hazard ratios and weights for ISCCI derived from the test cohort are shown in Table 2. For comparison purposes, the original Charlson weights are also shown [11]. There was no difference in weights when atrial fibrillation (ISCCI-AF) was included in the model compared to the ISCCI model weights except cerebrovascular disease became non-significant and atrial fibrillation became significant with a weight of 1 (data not shown). Ten of the 17 Charlson comorbid conditions in the ISCCI were statistically significant (i.e, a HR ≥ 1.2). Compared to Charlson weights, the ISCCI weights were one point higher for four conditions (myocardial infarction (MI), congestive heart failure (CHF), dementia, rheumatologic disease), remained the same for three conditions (cerebrovascular disease (CVD), chronic obstructive pulmonary disease (COPD), any malignancy,) and, one point lower for three conditions (diabetes with chronic complications, hemi or paraplegia, metastatic solid tumor). Figure 1 compares the frequency of CCI scores with ISCCI, and ISCCI-AF scores in the validation cohort. The distribution of scores shows a similar proportion of patients with 0 (~ 38%) and 5 and higher scores (~ 6%), but the two new models, had more than double the proportion of patients with a score of 1 (26.9, 30.3%, vs 12.3%), and a smaller proportion of patients with scores of 2 and 4 compared to the original CCI. Figure 1 also illustrates the observed 30-day and one-year mortality. As the comorbidity score increases, the proportion of 1 year mortality correspondingly increases. For example, among patients with an ISCCI of 5 or greater 33.3% died within 30 days and 61.9% within one year.

Table 2 Comorbidity weights derived from the ischemic stroke test cohort (April 1, 2012 to March 31, 2013) compared with weights of the original Charlson comorbidity index [11]
Fig. 1
figure 1

Distribution of comorbidity score, by index, and mortality associated with ISCCI among the validation cohort (N = 2331), April 1, 2012 to March 31, 2013

Table 3 shows the calibration and predictive accuracy of modelling the probability of death in-hospital, within 30 days or 1 year of admission for the three models. For all mortality outcomes, thec-statistics for the ISCCI models were higher compared to the CCI model. For example, for 30-day mortality the c-statistic for the original CCI was 0.732 compared to 0.746, for the ISCCI p = 0.009. However the difference between the CCI and the ISCCI model were negligible and non-significiant for in-hospital death and one-year (0.722 vs 0.729 p = 0.343 and 0.760 vs 0.764, p = 0.398, respectively). Including atrial fibrillation (ISCCI-AF) did not improve the ISCCI model discrimination. The cNRI analysis did not include the ISCCI-AF model given the lower c-statistic compared to the ISCCI. The cNRI showed the ISCCI model compared to the CCI model did not improve patient net mortality risk reclassification (Table 3).

Table 3 Comparison of Model performance among ischemic stroke patients in the validation cohort (N = 2331, discharged between April 1, 2012 and March 31, 2013)

Discussion

Using a large population-based ischemic stroke cohort discharged from acute hospitals in Ontario, Canada between April 1, 2012 and March 31, 2013, we examined the association between the Charlson comorbidities and mortality to determine which Charlson comorbidities are relevant among ischemic stroke patients. Similar to other studies, we found several Charlson comorbidities were not signifinicant predictors of one-year mortality in ischemic stroke patients [3, 6, 31,32,33]. In fact, we found just ten of the 17 Charlson comorbidities to be associated with one-year mortality following an acute ischemic stroke. The very low prevalence of moderate or severe liver disease and AIDS/HIV (n < 10) meant they were not considered for the ISCCI. Of the remaining Charlson comorbidities, peripheral vascular disease, peptic ulcer disease, mild liver disease, diabetes without complications, renal disease, were not associated with one-year mortality in our ischemic stroke cohort (p > 0.05).

Although the ISCCI model demonstrated marginal improved model performance (< 2% increase in c-statistic) compared with the original CCI, the extent of model improvement is similar to other studies comparing performance of revised CCI models to the originial CCI [3, 7, 31, 32]. Not surprisingly, there was little difference in model performance between the ISCCI and ISCCI-AF models across all outcomes given no difference in the total number of comorbidities and the corresponding weights. Additionally, the ISCCI didn’t result in significant gain or loss in net patient mortality risk reclassifications.

A criticism of the CCI derived from administrative databases is the concern over misclassifying complications as comorbidities [5, 26, 34]. Although the CIHI DAD and SDS databases have diagnosis type fields to allow differentiation between a pre-admission condition and conditions that developed during the hospitalization, reabstraction studies have found modest validity for the diagnosis type field and therefore distinguishing conditions that arise as a consequence of natural disease progression from complications of care is limited [27, 34]. However, most Charlson comorbidities, with the possible exception of hemi- or paraplegia, would not be considered complications in ischemic stroke patients. The prevalence of existing hemi- or paraplegia among acute stroke patients in the OSR is ~ 2% (data not shown). We examined the 1138 IS patients identified as having hemi- or paraplegia and determined all were coded in the index hospitalization record. The majority (87.7%) classified hemi- or paraplegia as a pre-existing condition, 2.8% a post admission condition or complication and 9.5% a secondary diagnosis. A secondary diagnosis does not meet the requirement for determining comorbidity, ie., capturing a symptom of the stroke [27]. CIHI DAD data quality studies have revealed overall coder agreement of diagnosis type was 76%, ranging from 65% for post-admission diagnoses, 67% for pre-existing diagnoses to 86% for most responsible diagnosis [27]. Given coders are limited to physician documentation only when assigning diagnosis types and, the way physicians document is to record conditions relevant to patient treatment/management, it is not surprising coder assignment of diagnosis type is challenging [35, 36].

We suspect the coding of hemi- or paraplegia may be caputuring either; 1) acute stroke associated symptoms or, 2) prior stroke given the low prevalence of prior stroke in our cohort (cerebrovascular disease ~ 6%). If, hemi- paraplegia coding in the index acute stroke event is capturing the symptoms of the acute stroke this may reflect stroke severity and given stroke severity is not available in administrative databases, it may be reasonable to include hemi- and paraplegia in an administrative database derived risk-adjustment model.

Stroke severity is strongly associated with mortality and recommended to be included in risk adjustment models [37]. Given, our ischemic stroke cohort was from the OSR and the OSR captures stroke severity through the Canadian Neurological Scale, we added stroke severity to the ISCCI model as an additional covariate and model performance improved substantially (c = 0.855 vs 0.746, for 30-day mortality, data not shown).

If we consider the 87.7% of the 1138 IS patients with hemi- or paraplegia coded as a pre-existing condition to be a result of a previous stroke; and combine it with the cerebrovascular disease prevalence, (n = 397) the prevalence of previous stroke would be ~ 20% and is within the range reported in the literature [17, 18, 38,39,40]. Further investigation of including hemi- or paraplegia diagnostic codes into algorithms to identify prior stroke in ICD coded databases is warranted.

Our study is not without limitations, firstly, we only examined mortality and our findings may not apply to other outcomes including length of stay, cost, readmission and patient functional outcomes. We also limited comorbidity identification to acute inpatient and same-day surgery hospital-based claims with a two-year lookback. However, we found little gain in model performance when we used a three-year lookback (data not shown) and a longer lookback was not examined . Secondly, the accuracy of ICD codes and number of diagnostic code fields available and or completed to capture comorbidity conditions is jurisdictional dependent [3, 41]. In Ontario, the low prevalence of comorbidities in administrative databases compared with clinical data obtained in reabstraction studies has been reported, although when a lookback period was applied, prevalence improved for several comorbidities [42,43,44,45,46]. Other stroke-related comorbidities associated with mortality and higher population attributable risk of stroke, such as obesity, smoking and hypertension were not examined given the low prevalence and unreliable coding of smoking and obesity in hospital-based administrative data and hypertension has been shown to be negatively associated with mortality [23, 47,48,49]. Additionally, we did not examine other administrative data sources like physician billing, emergency department, drug and laboratory databases for the comorbidity history of our cohort. However, little gain in model performance has been observed when physician billing data were included with hospitalization data [5, 50]. We did not use the OSR to identify comorbidities due to the inability to map one to one the 17 Charlson comorbidities (e.g., the OSR does not distinguish between diabetes with and without complications, mild vs moderate/severe liver disease and does not capture history of rheumatic disease) and our intended focus was on administrative data derived CCI not comparing a claims-based CCI to a medical record-based CCI. The ability to identify comorbidities in various databases such as-physician claims, laboratory, diagnostic imaging and drug databases and electronic medical records, is worthy of future research especially with the growing access to these data sources and increasing computational capacity and advanced analytical techniques like machine learning algorirthms that allow for the integration of time-dependent data. Finally, our findings are based on Ontario administrative data within the context of a universal health care system with mandatory hospitalization data submission and processes for error checking; therefore our results may not be generalizable to other settings or populations. Despite this, our findings are from a large, province-wide sample of ischemic stroke in-patients with complete follow-up for deaths of varying time frames.

Conclusion

We have shown the ISCCI model had similar performance to the original CCI model and therefore in the context of an ischemic stroke cohort the CCI remains a valid measure of comorbidity when using administrative data. The key advantage of the ISCCI model is it includes seven fewer comorbidities (10 vs 17) and therefore easier to implement in situations where coded data is unavailable (e.g. chart reviews, clinical trials, surveys and clinical registries).