Background

The growing trend to benchmark certain health care performance indicators – to assess the health care quality between institutions – requires careful consideration of the methodology that is being used [1, 2]. Healthcare is evolving towards a value-based healthcare framework with more emphasis on Patient-Reported Outcome Measures (PROMs), that will not only facilitate opportunities for performance improvement at an individual patient level when these measures are used in clinical practice, but may also be useful for benchmarking across providers [3,4,5]. PROMs can be defined as feedback directly from the individual patient on their own health condition (e.g. symptoms and health-related quality of life), thus without external interpretation [6]. PROMs can be either disease-specific (e.g. Neuro-QOL [7]) or generic (e.g. EQ-5D [8]).

An important consideration for meaningful comparisons across hospitals is the case-mix adjustment of the patient populations for each health care provider [9, 10]. By adjusting for the heterogeneity of patient characteristics in inter-hospital comparisons, a larger part of the estimated variation between hospital performances will be attributable to the quality of care provided to patients rather than factors outside of the healthcare providers’ control.

In stroke, the most commonly used clinical outcome measures are mortality and the modified Rankin Scale (mRS). There has been considerable research conducted on prognostic models for these clinical outcomes, which also encompass variables for case-mix adjustment [11]. Although there is a strong trend to use PROMs for benchmarking purposes [12], there still remains a lack of case-mix models to predict patient-reported outcomes as compared to clinical outcomes [11]. The aim of this study was to identify the specific variables for case-mix adjustment for a generic PROM (EQ-5D) and compare them to case-mix variables for clinical outcomes in acute ischemic stroke.

Methods

Patient population and data collection

A core set of baseline patient characteristics, performance indicators and outcome measures were registered from March 2014 till August 2016 of four stroke care centers in the Netherlands, of which 1 was a university and 3 were district-based hospitals. The original database contained consecutive acutely admitted ischemic stroke patients of which demographic, process indicators and outcome measures were registered.

The three outcome measures were mortality at 3 months, modified Rankin Scale (mRS) score at 3 months and EuroQol-5D index score at 3 months. The mRS is a commonly used clinician-reported scale, which measures the degree of disability after a stroke, with scores ranging from 0 to 6 (Fig. 1) [13]. The mRS score at 3 months post-discharge was generally recorded by trained nurses, either by phone or at the outpatient clinic. The EQ-6D, a generic health-related quality of life (HRQOL) instrument, is based on the EQ-5D (dimensions: usual activities, self-care or autonomy, mobility, pain/discomfort, and anxiety/ depression) with an additional question on cognitive functioning. The survey has been translated into Dutch and validated in previously published literature [8, 14, 15]. The post-discharge EQ-6D data was captured through either face-to-face or telephone interviews with patients themselves or their proxies. Due to the lack of a validated EQ-6D index tariff, the utility score was derived and transformed through the EQ-5D index tariff, by ignoring the “cognitive” dimension of the EQ-6D [16,17,18]. This EQ-5D tariff is an algorithm for attaching values to all 3125 health states often used in economic evaluations. This utility score can be used to compare to population norms or to calculate quality-adjusted life years (QALY’s) [16, 17, 19]. The authors will, for the remainder of this article, solely mention “EQ-5D index score” as the patient-reported outcome of interest to avoid any confusion. This EQ-5D index score ranged from 0 (death) to 1 (perfect health) and signified the patient’s perspective on his/ her own health. Because there still is no consensus on the minimal clinically important difference on the EQ-5D utility score in stroke populations [20], it was decided to keep the EQ-5D utility score as a continuous outcome rather than modify it to an ordinal outcome based on arbitrary cut-offs.

Missing baseline patient characteristics (among which case-mix variables) were imputed 10 times using multiple imputation in the original database (N = 2733 patients of 4 stroke centers), assuming missingness at random. Predictors (including stroke center) and outcomes also served as indicators for the imputation model [21]. Figure 2 showcases the substantial proportion of missing mRS and EQ-5D data that were filtered out before the three case-mix adjustment models were developed. Thus, all three regression models were developed using an imputed (10 iterations) dataset also containing the “original” 1022 patients. Potential reasons for missing patient-reported outcome data were patients being too sick to fill questionnaires out, and loss-to-follow at 3 months (patients being unreachable due to staying at a nursing home/ rehabilitation center, or because of their tremendous recovery).

Fig. 1
figure 1

Modified Rankin Scale (mRS)

Fig. 2
figure 2

Flowchart of Study Population Selection

Fig. 3
figure 3

Prognostic Value of Univariable and Full Models for Three Outcomes, Expressed in Percentage Explained Variance (R2)

Case-mix models

Patient characteristics that could differ between hospitals and could be predictive of outcomes were considered potential candidate case-mix variables and were identified based on clinical experience and past literature. Those included: age, sex, nationality, socio-economic status (SES), smoking, cardiovascular comorbidity (e.g. hypertension, hyperlipidemia), stroke in past history, diabetes, cancer, connective tissue disease, Charlson comorbidity index (CCI), stroke onset-door time, initial National Institutes of Health Stroke Scale (NIHSS) score and the presence of a caregiver. The SES was generated by the ranking of status scores (based on neighborhoods/ zip codes) that have been calculated and published by Social and Cultural Planning Office (SCP), a Dutch governmental institution [22, 23].

Statistical analysis

Descriptive statistics were presented as counts (percentages) or median ± inter-quartile range (IQR). Nonparametric tests were used where appropriate to determine (unadjusted) differences in patient populations between the four healthcare providers, using the Pearson chi-squared statistic for categorical variables and the Kruskal-Wallis test for continuous variables. A p-value < 0.05 was considered significant. To assess the adjusted effect of the potential case-mix independent variables, the models were developed using logistic, ordinal and linear regression models respectively for mortality, mRS score and EQ-5D index score with stepwise backward selection. This stepwise (regression-based) method initially tests all the predictors in a regression model and subsequently eliminates the least significant variables in a stepwise approach with a certain cut-off p-value [21]. In this study, the AIC (Akaike information criterion) [24], equivalent to p < 0.157, was used as a criterion.

For the logistic and ordinal regression models, the odds ratios (ORs) with 95% confidence intervals (CI) were calculated per predictor. Beta’s (ß’s) and 95% CI were calculated for predictors in the linear regression model. The ß coefficient indicates the change in outcome (units on the EQ-5D index score scale) for one unit change in the predictor variable. The ability of the case-mix models to explain the variability (‘goodness-of-fit’) of these 3 outcomes was expressed by calculating the R2 (R-squared) statistics [25]. The predictors for each model were added in consecutive order based on the p-values (lowest to highest p-value) and coefficients. The explained variance of each additional predictor was demonstrated till each model was completed.

Because the R2 measure is not immediately comparable between different regression models (logistic vs. ordinal vs. linear), the AUC (area-under-the receiver-operating-characteristic-curve) statistic was also included to get a sense of the comparability between the three risk-adjustment models. For this additional analysis, both mRS (0–2 vs. 3–6) and EQ-5D (< 0. 65 vs. ≥ 0.65) were transformed to a binary outcome variable in order to compare the three logistic regression models. The EQ-5D index score of 0.65 was chosen as a cut-off value, as it was the estimated median score in this study sample. The statistical analysis was carried out by using IBM SPSS Statistics 21 & RStudio version 1.0.153.0-© 2009–2016 RStudio, Inc. software.

Results

Patient characteristics

In total, 1022 ischemic stroke patients were included. The number of patients per studied hospital varied from 29 to 555 patients. 57% of total study participants were men (Table 1). The unadjusted median age was significantly different (range 70–78 years, p = 0.001) across the four stroke care centers. Most patients (87%) were native Dutch inhabitants. Both the Charlson Comorbidity Index and the stroke rate in patient history were similar across the 4 stroke patient cohorts. There was a significant unadjusted difference (range 113–275 min, p = 0.002) in the onset-to-door time between the patient populations of the four stroke care centers.

Table 1 Characteristics of all Ischemic Stroke Patients (N = 1022) and per Dutch Stroke Center, Admitted from March 2014 – August 2016 in Four Dutch Stroke Hospitals

The 3-month mortality was 24.5% (Table 2). The unadjusted difference in 3-month mortality rates between the four stroke centers was significant (p < 0.001). 581 (57%) of all patients had a favorable degree of disability (mRS < 3). There was also a significant unadjusted difference in mRS scale scores between the four stroke centers. The median EQ-5D index score at 3 months for all patients was 0.65 (inter-quartile range 0.10–0.83), and the unadjusted difference across the four stroke centers was also significant (p < 0.001). Missing mRS outcomes were 1205/2733 (44.1%) in the original database, with most missing mRS data being observed in stroke center IV (192/238 = 80.7%) (data not shown).

Table 2 Outcome Measures of Ischemic Stroke Patients (N = 1022)

Case-mix models

Table 3 shows the remaining predictors in the regression models after backward selection for mortality, mRS and EQ-5D utility scores. The “strongest” (based on lowest p-values) independent variables in the model for mortality were age (OR = 1.07), NIHSS score on admission (OR = 1.17) and the Charlson’s comorbidity index (OR = 1.22). The strongest predictors for mRS at 3 months were age (OR = 1.04), NIHSS score at admission (OR = 1.17), heart failure (OR = 3.58) and previous stroke (OR = 1.74). There were only three overlapping predictors for the three different outcomes: age, NIHSS score on admission and heart failure. Exclusive predictors for the EQ-5D index score were sex (β = 0.041), socio-economic status (β = − 0.019), and nationality (β = − 0.074).

Table 3 Case-Mix Risk Adjustment Models for Mortality, mRS and EQ-5D

The binary logistic regression model for mortality had an R2 = 0.44 (Table 3), compared to the ordinal regression model for the mRS which had an R2 = 0.42, and the linear regression model for the EQ-5D utility score with a R2 = 0.37. The largest increase in R2 was after the addition of NIHSS to the models for mortality and mRS, and age to the EQ-5D index score model (Fig. 3). After mRS and EQ-5D index scores were both transformed to a dichotomous outcome, AUC’s were compared between all three binary logistic regression models: AUC = 0.87 (mortality) vs. AUC = 0.83 (mRS ≥ 3) vs. AUC = 0.78 (EQ-5D index score ≥ 0. 65). As opposed to the models for mortality and mRS, it took more predictors in the model for EQ-5D index scores for the predictive ability to reach a plateau.

Discussion

The objective of this study was to construct and compare case-mix adjustment models for three different outcomes, of which two were clinical (mortality and modified Rankin Scale at 3 months) and one patient-reported (EQ-5D utility score at 3 months). The three case-mix models had several predictors in common: age, NIHSS score at hospital admission, and heart failure. However, the most important difference in the case-mix adjustment models was that sex, nationality, and socio-economic status remained significant case-mix variables specifically for the PROM in contrast to the models for the clinical outcomes. It has to be stated that even if a predictor is significantly associated with the outcome, it doesn’t necessarily have to be included as a case-mix variable, if the prevalence distribution of the variable and its effect on the outcome of interest is similar across hospitals. The R-squared (R2) statistics of the model for the patient-reported outcome measure (PROM) was somewhat lower in comparison to the R2 statistics for mortality and the modified Rankin Scale (mRS), but contained more variables.

There have been multiple models previously developed and validated to predict clinical outcomes after stroke [11]. Bray et al. [26] developed and externally validated two case-mix models with 30-day post-stroke mortality as an outcome. The predictors included in the final models were similar to the findings of the current study: age, NIHSS on admission and atrial fibrillation. On the other hand, there has not been much research conducted on the development of case-mix factors for patient-reported outcomes (e.g. EQ-5D) in stroke care [27]. There was some overlap in the remaining case-mix variables in this study and those identified in previously published articles [28]. A review by Carod-Artal et al. [29] identified age, sex, stroke severity, physical impairment, functional status, and mental impairment as predictors for the health-related quality of life (HRQOL) after stroke. Mar et al. [30] also found the male gender and the NIHSS to be significantly associated with better EQ-5D values. The negative association between (history of) cancer and a lower quality of life (lower EQ-5D scores) in this study confirms previously published literature [23, 31, 32].

A striking observation is the caregiver presence post-discharge as a statistically significant predictor variable for mRS score at 3 months with an OR = 0.67 (95% CI 0.51–0.86; p = 0.002) and for the EQ-5D utility score at 3 months with a ßeta = 0.057 (SE 0.022). This observation of caregiver presence at hospital discharge being associated with a lower mRS score (better clinical outcome) and a higher EQ-5D utility score (better quality of life) at 3 months, highlights the potential benefits that a caregiver could offer (e.g. patient motivation, facilitating rehabilitative care) leading to improved functional status and quality of life. However, definite conclusions cannot be drawn about this association, because the definition of “caregiver” and “caregiver presence” was not similar across the multiple stroke centers; it was unclear if the absence of a caregiver implied no indication (e.g. low mRS score) or no need (admittance to a revalidation center or nursing home).

Strengths and limitations

A considerable strength of this study is that it explores a relatively new field of research namely case-mix adjustments for PROMs in order to make inter-hospital performance comparisons. The case-mix variables for a PROM do not imply additional registration burden for recording data in quality registries because general (relevant) demographic variables (e.g. age, gender and socio-economic status) are already captured in standardized fashion. This is a major strength of the study.

An important limitation of this study is the notably large amount of missing outcome (mRS and EQ-5D) data in the original database. This problem is not uncommon in registries that are routinely acquired for the purpose of quality of care assessment, and it was the main reason this study solely focused on the development of case-mix risk adjustment models rather than benchmarking the included stroke centers. Although the estimated regression coefficients of all three case-mix models might be somewhat biased due to the substantial missingness, it is less important in this context and more about the differences in case-mix variables between the models. The missing data issue was partially countered by the use of multiple imputation for the predictor variables. The distribution of patient characteristics was compared between missing and non-missing mRS and EQ-5D groups (data not shown). These analyses showed significant differences in distribution in NIHSS score, SES rank (low, middle, high), nationality, and some cardiovascular comorbidities (hypertension, heart failure, hyperlipidemia) between missing and non-missing mRS and EQ-5D data. This observation implies that the generalizability of the final set of case-mix variables, observed in this study, should be corroborated in future research.

Another limitation is the loss-to-follow-up bias in this stroke registry: if missing 3-month mRS and EQ-5D data could be attributed to either patients’ full recovery or an extended stay in a rehabilitation center/ nursery home, it is quite possible that known outcome data could be skewed (both directionalities possible), seeing it was mostly recorded at outpatient clinics. Other stroke registries (e.g. European Safe Implementation of Thrombolysis in Stroke-Monitoring Study (SITS-MOST) [33] and UK Sentinel Stroke National Audit Program (SSNAP) [34]) have also incorporated patient-reported outcomes, which are typically collected at 3–6 months post-discharge through diverse methods like face-to-face interviews, telephone interviews or mailed questionnaires [35]. As the collection of PROMs at these time points can be challenging due to varying post-discharge patient trajectories and/ or substantial resource requirements (personnel and costs), future research should focus on efficient methods to optimally capture PROM data as part of value-based stroke care. This is an essential step that should to be taken before (case-mix) risk adjustment models are further developed.

In this study, R2 values were compared to pseudo-R2 values although they are not directly comparable. However, the objective of this study was to showcase the differences in predictors between the three models. It has to be noted that some potentially relevant psychological case-mix variables (e.g. depression, anxiety, EQ-5D scores at baseline) were not recorded in the database, even though they could influence PROM responses and thus ultimately impact the case-mix adjustment model. This paper suggests that the specific predictors for the EQ-5D, based on this data, have not been found yet.

Conclusions

In conclusion, this study shows that other predictors (e.g. psychological and social factors) should be considered as potential case-mix variables for patient-reported outcome measures (PROMs) than for clinical outcomes in ischemic stroke patients. It is important that these specific case-mix variables should be included in order to benchmark hospitals legitimately on PROMs. One of the principles of value-based healthcare is to benchmark clinical outcomes and PROMs across different diseases and healthcare providers/ institutions to ensure quality improvement and competition [36]. This study identified a low (er) socio-economic status to be specifically associated with lower EQ-5D index scores. Future research should focus on finding other important predictors specific to PROMs in acute ischemic stroke to be able to further develop valid case-mix models.