Background

In recent years, increased measurement of health-related quality of life (HrQoL) has expanded to evaluate chronic disorders and analyse cost-effectiveness in particular. Different instruments were developed to assess the HrQoL. Types of HrQoL instruments include profile-based instruments that depend on the aggregation of several outcome values (e.g., Parkinson’s disease questionnaire eight dimensions (PDQ-8) [1]) and index instruments with a single index value to represent the HrQoL (e.g., EuroQol – 5 dimensions (EQ-5D) [2]). Disease-specific (e.g., PDQ-8 [1]) and generic instruments (e.g., EQ-5D [2]) are also available.

The guidelines for health-economic evaluations call for the implementation of quality of life as a patient-relevant outcome and the use of utility-based patient preferences [35]. However, utility-based instruments are not routinely applied, even in recent clinical trials. Clinical scales are regularly used, and study designs frequently include disease-specific HrQoL or profile instruments.

Cost-utility studies require HrQoL data, and clinical effectiveness parameters. We aimed to develop a mapping algorithm based on Unified Parkinson's Disease Rating Scale (UPDRS) and PDQ-8 data in cases when utilities are needed but not assessed in the field of Parkinson’s disease.

Methods

Clinical evaluation

The data were collected from a study population of patients (n=138) with idiopathic Parkinson’s disease following recruitment at several study centres in Hessia, Germany. A detailed description of the patients and scales applied was previously published [6]. The severity of Parkinson’s disease was assessed with the UPDRS [7]. Our analysis relied on subsets to calculate several scores, including the summed scores of parts II-IV. The latter data were also divided into subscores for dyskinesias (IVa), motor fluctuations (IVb), and other complications (IVc).

The HrQoL was evaluated with the generic EQ-5D and the disease-specific, profile-based HrQoL Parkinson’s disease questionnaire in its short version (PDQ-8) [1]. The health states identified by the EQ-5D were converted into EQ-5D indices employing weights from the German population valued with the time trade-off approach (hereafter referred to as the EQ-5D Germanindex) ranging between 0 and 1 [8], and weights from a pooled European population valued by a visual analogue technique ranging from 0 to 100 (EQ-5D Europeanindex) [9].

We validated our results with three independent datasets: (1) our own unpublished data, (2) data from Siderowf et al. [10], and (3) data from Schrag et al. [11]. Siderowf et al. reported data for the UPDRS II and III, the PDQ-8, and the EQ-5D. The data from Schrag et al. consisted of the EQ-5Dindex, PDQ-8, and all UPDRS subscores. Our own data included the EQ-5Dindex and the UPDRS II and III. We predicted the EQ-5D values with these independent data sets and calculated R2 resulting from the predicted and observed values.

The study protocol for our own data and data from Spottke et al. [6] was approved by the local ethics committee and all patients gave informed consent. Schrag et al. [11] obtained ethics approval from the National Hospital for Neurology and Neurosurgery and the Institute of Neurology Joint Medical Ethics Committee. The study provided by Siderwof et al. [10] was reviewed by the Research Review Committee of Pennsylvania Hospital, and informed consent was obtained from all subjects prior to administration of study instruments.

Statistical analysis

A correlation analysis was calculated by a two-sided Spearman’s rank correlation test to determine any linear relationship between the predictor and the dependent variable. A multiple linear regression analysis was applied to develop a prediction rule for EQ-5D (i.e., Germanindex and Europeanindex) from the UPDRS and PDQ-8 variables. The interaction terms and squares of the variables were considered including PDQ-8- and UPDRS-subscores. Following the algorithm established by Cheung et al. [12], we built quadratic terms of these scales to consider non-linear relationships. We conducted a fractional polynomial regression analysis [13] to provide an alternative analytical approach to model the non-linear relationships between the outcomes and predictors. We investigated a logarithmic relationship and a relationship up to the third degree between the EQ-5D and the independent variables. The relationship between EQ-5D and predictor variables was nonparametrically estimated by a local polynomial smoothing of a general additive regression without making a functional assumption about the relationship. This approach serves as a graphical check of the parametric model fit to the data. In a second analysis, each EQ-5D dimension item was predicted, and the EQ-5Dindex values were subsequently calculated. Several items of the UPDRS II-IV cover similar aspects as some EQ-5D items (e.g. activities of daily living/ self care by the UPDRS II or mobility, and pain by the UPDRS III). To investigate the relevance on the overall association between the UPDRS II-IV and the EQ-5D, we repeated our analyses after the elimination of UPDRS items 9, 10, 11, 12, 13, 14, 15, 17, 22, 29, 30, and 31 from the recalculated UPDRS II-IV scores.

Four basic regression models were built as follows:

M 1 UPDRS II III : EQ 5 D = UPDRS II + UPDRS III M 2 UPDRS II IV : EQ 5 D = UPDRS II + UPDRS III + UPDRS IV M 3 UPDRS II IVa c : EQ 5 D = UPDRS II + UPDRS III + UPDRS IVa + UPDRS IVb + UPDRS IVc M 4 PDQ 8 : EQ 5 D = PDQ 1 + PDQ 2 + PDQ 3 + PDQ 4 + PDQ 5 + PDQ 6 + PDQ 7 + PDQ 8

The models were constructed applying backward selection. For the model validation R2 and root mean square error (RMS) were calculated. To be consistent with other published work [12, 14, 15], we considered values for R2 ≥0.3 as acceptable and R2 values ≥0.5 as good predictions.

The alternative model fit was evaluated with the Pregibon link test [16] and the Bayesian information criterion (BIC). The model specification error was tested by the Pregibon link test to check the linearity of the EQ-5D on its prediction scale. The alternative model selection was assessed by the BIC. We graphically conducted a comparison of the linear regression analysis and factional polynomials against the local polynomial smoothing.

All analyses were calculated with the statistical packages STATA and R (Stata 12, StataCorp LP, Texas USA; R-2.15.1 Comprehensive R Archive Network, Institute for Mathematics, TU Vienna, Austria).

Results

Seventeen patients were excluded because of missing data. We therefore evaluated a total of 121 patients. The mean patient age was 67.1 years (SD 9.1) and 66.1% were males. Approximately 2/3 of the population was classified into Hoehn&Yahr (HY) stage II, III or IV, with 6.6% in stage I and 6.6% in stage V. No differences in age and sex were observed between included and excluded cases but excluded cases had higher HY stages, with nearly 3/4 of these cases being in HY stages IV or V.

The correlation analysis demonstrated that the EQ-5D Germanindex and the EQ-5D Europeanindex were associated for some variables: PDQ1, PDQ2, UPDRS II, UPDRS III with the EQ-5D Germanindex, and PDQ1, PDQ2, PDQ7, UPDRS II, UPDRS III with the EQ-5D Europeanindex (all rs >0.6 and p <0.05).

On average, 50.0% (n=9) of the models analysed in “UPDRS II-III”, 42.6% (n=23) in “UPDRS II-IV”, 24.3% (n=118) in “UPDRS II-IVa-c” and 1.5% (n=197) in “PDQ-8” solely consisted of coefficients with a significant p-value (p <0.05). We will refer to these models as “significant models”.

The equations for best data fit of the EQ-5D Germanindex were represented by

M 1 UPDRS II III : EQ 5 D = 0.9042 0.0001 * UPDRS II I 2 M 2 UPDRS II-IV : EQ - 5 D = 0.9275 - 0 .0001 * UPDRS III 2 - 0.0134 * UPDRS IV M 3 UPDRS II IVa c : EQ 5 D = 0.9628 0.0001 * UPDRS II I 2 + 0.0031 * UPDRS IV a 2 0.0052 * UPDRS IVb 2 - 0 0.0448 * UPDRS IVc 2 M 4 PDQ 8 : EQ 5 D = 0.9298 0.00004 * PDQ 1 2 0.00002 * PDQ 2 2 0.00004 * PDQ 8 2

For the EQ-5D Europeanindex, we determined the following:

M 1 UPDRS II III : EQ 5 D = 79.272 0.775 * UPDRS II 0.008 * UPDRS II I 2 M 2 UPDRS II IV : EQ 5 D = 76.850 0.010 * UPDRS II I 2 1.520 * UPDRS IV M 3 UPDRS II IVa c : EQ 5 D = 80.054 0.010 * UPDRS II I 2 + 0.242 * UPDRS IV a 2 2.236 * UPDRS IVb 3.919 * UPDRS IV c 2 M 4 PDQ 8 : EQ 5 D = 81.960 . 0.380 * PDQ 1 0.003 * PDQ 2 2 0.003 * PDQ 8 2

The models were compared for the best data fit with maximum R2 values, and minimum RMS values. The model “UPDRS II-IVa-c” showed the best fit for both the EQ-5D Germanindex and the EQ-5D Europeanindex (R2 = 0.712 and 0.684, respectively) (Table 1). The same model also showed the smallest RMS values (0.14 and 13.38, respectively). The R2 and RMS values for all other models for the EQ-5D were in the ranges of 0.538-0.603 (R2) and 0.16-0.17 (RMS) for the Germanindex and 0.561-0.666 (R2) and 13.75 to 15.78 (RMS) the EQ-5D Europeanindex (Table 1). The elimination of similar items from the UPDRS II-IV resulted in R2 values of 0.684 for the Germanindex and 0.682 for the EQ-5D Europeanindex.

Table 1 Results from regression analysis

The model structure and complexity was evaluated by the goodness of the link-test of Pregibon [16] and the BIC. The link test did not reject the hypothesis of model misspecification for all models constructed. This result indicates that the functional relationship was correctly specified for all significant predictors considered in the model. The smallest coefficients were observed for the “UPDRS II-IVa-c” model regardless of the Europeanindex or Germanindex prediction (Table 1). This result was further supported by a small BIC for the M3 model.

The fractional polynomial regression resulted in the same models with optimal R2. The original EQ-5D data and a graphical comparison of the estimated regression models (linear, fractional polynomial and general additive regression) are shown in Figure 1.

Figure 1
figure 1

Presentation of the EQ-5D data fit for the regression line (left panel: EQ-5D German index ; right panel: EQ-5D European index ): comparison of the fitted values estimated by alternative analytic approaches (dots: EQ-5D values; solid line: fitted regression line by ordinary linear regression; dash-dot line: fitted regression line by a generalised additive model; dashed line: fitted regression line by fractional polynomials). The UPDRS II was not presented because its categorical nature led to the accumulation of points by a small number of values.

The regression analysis for the single EQ-5D questions 1–5 resulted in a R2 of 0.31 for the EQ-5D Europeanindex and 0.26 for the EQ-5D Germanindex items.

The validation of our results with independent data from our own (M1 model), from Siderowf et al. [10] (M1 and M4 models) and Schrag et al. [11] (all models) showed R2 values ranging from 0.11 to 0.56 for the EQ-5D Germanindex and from 0.24 to 0.64 for the EQ-5D Europeanindex. These results confirm the results (except for the prediction of the EQ-5D Germanindex by the UPDRS II-III model) from our primary data showing robust results and indicating external validity.

Discussion

We present an algorithm for the estimation of the EQ-5D from the UPDRS parts II-IV and the PDQ-8, both of which are standard clinical classification schemes that are widely used in the evaluation of Parkinson’s disease patients within clinical studies. Our prediction models based on the UPDRS explained more than 71% and 68% of the variation and used models having minimal RMS of 0.14 and 13.38 in the EQ-5D Germanindex and EQ-5D Europeanindex, respectively. The results were reproduced by our own independent data and data from Siderowf et al. [10] and Schrag et al. [11]. We note, however, that the application of empirical utility data is preferable if available. However, we address an approach to these issues when utility data are missing.

Our mapping algorithm for the UPDRS compared to the PDQ-8 explained slightly more of the appearing variance predicting the EQ-5D (PDQ-8: 60.3% and 66.6%; UPDRS: 71.2% and 68.4% for the EQ-5D Germanindex and the EQ-5D Europeanindex). This finding was supported by the RMS (PDQ-8: 0.16 and 13.75, UPDRS: 0.14 and 13.38 for the EQ-5D Germanindex and EQ-5D Europeanindex). This result is surprising because we expected the PDQ-8 by measuring Parkinson specific quality of life to have a greater conceptual resemblance to the EQ-5D. The fractional polynomial regression tested different types of models, and we concluded that the “PDQ-8” data have a poorer fit compared to “UPDRS II-IVa-c”. One possible explanation for this result is the different nature of the items in the two instruments; the EQ-5D has a stronger focus on the perceived impaired general health due to the physical illness, and the PDQ-8 considers more of the social and psychological consequences of Parkinson’s disease. The focus of the UPDRS on physical constraints makes this instrument more likely to have a relationship conceptually closer to the EQ-5D. Additional analyses showed that similar items in the UPDRS II-IV and the EQ-5D did not have a relevant impact on the association between the UPDRS and the EQ-5D, thus supporting our potential explanation. The link test and the BIC indicated that the model includes all important terms (see also Figure 1). Although we do not expect to find a relevant bias, we cannot completely rule out residual bias and model misspecification. The maximal R2 and minimal RMS represent the best fit of the data, but not necessarily the most logical relationship between the predictor and the independent variables investigated. The unexplained variance of approximately 30% may result from conceptual differences between the scales (e.g., in comorbidities such as depression) or differences in the evaluation technique (i.e., self- vs. professional-rating). Furthermore regression analysis does not consider pseudo-correlation or multi-collinearity.

We attempted to detect country-specific responses to the EQ-5D questionnaire with an analysis of the EQ-5D with German and European weights, but the marginal differences indicate the robustness of our models.

In contrast, when the model suggested by Cheung et al. [12] was applied to our data it failed to result in a satisfying model fit (R2 close to zero and RMS = 3651.5), suggesting that the model is inappropriate for our data. However, Cheung et al. calculated an Asian EQ-5Dindex, necessitating a careful comparison between our German data and Cheung’s et al. results. We therefore expanded our analysis beyond the work of Cheung et al. and analysed the quadratic, cubic and logarithmic relationships between the EQ-5Dindex and PDQ-8 or UPDRS. However, non-linear effects did not contribute to the association in a relevant way.

Another recently published study [14] dealt with the prediction of EQ-5D dimensions from PDQ-39 items using sophisticated simulation-based methods. The authors showed a better prediction with their method compared to several regression analysis methods. This is consistent with our results for the prediction of EQ-5D items 1–5, which resulted in R2 of 0.26 for the EQ-5D Germanindex and 0.31 for the EQ-5D Europeanindex. However, the approach described by Borchani et al. is probably not easily applicable in the clinical setting.

Conclusion

The EQ-5Dindex values were best estimated with a model based on the UPDRS subscales II-IVa-c regardless of whether we applied German or European weights to calculate the EQ-5D. The data fit as measured by the maximum R2 and minimum RMS is best for these models. The prediction rule could be validated with several independent data sets, indicating the potential for general usefulness. However, the results from the application of the instrument in large and independent studies should be reported prior to general application.