FormalPara Key Summary Points

Why carry out this study?

Psoriasis Area Severity Index (PASI) is the most common disease severity measure used in clinical trials. PASI assessment is a complex and time-consuming measure, requiring collection of 16 measurements and performance of 13 calculations, and is therefore generally not used in real-world practice.

A simpler assessment measure which is more sensitive to changes in symptom severity and predictive of patients’ quality of life is therefore needed to assist dermatologists in patient management and could facilitate value-based payment agreements between payers and pharmaceutical manufacturers.

What was learned from the study?

The findings of this analysis show that the Optimal Psoriasis Assessment Tool (OPAT) could be an alternative to PASI that requires only body surface area (BSA) and a single patient-reported outcome.

The OPAT could be easily incorporated into real-world practice using a digital calculation tool.

Digital Features

This article is published with digital features, including a summary slide, to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.14496078.

Introduction

Psoriasis is an inflammatory disease with a chronic course, which is often associated with various serious comorbidities and compromised health-related quality of life (HRQoL) [1, 2]. Patients with psoriasis may experience low self-esteem, feelings of shame and frustration, and frequently report seriously compromised social life, difficulties at work and in daily activities, and suicidal ideation in more serious cases [3].

Quantifying the severity of psoriasis in a consistent and clinically meaningful way is important when conducting clinical research. Although several psoriasis severity tools have been used over the past few decades, no instrument meets all validity criteria [4]. The most commonly used instrument, the Psoriasis Area Severity Index (PASI), was introduced in an etretinate trial for psoriasis in 1978 [5].

PASI is widely considered the gold standard measure of disease severity and is frequently used as a primary efficacy endpoint in clinical trials of moderate-to-severe psoriasis [6]; however, it does have several limitations. Assessment of disease severity by clinicians unfamiliar with the PASI technique can lead to large intra- and inter-rater variability in scores, thus leading to more subjective measurements of treatment efficacy [7]. One of the biggest issues with PASI is that it was arbitrarily created without consideration of how erythema, induration, scale, and body surface area (BSA) should be combined to produce a meaningful metric. Also, the BSA component (area score) is nonlinear. As a complex index to measure psoriasis severity, PASI assessment is time-consuming, cumbersome, and difficult to interpret [8, 9]. In particular, for PASI assessment, an evaluator assigns erythema, induration, desquamation, and area scores to each of four body sections (head/neck, trunk, upper extremities, and lower extremities), measures BSA in these four body sections, and then mathematically calculates a score that ranges from 0 to 72. Consequently, PASI assessments and scores are typically not used in real-world practice.

In recent years, the US Food and Drug Administration (FDA) has not accepted PASI as a standalone efficacy endpoint and has required a static Physician Global Assessment (PGA) for most late phase clinical trials of psoriasis therapies [8, 9]. The most commonly used versions of PGA assess only plaque qualities, as they measure degrees of erythema, induration, and desquamation representing the entire body. Most PGA instruments do not provide an overall measure of psoriasis severity because they do not explicitly account for BSA involvement. For example, a patient with extensive BSA involvement (i.e., 30%) could have the same PGA score as a patient with limited BSA involvement (i.e., 1% BSA), even if the degrees of lesion erythema, induration, and desquamation are the same.

The use of PASI by health care payers in the USA is very limited, especially for establishing eligibility criteria and access for biologics, whereby BSA is generally used. Furthermore, value-based agreements between biologics manufacturers and health care payers have been limited to nonclinical measures (e.g., persistence or drug survival) [10].

The impact of psoriasis on patients’ quality of life is substantial. Thus, an assessment of the impact of treatment should reflect improvements in both objective clinical assessments and patients’ subjective assessments of the impact of treatment on their quality of life. Hence, there is a need for a simpler assessment measure that is more sensitive to changes in symptom severity and patient assessment, and predictive of patients’ HRQoL (Dermatology Life Quality Index, DLQI). Physicians will have more interest in a real-world, simple, and useful tool (e.g., Optimal Psoriasis Assessment Tool [OPAT]) that could serve as a substitute for PASI in the treatment of patients with psoriasis. The objectives of this study are to evaluate the potential of OPAT as an alternative to PASI, based on simple and more relevant clinical measures and patient outcomes, which are correlated with PASI and DLQI.

Methods

Data Source

We analyzed data from the UNCOVER-1, -2, and -3 trials to investigate an alternative to PASI assessments for psoriasis disease severity based on clinical and patient-reported outcome (PRO) measures. The alternative clinical measures included the product of BSA and overall “worst” sum of severity measures based on PASI, and BSA alone. PRO measures included patient-reported itch numeric rating scale (NRS), patient-reported skin pain, and a patient global assessment of disease severity (PatGA).

Eligibility criteria and study designs of individual studies have been described earlier [11, 12]. Figure 1 shows the study design of all the UNCOVER trials. In this integrated analysis of three UNCOVER trials, intent-to-treat patients (N = 3866) with moderate-to-severe psoriasis in the 12-week induction period were randomized to subcutaneous 80-mg ixekizumab injection once every 2 weeks (Q2W) (n = 1169) or once every 4 weeks (Q4W) (n = 1165) after an initial 160-mg starting dose, or placebo (n = 792) or etanercept 50 mg biweekly (n = 740; UNCOVER-2, -3). For the analyses, imputation of missing data was not performed and all patients with available data were included.

Fig. 1
figure 1

Study design

The studies were approved by the ethical review board at each participating site before the start of the study. The trials conform with Good Clinical Practices, International Council for Harmonization, local laws and regulations, and were conducted in accordance with the Declaration of Helsinki principles. All patients enrolled provided written informed consent before participating in the trials.

Statistical Analysis

Correlation analyses were performed to investigate the relationship between the product of BSA and severity components (overall worst scaling, overall worst erythema, overall worst thickness, and sum of the three worst overall scores) and PASI, and between BSA alone and PASI. On the basis of pre-analyses, Pearson correlation coefficients were high between PASI and BSA alone (r = 0.745–0.852), and between PASI and the product of BSA with each symptom severity and worst sum of severity measures (r = 0.824–0.939). Given that the worst symptom severity measures are based on the PASI assessment and to reduce the number of measures in creating a more relevant assessment tool, BSA alone was selected for further analyses and development of the OPAT.

Pearson correlations were computed for patient-reported measures (skin pain, itch, and PatGA) with PASI and DLQI at baseline and at weeks 4, 8, 12 (PASI) and weeks 4 and 12 (DLQI). The standard ordinary least-squares regression model using two variables (BSA and one of the patient assessment measures) was used to assess PASI and DLQI at baseline and week 12 with √(R2) results presented.

The predicted value produced by the regression model was defined as proxy PASI score. Sensitivity analyses were completed, the proxy scores versus actual PASI results were plotted, and the concordance rates were assessed for the proxy score measures versus PASI at week 12. These analyses were utilized to develop a simpler tool (i.e., OPAT) that is more sensitive to changes in symptom severity, incorporates patient self-assessment, and is predictive of patients’ quality of life based on DLQI.

A variety of regression models adjusted for BSA and patient assessments (itch, PatGA, or skin pain) were used for evaluation at week 12 followed by a comprehensive model selection procedure using least-squares regression modeling of PASI and DLQI total scores. This process involved fitting 18 models with different independent variables including individual or combinations of BSA, the patient assessment variables (itch, PatGA, or skin pain), and the interactions between BSA and patient assessment variables (see supplementary materials).

Results

Patient Demographics

In this pooled analysis of 3866 patients, the average age was 45.5 years, 67.8% were male, and 92.6% were white. Mean baseline values for PASI and DLQI were 20.2 and 12.5, respectively. Baseline demographics and disease characteristics are shown in Table 1.

Table 1 Baseline demographics and disease characteristics

Correlation Analyses

The patient-assessed measures of skin pain, itch, and PatGA were significantly correlated with both PASI and DLQI. Similarly, the correlation of BSA with PASI and DLQI was also established.

PASI and BSA were significantly (p < 0.001) correlated with each other at baseline (r = 0.759), week 4 (r = 0.745), week 8 (r = 0.804), and week 12 (r = 0.852) (Fig. 2a). There was a significant correlation between PASI and the product of BSA with worst scaling (r = 0.836–0.923), worst sum of severity (r = 0.857–0.939), worst erythema (r = 0.824–0.918), and worst thickness (r = 0.842–0.929) at baseline and weeks 4, 8, and 12 (Fig. 2a). Similarly, BSA alone was also significantly correlated with DLQI total score from baseline (r = 0.131; p < 0.001) to week 12 (r = 0.536; p < 0.001) (Fig. 2b).

Fig. 2
figure 2

a PASI correlations with simpler clinically assessed measures; b DLQI correlations with physician’s assessment measures

The correlations of DLQI with patient-assessed measures were higher than that with PASI for itch NRS and skin pain (itch NRS, 0.771 vs 0.695 at week 12; skin pain, 0.749 vs 0.635 at week 12), and similar for PatGA (PatGA, 0.748 vs 0.753) at week 12 (Fig. 3). The correlations at week 12 were higher than baseline for both PASI (skin pain, 0.154 vs 0.635; itch NRS, 0.128 vs 0.695; PatGA, 0.186 vs 0.753) and DLQI (skin pain, 0.507 vs 0.749; itch NRS, 0.490 vs 0.771; PatGA, 0.350 vs 0.748). Correlations between DLQI and the product of BSA with worst erythema, thickness, scaling or sum of severity at week 12 ranged from 0.584 to 0.596 (Fig. 2b).

Fig. 3
figure 3

PASI (and DLQI) correlations of patient-assessed measures with PASI as reference

Sensitivity and Specificity Analyses for Prediction

Model with BSA and Itch

For the ordinary least-squares (OLS) regression model with BSA and itch, Table 2 shows sensitivity and specificity of the association between the percentage change in proxy PASI score with PASI 75 and PASI 90 at week 12. For PASI 75, a 73% change in proxy PASI score provides the maximum association (based on Youden index), whereas an 83% change in proxy PASI score had the maximum association for PASI 90.

Table 2 Percentage improvement in proxy PASI score

Sensitivity, specificity, positive predictive value, and negative predictive value were very high (greater than or equal to 80%) for both PASI 75 and PASI 90. PASI 75 had a sensitivity of 87.5% whereas PASI 90 had a sensitivity of 88.0%.

Model with BSA and PatGA

For the OLS model with BSA and PatGA, Table 2 shows sensitivity and specificity of the association between the percentage change in proxy PASI score with PASI 75 and PASI 90 at week 12. For PASI 75, a 71% change in proxy PASI score provides the maximum association (based on Youden index), whereas an 89% change in proxy PASI score had the maximum association for PASI 90.

Sensitivity, specificity, positive predictive value, and negative predictive value were very high (greater than or equal to 81%) for both PASI 75 and PASI 90. PASI 75 had a sensitivity of 88.6% whereas PASI 90 had a sensitivity of 81.4%.

Concordance/Discordance

The concordance summaries showed high concordance rates for PASI 75 and PASI 90, as shown for the models with PatGA (77.8% and 60.2%) and itch (74.4% and 55.1%) for PASI 75 and PASI 90, respectively, as shown in Fig. 4.

Fig. 4
figure 4

Concordance rates by residual BSA severity at week 12 in proxy BSA (all treatment arms)

Models with BSA and itch, and BSA and PatGA were similar in terms of percentage improvement in proxy PASI score and sensitivity (Table 2). Furthermore, the most severe patients at baseline who had good responses (PASI 75 and PASI 90) after 12 weeks often continued to have BSA involvement greater than 10%, suggesting that relying on BSA changes alone does not fully capture the benefit or amount of clinical improvement.

In assessing these models, the scatterplots (Fig. 5) visually displayed the alignment of the proxy PASI versus PASI. The concordance summaries showed very high concordance rates for PASI 75 and PASI 90, especially for the models with PatGA and itch.

Fig. 5
figure 5

Predicted vs observed PASI scores

Model Comparison Based on Regression Analysis

At week 12, the correlations between PASI and BSA combined with patient assessments [PatGA (\(\mathrm{Corr }(\sqrt{{R}^{2}})\) = 0.904), itch NRS (\(\mathrm{Corr}\) = 0.898), and skin pain (\(\mathrm{Corr}\) = 0.890)] were higher than between PASI and BSA alone [\(\mathrm{Corr}\) = 0.852 (Fig. 6)].

Fig. 6
figure 6

PASI (and DLQI) correlations with combined clinical and patient assessments at week 12

In case of DLQI total score, the correlations with BSA combined with the same measures [PatGA (\(\mathrm{Corr}\) = 0.756), itch NRS (\(\mathrm{Corr}\) = 0.785), and skin pain (\(\mathrm{Corr}\) = 0.775)] were stronger than with BSA alone (\(\mathrm{Corr}\) = 0.536) or with PASI (Corr = 0.662) at week 12. The proxy PASI versus actual PASI results were plotted for each of the two-parameter models. In addition, for each model, the concordance rates were summarized for the proxy measure versus PASI.

For the overall population at week 12, of the 18 models tested for PASI (Supplementary Table 1), the results for the MAE, PRESS, and RMSE from overall datasets ranged from 2.12 to 4.02, 12.61 to 34.62, and 3.52 to 5.87, respectively. MAE, sqrt(PRESS), and RMSE represented percentage errors of up to 5.6%, 8.2%, and 8.2%, respectively, for the overall PASI range. In the PASI models, out of the three patient assessment variables, those models with PatGA had the best performance with respect to common model selection criteria (AIC, BIC, or RMSE in validation dataset), followed by itch and skin pain. In the DLQI models (Supplementary Table 2), MAE, PRESS, and RMSE from overall datasets ranged from 2.21 to 3.62, 11.98 to 26.42, and 3.42 to 5.14, respectively. Percentage errors for MAE, sqrt(PRESS), and RMSE were up to 12.1%, 17.1%, and 17.1%, respectively, for the overall DLQI range. The DLQI model with itch had the best performance, followed by skin pain and PatGA.

As shown in Supplementary Tables 1 and 2, the most complicated models with BSA, itch, PatGA, skin pain, and their interaction terms have the smallest AIC, BIC, or RMSE for the validation dataset. A summary of estimated coefficients of selected models is presented in Tables 3 and 4. However, the best two-term models [i.e., BSA + PatGA (R2 = 0.82), and BSA + itch (R2 = 0.81) for PASI and BSA + itch (R2 = 0.62) for DLQI] or three-term models [i.e., BSA + PatGA + BSA × PatGA (R2 = 0.84) for PASI and BSA + itch + skin pain (R2 = 0.65) for DLQI] are still preferred, as they showed minor differences in R2 and RMSE in comparison to the most complicated models.

Table 3 Summary of coefficients (estimates) of selected models (without interaction and without higher order terms) for PASI at week 12
Table 4 Summary of coefficients (estimates) of selected models (without interaction and without higher-order terms) for DLQI at week 12

A sample digital prototype of OPAT is presented in Fig. 7.

Fig. 7
figure 7

Digital prototype of OPAT

Discussion

Health care payers and dermatologists are paying more attention to assessment of patient severity outcomes or improvements. In clinical settings (mostly randomized clinical trials), the most common measure used is PASI. However, a recent systematic review evaluating the variation between PASI and PGA scores used in clinical trials showed a high variability when measuring the same results [13]. This study highlights the need for more consistent and reliable outcome for measuring biologic therapeutic efficacy. PASI is rarely used in the real world and it does not adequately reflect patients’ quality of life; therefore, an alternative to PASI with potential for use in clinical practice is needed.

Currently, two proxy measures to PASI are receiving significant attention as potential outcome measures for use in clinical practice and are tied to effective treatment. First, the BSA measurement has been suggested as a target goal by the National Psoriasis Foundation in a recent publication [14]. The target responses were defined by a consensus panel of dermatologists with specific expertise in psoriasis treatment. BSA 75, BSA 90, and BSA 100 could be used in similar manner as PASI 75, PASI 90, and PASI 100 [15]. Secondly, the International Dermatology Outcomes Measures (IDEOM) has worked to develop and validate measures for skin diseases by using the Delphi survey process [16]. The consensus opinion for IDEOM is focused on the BSA × sPGA measurement, which captures more clinical information than BSA alone, as a potential outcome that can be used consistently across clinical trials and in real-world practice [17]. Although these two proxy measures (BSA and BSA × sPGA) appear to show promising results, neither one considers any patient-reported assessments of disease impact.

In this study, we pursued a more holistic approach for assessing an alternative to PASI by combining a clinical assessment with a PRO measure using a large, integrated database from three clinical trials (n = 3866). Statistical analyses revealed high correlations between OPAT inputs (BSA and PROs) to both PASI and DLQI scores. Regression modeling was used to analyze multiple model specifications (including BSA plus one, two, or three PROs, as well as interaction terms and other higher-order terms). On the basis of model fit parameters, models with BSA and one PRO performed well (BSA and PatGA had the best fit, in general, although not much more than itch and skin pain). The two-term and three-term models were the preferred ones, as they had minor differences in R2 and RMSE as compared to the most complicated models with eight terms. The concordance rates between OPAT model predictions and observed PASI 75, PASI 90, and PASI 100 scores were highest in the model using BSA and PatGA, followed by the model with BSA and itch, and provide additional support for the performance of the OPAT model.

Thus, our use of both BSA and a patient-reported assessment for the OPAT is an important step toward a viable alternative to PASI in real-world clinical practice.

The digital prototype demonstrates the practical potential for health care providers in the implementation of the OPAT, as the BSA measure and a single PRO would be entered resulting in quick, real-time estimates of disease severity and the impact on patients’ HRQoL. A prototype for a potential interactive, digital version of OPAT was created on the basis of the regression models with the best statistical fit for predicting PASI scores, and for predicting DLQI scores. The graphical software program (R-shiny) was used to create the prototype (Fig. 7).

PASI is considered as the gold standard measure of psoriasis severity and the most common disease severity measure used in clinical trials. However, the PASI assessment is complex and time-consuming, requiring collection of 16 measurements and performance of 13 calculations, and is generally not used by community dermatologists. In this study we evaluated the use of OPAT that requires only two inputs: BSA and a single PRO measure (itch, PatGA, or skin pain) to predict PASI. Therefore, OPAT is a simpler and time-saving method for dermatologists and health care providers to measure psoriasis severity.

The limitations of these analyses were that it was performed on a moderate-to-severe psoriasis population (BSA ≥ 10); thus, the performance of OPAT in patients  with less than 10% BSA involvement is not known. Additional validation in terms of replication in other patient populations with other biologic treatments in both randomized clinical trials and real-world databases is needed. Another limitation of this tool is that the population in this study could be different than in a clinical practice setting. Additional analysis on real-world data would allow the generalization of the results.

Conclusion

Our study demonstrates how using only two data inputs, BSA and a single PRO measure, can strongly predict a full PASI assessment, and quality of life on over 3800 patients from three randomized clinical trials. In addition, we presented a prototype for the user-friendly OPAT application for dermatologists and health care providers with input selection, prediction results, and density distributions for predicted PASI and DLQI scores. The OPAT provides an alternative to PASI to measure disease severity in a simple, practical, and time-saving manner. The OPAT can be easily incorporated into real-world clinical practice via a digital calculation tool.