Psoriasis has been demonstrated to have substantial impacts on dermatology-related functional limitations and health-related quality of life (HRQL) [17]. The itching and other physical symptoms associated with psoriasis negatively impacts patients' functional well-being and their social relationships. Concerns about the appearance of one's skin can result in distress [8], worry, and embarrassment leading to restrictions in social, recreational, and work activities. Sexual functioning has also been demonstrated to be adversely impacted by psoriasis [9]. Psoriasis results in economic burden in terms of direct cost of care [10] and in terms of indirect cost of lost productivity at work and home [11]. In addition, some psoriasis treatment regimens have time costs associated with administration, such as doctor visits for ultra-violet treatment or safety monitoring for topical administration. Studies of psoriasis patients will often include both physician-assessed clinical endpoints and dermatology-specific patient-reported HRQL and limitations of functional ability to obtain a more comprehensive view of the impacts of the disease and its treatment on the patient [12]. A number of measures have been developed for measuring psoriasis-related HRQL and functional limitations [2].

Along with measuring patient-reported HRQL and dermatology-specific functional limitations, measuring patient-reported psoriasis symptoms is an important adjunct to clinical assessments of the disease. In fact, for some symptom domains, such as itch or pain associated with psoriatic plaques, the patient report is the primary source of assessment. Some disease specific measures of HRQL used in dermatology contain items that assess patient-reported symptoms as well [1317].

The purpose of the current analyses was to examine the psychometric properties of four patient-reported measures – a psoriasis symptom assessment measure, two single-item itch scales, and one dermatology-specific HRQL measure – when used in a moderate to severe psoriasis population. Specifically, the psychometric properties investigated were reliability, validity, and the responsiveness to underlying clinical change for each of the patient-reported measures. The HRQL measure under investigation is the Dermatology Life Quality Index (DLQI), a validated questionnaire that was developed as a practical measure for use in dermatology clinical settings [17]. The first symptom measure, the Psoriasis Symptom Assessment (PSA) scale, was adapted from the symptom scale of the validated Skindex-29 [14], a measure intended to assess the effects of skin disease on the patients' HRQL. The single-item itch measures were an itch Visual Analog Scale (VAS), used in Study A, and the itch scale from the National Psoriasis Foundation [18] – the NPF itch scale, used in Study B. These four patient-reported measures were used in two randomized clinical trials of a new therapy for moderate to severe psoriasis patients (combined n = 1095) thereby providing a substantial data-set for in-depth examination of their psychometric properties. Evidence supporting psychometric qualities such as reliability, validity, responsiveness for patient-reported outcomes is needed for clinical trials comparing therapies and to support claims of HRQL and symptom benefit [19].

Methods

The data for the analyses came from two Phase III randomized, double-blind, parallel group, placebo-controlled, multi-center clinical trials that were conducted to assess the efficacy, safety, and tolerability of weekly subcutaneous administration of efalizumab for the treatment of psoriasis. The two studies are labeled Study A and Study B, and were similar in design. Details of these studies are reported elsewhere [20]. Subjects were randomized to receive either the study drug or placebo. Both studies involved an initial twelve week treatment period that included assessments at baseline and after twelve weeks; the primary safety and efficacy endpoints of the trials focused on this twelve week treatment period. The analyses described in the present report do not evaluate treatment effects. This report focuses on a blinded examination of psychometric properties of the patient-reported instruments included in the two clinical studies. Subjects signed informed consent forms, and the study complied with FDA Good Clinical Practices, Health Protection Branch guidelines, and all other applicable ethical, legal, and regulatory requirements.

Subjects and Inclusion Criteria

Subjects volunteering for the study were screened for eligibility. Subjects had to have been classified with at least moderate psoriasis (e.g., at least 10% of body surface area covered with plaques and screening PASI score ≥ 12) for at least six months, and had to have been between the age of 18 and 70. Excluded from the study were patients with any of a number of concomitant diseases or allergies to the class of medications used in the trials, as well as pregnant or lactating females. There were 498 subjects in Study A and 597 subjects in Study B. Both studies recruited subjects from multiple clinical centers located throughout the United States and Canada.

Clinical Measures

Psoriasis Area and Severity Index (PASI)

The PASI [21] is frequently used as an endpoint in psoriasis clinical trials [22], and the proportion of subjects achieving PASI-75, defined as ≥ 75% improvement in PASI score from baseline, was the primary efficacy endpoint in both Study A and Study B. The PASI is a composite index indicating the severity of the three main characteristics of psoriatic plaques (erythema, scaling, and thickness) weighted by the amount of coverage of these plaques in the four main body areas (i.e., head, trunk, upper extremities, and lower extremities). PASI scores can range from 0 to 72, with higher scores indicating greater severity.

Overall Lesion Severity Scale (OLS)

The OLS is a physician global rating of psoriasis severity at a given point in time. The physician assigns a number from 0 ("clear") to 5 ("very severe") indicating his/her judgment of psoriasis severity. The physician is guided in this judgment by the status of three characteristics of the plaque (i.e., plaque elevation, scaling and erythema). The OLS was developed for use in these and related clinical trials.

Physician's Global Assessment of Change (PGA)

The PGA corresponds to the physician's global assessment of changes in all psoriatic lesions compared to the baseline condition (using photographs from baseline to aid in making the assessment). The possible scores are "Cleared" (100% improvement); "Excellent" (75%–99% improvement); "Good" (50%–74% improvement); "Fair" (25%–49% improvement); "Slight" (1%–24% improvement); "Unchanged"; and "Worse". For purposes of the analyses in this report, these categories were scored from 5 to negative 1, corresponding to "excellent" to "worse", respectively.

Patient-Reported Outcome Measures

Four patient reported outcome measures were used in these studies: the PSA, DLQI, and two measures of itching.

Psoriasis Symptom Assessment (PSA) Scale

The PSA contains eight symptom questions related to psoriasis, to which the patient responds twice, once in terms of frequency of occurrence (4-point scale, ranging from "always" to "never"), and once in terms of troublesomeness/ bothersomeness (4-point scale, ranging from "a great deal" to "not at all") [14]. Two scores are derived from the PSA, one for frequency and one for severity (e.g., from the troublesomeness/ bothersomeness ratings). Each PSA score ranges from 0 to 32, with higher scores indicating worse psoriasis symptoms.

The PSA was adapted from the symptom scale of the validated Skindex-29 [14, 15]. The first seven symptoms of the PSA are identical to the original Skindex-29 symptoms scale. An eighth item ("Your skin was scaling") was added based upon clinical judgment as to the importance of this symptom and based on a review of results from earlier efalizumab clinical trials. The PSA requires the subject to rate both frequency and severity over the previous two weeks. The Skindex-29 requires the subject to respond only in terms of frequency, and the subject is asked to respond to each item in terms of their perception at present (e.g., "my skin hurts").

Dermatology Life Quality Index

The DLQI was developed as a simple and practical questionnaire for use in dermatology clinical settings to assess limitations related to the impact of skin disease [17]. The instrument contains ten items dealing with the subject's skin. The ten items were based upon the most commonly identified impacts upon dermatology-specific HRQL that were elicited from patients with skin disease. The score on the DLQI has a possible range of 0 to 30, with 30 corresponding to the worst HRQL. The DLQI has evidence supporting reliability and validity when used in a dermatology setting [17]. The DLQI was developed to contain six subscale scores: symptoms and feelings; daily activities; leisure; work/school; personal relationships; and treatment.

Itch Measures

Study A and Study B used slightly different single-item measures to assess itchiness. Study A used a single item itch VAS, anchored by the terms "no itching" at the 0 point to "severe itching" at the 10 end of the scale. Subjects were asked to respond in terms of their itching "at the present time." Study B used the itch scale from the National Psoriasis Foundation score [18]. The itch scale is a single, 6-point Likert-type item assessing itching over the past 24 hours, with responses from 0 ("no itching") to 5 ("severe, constant itching, distressing; frequent sleep disturbance; interferes with activities").

Statistical Methods

The psychometric properties of the patient-reported outcome measures were assessed to determine the different instrument's reliability, validity and responsiveness [23, 24]. For the PSA, internal consistency reliability – i.e., the extent to which items on a scale are all measuring the same concept – was evaluated for the original 7 item and new 8 item scale, and for both the PSA frequency and severity scores. Internal consistency reliability was evaluated for all the multi-item scores using Cronbach's coefficient α statistic [25].

Validity represents the extent to which the instrument actually measures the construct it is intended to measure [23]. Validity of patient-reported outcomes is assessed by specifying and testing hypotheses about the relationship between the patient-reported outcome measures and other independent measures (e.g., severity of disease, clinician rated clinical status) and other health related measures. The strength of the correlation between measures intended to assess similar concepts was examined. Specifically, the correlation of the single-item itch scale (i.e., VAS or NPF itch) with item 3 on the PSA ("your skin itched") was compared to the correlation of the itch scale with other items on the PSA. The correlation of the PSA score with the symptom scale of the DLQI was compared with its correlation with the other DLQI scales, with the hypothesis that as a measure of patient-reported symptoms, the PSA would correlate higher with this DLQI scale than with the other five DLQI scales. To further investigate the validity of the scales, an assessment was made of the relationship between the patient-reported outcomes and the clinical measures, both at baseline and at week 12. Moderately strong correlations between the clinician rated and patient-reported symptom measures, and between clinician rated symptoms and the DLQI were expected. Pearson product-moment correlations were used to assess relationships between the clinical and patient-reported outcome measures.

Responsiveness of a patient-reported outcome deals with the extent to which changes in the scores are associated with changes in the underlying clinical status of the subject over some defined time period. Two approaches were used to assess the responsiveness of the patient-reported outcome measures. First, changes in these measures from baseline to week 12 were correlated with the changes in the physician-assessed clinical measures over this 12 week period, as well as with the physician's global assessment of change at week 12. The second approach to assessing responsiveness involved categorizing subjects into three groups based on the change in their PASI scores from baseline to week 12: PASI improved ≥ 75%; PASI improved ≥ 50% – 74.9%; and PASI got worse or improved < 50%. Analyses of variance were performed among these three groups on the changes from baseline to the 12-week endpoint in the patient-reported outcome measures.

Analyses were performed on blinded data, that is, the status of the subject with respect to his/her assigned treatment group was not known. Given that Study A and Study B were independently conducted studies; the results of the two clinical trials were analyzed separately. Two-sided tests with alpha = 0.05 were used to determine statistical significance, and there was no adjustment for multiple statistical tests.

Results

The average age of the 498 subjects enrolled in Study A was 44.1 (s.d. = 12.0) and of the 597 subjects in Study B was 45.6 (s.d.= 12.7). Males comprised 72.3% of the sample in Study A and 64.8% of the sample in Study B. The baseline PASI scores for Study A and Study B were 18.84 (7.05) and 20.01 (8.35), respectively; and the baseline OLS scores for the two studies were 3.46 (0.63) and 3.31 (0.85), respectively. These scores indicate that the subjects in both studies had moderate to severe psoriasis (e.g., the OLS score of 3 corresponds to physician ratings of "moderate" psoriasis, and a 4 corresponds to "severe" psoriasis).

The means, standard deviations, and reliability coefficients (Cronbach's coefficient α) of the patient reported outcomes are shown in Table 1. Complete baseline and 12 week data were available for 89.7–90.5% of the subjects in Study A and for 93.5–97.4% of the subjects in Study B. In general, the PSA, DLQI and VAS itch, and NPS itch scores decreased over the 12-week studies.

Table 1 Means, Standard Deviations, and Coefficient α for Patient Reported Outcome Measures, Baseline and Week 12

The internal consistency reliability coefficients of all the patient-reported measures are satisfactory, ranging from approximately 0.86 to 0.95 (see Table 1). Internal consistency reliabilities were somewhat better at the 12-week follow-up in both studies. For the PSA, no substantive differences in reliabilities were observed between the 7-item and 8-item versions.

The correlations at baseline and week 12 among the patient reported outcomes are summarized in Table 2. The PSA frequency and severity scores correlate in the range of 0.59 to 0.63 with the DLQI total score at baseline and in the range of 0.78 to 0.79 at 12 weeks. As hypothesized, the PSA frequency and severity scores correlate most strongly with the DLQI symptoms and feelings subscale than with the other DLQI subscales. The PSA scores correlate in the range of 0.59 to 0.82 with the two itch scores.

Table 2 Correlations Between the PSA Frequency and Severity Scores and the DLQI Total and Scale Scores and the Itch Measures, Baseline and Week 12

The correlations between scores on the DLQI and scores on the VAS (Study A) and NPF (Study B) itch scales are significant (see Table 3). The DLQI symptoms and feelings subscale correlates strongly and significantly with the itch measures. Combined, these data demonstrate that the itch scale correlates more highly with the patient-reported outcome measures that deal specifically with the symptom of itching (e.g., PSA; DLQI symptoms and feelings subscale) than with the other patient-reported outcome measures.

Table 3 Correlations Among DLQI Scores and Itch Measures, Baseline and Week 12

The relationship between the patient-reported outcomes and the physician-based clinical assessments was also evaluated. The patient reported outcome measures – the PSA, the DLQI, and the itch scales – all correlate significantly with both the PASI score and the OLS (Table 4). These correlations were significantly stronger at the end of the study than at baseline. For example, the PSA severity score at baseline was correlated 0.19, in both studies, with PASI scores, but at 12 weeks this correlation increased to 0.57 in Study A and 0.53 in Study B. Similar patterns of small to moderate correlations were observed for the two itch scales and the PASI and OLS.

Table 4 Correlations Between Patient Reported Measures and the Psoriasis Area and Severity Index (PASI) and Overall Lesion Severity Scale (OLS) at Baseline and at Week 12.

The responsiveness of the patient-reported outcomes to clinical change was evaluated using the two approaches discussed in the methods section. The correlations between the change scores for the clinical and patient-reported outcomes over the 12-week period are reported in Table 5. The patient-reported outcomes used in these studies demonstrate very good associations with changes in clinical status. Specifically, the correlations of changes in the PSA scores, the NPF itch score, and the DLQI total score with the changes in the PASI and OLS clinical measures and with the PGA are all highly significant and in the 0.44 to 0.57 range. The DLQI subscale demonstrating the strongest relationship with changes in PASI or OLS scores was the symptoms and feelings subscale. The other subscales, which focus on other HRQL impacts of psoriasis, are not quite as strongly correlated with clinical changes, as might be expected. The 8-item PSA was slightly more correlated to changes in clinical status than the original 7-item version (see Table 5). There was also a moderate and significant correlation between change in the patient-reported outcome measures and the PGA of change in the patient's psoriatic symptoms.

Table 5 Correlations Among Change Scores for PSA Frequency and Severity Scores, DLQI Scores and NPF Itch Scores with Change Scores on Clinical Measures and with the Physician's Global Assessment of Change (PGA) at Week 12

The second approach to assessing responsiveness of the patient reported outcomes relied upon the classification of subjects into three categories based upon the change in PASI scores from baseline to the end of the 12-week treatment period. The results of analysis of variance to assess differences among the three groups categorized by % of PASI improvement from baseline to week 12 are summarized in Table 6. There were significant differences among the groups on all the outcome measures, and in all cases, the post hoc analyses indicated significant differences between all pair-wise comparisons of groups. For all patient-reported outcomes there was a consistent trend of greater improvement on the respective patient reported outcome corresponding to greater improvement on the PASI.

Table 6 Analyses of Variance of Patient Reported Outcome Measures Among Three Groups of PASI Improvement Scores:≥ 75%; Between 50% and 75%; and < 50%.

Discussion and Conclusions

Two randomized clinical trials for a new therapy for psoriasis provided the opportunity to evaluate the reliability, validity, and responsiveness to change in clinical status of several patient-reported outcome measures. The results of the present study provide evidence supporting the reliability, construct validity and responsiveness of the PSA, the DLQI, and the two itch scales.

Although developed for a general dermatologic population, the DLQI has been applied to patients with psoriasis. For example, Badia et al. [26] used the DLQI in a sample of patients with mild to moderate eczema or psoriasis (approximately half the sample had psoriasis). Their results on the combined sample demonstrated that the DLQI was responsive to change, although they reported that the great majority of change occurred in the symptom and feeling scale of the DLQI. Zachariae et al. [27] tested a Danish version of the DLQI on a mixed sample of patients with dermatological diseases, including psoriasis. They found significant relationships between DLQI scores and physician-rated severity. Nichol et al. [28] reported a significant relationship between the DLQI and psoriatic symptoms in a sample enrolled in multi-center clinical trials. Ellis et al. [29] used the DLQI and the SF-36 in a Phase II clinical trial of a new treatment for psoriasis. Their results show greater responsiveness to treatment for the DLQI than to the SF-36. These and other studies [30, 31] support the use of the DLQI as a HRQL measure in psoriatic populations.

The current study further establishes the reliability and validity characteristics of the DLQI and expands on this by demonstrating the responsiveness of the DLQI to change in clinical status over the course of two independently conducted clinical trials with initial treatment periods of 12 weeks. Changes in the DLQI total score, as well as in all of the DLQI subscales, demonstrated significant and sizeable correlations with independently obtained physician-assessed changes in the clinical status of patients. As in the Badia et al. study [26], changes in the symptom and feelings scale demonstrated the greatest association with changes in clinical status. However, the other scales were also very responsive. For example, the daily activities scale, which is comprised of two items dealing with the interference of the skin disease on activities like shopping or with the choice of clothing, demonstrates the second highest responsiveness to clinical changes. This indicates that the alleviation of psoriatic symptoms, as determined by clinical assessments, results in significant and marked improvements to the dermatology-specific HRQL of psoriasis patients.

In addition to the dermatology-specific HRQL measure, this study included patient-reported symptom scales. Both the PSA and itch scales demonstrated responsiveness to changes in clinical status. They also demonstrated significant correlations with the clinical indicators obtained at the same time period, especially so at the end of the 12-week portion of the trial. Taken together, these data indicate that patient reports of their symptoms are an important and valid indication of clinical status.

In addition to demonstrating that the PSA is a reliable and valid instrument for a psoriasis population, the present study also confirms that the addition of an item on scaling of skin specific to psoriasis to the symptom scale of the Skindex-29, as well as slight changes in instructions to the subjects (as described in Methods) results in good reliability and validity, and increases its responsiveness. The 8-symptom scale was shown to be as reliable as the 7-symptom scale. In addition, the very high correlations between the PSA frequency and severity scores would indicate that it may not be necessary to ask both questions about the symptoms – one measure would suffice.

The present study demonstrates the validity, reliability, and responsiveness of several patient-reported outcome measures in the study of psoriasis, specifically the PSA, the DLQI and two itch scales. Although the DLQI has been used frequently in studies of psoriasis, the results presented here represent a very broad assessment involving the DLQI, two different itch measures, and patient-reported symptom assessments. Moreover, the results were derived from two independently conducted clinical trials using a combined study population that is greater than any other study using patient reported outcomes in assessing the impact of a psoriasis drug on these outcomes. The results of the psychometric analyses conducted confirm that these patient-reported measures are appropriate endpoints in clinical trials of psoriasis and that they relate well to the clinical endpoints of PASI, OLS and PGA. The use of patient-reported outcome measures with demonstrably good psychometric properties is a highly recommended when using such measures in a psoriatic population [32]. Based upon previous evidence and the results of the present study, the DLQI, the PSA, the VAS itch measure, and the NPF itch measure are considered useful tools for the measurement of dermatology-related limitations of functional ability and the frequency, severity and impact of psoriasis symptoms on patients' lives. These measures provide information about the change in subject's psoriatic symptoms that supplement and provide assessments from a different perspective than the clinical assessments obtained by physicians.

Authors contributions

RS developed the analytical approach and had primary responsibility for interpreting results and preparing the manuscript.

BB reviewed results of the analyses, suggested additional analyses, and helped prepare the manuscript.

SS provided critical revision of the manuscript for important intellectual content.

CT programmed the analyses reported here, assisted in the interpretation of results, and helped to modify the analysis plan.

JK provided critical revision of the manuscript for important intellectual content.

DR provided critical revision of the manuscript for important intellectual content.