Introduction

The choice of outcomes in clinical trials is crucial. The results of a trial indicate the use of a treatment, as well as its safety and its efficacy. It is, thus, essential to choose the right outcomes to take into account what really matters in health care—patients’ health and well-being. However, guidelines indicating what should be measured in clinical trials are lacking. An example of systematization of measures in clinical trials is Outcome Measures in Rheumatology Clinical Trials (OMERACT), a network aimed at defining a common set of measures in rheumatology, which would allow comparison between trials and use in meta-analyses [1]. Such a network has not been created yet for psoriasis, and a plethora of outcomes are used in clinical trials of this disease.

Psoriasis is a disease with a strong involvement of psychosocial aspects. It is well-known that severity measures alone are not able to thoroughly depict the burden of psoriasis on patients [2••], since the same severity level may have a different impact on health-related quality of life. Concerning outcomes, something has changed over the last 20 years. Morsy et al. [3] compared outcomes in clinical trials of psoriasis from 2004 to 2005 with those analyzed by Marks et al. [4] in 1989 and observed that the main difference was the introduction of quality of life measures. However, a lack of standardization for all kinds of measures was reported. Even for the assessment of psoriasis severity—traditionally scored by the clinician on the basis of area involved, erythema, scaling, and induration—there was no homogeneity. Naldi et al. [5] found that in 171 randomized controlled trials of psoriasis from 1977 to 2000, 44 different scoring systems were used. The aim of this review is to analyze in detail the outcomes of clinical trials of psoriasis published between January 2011 and March 2012.

Methods

We performed a search for clinical trials in psoriasis using the PubMed MeSH database. We used the keyword psoriasis, the search limits type of study “phase III clinical trial,” and time period from January 1, 2011, to March 31, 2012. The search identified 123 articles. Of those articles, we selected 60 articles that directly dealt with the structured evaluation of psoriasis treatments. Of these articles, three did not concern the efficacy of treatment and one was in Chinese, one focused on arthritis and included only six out 60 patients with psoriatic arthritis, and one was a commentary on a trial. Thus, we analyzed outcomes in 54 clinical trials.

Each article was read by both authors and, when not explicitly stated, a consensus was reached about what was considered as primary outcome. For example, when the methods did not indicate the primary outcome, we referred to the results section and to the tables and figures to see which variable(s) was (were) used in the analysis. The same process applied to the secondary outcomes. We did not extract information about the variables indicated as “exploratory outcomes.” Other information that we extracted from each paper included the main interventions, the reported sample size, the presence of a reported power calculation, and the presence and the type of masking.

Table 1 provides a summary of the interventions and outcomes in the 54 studies we reviewed. We organized the studies according to the different clinical types of psoriasis: plaque, arthropathica, palmoplantar, and scalp. One study reported on chronic plaque psoriasis of the hands and feet only, so we listed it separately. As for the interventions, we included in Table 1 only information about the type of treatment, not about the doses; however, we indicated the number of different dose/treatment regimens.

Table 1 Summary of the interventions and outcomes used in the evaluation of psoriasis in the reviewed 54 studies

Results

The characteristics of the 54 studies selected at the end of the search and screening process are summarized in Table 1, and listed in alphabetical order by the first author for each clinical type of psoriasis. Forty-three studies were conducted on patients with plaque psoriasis, five of patients with psoriatic arthritis, three of patients with palmoplantar psoriasis, two of patients with scalp psoriasis, and one of patients with plaque psoriasis of hands and feet. Seventeen studies included a description of the power analysis/sample size calculation, whereas the others either were the continuation of previous trials (and thus based on variable principles of efficacy or inefficacy) or did not motivate the choice of the sample size. Of the studies, 25 were double-blind (one of these was quadruple-blind), seven were single-blind, and 22 were open-label or had no mention of patient or investigator blinding. Of the 16 studies with the power calculation, 11 were double-blind and two were single-blind.

Primary Outcomes

Of the 54 selected studies, 41 had a primary outcome based exclusively on the clinician/investigator assessment. The most used measures were the Psoriasis Area and Severity Index (PASI) (usually PASI 75 but also Δ PASI) and the Physician’s Global Assessment (PGA) (mainly with a target score of 0 or 1 corresponding to “clear” or “almost clear”). The combination of these measures—including their various modifications (eg, Psoriasis Severity Index [PSI], which excludes the area from the PASI calculation; Palmoplantar Pustular Psoriasis Area and Severity Index [PPPASI]) or variations in which different names are applied to the same instrument (eg, overall disease severity [ODS]; Investigator’s Global Assessment [IGA])—covers 49 of the 54 studies. One study without these clinical severity measures evaluated a treatment with narrowband UVB and used the number of treatments to clearance and/or minimal residual activity (MRA) as primary outcome.

The four studies using only patient-reported outcomes are secondary reports. Two focused on work productivity, using specific measures such as the Work Productivity and Activity Impairment Questionnaire: Specific Health Problem (WPAI-SHP) and the Work Limitations Questionnaire (WLQ). The other two focused on quality of life, using Dermatology Life Quality Index (DLQI), Psoriasis Disability Index (PDI), and Health Assessment Questionnaire (HAQ). In addition, nine studies listed patient-reported measures among their primary outcomes. Three more studies used the DLQI and another the HAQ; quality of life was also measured using the EuroQol (EQ-5D), the Skindex-16, and the Physical Functioning (PF) scale of the Medical Outcomes Study short form (SF-36). The remaining patient-reported primary outcomes included the Visual Analog Scale (VAS) for itch, pain, and well-being; the Patient Global Assessment (PtGA); and the patient’s overall assessment of treatment response. Each of these measures were listed in only one study.

Secondary Outcomes

Twenty studies did not report any secondary outcome. Although this may reflect publication bias (eg, because of word-count restrictions), it still amounts to approximately 40 % of the studies considered. Ideally, an optimal analysis of such studies would include a review of the study protocols. Of the remaining 34, only seven had secondary outcomes based exclusively on the clinician/investigator assessment. Twenty-four studies listed the PASI in several of its possible forms (eg, PSI, m-PASI) and at the conventional cutoff points (eg, PASI 50, -90, -100). In addition, these PASI measures were often considered at different times and for a number of studies even at every follow-up visit. The PGA was listed in 16 studies and, as described for the PASI, different cutoff points and observation times were often used.

Of the 27 studies listing at least a patient-reported measure as secondary outcome, four had only patient-reported outcomes: one had the VAS for itch, one the HLQ for work productivity, one a mixture of quality of life and work productivity instruments, and one quality of life questionnaires and a willingness to pay assessment. Among the studies with a patient-reported outcome, 14 listed the generic dermatologic DLQI and only two the psoriasis-specific PDI. The SF-36, a questionnaire used to describe general health status, was used in five studies. The PtGA among the secondary outcomes was listed five times.

Discussion

This review analyzes the outcomes of 54 psoriasis trials published between January 2011 and March 2012. We observed that the majority of the studies had a primary outcome based exclusively on the clinician/investigator assessment. The PASI, although expressed in many different forms (eg, proportion of patients reaching PASI 75, absolute or percent change in PASI score), is still the most used measure to evaluate clinical severity of psoriasis, even though it has never been standardized or its reliability demonstrated. As Naldi suggested [6], PASI should be considered passé, and better clinimetrics of disease severity should be developed. This lack of validity and inability to catch important symptoms arising from psoriasis [7] seem to support Naldi’s view.

However, the PASI score is widely used in clinical trials, and a valid alternative has not been suggested. Reasons for its widespread use may include its apparent “objective” nature, which some may perceive as comparable and interpretable. In addition, other measures of clinical severity (eg, the PGA, IGA, PSI) have been even less rigorously examined than PASI. Comparability of PASI is most likely a myth, not only because of a lack of standardized comparative studies across institutions and countries but because of the vast diversity of times at which it is assessed as an outcome. In this review, even when only considering the primary outcomes, the change in PASI score was assessed in the different studies at weeks 4, 6, 8, 10, 12, 14, 16, 20, 25, 52, and 160. In addition, one study used it at 35 sessions of phototherapy. Taken together this adds up to 12 different criteria of evaluation of an apparently single measure and clearly limits comparability across studies, even though the same measure of clinical severity was used.

The other commonly used measure for clinical outcome was PGA. PGA is a very simple measure to use both in clinical trials and in clinical practice, and it provides an overall score indicating the degree of severity on a scale, generally from 1 to 5. It does not include the details of PASI (ie, area, erythema, desquamation, and induration); however, the final result of PASI is also a single score. No studies detail whether PASI and PGA are evaluated by the same person; however, it seems likely that the same person does both and that PGA is likely to be “driven” by PASI. In fact, in a study analysing 30 clinical trials for biologics in psoriasis, Robinson et al. [8•] showed that the correlation between PASI 75 and a score of clear or almost clear on the PGA were 0.916 at 8 to 16 weeks and 0.892 at 17 to 24 weeks. The high correlation, according to the authors, indicated that the two assessment tools are redundant and one might be enough to assess psoriasis severity.

Although the correlation between these two physician-reported measures is extremely high, we have recently observed [9] that the agreement between PGA and the equivalent patient-assessed measure, PtGA, is scarce. On a sample of over 2500 dermatologic outpatients, the overall Cohen’s κ between the two clinical severity evaluations was very low (k = 0.25), possibly further illustrating the lack of standardization and validity of PGA.

In general, the correlation between physician- and patient-reported measures in psoriasis has been shown to be low [2••]. This should be of concern because these two classes of assessments seem to measure different constructs, so what may be considered a treatment “success” by a physician may be a disappointing outcome for the patient. Despite this, in the 54 clinical trials analyzed in this review, only about one fourth included a patient-reported measure among the primary outcomes. These figures are quite similar to those of Townshend et al. [10], who, in 2003, analyzed 125 dermatologic trials and found that only 32 of them (25.6 %) mentioned participant efficacy outcomes. Another study observed that, even when information on quality of life was available [11], methods and results were not adequately reported. Of note, in our review, we observed that no power calculation was based on a patient-reported outcome.

When looking at quality of life assessment, it is somewhat surprising to observe the clear preference for generic dermatologic rather than psoriasis-specific instruments. In fact, the DLQI is listed four times as a primary outcome and 14 times as a secondary outcome, whereas the PDI is listed once among the primary and twice among the secondary outcomes. No other psoriasis-specific questionnaires are mentioned, whereas several generic nondermatologic questionnaires are reported (eg, SF-36, HAQ, EQ-5D). In no instance, the recommendations contained in an in-depth critical review of generic and dermatology-specific quality of life instruments [12••] were followed. In fact, not only was the combination of SF-36 and Skindex-29 unobserved, but the SF-36 was included only once among the primary outcomes and five times among the secondary outcomes while the Skindex-29 went unmentioned in either the primary or secondary outcomes.

Surprisingly, when considering other patient-reported outcomes, we found a very rare occurrence of PtGA. Although PtGA obviously shares the convenience of PGA, in this review it was found in only one of 54 studies as a primary outcome and in five of the 34 studies listing at least a secondary outcome. In addition, an important measure such as the patient’s overall assessment of treatment response appeared in only one study as a primary outcome. Among these other outcomes, we found that different indexes of work productivity were used in the evaluation of treatment efficacy—in two studies as a primary outcome and in two other studies as a secondary outcome.

As for the clinical types other than plaque psoriasis, we saw that several indexes of clinical severity in addition to PASI and PGA or their modifications were used. Such indexes included, for example, the American College of Rheumatology (ACR) response criteria, the 28-joint Disease Activity Score (DAS28), the psoriatic arthritis MRI scoring system (PAMRIS), the total sign score (TSS) for scalp involvement, and the Nail Psoriasis Severity Index (NAPSI). On the other hand, no specific patient-reported severity or quality of life tools such as the Psoriatic Arthritis Screening and Evaluation (PASE), the Scalpdex for scalp psoriasis, or the nail psoriasis quality of life scale (NPQ10) were used.

The risk of relapse/rebound during or after treatment, an important aspect of the clinical course of psoriasis, was also scarcely evaluated. Not only were follow-up times usually too short, as shown in Table 1, to measure such occurrence, but only two trials [13•, 14•] explicitly listed relapse/rebound among their primary or secondary outcomes. However, as noted previously, this may also be because of publication bias, and an analysis of the study protocols could add a useful insight in this important aspect of treatment efficacy evaluation in psoriasis.

Conclusion

Compared to previous reports, we observed an increase in patient-reported outcomes in psoriasis clinical trials. A promising trend had already been reported in 2010 [15••], when the proportion of trials incorporating a quality of life measure was 7.7 % compared to 0.4 % in 2003. Here the proportion of studies with a quality of life measure listed as primary outcome was 11 %, and such proportion was 24 % when other patient-reported measures such as PtGA or WPAI-SHP were considered.

However, we recommend that the increase in trials incorporating patient-reported measures be combined with an increase in the quality of the measures used [12••]. Also, regulatory agencies, trial registers, and medical journals should require investigators to adopt more thoroughly validated and more standardized clinical measures and times of follow-up.