Digital Features for this article can be found at https://doi.org/10.6084/m9.figshare.16823503.

FormalPara Key Points

There is a lack of comprehensive clinical disease measures used consistently for assessment and monitoring of patients with generalized pustular psoriasis (GPP) in clinical trials and routine clinical practice.

The use of psoriasis disease measures in the assessment of patients with GPP is not optimal because of the unique pathology and clinical features of GPP, and the acute and life-threatening nature of GPP flares.

Modified psoriasis clinical disease measures, including the Generalized Pustular Psoriasis Physician Global Assessment and Generalized Pustular Psoriasis Area and Severity Index, have recently been developed to specifically assess patients with GPP.

1 Introduction

The accurate assessment of disease severity and treatment outcomes is essential for providing effective patient care and for the development and evaluation of novel therapies. In plaque psoriasis, many disease severity and outcomes measures have been used with varying levels of clinical validity and reliability [1]. Generalized pustular psoriasis (GPP) is a rare neutrophilic skin disease characterized by recurrent episodes of widespread eruption of sterile macroscopic pustules that can occur with or without systemic inflammation [2]. Based on the historical classification of GPP as a variant of psoriasis, assessment of GPP severity and treatment outcomes has commonly involved disease measures developed for plaque psoriasis. However, accumulating evidence based on genetic, histologic, immunologic, and clinical studies demonstrates that GPP is a distinct disease entity with unique skin-related and systemic symptoms that require specific disease measures for accurate assessment.

According to the European Rare and Severe Psoriasis Expert Network (ERASPEN) consensus criteria, GPP is characterized by primary, sterile, macroscopic neutrophilic pustules on non-acral skin, but not pustulation that is restricted to psoriatic plaques [2]. GPP is heterogeneous and can be characterized as a relapsing disease with recurrent flares or a persistent disease with intermittent flares. Symptom severity can vary with each flare within an individual patient [3]. Unlike plaque psoriasis, GPP flares, in certain patients, are considered medical emergencies that can worsen rapidly and may require hospitalization because of the severity of the skin lesions and systemic symptoms, and the potential for life-threatening complications [4]. The rarity of GPP poses substantial challenges for its effective treatment and the development of novel therapies. Currently, there is a lack of validated GPP-specific clinical disease measures and an indirect comparison of new GPP therapies is not feasible because of the lack of consistent application of disease measures in clinical trials and routine practice. In this article, we provide an overview of the disease measures currently used to assess GPP in clinical practice and clinical trials, highlighting their advantages and limitations.

2 Psoriasis Disease Measures Used for the Assessment of Patients with GPP

Owing to the lack of GPP-specific clinical disease measures, psoriasis disease measures have been used to evaluate patients with GPP. However, these tools are not optimal for assessing patients with GPP and GPP flares because the disease must be evaluated rapidly and accurately to prevent life-threatening complications such as acute respiratory syndrome and sepsis [4,5,6]. In a recent estimate, 5–10% of deaths in patients with GPP were due to complications related to GPP flares [4]. The Physician Global Assessment (PGA), also known as the Physician’s or Investigator’s Global Assessment, provides a single estimate of a patient’s overall disease severity determined by the physician or investigator; typically, a seven-point scale from clear (PGA = 0) to severe (PGA = 6) is used. The PGA evaluates disease severity more intuitively than the Psoriasis Area and Severity Index (PASI). There are two versions of the PGA: dynamic, which assesses improvement relative to a baseline severity level, and static (sPGA), which assesses disease severity at an independent timepoint; bias is reduced with the sPGA versus the dynamic PGA because the physician does not need to recall the baseline disease state [7]. In both versions, the individual elements of psoriasis plaque morphology and degree of body surface area (BSA) involvement are not quantified [8, 9].

The sPGA was included as one of several disease measures in a 52-week Japanese study of ixekizumab, an anti-interleukin (IL)-17A humanized monoclonal antibody, for the treatment of patients with different forms of psoriasis, including five patients with GPP [10]. Among the efficacy outcomes assessed were the proportion of patients achieving an sPGA score of 0 or 1 and the proportion of patients achieving remission of psoriasis plaques according to sPGA (a score of 0) [10].

The PGA has several key limitations. It does not assess the extent of the disease, which may limit its use as a tool to assess inclusion criteria in clinical trials. In addition, the PGA does not take into consideration comorbid conditions and is solely focused on skin assessment, which is a key drawback when used in GPP, a systemic inflammatory disease. Furthermore, multiple versions of the PGA have been used in clinical trials, which complicates the feasibility of cross-trial comparisons. For psoriasis, Pascoe et al. investigated the possibility of using the PGA in daily clinical practice [9]. Based on the correlation of PGA scores with a longitudinal patient global assessment, the PGA can be used in clinical practice; however, further real-world evidence is needed to demonstrate its positive impact on treatment plans [9].

The European Medicines Agency and the US Food and Drug Administration recommend the use of the PASI combined with the PGA for the evaluation of novel psoriasis treatments [7]. The PASI was developed in 1978 for use in psoriasis clinical trials and is considered the best-validated assessment tool for the severity of plaque psoriasis [11]. However, the PASI has several limitations, including poor sensitivity when the extent of BSA involvement is small, inability to assess disease on all body areas, such as the hands, nails, feet, and genitals, and poor reproducibility because of the inherent variability in measuring BSA [7]. In addition, the PASI is complex and can be resource intensive for routine use in clinical practice. Additional real-world evidence is needed to assess the applicability of PASI in routine clinical practice [12].

Currently, PASI 90, which denotes the proportion of patients who experience a 90% improvement in PASI, is considered the treatment goal for psoriasis, indicating substantial improvement in the affected skin. However, such thresholds have not been used rigorously in GPP. Recently, the PASI score has been modified for the assessment of patients with GPP (see Sect. 3) [13].

Despite the success of the PGA and PASI in the assessment of patients with plaque psoriasis, the use of these measures in GPP is inadequate because of its unique clinical symptoms. In addition, the induration component used in the PASI to assess the epidermal hyperplasia associated with plaque psoriasis is not as relevant to patients with GPP. Finally, the pain and systemic inflammation associated with GPP flares are not assessed by these measures.

3 GPP-Specific Disease Measures

To improve the applicability of the PGA and PASI in the assessment of GPP, certain modifications have been implemented based on input from international GPP experts. This led to the development of the Generalized Pustular Psoriasis Area and Severity Index (GPPASI) and the Generalized Pustular Psoriasis Physician Global Assessment (GPPGA), which replace the induration component with a pustular component. The development of these modified measures involved the review and discussion of hundreds of photographs of patients with GPP, which led to the creation of a pocket guide that includes patient photographs and descriptions.

These measures were first used in a proof-of-concept study of spesolimab in patients presenting with a GPP flare [13]. Training on the use of these measures was provided by the sponsor and the primary investigators to ensure consistency and accuracy in scoring GPP flares. These measures are also being used in the Effisayil™ 1 trial, a placebo-controlled randomized clinical trial of spesolimab in patients presenting with a GPP flare, the Effisayil™ 2 trial, a phase IIb, dose-finding study to evaluate the efficacy and safety of spesolimab for the prevention of GPP flares, and a phase III clinical trial of imsidolimab in patients with GPP based on a report by the sponsoring company [14,15,16].

For both the GPPGA and GPPASI, the five grades of severity for erythema, scaling, and pustulation correspond to 0 = clear, 1 = almost clear, 2 = mild, 3 = moderate, and 4 = severe. The GPPGA score is based on averaging the individual scores for erythema, scaling, and pustulation (Fig. 1); by contrast, for GPPASI and similar to PASI, the score for each body region is calculated (the product of the sum of severity scores and its corresponding BSA score for erythema, scaling, and pustulation, multiplied by a weighting factor for each body region) and then the total GPPASI score determined (the sum of the individual scores from all body regions; Fig. 2).

Fig. 1
figure 1

Generalized Pustular Psoriasis Physician Global Assessment (GPPGA) score is based on averaging the individual scores for erythema, scaling, and pustulation [13]. GPP generalized pustular psoriasis

Fig. 2
figure 2

Generalized Pustular Psoriasis Area and Severity Index (GPPASI). The score for each body region is calculated (the product of the sum of severity scores and its corresponding body surface area score for erythema, scaling, and pustulation, multiplied by a weighting factor for each body region) and then the total GPPASI score determined [13]

The Clinical Global Impression (CGI) scale was developed to provide a brief assessment of the clinician’s view of a patient’s global functioning before and after treatment initiation, and is the most commonly used measure in clinical trials for GPP in Japan [17,18,19]. It provides an overall clinician-determined summary that considers all available information, including knowledge of the patient’s medical history, symptoms, and behavior, and the impact of the patient’s symptoms on their ability to function [20]. The CGI scale can be adapted to identify the proportion of patients achieving treatment success at a specific timepoint [18]. In a clinical trial of guselkumab, which included ten patients with GPP, the primary endpoint was defined as the proportion of patients achieving the treatment goal of a CGI score of “very much improved”, “much improved”, or “minimally improved” at week 16 [18]. Different CGI scales have been used in other small GPP clinical trials [18, 19, 21].

To overcome the limitations of the CGI scale and the lack of systemic measures, another GPP-specific measure, the Japanese Dermatological Association Severity Index of GPP (JDA-GPPSI), was introduced in the GPP Medical Practice Guideline to assess GPP severity based on skin symptoms and systemic involvement (Table 1) [22]. Cutaneous symptoms are assessed using a total skin score that ranges from 0 (none) to 9 (severe) and includes three components: overall erythema area, erythema area with pustules, and edema area. The systemic symptoms of GPP are assessed using a systemic/laboratory score that ranges from 0 to 8 and has four components: pyrexia, white blood cell (WBC) count, C-reactive protein (CRP) levels, and serum albumin level. Each component is based on measurements of four anatomic regions that are summed to give the total BSA involvement (head [including neck, 10%], upper extremities [20%], trunk [including axillae and genitals, 30%], and lower extremities [including buttocks, 40%]) [22]. The total GPP score is the sum of the total skin score and the systemic/laboratory score and ranges from 0 to 17. The JDA-GPPSI classifies disease severity as mild (0–6), moderate (7–10), or severe (11–17) [22].

Table 1 GPP severity assessment based on the Japanese Dermatological Association recommendations

The JDA-GPPSI was used to assess the efficacy of adalimumab in Japanese patients with GPP [22]. However, although systemic symptom measures are incorporated in the JDA-GPPSI, there is no global agreement on certain components, such as the cut-off values of CRP levels and the edema assessment, which is specific to Japan. In addition, although the JDA-GPPSI could be implemented in clinical practice, the assessment of BSA with edema may be challenging. The JDA-GPPSI has been used under the term “pustular symptom score” in a clinical trial of brodalumab that included 12 patients with GPP [19].

In summary, with the increasing requirement for accurate assessment of GPP severity in clinical practice and improved evaluation of new therapies, there is a need to develop clinical disease measures that are easy and simple to use in routine daily practice and standardized across clinical trials worldwide. In addition, future studies should focus on the clinical validation of the GPPGA, GPPASI, CGI, and JDA-GPPSI to adapt their application to clinical trial and clinical practice settings.

4 Other Disease Measures Used in GPP

Owing to the heterogenic nature of GPP and variation in the disease symptoms, there is a need to quantitatively assess GPP severity. Severity assessment of skin symptoms is a component of various GPP clinical disease measures described earlier, such as the JDA-GPPSI, GPPGA, and GPPASI. The severity of skin symptoms associated with GPP can generally be categorized as mild, moderate, or severe based on the total score obtained from rating the symptoms, which usually include erythema, pustules, and edema. However, there is a lack of consistency in the implementation of a severity assessment of skin symptoms among different clinical measures, which complicates the ability to compare disease outcomes across GPP studies. The systemic inflammation associated with GPP can be assessed using specific laboratory measures, including fever, WBC count, serum CRP levels, erythrocyte sedimentation rate, and serum albumin levels [3, 23]. Complete blood count and results of a comprehensive metabolic panel are also important to assess hypocalcemia, which is associated with GPP flares. Furthermore, to identify potential complications, an assessment of electrolytes and renal and liver function should be conducted [23, 24]. Although these laboratory assessments are relevant to GPP symptoms, they are not GPP specific and do not necessarily correlate with GPP disease activity as they can be altered as a result of other pathologies and comorbid conditions.

The Autoinflammatory Diseases Activity Index (AIDAI), a validated tool originally developed to evaluate disease activity and treatment response in inherited autoinflammatory diseases that involve skin and systemic symptoms, was recently implemented in the assessment and monitoring of disease activity in a patient with deficiency of IL-36-receptor antagonist (DITRA) [25]. The AIDAI measures 12 items including fever ≥ 38.5 °C, overall symptoms, abdominal pain, nausea/vomiting, diarrhea, headaches, chest pain, painful nodes, arthralgia or myalgia, swelling of the joints, eye manifestations, and skin rash. In a 4-year-old Caucasian boy with GPP, the AIDAI was used to monitor disease activity; however, several limitations were identified [25]. The systemic symptoms associated with DITRA are usually acute and require immediate intervention, which makes it difficult to obtain a baseline assessment. In addition, several symptoms assessed by the AIDAI, such as chest pain and painful lymphadenopathy, are not relevant to GPP and AIDAI does not assess symptom severity [25].

Recently, Yamamoto et al. investigated the correlation of the level of serum cytokines with GPP disease severity in a sample of six patients with varying disease severity, and assessed their connection with GPP scores, WBC count, and CRP levels during the disease course [26]. The serum levels of IL-1β, IL-1 receptor antagonist, IL-6, IL-10, IL-12p70, IL-18, IL-22, interferon-γ, and vascular endothelial growth factor correlated with the clinical severity of GPP based on WBC count and CRP levels. In addition, IL-10 and IL-22 were identified as potential markers for response to GPP treatment. Large prospective clinical trials are needed to further validate these findings.

5 Patient-Reported Outcome Measures Used in GPP

GPP is a chronic disease that has a considerable burden on patients’ quality of life (QoL). Assessment of clinical disease measures is insufficient to evaluate GPP severity and there is a lack of studies on the correlation of GPP clinical measures with the impact of disease on QoL. Recent findings from a study by Choon et al., which included 95 patients with GPP, demonstrated that the impact of GPP on QoL is substantial [27]. In addition to using the Dermatology Life Quality Index, the assessment of other patient-reported outcome (PRO) measures is critical for the assessment of disease severity and adjustment of treatment strategies. The implementation of PROs in the evaluation of novel therapies could also provide further validation of meaningful clinical activity. Several PRO measures have been implemented in clinical trials that included patients with GPP to evaluate efficacy of several biologics, including adalimumab (patients with GPP only) [22], brodalumab (patients with GPP and psoriatic erythroderma) [19], ixekizumab (patients with severe plaque psoriasis, erythrodermic psoriasis, or GPP) [10], and spesolimab (patients with GPP only) [13]. PRO measures to assess symptoms include pain Visual Analog Scale (VAS), Psoriasis Symptom Scale, and Functional Assessment of Chronic Illness Therapy (FACIT)-Fatigue, while the Dermatology Life Quality Index and 36-Item Short Form Health Survey have been applied to assess the impact on QoL [10, 14, 21, 28]. However, the implementation and reporting of PROs across GPP studies are limited by the lack of standardized assessment tools [4]. In addition, the dynamic nature of GPP flares and the rapidly changing disease state pose additional challenges to capturing PROs that reflect disease severity.

6 Application of Clinical Disease Measures and PROs in GPP Clinical Trials and Clinical Practice

As a result of the complementary nature of the disease measures discussed, several measures are usually used in conjunction, which adds to the complexity of clinical trial design and patient evaluation in clinical practice. In a clinical trial of ixekizumab (UNCOVER-J), which included patients with plaque psoriasis, erythrodermic psoriasis, and GPP, an sPGA score ≥ 3 and BSA involvement ≥ 10% were used as inclusion criteria for patients with plaque psoriasis, while BSA involvement ≥ 80% was used for patients with erythrodermic psoriasis [10]; the five patients with GPP were included based on meeting the criteria set by the Japanese Ministry of Health, Labour and Welfare [10]. In a clinical trial of adalimumab in Japanese patients with GPP, patient inclusion criteria included a total skin score ≥ 3 and erythema with pustules (skin score ≥ 1), but a total GPP score of < 14 in the JDA-GPPSI [22]. In this trial, complete response was defined as either remission (total skin score of 0) or improvement from baseline (reduction of ≥ 1 point from a baseline total skin score of 3 or ≥ 2 points from a baseline total skin score of ≥ 4). In addition, PASI 50, 75, and 90 response measures, a PGA score of clear or minimal, and improvement of ≥ 1 point in JDA-GPPSI from baseline were evaluated every 4 weeks for up to 52 weeks. In certain clinical trials, such as the evaluation of brodalumab in patients with GPP and psoriatic erythroderma, no specific disease measures were used as inclusion criteria. Patients with GPP were included based on meeting the diagnostic criteria provided in the therapeutic guidelines for the treatment of GPP in Japan [19]. Recently, GPPASI was implemented in the assessment of the activity of spesolimab in patients with GPP. Among the inclusion criteria used in the trial is presenting with a GPP flare with BSA involvement ≥ 10% with erythema and pustules and a GPPGA score of at least moderate severity. These examples highlight the lack of consistency in implementing clinical disease measures in GPP clinical trials and the need for the validation and international consensus for the consistent use of these measures. An overview of the clinical disease measures used in GPP and their associated limitations is shown in Table 2.

Table 2 Disease measures used in GPP and their associated limitations

7 Conclusions

The lack of GPP-specific clinical disease and QoL measures, as a result of the rarity and multisystemic nature of the disease, is a key hurdle for the optimal treatment and monitoring of patients with GPP and the assessment of new therapies in clinical trials. The current lack of GPP-specific and validated disease measures makes it challenging to conduct meta-analyses for comparison of GPP treatments. In addition, the incorporation of systemic measures into assessment scores would improve the quantitative evaluation of overall disease severity by accounting for the systemic inflammation associated with GPP. Currently, there are no defined standard severity inclusion criteria used consistently across GPP trials and minimal clinically important differences for most disease measures are not well defined within the context of GPP. Consistent implementation of the modified psoriasis disease measures that address specific aspects of GPP (GPPGA and GPPASI) will improve the assessment and interpretation of the disease. Further validation of the GPPGA, GPPASI, CGI, and JDA-GPPSI and characterization of standard minimal clinically important differences for these measures will provide the much-needed tools to improve the design of clinical trials. In addition, establishing the most appropriate timeline to assess the severity of GPP using defined tools that evaluate skin symptoms and systemic disease manifestations could guide therapeutic decisions. The standardization of the use of GPP-specific disease measures in clinical practice will allow physicians to accurately define disease severity and adjust treatment strategies accordingly. Following further validation, the adaptation of GPPGA in routine clinical may provide consistent data for individualized disease management. In addition, in the clinical development of new therapies, consistent implementation of disease assessment tools will provide reliable data that can be compared among different patient populations and various therapeutics. The advances in our understanding of the pathogenic mechanisms of GPP and the development of targeted therapies highlight the importance of defining standardized, quantitative GPP-specific disease measures.