FormalPara Key Summary Points

Why carry out this study?

Sjögren’s disease activity can be assessed from the physician perspective using European Alliance of Associations for Rheumatology (EULAR) Sjögren's Syndrome Disease Activity Index (ESSDAI), and patient-defined Sjögren’s symptom severity can be assessed using EULAR Sjögren's Syndrome Patient Reported Index (ESSPRI)

Although both instruments are commonly used as clinical trial endpoints and have been psychometrically validated, evidence supporting content validity and what constitutes a meaningful change is limited

This study investigated the appropriateness of ESSDAI and ESSPRI, as well as meaningful improvements on ESSDAI from the physician perspective and ESSPRI from the patient perspective

What was learned from the study?

Most physicians and patients considered ESSDAI and ESSPRI appropriate, with most physicians reporting a 3-point improvement in ESSDAI total score as meaningful, and most patients reporting a 1-to-2-point improvement in ESSPRI total score as meaningful

The findings support the use of ESSDAI and ESSPRI as Sjögren’s clinical trial endpoints, in clinical practice and in other research settings, and qualitative data exploring meaningful change support existing minimal clinically important improvement (MCII) thresholds

Introduction

Sjögren’s is a chronic autoimmune disease of unknown aetiology, characterised by lymphoid infiltration and progressive destruction of exocrine glands [1]. Approximately 30–40% of patients have potentially serious systemic organ involvement that can greatly affect morbidity and mortality [2,3,4,5]. Cardinal symptoms include eyes, mouth, skin and female genitalia dryness; however, the inflammatory process can target any organ [6, 7]. As such, Sjögren’s is heterogeneous and characterised by a combination of clinical features and subjective symptoms best assessed using clinical tests and patient reports, respectively [8, 9]. The European Alliance of Associations for Rheumatology (EULAR) Sjögren’s Syndrome Disease Activity Index (ESSDAI) [10] and the EULAR Sjögren’s Syndrome Patient Reported Index (ESSPRI) [11] were developed for this purpose and are increasingly being used within clinical practice [11,12,13] and as clinical trial endpoints to evaluate treatment benefit [9, 14,15,16,17,18].

ESSDAI is a 12-domain clinician-reported outcome (ClinRO) instrument assessing systemic disease activity, developed with the involvement of physicians who specialise in Sjögren’s [10]. For each domain, clinicians rate patients’ disease activity using pre-defined descriptions. ESSDAI domains are weighted to reflect their contribution to total disease activity. Multiple regression modelling was used to determine domain weights (ranging from one to six), according to the strength of each domain’s relationship with a Physician Global Assessment (PhGA) of disease activity item [10]. Domain scores are obtained by multiplying the level of activity by the domain weight, and the 12 domain scores are combined to obtain the ESSDAI total score, ranging from 0 to 123. Low, moderate and high disease activity is defined by scores of < 5, 5–13 and ≥ 14, respectively. A minimal clinically important improvement (MCII) of ≥ 3 points has also been defined using anchor-based analysis [16, 19].

ESSPRI is a 3-item patient-reported outcome (PRO) instrument assessing the severity of dryness, fatigue and joint/muscle pain over the past 2 weeks [11]. Items were selected on the basis of qualitative interview data regarding the importance of Sjögren’s symptoms obtained during development of the Sicca Symptoms Inventory [20] and Profile of Fatigue and Discomfort (PROFAD) [12] and were confirmed using multiple regression modelling. Items are rated on a 0–10 numerical rating scale (NRS), and a mean total score is calculated where higher scores indicate worse symptom severity. Quantitative research suggests that patients consider their symptom severity acceptable when their ESSPRI total score is ≤ 5 (Patient Acceptable Symptom Score; PASS), with an MCII of ≥ 1 point or 15% score reduction [19].

Quantitative evidence to support the reliability and validity of both ESSDAI [4, 19, 21,22,23] and ESSPRI [14, 19, 21, 24] is well documented. However, there is limited qualitative evidence to support content validity of these instruments, specifically the appropriateness of ESSDAI’s domain weights and scoring approach, and ESSPRI’s 2-week recall period. Regulators such as the US Food and Drug Administration (FDA) recognise the critical role that patients and physicians play in developing Clinical Outcome Assessment (COA) instruments for use as clinical trial endpoints, and increasingly seek qualitative evidence to support their content validity [25,26,27,28,29,30]. Furthermore, there is a lack of qualitative evidence in Sjögren’s to support and contextualise meaningful change thresholds generated using anchor-based analysis, as recommended by the FDA [28, 29].

PhGA of disease activity and Patient Global Assessment (PaGA) of symptom severity items using a 0–10 NRS are commonly used to support anchor-based analyses relating to ESSDAI/ESSPRI [14, 31,32,33]. Anchor-based analyses are central to assessing meaningful within-patient change on COA instruments by exploring the relationship between the concept of interest and an external anchor. Existing PhGA/PaGA items using a 0–10 NRS do not meet FDA criteria for anchors, so new PhGA/PaGA items were developed and evaluated in this study [27, 28, 34, 35].

Methods

Sample and Recruitment

Two cross-sectional, non-interventional, qualitative interview studies involving patients with Sjögren’s and physicians who specialise in Sjögren’s were conducted. Patients from diverse locations in the USA (Baltimore, Maryland; Chicago, Illinois; Los Angeles, California; Pittsburgh, Pennsylvania; and St Louis, Missouri) were recruited via MedQuest Global Market Research using physician referrals. Sampling quotas relating to key demographic and clinical characteristics were employed to ensure insights were obtained from a diverse sample of patients who were representative of the wider Sjögren’s population, in line with best practice guidelines for collecting comprehensive and representative input (Table 2) [30].

Physicians who specialise in Sjögren’s from the USA, the UK and Germany were approached by the sponsor to participate in an interview. As most physicians involved in the original development of ESSDAI and ESSPRI were European, meaning their perspectives were well incorporated [10, 11, 19], a sampling quota was employed to ensure greater representation of US physicians in this study to provide additional clinical perspectives from this region. Similarly, ESSPRI items were developed on the basis of qualitative interview data obtained from patients from the UK during development of the Sicca Symptoms Inventory and PROFAD. Therefore, the inclusion of US patients in this study provides an additional patient perspective and supplements the previous research. Patients and physicians were required to meet pre-defined eligibility criteria (Table 1).

Table 1 Patient and physician eligibility criteria for qualitative interviews

Although no minimum sample size is required for interview studies, it is acknowledged that Sjögren’s can be highly heterogeneous, meaning a relatively large number of patients would be required to fully explore the disease experience. However, the focus of this study was to debrief ESSDAI and ESSPRI using cognitive interview methods, and research suggests that seven to ten participants are sufficient to comprehensively assess the content validity of COA instruments [36]. Further, as both ESSDAI and ESSPRI are well established, and quantitative evidence to support the reliability and validity of both instruments is well documented, the aim of this study was to generate additional, supportive qualitative evidence to fill specific evidence gaps. Therefore, the patient (n = 12) and physician (n = 10) samples were considered adequate.

Qualitative Interviews

Interviews were conducted by trained Adelphi Values interviewers via telephone or Microsoft Teams video call, between March and June 2021. Semi-structured interview guides were used to facilitate patient (approximately 90 min) and physician (approximately 60 min) interviews. Interviews were designed and conducted in line with best practice guidance for qualitative research, such as ensuring questions were framed in an unbiased manner by using open-ended and non-leading questions [28].

For ESSDAI, the purpose was to obtain physician feedback on (i) the appropriateness of the 12 organ-specific domain weights (constitutional, lymphadenopathy & lymphoma (L&L), glandular, articular, cutaneous, pulmonary, renal, muscular, peripheral nervous system (PNS), central nervous system (CNS), haematological and biological), (ii) the level of change that would be clinically meaningful at the individual domain (e.g. a one-category improvement from moderate to low) and total score level (e.g. what decrease would suggest a clinically meaningful improvement) and (iii) the best approach to calculating a total score.

For ESSPRI, the objectives were to (i) explore the appropriateness and feasibility of the 2-week recall period from the patient and physician perspective, and (ii) explore patients’ perspectives on item relevance and the level of change they would consider meaningful at the total score level.

The existing (0–10 NRS) and newly developed (4-point Likert scale) versions of the PhGA and PaGA items were also debriefed to explore physician and patient interpretation (Fig. 1). The FDA Patient Focused Drug Development (PFDD) guidance series (including example items) [28, 29] was used to design the Likert versions of the PhGA and PaGA. Specifically, the items were developed to match the concepts assessed by ESSDAI (overall disease activity) and ESSPRI (severity of symptoms), with distinct, non-overlapping response categories, and recall periods equivalent to ESSDAI (today) and ESSPRI (past 2 weeks) [27, 28, 34, 35].

Fig. 1
figure 1

Versions of the PhGA and PaGA items tested. *Please note that two additional versions of the Likert scale PaGA item using recall periods of ‘today’ and ‘the past week’ were also tested but are not discussed in this paper as they do not relate to ESSPRI. PhGA Physician’s Global Assessment, PaGA Patient’s Global Assessment, NRS numerical rating scale

Qualitative Analysis

All interviews were audio-recorded and transcribed verbatim. Qualitative analysis was conducted using ATLAS.Ti software [37] and framework and thematic analysis methods [36, 38, 39]. An induction–abduction approach was taken to identify themes emerging directly from the data (inductive inference), and by applying prior knowledge (abductive inference). Separate coding schemes were derived for both sets of interviews and used throughout the analysis process to ensure consistent application and grouping of codes by trained and experienced researchers.

Ethical Approval

Ethical approval and oversight was provided by Salus Independent Review Board (IRB), a centralised IRB in the USA (physician protocol ID: NO9051A; patient protocol ID: NO9052A). All participants provided oral and written informed consent prior to the conduct of any research activities. Both studies were designed and conducted in accordance with best practice guidelines [40, 41] and the ethical principles laid down in the Declaration of Helsinki and its later amendments [42].

Results

Patient Characteristics

A total of 12 patients were interviewed. All demographic and clinical sampling quotas implemented to promote heterogeneity and representation of the Sjögren’s population (Table 2) were met or exceeded except for the education level category ‘completed high school or below only’ (≥ 3 target; 2 actual). Patients were mostly female (n = 8/12; 67%) with a mean age of 56.1 years (range 20–80 years). Although Sjögren’s is more common in white individuals (hence the racial quota relating to white and non-white) [43], white and non-white individuals were equally represented (n = 6/12; 50%, each) (Table 2). Most patients (n = 10/12; 83%) had received a Sjögren’s diagnosis within the last 10 years and were predominately classified as having moderate (n = 5/12; 42%) or high (n = 4/12; 33%) disease activity on the basis of PhGA score. At screening, most patients (n = 9/12; 75%) had an unsatisfactory symptom state (≥ 5 ESSPRI score) and experiences of eye dryness (n = 12/12; 100%), tiredness/fatigue (n = 11/12; 92%) and mouth dryness (n = 8/12; 67%), among other symptoms.

Table 2 Demographic and clinical characteristics as reported by patients at screening (N = 12)

Physician Characteristics

A total of ten physicians, all rheumatologists, were interviewed and met all sampling quotas. Five (n = 5/10; 50%) physicians were female, and were predominantly from the USA (n = 8/10; 80%). All physicians had been qualified for at least 10 years, treating patients for at least 5 years and treating patients with Sjögren’s on a weekly (n = 9/10; 90%) or monthly (n = 1/10; 10%) basis at the time of interview. On average, physicians treated at least 20 patients with Sjögren’s per month and worked in a range of settings, including academia (n = 8/10; 80%), private practice (n = 3/10; 30%) and/or hospital-based care (n = 2/10; 20%). Physicians reported using ESSDAI (n = 7/10; 70%) and ESSPRI (n = 4/9; 44%) to assess Sjögren’s disease activity in their clinical practice.

ESSDAI

Appropriateness of Domain Weights

Most domain weights were considered clinically appropriate by most (≥ 50%) physicians (Fig. 2). However, the glandular (domain weight of 2), articular [2] and biological [1] domains were considered slightly underweighted and the muscular [6] domain was considered slightly overweighted by ≥ 50% of physicians. Physicians who considered the glandular (n = 5/10; 50%), articular (n = 6/10; 60%) and/or biological (n = 6/10; 60%) domains slightly underweighted commented on the domains representing an important/prevalent aspect of disease activity, being significantly impactful to patients’ feelings and functioning, and/or being relevant to assess in the context of a clinical trial. However, physicians suggested that these domain weights were only ‘a little low’, and recommendations for increased weights tended to only be 1–2 points higher. Physicians who suggested that the muscular domain was slightly overweighted (n = 7/10; 70%) stated that muscular involvement due to Sjögren’s is rare and/or not as severe as other organ involvement. Additionally, physicians reported that it can be difficult to separate muscular involvement due to Sjögren’s from other comorbid conditions and weakness due to steroids. Again, the weight was only considered ‘a little high’, and despite these suggestions of increasing/decreasing domain weights, no physicians suggested that ESSDAI was inappropriate for use in its current format.

Fig. 2
figure 2

Appropriateness of ESSDAI domain weights

Meaningful Improvement

All physicians asked reported that a one-category improvement (e.g. moderate to low activity level) would be clinically meaningful for the constitutional (n = 10/10; 100%), pulmonary (n = 10/10; 100%), renal (n = 8/8; 100%) and biological domains (n = 8/8; 100%). For the remaining domains, a minority of physicians reported that a two-category improvement (e.g. moderate to no activity level) would be clinically meaningful (≤ 30%). Reasons for a two-category improvement included the domain typically being responsive to treatment (glandular, articular, cutaneous, muscular), there being natural fluctuation in the domain (glandular, articular), variation/subjectivity in how the disease activity levels are interpreted by physicians (cutaneous, PNS), and requiring a change substantial enough to make the patient feel better (CNS, haematological). For seven domains, all or most physicians (≥ 80%) reported that patients would also consider the same level of improvement meaningful (one or two categories). For the L&L, muscular and haematological domains, a few physicians (≤ 30%) reported that the change may not be meaningful to patients, and some physicians (≤ 40%) reported that patients were unlikely to notice changes in the renal and biological domains, as these changes would be more evident in objective tests of disease activity (e.g. complement and Immunoglobulin G (IGG) levels) as opposed to patients’ feeling and functioning.

The existing MCII threshold of 3 points on the ESSDAI total score [16, 19] was most frequently reported as clinically meaningful by physicians (n = 5/10; 50%). However, notable caveats included meaningfulness being dependent upon the domains that have changed (n = 1/2; 50%), and patients’ baseline ESSDAI score (n = 1/2; 50%).

Best Approach to Calculating an ESSDAI Score

Owing to interview time constraints, only four physicians (n = 4/10; 40%) discussed the best approaches to calculating and tracking ESSDAI total scores. Four approaches were recommended: the validated approach of summing weighted domain scores to calculate a total score (n = 2/4; 50%) [10, 19, 21], tracking individual domain scores (n = 2/4; 50%), calculating separate total scores for domains in which disease activity is reversible versus irreversible (n = 1/4; 25%), and generating a total score without domain weights (n = 1/4; 25%). The two physicians (n = 2/4; 50%) who recommended tracking individual domain scores did so to avoid changes within domains being diluted within a total score (n = 1/2; 50%) and to separately track domains in which disease activity is reversible or irreversible (n = 1/2; 50%). To note, this was a misconception, domains should not be scored when damage is present. The physician that suggested calculating an ESSDAI score without domain weights only did so to reduce the complexity of the calculation (n = 1/4; 25%).

ESSPRI

Relevance and Interpretation of ESSPRI Items

ESSPRI items were considered relevant to most patients within the 2-week recall period: dryness (n = 12/12; 100%), fatigue (n = 10/12; 83%) and pain (n = 11/12; 92%). However, there was variation in interpretation of the dryness item, with most patients considering more than one area [eye dryness (n = 9/11; 82%), mouth dryness (n = 4/11; 36%) and skin dryness (n = 3/11; 27%)] and tending to focus on the most severe areas of dryness.

Recall Period

The 2-week recall period was considered appropriate to assess average symptom severity by most physicians (n = 7/10; 70%). Reasons for this included the recall period accounting for day-to-day variability of symptoms (n = 2/7; 23%) and patients with Sjögren’s being ‘focused’ on their symptoms allowing for reliable self-reports (n = 2/7; 23%). However, two physicians (n = 2/10; 20%) suggested that patients may think only about their worst symptoms, and one physician (n = 1/10; 10%) felt the recall period is too long relative to shorter recall periods used in other rheumatologic diseases and may diminish treatment efficacy. A subset of physicians (n = 5/6; 83%) considered the 2-week recall period to be clinically meaningful. Only one physician provided a rationale, explaining that, as Sjögren’s symptoms fluctuate day to day, having a general trend is more clinically meaningful when considering treatment choices.

Almost all patients (n = 11/12; 92%) reported that it would be easy to remember symptom severity over the past 2 weeks. Most patients demonstrated an understanding of the 2-week recall period (≥ 92%) and considered the recall period appropriate (≥ 83%) across individual items.

Meaningful Change

Most patients reported that a 2-point (n = 5/12; 42%) or 1-point (n = 3/12; 25%) (range 1–6) improvement in their total ESSPRI score would be meaningful, and that a 2-point improvement (n = 5/11; 45%) would justify using a new treatment (range 1–4). Patients reported that a 1-to-2-point improvement would reduce symptom severity (n = 1/5; 20%), frequency (n = 1/5; 20%) and bothersomeness (n = 1/5; 20%), reduce impact to activities of daily living (n = 1/5; 20%) and improve emotional wellbeing (n = 1/5; 20%). Similarly, most patients felt a 1-point (n = 4/12; 33%) or 2-point (n = 6/12; 50%) worsening in their total ESSPRI score would be meaningful. Table 3 provides patient quotes reflecting how ESSPRI total score improvements would affect how they feel and function.

Table 3 Impact of improvement in ESSPRI total score on how patients feel/function

PhGA

Both PhGA items were well understood by physicians, the majority (≥ 80%) considering both systemic and glandular disease activity when responding to both items. However, the Likert version was interpreted more consistently as referring specifically to disease activity than the 0–10 NRS version, for which physicians reported that they would consider a range of concepts in addition to disease activity, including symptoms (n = 9/10; 90%), impacts (n = 4/10; 40%) and global health status (n = 3/10; 30%). Physicians frequently reported that they would require objective laboratory test results (e.g. blood work, inflammatory markers or complement antibodies levels; ≥ 71%), conduct physical examinations (≥ 56%) and/or collect patient-reported information (≥ 43%) before answering each version.

PaGA

Although both PaGA items were well understood, the Likert version was interpreted more consistently as an assessment of overall symptom severity with most patients considering all their symptoms when responding (n = 11/12; 92%). The 0–10 NRS version varied in interpretation, with fewer patients considering all their symptoms (n = 5/12; 42%) and some considering the effects/impacts of Sjögren’s (n = 2/12; 17%) and/or specific symptoms only [eye dryness (n = 4/12; 33%), mouth dryness (n = 1/12; 8%) and pain (n = 1/12; 8%)]. Patient and physician preferences for different response scales, including the 0–10 NRS and Likert scale used in the PaGA/PhGA items, were also explored during interviews, and the findings are published elsewhere [44].

Discussion

Although quantitative evidence to support the reliability and validity of both ESSDAI [4, 19, 21,22,23] and ESSPRI [14, 19, 21, 24] is well documented, to our knowledge, this paper is the first to present qualitative data supporting the content validity and exploring meaningful change on ESSDAI and ESSPRI, and the content validity of corresponding PhGA/PaGA anchors. In doing so, the data address a key evidence gap and support use of ESSDAI and ESSPRI in their current formats as clinical trial endpoints, as well as their use in routine clinical practice and other research settings.

The data support the majority of ESSDAI domain weights in their current format, aligning with and providing supplementary evidence to the opinions of 44 experts from 15 countries who confirmed during development of ESSDAI that the domain weights generated using regression modelling were appropriate from a clinical perspective [10]. Where physicians suggested a small number of domains were slightly underweighted/overweighted, their reasoning did not always consider the aim of ESSDAI domain weights. For example, some physicians considered the articular, biological and glandular domains to be slightly underweighted, suggesting they represent prevalent aspects of Sjogren’s that are impactful to patients’ feelings and functioning. While it is interesting to note that these domains may have a more direct impact on patient-reported symptoms and impacts, ESSDAI is ultimately a clinical assessment of overall disease activity. As such, during ESSDAI development EULAR agreed that domain weights should reflect the type and severity of involvement, irrespective of their frequency, particularly as potentially lethal organ manifestations are likely to have a greater impact on morbidity and prognosis [10]. Further, physicians stated that domain weights were only ‘a little’ low, and none reported that ESSDAI was inappropriate in its current format. In line with the FDA’s PFDD guidance and the greater emphasis on the use of qualitative data to support and contextualise meaningful change thresholds [28, 29], this study also generated qualitative insights that supports and complements the existing MCII threshold of ≥ 3 points on ESSDAI, generated using anchor-based analysis [19]. The findings therefore support the use of this responder definition in clinical trial endpoints.

Regarding ESSPRI, the 2-week recall period was considered both appropriate and feasible by patients, and clinically relevant by expert physicians. Although variability in interpretation of the dryness item was observed, multiple regression modelling during ESSPRI development found that the individual PROFAD dryness items were highly correlated, suggesting conceptual redundancy. Therefore, EULAR deemed it appropriate to use a generic dryness item [11]. Again, in line with the FDA’s PFDD guidance and the greater emphasis on the use of qualitative data to support and contextualise meaningful change thresholds [28, 29], this study found that most patients considered a 1-to-2-point improvement meaningful. Importantly, these qualitative insights support the recently published MCII threshold of ≥ 1.5 points generated using anchor- and distribution-based analyses [19, 45]. Although patients frequently reported a slightly greater improvement of 2 points as meaningful, a 1-point improvement was the next most frequent response, and qualitative meaningful change data should only be used to support and contextualise responder definitions generated using psychometric analyses [29]. Further, research suggests that satisfaction with total score improvements in Sjögren’s can depend on which symptoms improve, owing in part to the heterogeneity of the condition [46]. This was not assessed in relation to ESSPRI as part of the present study but would be interesting to explore in future research.

The data also demonstrate that the newly developed Likert versions of the PhGA/PaGA items were well understood and consistently interpreted by physicians and patients. In line with regulatory guidance, this supports the content validity of these items and their use as anchors for total disease activity (ESSDAI) and symptom severity (ESSPRI) when administration matches trial endpoints [27,28,29]. The findings also support use of these items in routine clinical practice and other research settings to assess disease activity and symptom severity from the physician and patient perspective, respectively. Notably, physicians frequently reported that they would rely on physical examinations, patient-reported information and objective laboratory tests to complete the PhGA. These findings provide useful insights for clinical trial designs, suggesting there would be value in physicians completing the PhGA following ESSDAI to ensure physicians have access to all relevant information and test results to inform responses and decision-making.

Limitations

It could be argued that male and non-white individuals were overrepresented in the sample as Sjögren’s has a strong female and white predominance [4, 43, 47]. However, the aim was to sufficiently represent white females while promoting heterogeneity in the sample so that understanding/relevance of items could be assessed across patients with varying characteristics. It could also be suggested that the sample sizes were relatively small, therefore limiting the generalisability of the findings, particularly given the fluctuation and heterogeneity of Sjögren’s symptoms, meaning a relatively large number of patients would be required to fully explore the disease experience. However, the aim of this study was to debrief ESSDAI and ESSPRI using cognitive interview methods, and research suggests that seven to ten participants are adequate to comprehensively assess the content validity of COA instruments [36]. Further, given that a number of experts were involved in the development and validation of ESSDAI, and that ESSPRI development was driven by patient-reported data, the samples were deemed appropriate to supplement this evidence and gain additional insights on specific areas of interest in Sjögren’s.

Physicians were recruited by the study sponsor via convenience sampling, which may have introduced sponsorship bias that would not have existed had an external sample of physicians been recruited. However, interviews were conducted by a research agency, and physicians were reminded of their right to confidentiality and anonymity and were encouraged to share their honest opinions. Further, the sample was diverse in terms of other sociodemographic characteristics and clinical experience, providing variability in experiences/perspectives. The physician sample was also composed of mostly US physicians, which provided clinical perspectives from a different region, given that European physicians’ perspectives were well incorporated during the development of ESSDAI and ESSPRI. Similarly, qualitative interview data obtained from UK patients during the development of the Sicca Symptoms Inventory and PROFAD was used to develop the ESSPRI items. Therefore, the inclusion of US patients in this study provides an additional patient perspective and supplements the previous research. Nevertheless, future research could be conducted using translated versions of ESSDAI and ESSPRI to confirm their content validity in physicians and patients with Sjögren’s from different countries.

Finally, owing to time constraints, it was not possible to obtain feedback from all participants on all aspects of ESSDAI and ESSPRI. However, both instruments are well established [9, 14, 16], and quantitative evidence to support the reliability and validity of both ESSDAI [4, 19, 21,22,23] and ESSPRI [14, 19, 21, 24] is well documented. As such, the focus of this study was to generate additional qualitative evidence relating to specific evidence gaps. Future research should explore topics of interest not included in the present study. For instance, although patients and physicians considered ESSPRI’s recall period appropriate and feasible, a separate validation study would be beneficial to test the reliability of patients’ responses.

Conclusion

The present study supports and complements existing evidence regarding the use of ESSDAI and ESSPRI in their current format as instruments that are complementary to each other; the majority considered the current ESSDAI domain weights appropriate for assessment of Sjögren’s disease activity, and the ESSPRI 2-week recall period is appropriate and feasible. These data therefore ultimately support use of the instruments in the context of clinical trial endpoints, routine clinical practice and other research settings. The data also support the content validity of the new PhGA and PaGA items (with Likert response scales) as anchors for ESSDAI and ESSPRI, as well as providing recommendations for their administration within Sjögren’s clinical trials. Patient- and physician-reported thresholds for meaningful improvements on the COA instruments are valuable in interpreting statistically derived responder definitions for clinical trial endpoints, and perspectives on ESSDAI scoring will be valuable in supporting psychometric exploration of scoring approaches for this instrument. Taken together, the qualitative findings address key evidence gaps and provide valuable insights informing use of ESSDAI and ESSPRI as clinical trial endpoints, use of PhGA and PaGA items in anchor-based analyses, and the interpretation of responder definitions.