FormalPara Key Summary Points

Patient-centered outcome measurement (PCOM) in rare disease presents unique challenges, with limited sampling and recruitment methods.

Using pragmatic solutions, we assembled a bespoke patient-reported outcome tool based on patient and caregiver interviews in NUT (nuclear protein in testis) carcinoma, a rare cancer with various manifestations and tumor locations primarily in the head, neck, and lungs.

Collection of enough quantitative data to document initial item performance was an additional challenge in this hard-to-reach population.

Using mixed-methods research, we developed a content valid patient-reported outcome (PRO) to measure disease symptoms and impacts in patients with NUT carcinoma with preliminary item performance results.

Further psychometric validation evidence is needed to confirm the bespoke PRO constructed is fit for purpose and will bring solutions to the research and clinical community attempting to assess treatments for NUT carcinoma patients.

Introduction

NUT (nuclear protein in testis) carcinoma is a rare and devastating disease hallmarked by the chromosomal rearrangement of the NUT gene. The carcinoma is characterized by the growth of epithelial malignant neoplasms and is typically found in the midline supradiaphragmatic structures (head, neck, and lungs), although tumors can be found anywhere in the body. Despite increased awareness of NUT carcinoma (NUTca) and access to diagnostic testing over the past decade, patients are often undiagnosed and/or misdiagnosed, leaving its true incidence and prevalence unknown [1,2,3]. Previous publications have estimated that the median age at diagnosis ranges from 16 [2] to 21.9 years [3]; however, NUTca has been identified in patients from less than 1 year to 81.7 years of age [2,3,4].

NUTca presents with rapidly enlarging tumours and, at advanced stages, metastasizing to locoregional lymph nodes or distal sites. Therefore, patients most often exhibit mass-related symptoms, while non-specific symptoms, such as fever and weight loss, are rarely seen [1]. In a sample of 54 patients, the median overall survival was 6.7 months, with only 19% (CI 7–31%) of patients alive after 2 years, indicating that this carcinoma is aggressive and devastating without treatment [2].

Treatments to combat NUTca include surgery, radiation therapy, and/or chemotherapy; however, there are no publications that report evidence to support the effectiveness of these treatments or report the impact and severity the treatments have on the patient’s quality of life. While the current treatment modalities intend to increase survival, no patient-reported outcome (PRO) instrument exists to measure the symptomatic experience of patients. PROs are essential to understand the patient experience within clinical trials and address the high unmet need to improve care for this vulnerable patient population.

Within this context and using pragmatic solutions given rare disease challenges, we developed a bespoke PRO tool based on patient input and qualitative interviews in NUTca patients and their caregivers [5]. The tool is based on the EORTC QLQ-C30 (European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire C30) with additional items selected from the EORTC Item Library and newly created items based on patient experience data. Given recruitment challenges, psychometric analyses of this tool were limited, but by employing mixed-methods analyses, we were able to create a content valid PRO tool with initial item performance results. Given the extreme unmet need in this population, we believe this tool will bring solutions to the research and clinical community attempting to assess treatments for NUT carcinoma patients.

Methods

Interview Conducted to Generate Conceptual Framework

Twenty-seven elicitation interviews (~ 60 min) were conducted via telephone by experienced qualitative researchers who received training in Good Clinical Practice (GCP) guidelines. The researchers had at least 2 years of experience conducting qualitative interviews, and all attended a mock interview to familiarize themselves with the interview guides. As was previously published, both patients and caregivers were asked about the diagnosis process, symptoms experienced, and specific impacts; the patient interview guide also included probes for their specific experience of the disease [5]. These interviews and analysis resulted in a conceptual framework describing the patient experience of NUTca. Note that the conceptualization was based on the first 27 elicitation interviews; based on this conceptualization, we developed a questionnaire that was debriefed in 20 debriefing interviews. Across the first 27 elicitation interviews and 20 debriefing interviews, there were 33 unique participants (n = 14 participants completed both an elicitation interview and a debriefing interview).

Participants

Our overall sample comprised 33 participants (13 with NUTca and 20 caregivers).

Patients were included if they were ≥ 12 years of age, had received a diagnosis of NUTca, were willing to participate in the interviews and survey assessment, lived in the United States, and could read, speak, write, and understand English. Patients were excluded if they had any visual, auditory, cognitive, or linguistic impairment that would prevent them from understanding and answering the interviewer’s questions or if they were being treated for another cancer. Caregivers were ≥ 18 years old and cared for or had cared for someone with a diagnosis of NUTca; all other inclusion/exclusion criteria were the same. Potential participants were asked to complete an online screener and if eligible signed an informed consent form and were then scheduled for an interview. After completion of the interview, participants received $85 for their time.

Compliance with Ethics Guidelines

Study documents were approved by a central institutional review board (Copernicus IRB tracking 20200591), and informed consent was obtained for all participants. Consent for publication was obtained. All guidelines of the Declaration of Helsinki of 1964 and later amendments (relevant to this non-interventional, qualitative study) were followed.

PRO and Item Selection, Item Generation

Based on the previously published conceptual framework [5], we developed a PRO tool for testing. We started with the EORTC QLQ-C30 as a basis for the bespoke tool, given the ability to append items to a core measure and select pertinent items from the EORTC Item Library to increase conceptual coverage [6]. To do this, we compared the conceptual framework to the items on the QLQ-C30 to assess its conceptual coverage. Where gaps were identified, we selected items from the EORTC Item Library. If no item existed, we developed new items in similar style to EORTC items.

Cognitive Debriefing and Item Completion

Cognitive debriefing was done with both patients and caregivers, but only patients supplied quantitative data by providing answers. Both patients and caregivers were asked about their understanding of the items, response option clarity, and relevance to NUTca (caregivers provided feedback as an observer for the patient).

Preliminary Psychometric Data

Given the small sample size, we were not able to conduct psychometric analyses even using modern techniques equipped to handle small sample sizes (Rasch measurement theory). To increase the number of observations, the 10 patients who completed the questionnaire were asked to complete it again at a second time point; six participants responded a second time, so the total sample size for psychometric calculations was 16. As preliminary psychometric data, we constructed “heatmaps” that demonstrate response option frequency by item for the EORTC QLQ-C30 and supplemental items (note that these items were from EORTC modules: Head & Neck, Lung, Systemic Effects, Impacts, Pain, and Gastrointestinal).

Results

Participant Sample

For the 20 participants that participated in debriefing interviews, participant characteristics can be found in Table 1 (n = 10 patients) and Table 2 (n = 10 caregiver responses as proxy for patients). Of the 20 interviews conducted, most tumor locations presented in the head (n = 12), lungs (n = 10), and neck (n = 6). Others noted tumors in other areas such as the spine and lymph nodes in the chest (n = 5). Of the ten patients with NUTca (Table 1), the time since diagnosis ranged from within the past 12 months (n = 2), 1–2 years (n = 3), to over 2 years (n = 4).

Table 1 Debriefing patient characteristics
Table 2 Debriefing caregiver characteristics (health information is by proxy for patient)

Conceptualization and Subsequent PRO and Item Selection, Item Generation

The results of the conceptual comparison of the EORTC QLQ-C30 and the conceptual framework can be found in Table 3. For specific concepts related to each domain, refer to Table 3 in our previous publication [5].

Table 3 Item-tracking matrix for item generation from conceptual framework

Results from Cognitive Debriefing

Full results from debriefing the EORTC QLQ-C30 and supplemental items can be found in the Supplementary Material. The cognitive debriefing analysis revealed that the EORTC QLQ-C30 items were generally well understood by participants; the only items that were hard to interpret for more than one participant were: item 02 (“long walk”—what constitutes “long”), item 05 (help with “eating, dressing, washing, toilet”—unsure if before or after treatment), item 06 (limited “daily activities”—which activities, cause of limitation [cancer, COVID-19]), and item 09 (“pain”—need more details [time frame, type]).

The additional items from the EORTC Item Library were generally well understood by participants. Three or more participants reported that for systemic effects, “dizziness,” “fever or chills,” “sweated excessively,” and/or “night sweats” were not relevant or observed. However, when asked about which items were most important, one participant endorsed “dizziness” and two endorsed “fever or chills.” Three participants believed there was a conceptual overlap between “pain in your head” and “headaches.” Five participants believed there was a conceptual overlap between “aches around your sinuses” and “pressure around your sinuses” (“pressure around sinuses” was a newly developed item). Three participants found both “blurred” and “double” vision irrelevant, respectively. Two participants found “trouble swallowing” and “trouble drinking liquids” to overlap conceptually; however, three participants reported having trouble swallowing after treatment but did not report this for “trouble drinking liquids.” Two participants found that the impacts item related to “motivated to continue with normal hobbies and activities” was confusing given that the directionality was positive as opposed to the negatively worded items around it.

The newly developed items were generally well understood by participants. Some items were reported as not relevant, such as “runny nose” (n = 5) and “blocked nose” (n = 4). Three participants reported that “tunnel vision” was not a relevant question, and one named it as one of the least relevant questions. For the impact item related to “caring for another,” two participants suggested adding a pet as an option. Three participants found “pain in your face” not relevant and four found “aches around your sinuses” not relevant.

Item Performance

Data from digestive items were missing, as none of the patients in the sample reported digestive problems. In the figures below, darker purple indicates a higher proportion of responses; darker purple for either extreme of the response scale means that there is either an observed floor effect (i.e., most patients reporting “not at all”) or ceiling effect (i.e., most patients reporting “very much”). These heatmaps should be interpreted with caution, since the total sample size was small and the number of patients endorsing specific tumor locations/symptoms was even smaller (i.e., lung n = 4/16).

Figure 1 shows variability in response options. For items with more variability, we see similar shades of purple across response categories such as “pain” or “need rest.” As noted, for items with less variability, dark purple is seen on one end of the scale. This was the case for some functioning items such as “taking a short walk,” “staying in bed,” “help with daily activities.” Most participants did not endorse having this experience at all, so a floor effect is observed, indicating a level of basic functioning in this sample. For “vomiting,” a floor effect was also seen. Figure 2 (item 29 and item 30 of the QLQ-C30) demonstrates greater variability for patient’s report of overall health than for quality of life; quality of life responses show that patients mostly report higher quality of life.

Fig. 1
figure 1

Heatmap of the EORTC QLQ-C30 items 1–28 (n = 16)

Fig. 2
figure 2

Heatmap of the EORTC QLQ-C30 items 29 and 30 (n = 16)

Figure 3 for the Head & Neck module demonstrates good variability of response frequencies for most items. This finding supports the relevance of adding these items. Items regarding “blurred” and “double” vision are not endorsed as frequently occurring; this is supported by qualitative data that described these as potentially less frequently occurring or clinically relevant items.

Fig. 3
figure 3

Heatmap of the EORTC Item Library supplemental items for Head & Neck (n = 12)

Figure 4 for Lung shows good response frequency variability for “cough,” but otherwise low endorsement frequencies for items overall. This could be due to the small number of participants completing the questionnaire (n = 4). The cognitive debriefing data suggest these items are well understood by participants.

Fig. 4
figure 4

Heatmap of the EORTC Item Library supplemental items for Lung (n = 4)

Figure 5 for Systemic Effects shows a good variation of response frequencies for most items. The items “been dizzy,” “night sweats,” “excessive sweating,” and particularly the “fever or chills” were less frequently endorsed by participants, which is in line with qualitative feedback stating these were potentially less frequently occurring or less clinically relevant items (notably, though, others stated that “fever and chills” was one of the most important, despite it being the least frequently endorsed here).

Fig. 5
figure 5

Heatmap of the EORTC Item Library supplemental items for Systemic Effects (n = 15)

Figure 6 for Impacts demonstrates good response variability for “concerned for caring about others,” but otherwise demonstrates ceiling effects (“motivated in hobbies”) or floor effects (“pain while lying down,” “limited in household repairs,” and “limited in light recreation”).

Fig. 6
figure 6

Heatmap of the EORTC Item Library supplemental items for Impacts (n = 15)

Figure 7 for Pain demonstrates good response option frequency variability for most items, supporting their relevance in this sample; “abdominal pain” and “pain in chest” were both items that had observed floor effects. The low endorsement for “abdominal pain” makes sense given that no patients in the sample endorsed GI symptoms.

Fig. 7
figure 7

Heatmap for the EORTC Item Library supplemental items for Pain (n = 9)

Discussion

Our main objective for this study was to address the paucity of research describing the patient experience of NUTca and the absence of specific PRO tools for this rare disease. The published literature on NUTca is scarce, and the limited number of case reports focus primarily on the clinical development of disease and presentation of tumors. We conducted mixed-methods research, including conceptualization and item selection/generation with subsequent cognitive debriefing and quantitative data analyses, to fill this evidence gap. Previous work conducted using a similar approach demonstrated that increasing conceptual coverage of an existing instrument using evidence reported by patients is a pragmatic approach to improve the quality of that existing instrument’s measurement capability [6, 7].

This study highlights the challenges in implementing patient-centric research to inform and develop PRO measures in rare diseases. Our mixed-methods research used pragmatic solutions to collect patient experience data and provides an evidence base to inform clinical programs in a rapidly progressing rare cancer with high unmet need.

A limitation of this study is that the sample size is extremely small and may not be representative of the larger population. In our sample, the time since diagnosis for patients interviewed was longer than in previous studies, suggesting longer survival rates. By nature of this non-interventional, observational study, patients who are experiencing severe symptoms or are declining rapidly may be less likely to participate in an interview. Caregivers provided valuable insights into the patient experience by proxy, and this included early stages post-diagnosis. We did not collect data from caregivers on time to death of the patients they cared for, so the survival rate of patients in this group is unclear and may be more representative of figures cited in the literature (data collected included time since diagnosis, but we did not assess how long the patients they cared for survived after diagnosis). However, this highlights the need for the dissemination of any data for this type of rare cancer, as limited available data affects the accuracy of determining survival rates.

A further limitation is the heterogeneity of the sample, in which signs and symptoms can be very different across individuals. This is confounded by the difficulty in separating symptoms associated with the disease and those due to treatment. At the early stages of the disease, most patients are impacted by aggressive treatment more than the tumor development itself. However, if not treated at all, the odds of survival are even smaller. As it is impossible to predict responders, all patients are likely treated, leaving little opportunity to develop a “true” disease-related PRO in such a rare indication. This means that (1) PROs should assess both disease and tumor-related symptoms, and (2) with more research, the differentiation between the two will become clearer. Hence, more research needs to be published to build an evidence base.

Pragmatic and creative solutions are necessary for overcoming limitations in understanding the patient and caregiver experience of living with a rare disease [8, 9]]. Our mixed-methods study enabled the creation of a content valid PRO with preliminary item performance results. Given recruitment challenges and limited psychometric analyses, further evidence should be collected to document the scoring and psychometric performance of the final PRO tool. In this rare disease context, the benefit of within-trial interviews to bring additional data on the experience of patients including meaningful within-patient changes should be considered.

Conclusions

Further psychometric validation evidence is needed to support the use of this bespoke PRO, but identification of items (qualitative) and preliminary data (quantitative) in this population provide initial solutions to the research and clinical community attempting to assess treatments for NUT carcinoma patients.