Background

For patients with advanced breast cancer, both the disease process and its treatment give rise to numerous signs and symptoms that have negative impacts on their day-to-day lives (e.g., pain, skin irritation, etc.) [1]. Endocrine therapies are most commonly used in the treatment of hormone receptive-positive (HR+) advanced breast cancers, although recent studies suggest that quality of life is preserved longer in patients receiving combination therapies [2,3,4,5]. Common disease- and treatment-related symptoms experienced by patients with advanced breast cancer include pain and fatigue [6,7,8,9,10], which may be associated with more distal impacts on physical function such as decreased mobility and ability to carry out daily activities [6, 9, 11]. Few patient-reported outcome (PRO) questionnaires have been used to measure patient experience in advanced breast cancer subtypes.

In order to evaluate treatment benefits in advanced breast cancer, it is necessary to first understand the disease and treatment experience from the perspective of the patient. PRO measures can provide valuable insight into the patient experience, and their use in evaluating treatments in oncology has increased rapidly over the last several years. In considering the selection of a suitable PRO instrument to measure study endpoints, attention should be given to whether the instrument is content valid in the target patient population. Content validity refers to evidence that the PRO instrument measures concepts that are relevant to a disease and important to patients with that disease, and that the items are constructed in such a way that respondents can easily read and comprehend and to which they can provide meaningful responses [12, 13]. As stated in the United States Food and Drug Administration (FDA) Guidance for Industry – Patient-Reported Outcomes Measures: Use in Medical Product Development to Support Labeling Claims, such evidence can be established by conducting concept elicitation interviews with the target patient population to identify and describe the relevant and important concepts of a disease, and by conducting cognitive debriefing interviews with the target patient population to evaluate the comprehensibility, readability, and relevance of a PRO instrument [14].

In the current study, systematic reviews of the literature in advanced breast cancer and of available PRO instruments were conducted to identify potentially suitable PRO measures for use in an HR+/human epidermal growth factor receptor 2 negative (HER2-) advanced breast cancer population. Characteristics of the instruments of interest (including an evaluation of development histories and psychometric properties) and concepts that are considered directly related to disease status were assessed during the review [14]. As a result of these research activities, the National Comprehensive Cancer Network – Functional Assessment of Cancer Therapy – Breast Cancer Symptom Index (NFBSI-16) and the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function Short Form 10b were selected as being most suitable to measure the important and relevant concepts of interest related to disease symptoms, treatment side effects, and physical functioning impacts in this patient population. The content validity of both the NFBSI-16 and PROMIS Physical Function Short Form 10b has been evaluated previously in breast cancer and cancer populations more generally [15,16,17,18,19], but not in an HR+/HER2- advanced breast cancer population specifically. Due to differences in disease trajectory and treatments among HR and HER2 subgroups, it is important to examine content validity in this specific subtype.

The purpose of this article is to describe the content evaluation of the PRO questionnaires (NFBSI-16 and PROMIS Physical Function Short Form 10b) through cognitive debriefing interviews with patients with HR+/HER2- advanced breast cancer.

Methods

Development history of measures

Cognitive debriefing interviews sought to evaluate patients’ ability to read, understand, and meaningfully respond to the questionnaires, as well as to evaluate the questionnaires’ overall relevance and ease of completion in the HR+/HER2- advanced breast cancer patient population. Prior to describing the cognitive debriefing methods for this study, we describe the development history (including any prior evaluation of the content validity or psychometric properties) of each instrument.

The National Comprehensive Cancer Network – Functional Assessment of Cancer Therapy – Breast Cancer Symptom Index (NFBSI-16)

The NFBSI-16 is a 16-item assessment of disease-related symptoms, treatment side effects, and general function and well-being. The instrument has three subscales: Disease-Related Symptom (DRS) – nine items; Treatment Side-Effect (TSE) – four items; and General Function and Well-Being (F/WB) – three items. All items have a seven-day recall period and a five-point verbal descriptive response scale [15].

The NFBSI-16 was developed as part of a larger project to create patient-reported symptom indexes for 11 different cancer types and builds upon the original Functional Assessment of Cancer Therapy (FACT) Breast Cancer Symptom Index (FBSI), and other components of the Functional Assessment of Chronic Illness Therapy (FACIT) measurement system [16]. Open-ended concept elicitation interviews were conducted with patients diagnosed with stage III or stage IV breast cancer (N = 52) to identify symptoms or concerns most important and relevant to them in their assessment of treatment value for their breast cancer [16]. Specifically, patients were first asked to generate a list of up to 10 symptoms and rank the importance of each symptom from 0 (“not important”) to 10 (“extremely important”) [16]. Additional concept elicitation interviews were conducted with patients diagnosed with stage III or stage IV breast cancer who had experience with chemotherapy (N = 52) to identify important and relevant concepts of their advanced breast cancer experience [16]. Patients also completed the FACT-Breast questionnaire [16]. Final item selection for the NFBSI-16 was driven by quantitative (i.e., frequency counts and measure of chance endorsement) and qualitative (i.e., review of the open-ended patient interviews) evaluation of the data [15]. Psychometric evaluations of the NFBSI-16 in advanced breast cancer populations, including convergent validity analyses and known-groups methods analyses, demonstrated acceptable measurement properties [15].

The Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function Short Form 10b

The PROMIS Physical Function Short Form 10b is a 10-item measure developed from the larger PROMIS physical function item bank of 124 items [20]. PROMIS defines the physical function latent trait as the ability to perform activities of daily living (ADLs; both general and instrumental) [21]. Items align with four subcategories: the functioning of one’s upper extremities (dexterity), lower extremities (walking or mobility), and central regions (neck, back), as well as instrumental ADLs (such as running errands) [22]. Items can assess more than one of these subcategories, but generally can be assigned to one predominant category [23]. Items are meant to be answered using the present tense and have a five-point verbal rating scale.

The PROMIS Physical Function Short Form 10b was developed consistent with all PROMIS item banks and for use across disease areas [24, 25]. Further research activities were conducted to examine applicability to cancer populations. Content analysis of data from a diverse sample of patients with cancer in focus groups (N = 21) and cognitive interviews (N = 40) was used to inform domain experts’ qualitative item review. Item modifications were made to reflect cancer-specific concerns (for example, adding items regarding neuropathic pain, which was supported by data analysis and expert consensus) [17]. Similar versions of the PROMIS Physical Functioning Short Form were shown to have acceptable measurement properties (including internal consistency reliability, test-retest reliability, and construct-related validity) among advanced solid tumor populations (including breast cancer) [18, 19].

Patient recruitment

The patients who were recruited for this study, following independent review board approval, were identified at four clinical sites located in New Orleans, Louisiana (n = 10); St. Louis, Missouri (n = 3); Chicago, Illinois (n = 1); and Detroit, Michigan (n = 1). Clinicians confirmed patients’ diagnosis of HR+/HER2- advanced breast cancer, among other inclusion and exclusion criteria (Table 1), prior to patients being included in the study.

Table 1 Study inclusion/exclusion criteria

Conduct of interviews

The cognitive debriefing interviews were conducted with patients one-on-one, over the telephone, by trained researchers. Each interview lasted approximately 90 min and was audio recorded with the subject’s prior written and verbal consent. The questionnaires were cognitively debriefed with patients following a concept elicitation exercise; the concept elicitation portion of the interviews has been reported elsewhere. The cognitive debriefing portion of the interview lasted approximately one hour.

Patients were provided electronic copies of both the NFBSI-16 and PROMIS just prior to the interview. While patients were cognitively debriefed on both questionnaires during the same interview, the order in which they were debriefed was rotated so that the same questionnaire was not debriefed first in all interviews. During the interview, patients were first asked to complete the questionnaires using a “think aloud” method. This method allowed patients the opportunity to complete each questionnaire without any interruption from the interviewer, and to describe aloud how they arrived at each answer, which helps to identify words, terms, or concepts that they may not understand or might interpret differently than intended [26]. After the patient completed the first questionnaire using the “think aloud” method, the interviewer then followed a semi-structured cognitive debriefing interview guide to elicit specific feedback on that questionnaire that might not have been covered during the think-aloud process. For example, the following types of questions were asked by the interviewer to assess patient understanding after the think-aloud exercise:

  • What does this [instruction/question/response option] mean to you?

  • Can you put this [instruction/question/response option] into your own words?

  • What made you choose [response selected by subject]?

In addition, the interviewer asked about the questionnaires’ relevance and overall ease or difficulty of completion. After cognitive debriefing was completed on the first questionnaire, the same process was followed for the second questionnaire.

Coding and analysis

Interviews were audio-recorded, transcribed, and anonymized (i.e., identifying information was removed). Transcripts were imported into a computerized qualitative data analysis package to facilitate the storing, coding and retrieval of qualitative data using Boolean operators [27]. A codebook was developed based on the components of the questionnaires and interview questions. For example, codes were created to tag patient quotes representing their interpretation of Item 1 of the NFBSI-16 and to distinguish those patients who interpreted the item as intended from those who did not: “NFBSI::Item 1::Interpreted as intended::Yes” or “NFBSI::Item 1:: Interpreted as intended::No.” A similar coding structure was followed for each component of the questionnaires and for each type of interview question.

The research team reviewed patient quotes as they related to the study objectives. Patients’ interpretations of each instruction, item, and response option in the questionnaires were evaluated, and a determination was made by the research team as to whether each patient demonstrated understanding of each component of the questionnaire per the criteria described in the study protocol and consistent with recommended practices [13].

Results were tabulated, summarized, and presented to support the content validity of the questionnaires and to identify potential areas for improvement in measurement consistent with recommended practices [13].

All frequencies and percentages reported were based on the number of patients who provided sufficient data that could be used in analysis; some data were not collected from every patient, or responses may have been insufficient or uninterpretable as determined by the research team. Therefore, total frequency and percent calculations are reported based on the number of patients for whom data were available, and not necessarily the total sample of 15.

Results

Study sample

All 15 patients were women, with an age range from 45.3 to 87.6 years (mean = 66.0, standard deviation = 12.4). The majority were white (n = 8/15, 53.3%), and the greatest number had attended some college or certificate program (n = 7/15, 46.7%) and were retired (n = 7/15, 46.7%) at the time of the interview. Clinically, the majority of patients were post-menopausal (n = 12/15, 80%), had breast cancer metastasized to the bone (n = 13, 80.0%), and had an Eastern Cooperative Oncology Group (ECOG) score of 1 (n = 11, 73.3%). Full patient demographic and health information can be found in Tables 2 and 3.

Table 2 Patient-reported demographic and health information
Table 3 Clinician-reported health information

Cognitive debriefing interview results

National Comprehensive Cancer Network – Functional Assessment of Cancer Therapy – Breast Cancer Symptom Index (NFBSI-16)

All patients for whom data were available (n = 14/14, 100.0%) demonstrated understanding of the instructions and the recall period of the NFBSI-16. Overall, greater than 90% of patients demonstrated understanding of each of the item questions. Specifically, all patients who provided sufficient data (n ≥ 12) demonstrated complete (100%) understanding of 14 of the 16 items of the NFBSI-16. For the remaining two items, 14 out of 15 patients (93.3%) demonstrated understanding of Item 3 (feeling ill), and 11 out of 12 patients (91.7%) demonstrated understanding of Item 10 (nausea). Additionally, greater than 70% of patients for whom data were available demonstrated understanding of all of the response options: “Not at all” (n = 12/13, 92.3%), “A little bit” (n = 10/14, 71.4%), “Somewhat” (n = 11/11, 100.0%), “Quite a bit” (n = 11/13, 84.6%), and “Very much” (n = 9/11, 81.8%). In terms of item relevance and comprehensiveness, ≥80% of patients reported experiencing at least 11 out of the 16 concepts as a part of the HR+/HER2- advanced breast cancer experience (lack of energy, pain, feeling ill, shortness of breath, family role, fatigue, pain, worry, ability to work, ability to enjoy life, and quality of life), while between 42 and 67% of patients endorsed the remaining five concepts (bone pain, nausea, side effects, hair loss, mouth sores). Overall, the majority of patients reported that they did not believe additional concepts should be added to the questionnaire and that the questionnaire was easy to complete (n = 11/14, 78.5% and n = 12/13, 92.3%, respectively) (Table 4).

Table 4 NFBSI-16 cognitive debriefing summary Table (N = 15)

Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function Short Form 10b

All patients for whom data were available (n = 14/14, 100.0%) interpreted the instructions of the PROMIS as intended. Overall, greater than 70.0% of patients demonstrated understanding of each item as intended. Specifically, all patients who provided sufficient data (n ≥ 11) demonstrated complete understanding (100%) of eight of the 10 PROMIS items. For the remaining two items, 13 out of 14 patients (92.9%) demonstrated understanding of Item 7 (vigorous physical activities), and 11 out of 12 patients (91.7%) demonstrated understanding of Item 9 (putting trash outside).

Additionally, two sets of response options were debriefed. The first set of response options, “Without any difficulty, With a little difficulty, With some difficulty, With much difficulty, Unable to do,” used for Items 1–6, were understood by > 90% of patients for whom data were available (n ≥13). The second set of response options was less well understood by patients; four of the five response options were interpreted as intended by ≥50% of patients for whom data were available (n ≥ 10) (“Not at all, Somewhat, Quite a lot, and Cannot do”), and only 25.0% for whom data were available (n = 12) demonstrated understanding of the response option “Very little.” In terms of item relevance, ≥75.0% of patients reported at least nine out of the 10 concepts as part of their HR+/HER2- advanced breast cancer experience (ability to do chores, ability to go up and down stairs, ability to run errands or shop, ability to bend down, ability to lift weight above shoulders, vigorous physical activities, bathing or dressing, moderate physical activities, and ability to get in and out of car), while 57.1% of patients endorsed the remaining concept (putting trash outside). Additionally, while the PROMIS has no stated recall period, almost all patients (n = 13, 86.7%) answered the items based on their current status. Overall, the majority of patients (n = 11/13, 84.6%) did not believe additional concepts should be added to the questionnaire. All of the patients (n = 15/15, 100.0%) indicated that the questionnaire was easy to complete (Table 5).

Table 5 PROMIS Physical Function Short Form 10b cognitive debriefing summary Table (N = 15)

Discussion

For the NFBSI-16, cognitive debriefing results showed strong support for the content validity of the measure in a HR+/HER2- advanced breast cancer population; all of the patients in the sample for whom data were available understood the instructions and the majority of the items (14 out of 16), while the remaining two items were understood by at least 90% of the patient sample. Further, the majority of the concepts measured by the NFBSI-16 were relevant to at least 80% of the sample, and all of the concepts were relevant to at least 40%. Similarly, at least 90% of the patient sample demonstrated understanding of the instructions and items of the PROMIS Physical Function Short Form 10b. Almost all of the concepts were relevant to at least 75% of the sample. The cognitive debriefing results for the NFBSI-16 and PROMIS Physical Function Short Form 10b demonstrated that HR+/HER2- advanced breast cancer patients were able to complete and comprehend the questionnaires in ways consistent with developer and researcher expectations, and that overall, the concepts covered by the questionnaires are relevant to the patient experience of HR+/ HER2- advanced breast cancer.

While some patients did not utilize the response options as expected for some items in the PROMIS, they demonstrated understanding of the item-concepts and the response scale overall. For example, Item 7 asked “Does your health now limit you in doing vigorous activities…?” and a subject selected “Not at all” with the rationale that she could not do vigorous activities at all, though the correct response option for her ability would have been “Cannot do.” The misunderstanding of the response scale may potentially have been a result of the negatively worded items, which switched direction from the previous items in the questionnaire and may be more prone to response error. This may also explain why the response options “Very little” and “Quite a lot” were less well understood by patients (n = 3/12, 25.0% and n = 6/10, 60.0%, respectively); however, the response option “Somewhat” was well understood (n = 8/10, 80.0%), as middle response options may be understood regardless of response scale direction. Another possible explanation could be that both ends of the response scale have negatively worded anchors (i.e., “Not at all” and “Cannot do”), which may have contributed to some of the misinterpretation.

In general, if a majority of patients understand the content as it is intended, there is justification to leave the questionnaire as is. Despite cognitive interviews being a widely used method, there are various approaches to interpreting data. For instance, there are no established and agreed upon thresholds for determining whether modifications are needed to a PRO instrument based on cognitive interview results. Some studies, such as the current one, do not employ any a priori threshold, and modifications might be deemed beneficial and necessary even though the majority demonstrated understanding. In its paper on establishing content validity, the International Society For Pharmacoeconomics and Outcomes Research (ISPOR) Task Force acknowledges the lack of clarity around whether modifications to an instrument are warranted if only a minority of patients misinterpret the content [13]. Ultimately, the decision to make any modifications should be made in consideration of whether the proposed change would increase the overall content validity of the instrument. In the current study, additional instructions before the second set of response options in the PROMIS Physical Function Short Form 10b could be one suggestion to help orient the respondent to the change in the directional nature of the items and response scale, and potentially help alleviate response error.

The limitations of the study center on the clinical and demographic characteristics of the patient sample, which may impact generalizability of results. Findings on the content validity of the questionnaires may not be generalizable to other subtypes of breast cancer or be comprehensive across all treatment experiences or tumor types, alongside the fact that the more severely ill may be less likely to participate in the study. In addition, all patients in this sample had completed some college, and most were treated at a single center in Louisiana.

Conclusions

The results of the cognitive debriefing interviews provide evidence that the NFBSI-16 and PROMIS Physical Function Short Form 10b assess the disease-related symptoms, treatment-related side effects, and physical functioning impact concepts that are important and relevant to HR+/HER2- advanced breast cancer patients, and do so in ways that patients can understand and to which they can meaningfully respond. Additional instructions on the PROMIS Physical Function Short Form 10b to orient the respondent to the response scale direction may increase respondent understanding. These findings add to the evidence of content validity of the NFBSI-16 and PROMIS Physical Function Short Form 10b in an advanced breast cancer patient population. Future research should further confirm the questionnaires as being “fit for purpose” in the target patient population via psychometric evaluation and score interpretation in regulated clinical trials.