Introduction

Nowadays, early clinical trials and treatment with anticancer drugs have become more personalized. This approach involves molecular screening for aberrations and matching with a targeted drug [1]. Tyrosine kinase inhibitors (TKIs) are prescribed at a fixed dose even though it is known that drug exposure differs among patients due to bioavailability [2]. Personalized treatment and/or dose individualization becomes more important since cancer is more and more considered a chronic disease for which TKIs should be administered continuously until progressive disease and not for a predefined number of cycles [2,3,4].

During long-term treatment, patients experience treatment and/or disease-related symptoms. In oncology trials, adverse events (AEs) are graded on the basis of the National Cancer Institute Common Terminology Criteria for Adverse Events (CTCAE) version 4.03 [5]..

Symptom intensity and burden differs among patients and affects the health-related quality of life (HRQL), particularly when patients experience multiple CTC grade 1–2 AEs at the same time and if symptoms are not managed effectively. A decreased HRQL is a serious risk of patient non-adherence and dose modifications (DMs), such as early discontinuation [6,7,8,9].

The application of patient-reported outcome measures (PROMs) enhances early recognition of symptoms and improves clinician-patient communications and quality of care [6, 10,11,12]. A meaningful and feasible for clinical use of PROMs requires a brief, precise, and accurate symptom assessment system [13]. The Utrecht Symptom Diary (USD) is such a PROM that also measures patient experiences [6]. The USD is a Dutch translation and adapted version of the Edmonton Symptom Assessment System (ESAS), which has been proven to be a strong and highly sensitive tool for symptom experience for more than 25 years [14, 15].

When patients do benefit from personalized and/or individualized matched treatment with TKIs for a long period of time, early recognition of symptoms and prompt symptom management becomes more important. We hypothesized that application of PROMs in an early clinical trial provides early recognition of symptoms and symptom intensity in the individual patient, whereby pro-active symptom management and effect evaluation of interventions performed is enhanced. This approach might help to maintain HRQL and, as a result, avoid DMs, whereby response to treatment might increase.

Sunitinib, an anti-angiogenic TKI, is available as a long-term, standard treatment at a fixed dose for renal cell cancer (RCC), pancreatic neuroendocrine tumors (pNETs), and gastrointestinal stromal tumors (GISTs) [16,17,18]. To gain insight into patient-reported symptoms in addition to healthcare professional (HCP)-reported AEs, we conducted a substudy in an ongoing pharmacokinetic (PK) study that was being performed to assess whether PK-guided dosing could be performed without causing additional toxicities [19].

Methods

Patient population and setting

This prospective study was performed in two Dutch cancer centers. Patients (≥ 18 years) for whom sunitinib was considered standard therapy or patients with advanced or metastatic tumors for whom no standard therapy was available were able to participate in the NCT01286896 trial. The main purpose of this trial was to assess whether PK-guided dosing could be performed without causing additional toxicities [19].

Other inclusion criteria were as follows: Eastern Cooperative Oncology Group (ECOG) performance status ≤ 1; measurable or evaluable disease according to Response Evaluation Criteria Solid Tumours (RECIST) 1.1; estimated life expectancy > 12 weeks; adequate hematologic, hepatic, and renal function; no cardiac instability within the previous 6 months. Patients had to be willing to undergo blood sampling and able to swallow oral medication. In addition, for this substudy, patients should be able and willing to complete the USD sunitinib at several time points. Inclusion commenced in April 2011 and was closed in June 2012.

The study protocol was approved by local independent ethics committees, and the study was conducted in accordance with the Declaration of Helsinki. All patients received information regarding the purpose and conduct of this study and provided written informed consent.

Objectives

The primary objective of this substudy was to describe patient-reported symptoms and symptom intensity as well as HCP-reported AEs and severity at different time points.

Endpoints for the primary objective were (i) prevalence and intensity of symptoms and well-being from the patient’s point of view, (ii) prevalence and severity of signs and symptoms from a professional point of view, and (iii) differences in proportions of patient-reported versus HCP-reported symptoms.

The secondary objective was to explore therapy decisions. Endpoints for the secondary objectives were (i) duration of initial dose in weeks, (ii) frequency of dose modification due to AEs, and (iii) description of used variants of DMs due to AEs and their frequencies.

Measurements and definitions

Intervention and tool

The ESAS is a nine-item monitoring tool focused on advanced cancer patients without active treatment [14, 20]. For many years, the USD has been a standard of care in daily practice of the in- and outpatient clinic and the early clinical trial unit of the department of Medical Oncology (MO) of the University Medical Center (UMC) Utrecht, Cancer Center [6, 9]. In this study, symptom assessment was performed by using a treatment-specific module of the USD, the USD sunitinib (Appendix A). In 2007, the USD sunitinib was developed, based on generic disease-related complaints and all AEs with a prevalence ≥ 10% and all grade 3–4 AEs as mentioned in the investigator’s brochure on sunitinib, to measure symptom prevalence and intensity in patients on sunitinib treatment [6].

Patients at two Dutch hospitals completed all 22 items, including an item well-being, of the USD sunitinib every 2 weeks without help, on a 0–10 numeric scale (0 = no burden; 10 = worst possible burden) in about 5 min. Missing experienced symptoms could be added by patients themselves. The concluding USD item is an overall score of the influence of treatment-related symptoms on HRQL. Nurses discussed the USD scores with the patients to bring about early recognition of disease and/or treatment-related symptoms and to objectify the effect of interventions performed. USD scores were entered into a database.

For the analysis, the USD scores were categorized in ≥ 1, 1–2, 3–5, and ≥ 6, in other words a mild, moderate, or severe intensity. These cut points were used because—instead of the ESAS—the USD sunitinib is focused on advanced cancer patients with active treatment, there is no clear evidence as to what the optimal cut points of symptoms in the ESAS should be, and patients on sunitinib treatment are known to experience multiple mild and moderate symptoms in particular [6, 14, 21, 22].

Symptom burden is defined as the impact of (multiple) symptoms on physical, emotional, and social functioning, reported by patients themselves [9, 21].

An AE, reported and graded by HCPs, is considered to be any unfavorable and unintended sign, symptom, or disease temporally associated with the use of a medicinal product, whether or not considered to be related to the medicinal product [23].

Dose and dose modifications

Sunitinib exposure differs substantially between patients and within patients at different time points due to, e.g., patient nonadherence, drug interactions with comedication, variability in oral drug availability sunitinib [2, 19]. In this PK-guided dosing study, patients commenced sunitinib treatment at 37.5 mg once a day continuously. A dose escalation occurred when the target total plasma concentrations of sunitinib > 50 ngml-1 was not achieved. If the patient suffered from a grade ≥ 3 toxicity or intolerable grade 2 toxicity despite supportive care at any moment during the study, the sunitinib treatment was interrupted until adequate recovery (CTC grade < 2) was achieved. Subsequently, sunitinib treatment was reduced to the next lowest dose level [19].

Statistical analysis

Statistical analysis was performed using SPSS 20.0 for Windows software (©2011 SPSS Inc.). Descriptive statistics of patient characteristics, reasons for and variants of DMs and symptoms (severity and intensity) were performed. Categorical data were described using contingency tables, including counts and percentages. Continuously scaled measures were summarized by median and minimum/maximum. To analyze total treatment duration (TTD), time until dose modification, reasons for and used variants of DMs, the χ2 (Fisher exact test) and t test were used to compare variables among groups. The adjusted-Wald statistics and the Bonett and Price method were used to compute a 95% confidence interval (CI) for differences in proportions and medians [24, 25]. A two-tailed P value < 0.05 was considered significant in all tests. A 95% CI that contains zero means that the difference in proportion is not considered significant at the 0.05 level [26].

Results

Patient population

In total, 29 patients completed at least one USD and were included for analysis; see Table 1 for patient characteristics. Most patients in our sample were men (69%), had an ECOG performance status 1 (72%), and the mean age was 58. All patients had metastatic disease. Tumor type was a neuroendocrine tumor in 28%, a colorectal carcinoma in 21% and miscellaneous in 31% of the patients. Prior treatments for cancer were systemic treatment (72%), surgery (62%), and/or radiotherapy (38%).

Table 1 Baseline characteristics

Patient-reported symptoms

Frequency and symptom intensity of most relevant disease and/or treatment-related symptoms and well-being at most relevant time points are shown in Fig. 1 (for full table, see Appendix B).

Fig. 1
figure 1

Most relevant patient-reported disease and/or treatment related symptoms

In general, patients experienced mainly a mild and moderate symptom intensity.

When looking at baseline scores, all 21 measured symptoms, with the exception of hair changes, pruritus, and nose bleeds, were already present in at least 20% of the patients, and pain (31%), anorexia, cough, and inactivity (all 23%) even caused a severe symptom burden in at least 20% of the patients.

When focusing on severe symptom burden reported by at least 20% of the patients, in week 2, a severe symptom burden was caused due to inactivity (29%), dry skin (21%), and diarrhea (36%). At time of dose modification—which was median week 5—a severe symptom burden occurred due to fatigue (20%), pain (33%), dry skin (20%), and inactivity (33%) and in week 6 by fatigue (25%) only. In weeks 8 and 12, a severe symptom burden was not experienced.

Symptom prevalence at baseline was compared to symptom prevalence at the other time points. A 95% CI below zero means that the difference in proportion of prevalent symptoms was increased statistically significantly at that time point. When looking at the differences in proportions of USD scores of week 2, diarrhea USD score ≥ 1 [48%; 95% CI − 74 to − 10%] and pain USD score 3–5 [43%; 95% CI − 43 to − 10%] increased statistically significantly. The same applies when the proportions of moderate pain [33%; 95% CI − 55 to − 3%] and skin change other USD score ≥ 1 [44%; 95% CI − 70 to − 6%] of baseline were compared to time of dose modification and the proportions of moderate pain [30%; 95% CI − 48 to − 2%], mild pruritus [40%; 95% CI − 58 to − 10%], and mild sleeping problems [32%; 95% CI − 32 to − 1%] of baseline were compared to week 6.

A 95% CI above zero means that compared to baseline, the proportion of symptom prevalence was decreased statistically significantly at that point. This occurred at week 8 for USD scores ≥ 1 of anorexia [52%; 95% CI 17 to 75%] and pain [43%; 95% CI 5 to 70%]. At week 12, only six patients completed an USD and no differences in proportions were found at this time point (data not shown).

Well-being

The percentage of patient-reported decreased well-being differed between weeks (55 to 87%) and was mainly mild to moderate in intensity. A severe decrease in well-being was reported the most in week 2 (21%). When the proportions of baseline were compared to the other time points, only the difference in proportion of mild decreased well-being (USD score 1–2) was increased statistically significantly at week 2 [39%; 95% CI 4 to 64%].

Influence of AEs on HRQL

Figure 2 shows the influence of AEs on HRQL. USD scores were categorized in 0, 1–2, and ≥ 3. Because at baseline treatment was not started yet, patients stated not to experience impact on HRQL by AEs at this point. When baseline scores were compared to the other weeks, statistically significantly increased differences of proportions were found at time of dose modification for USD scores ≥ 1 [47%; 95% CI − 67 to − 14%] and USD scores 3–5 [3%; 95% CI − 55 to − 3%] and for USD scores ≥ 1 at weeks 6 and 8, respectively [40%; 95% CI − 58 to − 10%] and [40%; 95% CI − 61 to − 8%].

Fig. 2
figure 2

Influence of adverse events on health related quality of life

Healthcare professional-reported adverse events

In Table 2, HCP-reported AEs that occurred in ≥ 10% of the patients are shown. Severity of AEs was mostly grade 1–2. HCP-reported pain (76%), fatigue (55%), cough (28%), diarrhea (24%), and peripheral neuropathy (21%) were observed in at least 20% of the patients at baseline already. When the proportions of baseline of all grades were compared to the other weeks, statistically significant differences in proportions were found for dysgeusia week 8 [23%; 95% CI − 43 to − 1%]; hand foot skin reaction (HFSR) week 6 [25%; 95% CI − 42 to − 6%], week 8 [21%; 95% CI − 40 to − 1%] and week 12 [25%; 95% CI − 46 to − 3%]; skin toxicities week 6 [48%; 95% CI − 67 to − 23%], week 8 [32%; 95% CI − 54 to − 6%], and week 12 [21%; 95% CI − 59 to − 1%]; and at least for hypertension week 6 [21%; 95% CI − 37 to − 3%] and week 8 [26%; 95% CI − 46 to − 5%]. All AEs increased during treatment.

Table 2 Healthcare professional reported adverse events

Patient-reported versus healthcare professional-reported symptoms

Patient-reported USD scores ≥ 1 were compared to HCP-reported all grade symptom prevalence. All measured symptoms, with the exception of fatigue and vomiting, were statistically significantly reported more often by patients themselves in one or more weeks. Anorexia, dry skin, sleeping problems, and shortness of breath were reported statistically significantly more often by patients during all measured weeks. All statistically significant differences in proportions are printed in bold in Table 3.

Table 3 Patient-reported versus healthcare professional-reported symptoms

Treatment duration and dose modifications

In 2/29 patients (7%), one RCC and one pancreas carcinoma patient, a dose modification did not occur and they both were still on treatment at time of analysis. Table 4 focuses on the 27 patients in which a DM occurred, median total treatment duration was 16 weeks. Median time until DM was 5 weeks and most common reason for that were AEs (63%). A dose reduction occurred in 13/27 patients (48%) and a dose discontinuation due to AEs in 4/27 patients (15%). In 3/27 patients (11%), the dose was escalated per protocol because the total plasma concentrations of sunitinib > 50 ng ml−1 was not achieved.

Table 4 Treatment duration and dose modifications

Discussion

To our best knowledge, this is the first study that describes patient-reported symptoms parallel to HCP-reported AEs in an early clinical trial with an individualized dosing schedule for the TKI sunitinib.

Early recognition of symptoms showed that even patients with an ECOG performance score of 0–1 scored almost all 21 measured symptoms as prevalent at baseline already. Symptoms, mainly mild and moderate in intensity, occurred in various combinations and intensity differed among patients and time points. HCPs reported less signs and symptoms than patients themselves: only 2–3 symptoms were measured in ≥ 50% of the patients.

At time of DM, which was median week 5, in addition to a decreased well-being, 11 symptoms were reported by more than 50% of the patients: fatigue, inactivity, anorexia, pain, dry skin, sleeping problems, hand foot skin reaction, other skin changes, oral changes, dyspnea, and diarrhea. AEs were the most common reason for DMs.

Especially for highly subjective AEs such as fatigue and dyspnea, it is known that clinicians tend to assign a lower severity than patients themselves do [32]. In our cohort, however, all measured symptoms and not only the highly subjective AEs, were reported significantly more often by patients than by their HCPs. In other words, subjective symptom intensity is not the same as objective severity graded by the CTCAE; rather, they complement each other.

When patient-reported symptoms of this cohort were compared to our previous findings in patients on standard sunitinib treatment at a standard dose of 50 mg, 4 weeks on 2 weeks off treatment, we found a comparable prevalence of 9/21 measured symptoms, just as for decreased well-being (difference of < 10%) [6]. However, fatigue, anorexia, pain, dry skin, HFSR, vomiting, hair changes, skin changes, sleeping problems, diarrhea, gastric complaints, and inactivity were reported more often by patients in this cohort.

The most probable reason for these finding is the disease stage of patients participating in this early clinical trial, since not only patients for whom sunitinib was considered standard therapy were able to participate in the study but also patients with advanced or metastatic tumors for whom no standard therapy was available.

We found a median treatment duration of 4 months. Only 15% of the patients stopped treatment due to AEs, which is comparable to the percentage of dose discontinuations in patients on standard sunitinib treatment [6, 27]. Furthermore, although severe AEs seldom occur, persistent multiple symptoms that cause mild and moderate symptom burden are common. Since HRQL is thereby hampered, we believe that the CTCAE grading system should be considered unsuitable for measurement of symptom intensity or burden in the individual because the impact of long-term sustained mild symptoms during long-term treatment is not fully taken into account. These findings endorse the outcomes by Edgerly and Fojo, who state that the CTCAE was developed in an era when most anticancer agents were administered intermittently. As a result, a grade 1–2 diarrhea that lasts for a short period of time might be tolerated by most patients, while a grade 1–2 diarrhea for months is not [28]. We therefore agree with Trotti et al. that using patient-reported symptoms improves the completeness and fidelity of the subjective domain of the CTCAE or may serve as a source of stand-alone complementary toxicity endpoints focusing on study-specific questions [29].

In phase 1 trials, tumor aberrations are more often matched with a targeted drug, so-called matched studies [1]. Patients in matched phase 1 clinical trials might benefit more from treatment than patients in nonmatched phase 1 trials [1]. Furthermore, it is suggested that dose individualization for TKIs might help to personalize therapeutics to the individual needs of each patient [2, 19]. Because of this changed landscape, insight into symptoms and symptom intensity in the individual patient reported by patients themselves becomes more important when evaluating drug safety and determining a safe dosage range or an optimized, individualized dosage. Thereby, multiple grade 1–2 toxicities in particular need to be taken into account, which endorses our earlier findings in patients on standard sunitinib treatment [6]. Since patient self-report is the most reliable indicator of symptom presence and intensity, the accuracy and efficiency of subjective AE data in early clinical trials and individualized doses of TKIs might be approved by standard collecting symptoms and symptom intensity directly from the patient in addition to HCP-reported AEs, graded by the CTCAE only [6, 29, 30, 32]. Additionally, when speaking of individualized doses, it would be defiant to allow patients and their medical teams to use the USD scores to formulate target scores for symptom intensity, which is called personalized symptom goals by Hui et al. [31]. A maximum tolerance level score, for example, might be useful in making informed therapy decisions regarding DMs for the individual patient.

This study is limited by its sample size, which causes quite large 95% confidence intervals for differences in proportions. Furthermore, the USD is a Dutch translation of the Edmonton Symptom Assessment System, which has been proven to be a strong and highly sensitive tool for symptom experience for more than 25 years [15]. We developed a sunitinib-specific USD module, based on evidence-based guidelines and experts’ consensus, in order to develop more differentiation. Although the added symptoms are regarded as distinctive for patient self-report level, a validation study should be performed in the future to confirm this assumption.

The strength of this study is that the added value of PROMS, like the USD, in an early clinical trial on TKI dose individualization is shown. Early recognition and prompt symptom management—especially in the first 6 weeks of treatment—proved to be crucial, just as insight into the effect of interventions performed. Structured clinical application of patient-reported symptoms makes clear that, after 6 weeks of treatment, symptom intensity and well-being was recovered.

Since, with the passage of time, more angiogenic inhibitors such as sunitinib have become available for standard treatment, we have merged these treatment-specific USD modules into one: the USD angiogenic inhibitors. Because AE profiles may vary among tumor types, sets of disease-specific items will be developed [33].

In conclusion, the USD stimulates patients, doctors, and nurses in early recognition of symptom prevalence and symptom intensity in early clinical trials and/or in dose individualization studies. Since response rates in these trials may increase, insight into symptom burden caused by multiple grade 1–2 toxicities and pro-active symptom management is important to avoid DMs due to a decreased HRQL. Application of the USD in early clinical trials is feasible and enhances early recognition of symptom intensity just as pro-active symptom management and evaluation of interventions performed ensure maintenance of well-being. Thus, application of PROMs in early clinical trials is clinically relevant and a challenging factor in obtaining dose-limiting toxicities and individualized doses.