Cytoreductive surgery (CRS) with or without hyperthermic intraperitoneal chemotherapy (HIPEC) has been recommended for selected patients with resectable colorectal peritoneal metastases in the majority of (inter)national guidelines.1 Little is known about the value of perioperative systemic therapy for resectable colorectal peritoneal metastases in the absence of randomized trials,2 leading to a wide variety in its administration among countries and hospitals.1,2,3 To address this evidence gap, the CAIRO6 trial randomizes these patients to perioperative (i.e., neoadjuvant and adjuvant) systemic therapy and CRS–HIPEC or CRS–HIPEC alone.4 Although superior survival of perioperative systemic therapy is hypothesized,4 it prolongs and intensifies treatment, may lead to (sometimes severe) toxicity,5 could increase postoperative morbidity (especially when including bevacizumab6), and may result in preoperative intraperitoneal progression and consequent inoperability given its assumed relative inefficacy for colorectal peritoneal metastases.7 Altogether, this could also worsen patient-reported outcomes (PROs). To address these issues, CAIRO6 incorporated a randomized phase II trial to assess the feasibility, safety, and PROs of perioperative systemic therapy in this setting.4,8 As part of this phase II trial, the present study aimed to compare PROs between both treatment arms. A secondary aim was to longitudinally explore PROs of patients receiving perioperative systemic therapy.

Patients and Methods

Design

CAIRO6 is an investigator-initiated, parallel-group, open-label, phase II–III, randomized, superiority trial conducted in all nine Dutch tertiary hospitals for the surgical treatment of colorectal peritoneal metastases. The trial is approved by a central ethics committee (MEC-U, Nieuwegein, the Netherlands, R16.056) and the institutional review boards of all participating hospitals. The trial is registered (Clinicaltrials.gov: NCT02758951). The trial protocol4 and the feasibility and safety data of the phase II trial (i.e., mortality, morbidity, surgical details, hospital stay)8 have been previously published. Therefore, only brief descriptions of eligible patients, randomization procedures, and interventions are provided.

Patients

Eligible patients were adults with a World Health Organization performance status of 0–1, pathologically proven isolated resectable colorectal peritoneal metastases, no systemic therapy for colorectal cancer within 6 months prior to enrolment, and no previous CRS–HIPEC.4,8 All patients gave written informed consent.

Randomization

Patients were randomized 1:1 to perioperative systemic therapy (experimental arm) or CRS–HIPEC alone (control arm) using minimization stratified by previous systemic therapy for colorectal cancer (yes, no), onset of peritoneal metastases (synchronous, metachronous), peritoneal cancer index (≤ 10, > 10), and planned HIPEC regimen (mitomycin C, oxaliplatin).

Interventions

Perioperative Systemic Therapy

At physician’s discretion, perioperative systemic therapy comprised either six two-weekly neoadjuvant and six two-weekly adjuvant cycles of FOLFOX (5-fluorouracil, leucovorin, oxaliplatin), four three-weekly neoadjuvant and four three-weekly adjuvant cycles of CAPOX (capecitabine, oxaliplatin), or six two-weekly neoadjuvant cycles of FOLFIRI (5-fluorouracil, leucovorin, irinotecan) followed by either six two-weekly adjuvant cycles of 5-fluorouracil with leucovorin or four three-weekly adjuvant cycles of capecitabine.4,8 Bevacizumab was added to the first three (CAPOX) or four (FOLFOX or FOLFIRI) neoadjuvant cycles.4,8 In case of unacceptable toxicity, it was allowed to switch from CAPOX or FOLFOX to FOLFIRI (and vice versa) during neoadjuvant treatment and to fluoropyrimidine monotherapy during adjuvant treatment. Perioperative systemic therapy was terminated in case of disease progression, unacceptable toxicity, patient’s request, or physician’s decision.

Surgery

CRS–HIPEC was performed according to the standardized Dutch protocol.9 CRS was performed only if macroscopic complete CRS was deemed achievable after explorative laparotomy. Only if macroscopic complete CRS was achieved, HIPEC was performed using mitomycin C or oxaliplatin according to local protocol.9 In case of unresectable disease or macroscopic incomplete CRS, trial treatment was stopped and patients were offered off-protocol palliative treatment.

PRO Assessment

Patients were asked to give separate informed consent for PRO assessment. PROs were assessed using three validated questionnaires (EORTC QLQ-C30,10 EORTC QLQ-CR29,11 EuroQoL EQ-5D-5L12) before trial treatment, after completion of neoadjuvant treatment (experimental arm only), and 3 and 6 months after (intended) surgery. At patient’s preference, questionnaires were sent on paper or electronically using certified software (Research Manager, Deventer, the Netherlands). Supplementary Table S1 presents the PROs of each questionnaire. The manuals of EORTC and EuroQol were used to calculate scores for all PROs.13,14,15 In general, PROs can be divided into function scales (with higher scores indicating better functioning) or symptom scales (with higher scores indicating worse symptoms). For the primary study aim (i.e., comparison of PROs between both arms), five PROs were predefined by the investigators as the most appropriate to assess overall health and treatment tolerability: visual analog scale, global health status, physical functioning, fatigue, and C30 summary score. For the secondary study aim (i.e., longitudinal exploration of PROs of patients receiving perioperative systemic therapy in the experimental arm), all PROs were analyzed.

Statistical Analysis

The investigators and the ethics committee agreed upon an a priori determined sample size of 80 patients for the phase II trial as a sufficient number to assess the feasibility and safety of perioperative systemic therapy.4,8 Given the explorative nature of PRO analyses, no PRO hypothesis was defined a priori. As the present study aimed to assess PROs of actual treatment rather than treatment assignment, analyses were done in a modified intention-to-treat PRO population of all patients starting neoadjuvant treatment (experimental arm) or undergoing upfront surgery (control arm). Statistical tests were performed two-sided using IBM SPSS Statistics (v25.0, IBM Corp, Armonk, NY, USA). Baseline characteristics of the modified intention-to-treat PRO population were compared between both arms using Student’s t-test or Mann–Whitney U test for continuous variables and chi-square test or Fisher’s exact test for categorical variables, with p < 0.05 being considered statistically significant for these comparisons.

For the primary study aim (i.e., comparison of five predefined PROs between both arms), all patients who completed questionnaires at two or more comparative timepoints (i.e., baseline and 3 and 6 months postoperatively) were included. In these patients, differential effects in scores over time and scores at each timepoint were compared between both arms using linear mixed modeling (LMM) with the use of maximum likelihood estimation and an unstructured covariance matrix with a two-level structure [i.e., repeated timepoints (lower level), patients (higher level)]. If there were no statistically significant differences in differential effects in scores over time and in scores at each timepoint, scores of both arms were merged to longitudinally compare baseline scores with scores at subsequent timepoints using LMM. To account for multiple testing in primary comparative analyses, p < 0.01 was considered statistically significant (Bonferroni correction: p < 0.05 divided by five main comparisons).

For the secondary study aim (i.e., longitudinal exploration of all PROs of patients receiving perioperative systemic therapy in the experimental arm), all patients who completed questionnaires at baseline and after neoadjuvant treatment were included. In these patients, baseline scores were compared with scores measured after neoadjuvant treatment using LMM. All PROs with a statistically significant difference in scores between these timepoints were further analyzed and (graphically) presented. To account for multiple testing in secondary explorative analyses, statistical significance was pragmatically set at p < 0.01.

For each statistically significant difference, a Cohen’s d (CD) effect size was calculated to assess its clinical relevance, with CD ≥ 0.5 being considered clinically relevant.16 Since means were used to determine effect sizes and to present differences, all PRO scores were presented as mean (standard deviation) regardless of distribution.

Results

Between 15 June 2017 and 9 January 2019, 233 patients were eligible for trial participation, 80 were randomized (40 to each arm, baseline characteristics of the intention-to-treat population in Supplementary Table S2), and 79 gave informed consent for PRO assessment (Fig. 1). The modified intention-to-treat PRO population comprised all these 79 patients, of whom 37 started neoadjuvant treatment (experimental arm) and 42 underwent upfront surgery (control arm) (Fig. 1). Table 1 presents the baseline characteristics of the modified intention-to-treat PRO population. The intention-to-treat population and the modified intention-to-treat PRO population had comparable distributions of baseline characteristics (Table 1, Supplementary Table S2).

Fig. 1
figure 1

Patient pathway and response rates (including reasons for non-response) at all timepoints. CRS–HIPEC cytoreductive surgery and hyperthermic intraperitoneal chemotherapy, PRO patient-reported outcome. aReasons for discontinuation to CRS–HIPEC (experimental): four unresectable peritoneal metastases. bReasons for discontinuation to CRS–HIPEC (control): five unresectable peritoneal metastases, one unexpected liver metastases. cAll due to progressive disease (experimental + control). dBaseline and 3 and 6 months postoperatively

Table 1 Baseline characteristics of the modified intention-to-treat PRO population

Figure 1 presents the patient pathway and questionnaire response rates (including reasons for nonresponse) at each timepoint. Overall response rates were 99% (78 of 79 patients) at baseline, 95% (35 of 37 patients) after completion of neoadjuvant treatment (experimental arm), 84% (66 of 79 patients) at 3 months postoperatively, 76% (60 of 79 patients) at 6 months postoperatively, and 87% (239 of 274 timepoints) in the entire phase II trial. Response rates were comparable between both arms at all timepoints (data not shown). PRO scores of all patients at all timepoints are presented in Table 2. Primary comparative analyses and secondary explorative analyses were performed in 68 and 35 patients, respectively (Fig. 1). PRO scores of patients included in primary comparative analyses and secondary explorative analyses are presented in Supplementary Table S3 and Supplementary Table S4, respectively.

Table 2 PRO scores of all patients at all timepoints

Primary Comparative Analyses

Figure 2 shows the primary comparisons of five predefined PROs between both arms, with corresponding LMM presented in Supplementary Table S5.

Fig. 2
figure 2

Primary comparison of five predefined PROs between both arms. Lines represent mean scores; dashed lines represent standard deviations

Visual analog scale. Differential effects over time (p = 0.315) and scores at each timepoint were comparable between both arms (Fig. 2A, Supplementary Table S5). Overall, compared with baseline, visual analog scale worsened at 3 months postoperatively [mean difference (MD) − 10, 95% confidence interval (CI) − 15 to − 4, p = 0.001, CD 0.42] and returned to baseline at 6 months postoperatively (p = 0.932) (Supplementary Table S5).

Global health status. Differential effects over time (p = 0.444) and scores at each time point were comparable between both arms (Fig. 2B, Supplementary Table S5). Overall, compared with baseline, global health status remained stable at 3 months postoperatively (p = 0.017) and at 6 months postoperatively (p = 0.479) (Supplementary Table S5).

Physical functioning. Differential effects over time (p = 0.460) and scores at each time point were comparable between both arms (Fig. 2C, Supplementary Table S5). Overall, compared with baseline, physical functioning worsened at 3 months postoperatively (MD − 9, 95% CI − 13 to − 6, p < 0.001, CD 0.50) and returned to baseline at 6 months postoperatively (p = 0.039) (Supplementary Table S5).

Fatigue. Differential effects over time (p = 0.642) and scores at each timepoint were comparable between both arms (Fig. 2D, Supplementary Table S5). Overall, compared with baseline, fatigue worsened at 3 months postoperatively (MD + 15, 95% CI 9 to 20, p < 0.001, CD 0.71) and returned to baseline at 6 months postoperatively (p = 0.345) (Supplementary Table S5).

C30 summary score. Differential effects over time (p = 0.033) and scores at each time point were comparable between both arms (Fig. 2E, Supplementary Table S5). Overall, compared with baseline, C30 summary score worsened at 3 months postoperatively (MD − 7, 95% CI − 10 to − 4, p < 0.001, CD 0.56) and returned to baseline at 6 months postoperatively (p = 0.482) (Supplementary Table S5).

Secondary Explorative Analyses

Explorative LMM in the experimental arm showed that four PROs had a statistically significant difference in scores between baseline and after neoadjuvant treatment: fatigue, loss of appetite, hair loss, and loss of taste. Figure 3 shows these PROs, with corresponding LMM shown in Supplementary Table S6. All other PROs had no statistically significant difference in scores between baseline and after neoadjuvant treatment.

Fig. 3
figure 3

PROs with a statistically significant difference in scores between baseline and after neoadjuvant treatment in secondary explorative analyses in the experimental arm. Lines represent mean scores; dashed lines represent standard deviations; hollow dots indicate a statistically significant difference compared with baseline

Fatigue. Fatigue differed over time (p < 0.001, Fig. 3A, Supplementary Table S6): compared with baseline, it worsened after neoadjuvant treatment (MD + 14, 95% CI 6–23, p = 0.001, CD 0.61), was still worse at 3 months postoperatively (MD + 17, 95% CI 9–26, p < 0.001, CD 0.85), and returned to baseline at 6 months postoperatively (p = 0.931).

Loss of appetite. Loss of appetite differed over time (p < 0.001, Fig. 3B, Supplementary Table S6): compared with baseline, it worsened after neoadjuvant treatment (MD + 15, 95% CI 5–25, p = 0.003, CD 0.67) and was still worse at 3 months postoperatively (MD + 16, 95% CI 6–29, p = 0.003, CD 0.66) and at 6 months postoperatively (MD + 14, 95% CI 4–25, p = 0.007, CD 0.55).

Hair loss. Hair loss differed over time (p = 0.002, Fig. 3C, Supplementary Table S6): compared with baseline, it worsened after neoadjuvant treatment (MD + 18, 95% CI 9–27, p < 0.001, CD 0.84), returned to baseline at 3 months postoperatively (p = 0.047) and at 6 months postoperatively (p = 0.105).

Loss of taste. Loss of taste differed over time (p < 0.001, Fig. 3D, Supplementary Table S6): compared with baseline, it worsened after neoadjuvant treatment (MD + 27, 95% CI 19–36, p < 0.001, CD 1.03), was still worse at 3 months postoperatively (MD + 16, 95% CI 7–25, p = 0.001, CD 0.90), and returned to baseline at 6 months postoperatively (p = 0.074).

Discussion

In patients with resectable colorectal peritoneal metastases, randomized to perioperative systemic therapy or CRS–HIPEC alone, all predefined PROs (i.e., visual analog scale, global health status, physical functioning, fatigue, C30 summary score) were comparable between both arms at baseline and 3 and 6 months postoperatively. These PROs returned to baseline at 3 or 6 months postoperatively in both arms. Secondary explorative analyses in the experimental arm showed statistically significant and clinically relevant worsening of fatigue, hair loss, loss of taste, and loss of appetite after neoadjuvant treatment. Except for loss of appetite, these PROs returned to baseline at 3 or 6 months postoperatively.

To the knowledge of the authors, the present study is the first to compare PROs between perioperative systemic therapy or CRS–HIPEC alone for resectable colorectal peritoneal metastases. Findings of the present study provide relevant insight in the burden of perioperative systemic therapy in this setting and show acceptable treatment tolerability. Together with the previously demonstrated safety and feasibility of perioperative systemic therapy in patients with resectable colorectal peritoneal metastases,8 results of the present study justify a phase III trial and may facilitate its informed consent. To the knowledge of the authors, PROs have also never been compared between perioperative systemic therapy and surgery alone in patients with other malignancies. As a result, findings of the present study may also be valuable for physicians administering similar perioperative systemic regimens to patients with other malignancies that require extensive surgery.

A recent systematic review identified 14 other studies reporting PROs in patients undergoing CRS–HIPEC.17 However, none of these studies specifically focused on perioperative systemic therapy and its possible effect on PROs.17 Nevertheless, the only two studies specifically focusing on PROs after CRS–HIPEC for colorectal peritoneal metastases reported postoperative recovery times of PROs similar to the present study.18,19

The present study showed worsening of fatigue, loss of appetite, hair loss, and loss of taste after neoadjuvant treatment. Although these symptoms are generally recognized as common observer-reported side effects of systemic therapy for colorectal cancer in clinical trials,20 PROs after neoadjuvant treatment for (potentially) resectable metastatic colorectal cancer have never been reported. While all the worsening PROs after neoadjuvant treatment (except for loss of appetite) returned to baseline levels at 3 or 6 months postoperatively, patients in the experimental arm underwent treatment for a considerably longer period than patients in the control arm. Thereby, they may have experienced a longer period of worsened PROs. Nevertheless, the present study suggests that these worsening PROs after neoadjuvant treatment did not translate into a postoperative difference in five predefined general PROs between both arms, even though many patients in the experimental arm received adjuvant treatment at time of the first postoperative PRO measurement at three months postoperatively. Several factors may explain the absence of differences in these predefined postoperative PROs between both arms. First, patients’ psychological adaptation to their changing health status over time, a phenomenon called response shift, could have contributed to the lack of worsening of postoperative PROs in the experimental arm despite toxicity of perioperative systemic therapy.21 Second, patients receiving perioperative systemic therapy may have an increased belief in cure,22 as this is the hypothesis of the CAIRO6 trial. Third, patients could have had the perception that side effects of perioperative systemic therapy are a sign of treatment efficacy.22 The latter two factors may have counteracted the possible negative effects of perioperative systemic therapy and its toxicity on PROs in the experimental arm.

The main strength of the present study is the overall response rate of 87%, which is high compared with other PRO studies: 65% in a randomized trial of neoadjuvant chemoradiotherapy and surgery versus surgery alone in esophageal cancer, and 65% in a systematic review of metastatic colorectal cancer trials.21,23 Given the severity of the disease and the treatment intensity, the authors expected a higher chance of bias due to drop-outs. Nevertheless, unavoidable drop-out of the most severely ill patients during the trial could have overestimated PRO scores at 3 and 6 months postoperatively in both groups. As the drop-out percentages did not differ between both groups (chi-square p = 0.307, data not shown), the authors conclude that a comparison between both groups can still be made and well interpreted. The main limitation of the present study is the relatively small sample size of 80 patients. Though LMM allowed detection of both statistically significant and clinically relevant differences, a larger sample size could have detected additional statistically significant fluctuations in PROs that may have been clinically relevant.

Conclusions

In patients with resectable colorectal peritoneal metastases randomized to perioperative systemic therapy or CRS–HIPEC alone, all predefined PROs were comparable between both arms and returned to baseline at 3 or 6 months postoperatively. Though several PROs worsened after neoadjuvant treatment, all of these (except for loss of appetite) returned to baseline at 3 or 6 months postoperatively. Together with the trial’s previously reported feasibility and safety data, these findings show acceptable tolerability of perioperative systemic therapy in this setting and justify a phase III trial.