Multicenter Prospective Cohort Study of the Patient-Reported Outcome Measures PRO-CTCAE and CAT EORTC QLQ-C30 in Major Abdominal Cancer Surgery (PATRONUS): A Student-Initiated German Medical Audit (SIGMA) Study

Background The patient-reported outcomes (PRO) version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) and the computerized adaptive testing (CAT) version of the EORTC quality-of-life questionnaire QLQ-C30 have been proposed as new PRO measures in oncology; however, their implementation in patients undergoing cancer surgery has not yet been evaluated. Methods Patients undergoing elective abdominal cancer surgery were enrolled in a prospective multicenter study, and postoperative complications were recorded according to the Dindo–Clavien classification. Patients reported PRO data using the CAT EORTC QLQ-C30 and the PRO-CTCAE to measure 12 core cancer symptoms. Patients were followed-up for 6 months postoperatively. The study was carried out by medical students of the CHIR-Net SIGMA study network. Results Data of 303 patients were obtained and analyzed across 15 sites. PRO-CTCAE symptoms ‘poor appetite’, ‘fatigue’, ‘exhaustion’ and ‘sleeping problems’ increased after surgery and climaxed 10–30 days postoperatively. At 3–6 months postoperatively, no PRO-CTCAE symptom differed significantly to baseline. Patients reported higher ‘social functioning’ (p = 0.021) and overall quality-of-life scores (p < 0.05) 6 months after cancer surgery compared with the baseline level. There was a lack of correlation between postoperative complications or death and any of the PRO items evaluated. Feasibility endpoints for student-led research were met. Conclusion The two novel PRO questionnaires were successfully applied in surgical oncology. Postoperative complications do not affect health-reported quality-of-life or common cancer symptoms following major cancer surgery. The feasibility of student-led multicenter clinical research was demonstrated, but might be enhanced by improved student training. Supplementary Information The online version contains supplementary material available at 10.1245/s10434-021-09646-z.

''… outcomes collected directly from the patient without interpretation by clinicians or others''. 8 With patients growing older and increasingly comorbid, the implementation of PRO measures helps to complement efficacy endpoints such as survival or morbidity and safety data, thus adding the patients' perspective to clinical trials. 9 The application of PRO measures in surgical oncology is of special interest not only as a clinical outcome parameter in routine care and clinical trials, but it might also be used to manage symptoms and complications. 10 Thousands of PRO measures have been developed in medicine, 11 however, in cancer patients, health-reported quality-of-life (HRQoL) and symptom scores are arguably the most important PROs. 12 Contrary to symptom measures, HRQoL is a multidimensional tool ''… encompassing physical and occupational function, psychological state, social interaction and somatic sensation''. 13 HRQoL and symptom PRO measures can be categorized into generic, cancer-specific, or disease-specific questionnaires. 12 Generic measures allow the comparison with healthy individuals, while cancer-and disease-type-specific tools aim to measure symptoms and HRQoL in all cancer patients or patients with a specific cancer disease, respectively.
A National Cancer Institute (NCI) consensus conference has proposed 12 cancer-specific symptoms that should be evaluated in all cancer trials (i.e. cancer-specific), but has left open which PRO measures should be used. 14 The recently developed PRO version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE TM ) could be a possible candidate but no data from patients undergoing abdominal cancer surgery have yet been published. 15 Therefore, the feasibility and absolute values of PRO-CTCAE TM assessment remain unclear in this population. Similarly, the European Organisation for Research and Treatment of Cancer (EORTC) has recently developed a computerized adaptive testing (CAT) version of their HRQoL questionnaire QLQ-C30 (CAT EORTC QLQ-C30). 16 The CAT EORTC QLQ-C30 is a cancer-specific questionnaire, i.e. it allows the comparison of HRQoL across multiple cancer types. Again, no data have yet been published for patients undergoing abdominal cancer surgery for this tool. In addition, understanding whether PROs overlap with or diverge from clinical outcomes such as survival and postoperative complications is critical to their application in quality assessment and improvement strategies. Therefore, the objectives of this cohort study were as follows.
1. Describe the absolute values of (a) CAT EORTC QLQ-C30 and (b) a set of 12 cancer-specific symptoms 14 measured via the PRO-CTCAE TM15 for patients undergoing major abdominal cancer surgery.
2. Describe the relationship between short-term clinical outcomes (morbidity) and PRO-CTCAE TM measurements in the short-term (within 30 postoperative days [POD]) and long-term (after 3 and 6 months). 3. Correlate short-term clinical outcomes (morbidity) with the long-term HRQoL at 3 and 6 months. 4. Describe the relationship between long-term clinical outcome (overall survival) with PRO-CTCAE TM and HRQoL according to the CAT version of the EORTC QLQ C-30.
The PATRONUS study was conducted by medical students under the supervision of academic surgeons (CHIR-Net SIGMA study group; see the Methods section). Student-led multicenter clinical research is a relatively novel field, therefore it is unclear how well medical students can motivate patients to submit PRO data and whether a large group of students remain dedicated to accomplish a multicenter prospective study. Hence, the following two additional educational objectives were evaluated.
1. To evaluate the rate of missing PRO data in a studentled research network. 2. To train and interest medical students for clinical research and surgery, defined as number of participating trial sites including patients compared with initiated trial sites (feasibility endpoint).

Study Design
PATRONUS is a multicenter, prospective, single-arm, observational cohort study conducted according to the published study protocol. 17 The study report was written according to the current STROBE cohort guidelines. 18

Setting
The PATRONUS study has been initiated, conducted, analyzed, and reported by the Student-Initiated German Medical Audit (SIGMA) study group. SIGMA is a Germany-wide, student-led clinical research network affiliated to the CHIR-Net, the clinical trial network of the German Surgical Society. 19 SIGMA offers medical students the opportunity to participate in student-led clinical research under the supervision of academic surgeons. Participating medical students are trained in workshops to acquire the theoretical and practical know-how to conduct substantial parts of clinical studies independently and act as peerteachers for fellow students. 20 Data analyses, interpretation, and reporting are performed by student members under the auspices of statisticians and CHIR-Net facilitators. PATRONUS is the first clinical study of the SIGMA study group. 17 The study was conducted at the following sites: University Hospitals of Heidelberg, Berlin, Dresden, Frankfurt, Freiburg, Hamburg, Kiel, Lübeck, Mannheim, Münster, Mainz, Munich (Ludwig-Maximilians University and Technische Universität München), and two non-university academic hospitals-Evangelical Hospital Herne and Asklepios Hospital Langen. The PATRONUS study was approved by the responsible independent Ethics Committees (Heidelberg: 11th September 2017, reference S-466/2017) and was registered with the German Clinical Trials Register (DRKS00013035) on 26 October 2017. Patients were recruited between February 2018 and March 2019 and were followed-up until 6 months after surgery. 17 Patient data were obtained using electronic case report forms (eCRF) entered into the REDCap electronic data capture system. 21 Data security was assured by restricting data access to authorized and trained study members only. Based on a study-specific data validation plan, queries were created in case of missing data or implausible data entry, which had to be clarified by the study investigators and medical students to enhance the validity of data collection. Data were either obtained from patients or their records, or entered directly by patients themselves (for PRO measures)

Participants
Adult (C18 years) patients were screened preoperatively. Patients scheduled for elective abdominal surgery for confirmed or suspected malignancy were approached for informed consent. Inclusion criteria were 17 (1) patient age C18 years; (2) patient was scheduled for elective abdominal surgery for confirmed or suspected malignancy; (3) patient's ability to understand the character of the study; (4) planned laparoscopic or open surgery or any variant (i.e. laparoscopic-assisted, laparoscopic-thoracoscopic); and (5) written informed consent. Exclusion criteria were (1) language barrier that impedes follow-up or informed consent; and (2) American Society of Anesthesiologists (ASA) grade 4 or higher.

Variables and Data Sources/Measurement
Both PROs and clinical outcomes were gathered.
1. The German translation of the PRO-CTCAE TM15,22 was used to assess a core set of 12 cancer-associated symptoms as recommended by the NCI (fatigue, insomnia, pain, anorexia, dyspnea, cognitive problems, anxiety, nausea, depression, sensory neuropathy, constipation, and diarrhea). 14 2. The CAT version of the quality-of-life questionnaire EORTC-QLQ 30 was used to evaluate HRQoL, 23 and comprises five functional scales (physical, role, cognitive, emotional, and social functioning), three symptom scales (fatigue, pain, and nausea and vomiting), and a global health and quality-of-life scale. The CAT EORTC QLQ-C30 questionnaire takes patients' individual priorities into account to increase precision. 16 An online survey tool of the EORTC CAT group was used that was linked to the REDCap study database. 21

Postoperative complication grades II-V within 30 days
according to the Dindo-Clavien classification (DCC) 24 (short-term clinical outcome). Complications were defined as minor (grade II) and major (grade III-V). 4. Long-term clinical outcome, defined as overall survival within 6 months postoperatively.

Sample Size
This was a cohort study with an explorative nature, thus no formal sample size calculation was performed. The initial goal was to achieve an average recruitment rate of 30 patients per center. 17

Statistical Methods
All evaluations were carried out using the SAS version 9.4 software package (SAS Institute Inc., Cary, NC, USA). Being an exploratory study, the analysis is descriptive, and all p-values have to be interpreted in a descriptive sense. pvalues \0.05 were determined as significant (in a descriptive sense). Missing values were described by relative and absolute frequencies, but were not imputed and were therefore dropped in the respective analyses. The number of recruited patients was smaller than planned, 17 which is why most of the analyses were only performed as univariate analyses. For the analysis of the correlation between PRO and short-and long-term clinical outcome, group comparisons of the subscales of CAT EORTC QLQ-C30 and PRO CTCAE were performed by analysis of variance and Chi-square tests. Continuous variables were described using several non-missing values, mean, standard deviation, median, Q1, Q3, minimum and maximum. Moreover, for continuous variables, variance analyses between subgroups divided to the tumor entities (upper gastrointestinal, pancreatic, hepatobiliary, colorectal, others) and t-tests for all pairwise comparisons of these entities were performed. For binary or categorical variables, absolute and relative frequencies were provided. In addition, Chi-square tests were calculated for binary or categorical variables between subgroups divided into the tumor entities. All objectives were analyzed for every visit where the required data were collected (visits 3-8). PRO-CTCAE TM scores and EORTC QLQ-C30 scores were adjusted for baseline (visit 1), i.e. for correlation analyses the differences between visit and baseline values were calculated. For regression analyses, baseline values were complemented into the analyses as an independent variable, and for the final analysis, missing values in the items of PRO-CTCAE TM and CAT EORTC QLQ-C30 were handled as described in the scoring manuals of the two QoL measures. Further missing values were documented, and frequencies were described with descriptive methods. Since there is no official scoring method for PRO-CTCAE TM , the mean of the item values for each of the 12 symptoms was used for the analysis. If all values for a subscale are missing, the mean of this subscale will also be set to missing. In addition, Kaplan-Meier graphs and logrank tests between different tumor entities were performed regarding overall survival. Pairwise log-rank tests were performed using the 'tukey' method for adjustment for multiple testing. Cox regression analysis to evaluate possible relationships of time to death regarding overall survival and different baseline covariates were performed as univariate analyses due to the small number of events. Spearman's rank correlations between short-term clinical outcomes (Comprehensive Complication Index [CCI] value) 25 and the set of PRO-CTCAE TM (difference between visit and baseline score) in the short-and longterm were performed, and the long-term HRQoL PRO measure (difference between visit and baseline score) was analyzed for all subscales. Correlation strengths were defined as 0.00-0.19, 'very weak'; 0.20-0.39, 'weak'; 0.40-0.59, 'moderate', 0.60-0.79, 'strong', and 0.80-1.0 'very strong'. The entire statistical analysis was predetermined in a statistical analysis plan (SAP) written before closure of the database.

Data Sharing
Requests for data sharing will be reviewed on an individual basis by the Steering Committee or the coordinating investigator. The data sharing process will comply with the good practice principles for sharing individual participant data, and data sharing will be undertaken in accordance with the required regulatory requirements. In particular, the privacy of the patients (i.e. sharing of anonymous data only) will be followed throughout.

Participants
From February 2018 to March 2019, 347 patients were enrolled in the study at 15 German centers (Fig. 1). A total of 21 patients were excluded for the following reasons: 11 patients did not undergo surgery, 8 patients were ineligible, and 2 patients retracted their informed consent. The remaining 326 patients underwent surgery for confirmed or suspected malignancy. A total of 23 patients were either lost to follow-up after enrollment (n = 19) or terminated the study earlier due to other reasons (n = 4). Thus, 303 patients (87%) of the 347 enrolled patients were considered for subsequent analysis (Fig. 1).

Baseline Data
The most frequent cancer type was hepatobiliary (n = 85, 28.05%), while the least frequent cancer type was upper gastrointestinal malignancies (n = 48, 15.8%) ( Table 1). Participants undergoing pancreatic cancer surgery were significantly older than patients with hepatobiliary tumors, with the latter having more ASA III scores. Weight loss was frequent, especially in patients with upper gastrointestinal and pancreatic tumors (68.75% and 56.92%, respectively). Only 15% of pancreatic cancer patients had received neoadjuvant treatment, while almost half of the upper gastrointestinal patients had undergone neoadjuvant therapy (47.92%). BMI and medical comorbidities as common risk factors for postoperative complications did not differ significantly between tumor entities. Medical comorbidities were frequent (82.84%).

Surgery
The duration of operations varied between entities, ranging from 212.0 ± 113.0 min for hepatobiliary surgeries to 301.2 ± 134.2 min for pancreatic surgeries ( Table 2). The estimated blood loss was more than twice as high in pancreatic surgery than in colorectal surgery (748 ± 708 mL vs. 353.8 ± 502.7 mL). In over 95% of cases, colorectal cancers were resected, while only 80% of hepatobiliary tumors were resectable.  (Fig. 2) 'poor appetite', 'fatigue', 'exhaustion or missing energy', and 'sleeping problems' increased postoperatively, climaxed between POD 10-30 (visit 5 or 6), and decreased 3-6 months after surgery (visit 7 and 8). In contrast, 'diarrhea' increased postoperatively and remained constant over time. 'Nausea' was significantly increased at visits 3, 5, and 6 compared with baseline, but normalized 3 months (visit 7) after surgery. Similarly, 'anxiety' only decreased during longterm follow-up (3 and 6 months after surgery; visits 7 and 8). Two weeks after surgery (visit 5), elevated levels were measured for 'fatigue' and 'exhaustion or missing energy'. Even 1 month after surgery (visit 6) patients stated 'concentration problems' significantly more often than before surgery. Six months after surgery (visit 6) no PRO-CTCAE TM symptoms differed significantly compared with baseline, except 'diarrhea'.
The CAT EORTC HRQoL questionnaire (Fig. 3) revealed that patients at visit 8 were more often confronted with 'dyspnea' and 'financial difficulties' compared with baseline (p = 0.032); however, patients stated higher 'social functioning' at this timepoint (p = 0.021) compared with the preoperative level. Overall HRQoL scores differed significantly between baseline and visits 6, 7, or 8 (p \ 0.05).

Perioperative Complications and Quality of Life (Objectives 2 and 3)
Of 303 patients, 302 could be analyzed regarding postoperative complications; 112 patients had no complications, 86 patients had minor complications (grade II), and 104 patients had major complications (grade III-IV) according to the DCC. All PRO-CTCAE TM symptoms correlated only (very) weakly with complications (minor or major) (Fig. 3a). Looking at major complications, the PRO-CTCAE TM symptoms of 'constipation' (blockage) at visit 3 (POD 3-5; p = 0.027) and 'poor appetite' at visit 6 (1month postoperatively; p = 0.002) were significantly correlated with major complications.
In total, 37 hospitals were contacted by the CHIR-Net SIGMA study group (electronic supplementary Fig. 2). Of   CAT EORTC QLQ-C30 computerized adaptive testing version of the EORTC health-related quality of life tool, EORTC European Organisation for Research and Treatment of Cancer 37 contacted sites, 15 trial sites were unable to participate because no students and/or supervising surgeons were able to form joint teams within the recruitment period. The remaining 22 sites initiated the study, however 7 sites dropped out due to problems regarding trial infrastructure or approval from local Ethics Committees, resulting in 15 of 22 initiated sites (68.18%) enrolling patients in the study.

Further Analysis
Overall survival was analyzed via Kaplan-Meier graphs (electronic supplementary Fig. 4) and showed significant differences between the tumor entities.

DISCUSSION
PATRONUS was the first multicenter, student-led clinical study in Germany. It combined proof-of-feasibility of student-led clinical research on the one hand and evaluation of clinical research questions on the other hand. As study conception, planning, acquisition, and analysis of data were performed by more than 100 medical students at 15 sites across Germany under the supervision of academic surgeons, the study is a model for research-based learning, which is a concept that refers to a trend in higher education, namely to provide students with the opportunity to gain knowledge by conducting their own scientific inquiries or investigations that are of interest to the scientific or     26 More than 60% of the centers that were initiated finally enrolled patients (Fig. 4), showing the widespread acceptance and feasibility of this concept. Several findings are noteworthy. First, measurements of cancer-associated symptoms via the newly developed PRO-CTCAE TM , as well as assessment of HRQoL via the new CAT version of the EORTC QLQ-C30, were technically possible via an electronic data capture system. Data completeness was [90% at baseline and [80% in the immediate postoperative period (POD 3-5), therefore we concluded that PRO measurement was accepted by patients undergoing major abdominal surgery. The two new PRO tools are attractive for a wider application in surgical oncology as they allow a standardized symptom assessment across multiple domains of cancer treatment, as both tools have been employed in medical oncology and palliative care. 16,27,28 They would therefore allow assessment of the total cancer and treatment burden from the perspective of individual patients or patient groups along an entire healthcare pathway. 29 As both PRO measures can be tailored to specific needs and situations, either by offering a wide set of standardized symptom scores or by using a CAT approach, these tools allow a more personalized PRO assessment, thus alleviating the burden of answering improper, lengthy, standard questionnaires. 7,30 This also elucidates one of the limitations of our study. In order to catch PRO symptoms across multiple cancer types, we used cancer-specific, but not disease-specific, PRO measurements, i.e. the CAT version of the EORTC QLQ-C30 and a general set of 12 cancer symptoms as recommended by the NCI. 14 Therefore, depending on cancer type and intervention, patients might additionally consider other items important that have not been covered with our set of measures. Future studies using disease-specific PRO measures will need to fill this gap.
To our knowledge this is the first report of PRO-CTCAE TM and CAR EORTC QLQ-C30 data in surgical oncology. The data provided in this publication and its supplements can thus be used for sample size calculation in future trials or for standardization and quality measurements in regular care. Although symptoms are directly affected by oncological surgery postoperatively (Fig. 2a), overall HRQoL, as well as functional subscales of the CAT EORTC QLQ-C30, either normalized or even improved compared with baseline within 6 months after surgery. This has been previously reported for other PRO measures 31,32 and confirms the multidimensional construct of HRQoL, which extends beyond symptom assessment and can be stable even under severely adverse conditions. 33 Overall, these results confirm major abdominal surgery as an adequate intervention from the patient's HRQoL point of view.
There was a lack of moderate or strong correlation between PRO measures and postoperative morbidity. Consequently, we see a challenge for the use of patientreported core cancer symptoms in predicting postoperative complications. Although this has been reported in other trials using different symptom and HRQoL measures, 33 other studies have reported the opposite. 34 In addition, Dumitra et al. reported that correlations also depend on the complication grading system and not only on the complication itself. 35 Therefore, future studies with larger sample sizes will need to elucidate this association more clearly. Given the small sample size in some of the subgroups, larger cohorts are needed to identify specific PRO symptoms that can be used as early detection markers for looming surgical complications. To this end, diseasespecific PRO measures and symptom scores could be used for specific tumor types. The idea of using automatized symptom reporting in conjunction with clinical examination and laboratory findings to predict postoperative complications is attractive in surgical oncology, especially in settings where major complications are frequent.
There are several limitations to our study. First, although 15 sites enrolled 303 patients in the PATRONUS study (average 20.2 patients per site), we fell short of the intended 30 patients per site. Furthermore, we were unable to recruit the planned 30 trial sites, mostly because of delays in patient recruitment due to the lengthy process of obtaining positive ethic votes for each individual site. During this time, some mini-teams consisting of students and academic surgeons broke apart. Furthermore, it needs to be pointed out that not all patients underwent resection and that a small subgroup of patients had benign histologies on their final pathology report (Table 2). We kept these patients in the final analyses in line with our prespecified inclusion criteria (preoperative confirmed or suspected malignancy). Another shortcoming is that average data completeness of PRO measures dropped to 55-58% after 3 and 6 months postoperatively. A recent systematic review of US FDA cancer trials reported a median PRO data completion rate of 89%, ranging from 33 to 100%. 36 Reasons for the drop in data completeness during follow-up was the relatively long follow-up period, which put a considerable strain on the already busy schedule of most medical students. Consequently, other successful student-led studies in the UK focused on shorter data capture periods or were cross-sectional audits rather than prospective studies. [37][38][39] Data completeness might be increased by strengthening the centralized automated monitoring via the electronic capture system. This would give immediate feedback and would avoid a delayed query process. In addition, query management and response need to be part of the pre-study training workshop. 20 In addition, most participating hospitals were large tertiary university centers, which limits the external validity of our results. Therefore, patient groups and surgeries performed (Tables 1, 2) might not reflect surgical practice in other hospitals. Finally, although intended to increase knowledge, skills, and competencies in clinical research of participating medical students, we did not measure educational goals in the current study. However, evaluation of our clinical investigator training prior to study participation showed an increase in clinical research knowledge in a pre/post test. 20 Several studies have shown that exposure to research during medical school correlates with engagement in research later on. 40,41 CONCLUSION Despite the fact that only low correlations between patient-reported symptoms and complications were found, PRO-CTCAE TM and CAT EORTC QLQ-C30 are promising PRO tools for surgical oncology as they elucidate the patients' perspective on surgical treatment and can be implemented electronically in the postoperative setting and after discharge. Furthermore, patients undergoing major abdominal surgery exhibit HRQoL scores similar or better than preoperatively. Student-led multicenter clinical research is feasible. Currently, the CHIR-Net SIGMA study group is conducting a randomized controlled trial investigating the effect of fitness tracker and enhanced postoperative mobilization on postoperative complications. 42 This trial (EXPELLIARMUS; UTN: U1111-1228-3320) takes into account the lessons learned from PATRONUS.
ACKNOWLEDGEMENTS The authors would like to thank Morten Aa. Petersen (University of Copenhagen, Denmark) and the EORTC CAT group for their kind support in using the CAT EORTC QLQ-C30. They would also like to acknowledge the IT support of Stefan Gram (University of Copenhagen) and Matthias Ganzinger (University of Heidelberg) in linking the REDCap study database to the online CAT EORTC survey tool. OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.