The availability of high-quality studies reporting patient-reported outcome (PRO) data utilising robustly developed patient-reported outcome measures (PROMs), offer several advantages to patient care, including their utility within shared decision-making discussions. Baseline PRO data has been shown to act as a prognostic factor for overall survival in cancer patients,1 including those with advanced malignancy.2,3 Integrating PROs into clinical care to monitor adverse effects of cancer treatment can also enhance patient quality of life,4 and has even been reported to improve survival.5,6 The interest in utilising PROMs from both a clinical and academic standpoint continues to grow given the potential utility of these outcome measures, including in patients with locally recurrent rectal cancer (LRRC). The inclusion of patient-reported outcomes (PROs) is particularly important in the context of advanced malignancy such as LRRC. LRRC can lead to debilitating symptoms such as pain, bleeding/discharge from the rectum, pelvic sepsis, urinary symptoms, lower limb symptoms and impaired sexual function. Surgical resection represents the only curative treatment option for patients with LRRC, with 5-year survival rates of 42.4% - 63% reported by specialist tertiary centres.7,8,9,10,11 Exenterative surgery has evolved, with ultra-radical techniques developed in recent years, which can offer potential cure to patients with LRRC, such as high sacrectomy and extended lateral pelvic sidewall excision (ELSiE), are generally accompanied by significant morbidity.12,13,14 In this context, balancing the patients’ existing symptoms, the potential survival benefits to be gained from treatment and their impact on PROs, is essential to enabling patients to make informed decisions regarding their care.

However, it is crucial that the methodological quality of the studies reporting PROs and the PROMs used are sufficient to produce valid and reliable results, particularly in complex disease settings. Validity is the degree to which a PROM measures the construct it purports to measure.15 In a clinical context, such as in measuring health-related quality of life (HrQoL) in patients with LRRC, a PROM can only be considered valid if there is evidence that it has been developed with input from patients with LRRC and provides a comprehensive assessment of HrQoL as the construct of interest, meaning that all aspects of HrQoL that are relevant to patients with LRRC are included. PROMs can be designed as disease-specific or generic, for instance, a generic PROM measures concepts which are broadly relevant to the population, whereas disease-specific PROMs measure concepts specific to a group of patients with a particular condition. To be considered valid in a specific group of patients, both disease-specific and generic PROMs should be shown to have content validity in the group of patients they have been designed for.

The existing evidence concerning PROs in LRRC possesses several limitations from a methodological standpoint, this includes heterogeneity in relation to the groups of patients included, with outcomes frequently reported in combined cohorts of patients with primary and recurrent disease,16,17,18,19 and heterogeneity in comparator groups. In addition to significant variability in the PROMs used and timing of PROM assessment.16,17,18,19 The majority of existing studies are retrospective in nature18 and the evidence is generally low in quality.16,17,18,19,20 Denys et al.’s review focused on patient-centred outcomes following pelvic exenteration for colorectal cancer, including both primary and recurrent disease, also found that the impact of urinary complications, discomfort or pain on sitting and functional disability are inadequately represented in the PROMs currently being used.19

This review sought to evaluate the methodological quality of the existing evidence concerning PROs in LRRC, utilising a systematic approach. The specific aims of the review were to identify the PROMs currently being used to report outcomes in patients with LRRC and to examine the methodological quality of the studies against criteria informed by the Consolidated Standards of Reporting Trials- Patient Reported Outcome (CONSORT-PRO) extension,21,22 and the psychometric properties of the PROMs identified using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) Risk of Bias checklist.23,24

Methods

This systematic review was conducted using a pre-specified protocol in keeping with Cochrane guidelines,25 and reported in line with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) checklist.26 The review was registered on the international prospective register of systematic reviews, PROSPERO (reference: CRD42022332577).

Eligibility Criteria

Studies in adults (aged ≥ 18) with LRRC that included PROMs as a primary or secondary outcome measure were included. Studies in patients with LRRC undergoing any form of treatment with curative or palliative intent, were eligible for inclusion. Studies in patients with a history of only local excision for primary rectal cancer who developed a regrowth or recurrence were excluded. Only studies published in the English language were considered. Case reports, conference abstracts, study protocols, reviews and letters were excluded.

Information Sources

The search was undertaken using the PubMed, Embase and CINAHL databases, including studies published from 1966 (PubMed), 1980 (Embase) and 1981 (CINAHL) up until 14th September 2022. The search strategy can be found in the supplementary material. Reference searching was also undertaken to identify additional studies. Studies describing the psychometric properties of the PROMs identified from this search were retrieved from citations and through manual searching to enable evaluation of the psychometric properties of the PROMs identified.

Selection Process

Titles and abstracts of studies retrieved were exported to EndNote X9 (Clarivate Analytics, Philadelphia, USA) and duplicates removed. The titles and abstracts were uploaded to Rayyan online software and screened for relevance by two authors (NM and ER). The full text for potentially eligible studies were retrieved and assessed, any queries regarding the eligibility of a study were resolved through discussion with senior authors.

Data collection process

Data concerning the characteristics of the studies included and the quality of the reporting of PROMs against criteria informed by the CONSORT-PRO checklist were extracted independently by authors NM and ER into Excel®. The COSMIN Risk of Bias checklist23 was completed using the Excel® template available from the COSMIN website27 independently by authors NM and FH. Any differences in data extraction or ratings were discussed with senior authors to reach consensus.

Data Items

Quality of Reporting of PROMs

There are currently no checklists available via the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) network regarding the inclusion of PRO data for observational studies. The CONSORT-PRO extension was developed to promote transparent reporting of trials including PROs as primary or secondary outcomes; facilitating the interpretation or PRO results for use in clinical practice.22 The CONSORT-PRO checklist was used to inform the evaluation of studies identified in relation to how the findings were reported and whether the methodology of the study and the PROMs used were sufficient to capture significant and meaningful findings.

PROM Psychometric Properties

The psychometric properties of the PROMs identified were evaluated using the COSMIN Risk of Bias checklist. The COSMIN Risk of Bias checklist for systematic reviews was developed to assess risk of bias of studies on measurement properties of PROMs,23 this information can be used to identify the most appropriate PROM for a specific purpose or study. There are ten criteria (see Figure 1), PROM development and content validity are the first to be assessed, if a PROM is deemed to have insufficient content validity, it should not undergo further assessment. Once sufficient evidence for content validity has been identified, the internal structure and remaining measurement properties are assessed. Studies are qualitatively summarised to give an overall rating of sufficient (+), insufficient (-), inconsistent (±), or indeterminate (?) for each measurement property.28 The quality of the evidence is rated using a modified Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach.29

Fig. 1
figure 1

Summary of the COSMIN Risk of Bias Checklist. *Cross-cultural validity was not assessed in this review as the search strategy was not deemed suitable for identifying all studies describing this psychometric property. **The COSMIN panel determined that no gold standard exists for PROMs30 and therefore criterion validity was not assessed in this review.

Risk of Bias Assessment

Risk of bias was assessed using the Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) tool,31 and the revised tool to assess Risk of Bias in randomised trials (RoB 2).32

Data Synthesis

A basic descriptive analysis was undertaken to report the number of patients included in the studies identified and the proportion of patients with LRRC and who contributed to assessments with PROMs.

Results

Study Selection

A total of 1475 references were identified; 147 duplicates and 5 animal studies were removed. Abstracts were screened for 1323 references and the full text for 56 references were retrieved. Thirty-one eligible references were included from the search strategy in addition to 4 references identified through manual searching (see Figure 2).

Fig. 2
figure 2

PRISMA flow diagram

Study Characteristics

A summary of the characteristics of the studies is presented in Table 1, including a total of 1914 patients with LRRC across the 35 studies included, of which PROM data was reported for 1104 (57.7%) patients. Twenty-one (63.6%) of the studies identified were published in the last decade. The studies were conducted mostly in Europe (n = 18, 51.4%), Australia (n = 13, 37.1%) or the USA (n = 4, 11.4%), with one study conducted in China (2.9%). Twenty-six (74.3%) studies recruited patients from a single centre. The majority were prospective cohort studies (n = 19, 54.3%) in addition to cross-sectional (n = 7, 20.0%), case-control (n = 5, 14.3%), retrospective cohort (n = 2, 5.7%) and randomised studies (n = 2, 5.7%). Eight (22.9%) of the studies identified included only patients with LRRC, in addition to two (5.7%) case control studies comparing patients with LRRC to other cohorts, with sample sizes of patients with LRRC ranging from 12 to 117 patients. The other 23 (69.7%) studies included combined cohorts of patients with primary and recurrent pelvic disease including LRRC, with sample sizes ranging from 12 to 710 patients in total. Median number of PROM assessments was two (IQR 1). In the 19 prospective, longitudinal studies identified, median follow-up was 12 months (IQR 15) the longest follow-up time point was 8 years.33

Table 1 Summary of studies identified

Risk of Bias

Risk of bias was high overall, with 32 (91.4%) studies highly or seriously biased (see supplementary Figures 1 and 2).

Results of Individual Studies

Quality of Reporting of PROMs

The assessment of the studies identified against criteria informed by the CONSORT-PRO checklist are illustrated in Figure 3. None of the studies included in the review met all eleven criteria for the quality of reporting of PROMs, with an overall median of 5.8 (58.3%) criteria. The least reported criteria were defining the PROM of interest (n = 3, 8.6%), describing the statistical approach to missing PRO data (n = 6, 17.1%), and detailing a PRO hypothesis (n = 6, 17.1%). The most commonly met criterion was the identification of a PRO as a primary or secondary outcome (n = 35, 100.0%).

Fig. 3
figure 3

Quality of Reporting of PROMS in LRRC

Characteristics of the PROMs Identified

Seventeen PROMs and two clinician-reported outcome measures (MSTS and Spitzer) were identified. The most commonly reported PROMs were the EORTC QLQ-C30 (n = 12, 32.3%),34,36,45,46,48,51,52,53,56,57,61,63 the SF-36 (n = 11, 31.4%),7,35,36,37,38,40,41,43,44,48,55 the FACT-C (n = 10, 28.6%)7,33,35,37,40,41,43,49,55,60 and the EORTC QLQ-CR29 (formerly CR38) (n = 8, 22.9%).34,36,48,51,52,57,61,63

Four of the PROMs identified were specific to patients with cancer (see Table 2), however, there were no disease-specific PROMs for patients with LRRC. The cancer-specific measures included the EORTC QLQ-C30 which is a measure of QoL in patients with cancer and the Functional Living Index – Cancer (FLIC) is a measure of functional state in adult patients with cancer. Two measures which are cancer-site specific were also identified; the EORTC-QLQ CR29 and FACT-C which are both measures of QoL in patients with primary colorectal cancer.

Table 2 Summary of cancer-specific measures identified

Seven PROMs which relate to forms of function or functional limitations were identified (Table 3), including bowel function, physical function, and sexual function. The Low Anterior Resection Syndrome (LARS) score is a measure to assess bowel dysfunction following low anterior resection for rectal cancer and the St. Mark’s Faecal Incontinence Score for adult patients with faecal incontinence. The Lower Extremity Functional Scale (LEFS) is a measure of lower extremity physical function designed for patients with lower extremity orthopaedic conditions. Four of the measures identified were measures of sexual function, including the Sexual Health Inventory for Men (SHIM) and the International Index of Erectile Function (IIEF) which are measures of erectile dysfunction developed for use in male patients with a history of erectile dysfunction and the Female Sexual Function Index (FSFI) measure of sexual function for female patients with a history of sexual arousal disorder and the Sexual function – Vaginal changes Questionnaire (SVQ) measure of sexual and vaginal problems developed for patients with a history of gynaecological cancer.

Table 3 Summary of measures related to function or functional limitations

Six of the PROMs identified were generic measures (see Table 4), including three measures of QoL for use in adult patients; the 36-Item Short Form Survey (SF-36), EuroQoL (EQ-5D) and Assessment of Quality of Life (AQOL-4D), two measure of pain intensity; the Verbal Numerical Rating Scale (VNRS) and Visual Analogue Scale (VAS), and finally one measure of pain, the Brief Pain Inventory (BPI).

Table 4 Summary of generic measures identified

The three remaining measures included (see Table 5), were not patient-reported but clinician reported. Those included the Late Effects of Normal Tissue – Subjective, Objective, Management, and Analytic (LENT-SOMA) scoring system for late effects of radiotherapy, including a subjective scale to be completed by patients with the remainder being completed by clinicians. The Spitzer is a clinician-reported measure of QoL for patients with cancer or other chronic diseases and the Musculoskeletal Tumour Society Score (MSTS) is a clinician-reported measure of physical function for patients with musculoskeletal neoplasms.

Table 5 Summary of other measures identified

PROM Psychometric Properties

The psychometric properties were only assessed for PROMs and not the LENT-SOMA or the clinician-reported outcome measures, Spitzer and MSTS.

Content Validity

None of the PROMs identified were developed specifically for patients with LRRC (Tables 2, 3, 4 and 5) and no studies were identified in which the psychometric properties of these PROMs were evaluated in patients with LRRC.

Internal Structure and Remaining Measurement Properties

Content validity is the most important measurement property of a PROM and therefore full review is not advised if a PROM does not meet criteria for content validity.

Discussion

There has been an expansion in PROMs reporting in LRRC, with several papers (n = 21, 63.6%) published in the last decade. However, despite this increase, these studies are methodologically limited due to the use of non-validated measures used to assess PROs in this cohort of patients. This systematic review did not identify a disease-specific PROM available for use in LRRC and none of the PROMs identified met the COSMIN criteria for content validity in the context of LRRC. The most used PROMS in LRRC were the FACT-C (n = 10, 28.6%), SF-36 (n = 11, 31.4%) EORTC QLQ-C30 (n = 12, 34.3%) and CR29 (n = 8, 22.9%), none of which have demonstrated content validity specifically for patients with LRRC.

Overall, the findings build on the existing evidence16,17,18,19 of variable methodological quality of reporting of PROMs within small sample sizes and mixed disease cohorts. This review focuses specifically on the methodological quality of PRO reporting using criteria informed by the CONSORT-PRO checklist; common weaknesses were identified in several domains, including defining the PRO of interest, describing the statistical approach to missing data and stating PRO-specific limitations and implications for generalisability. These results were comparable to those reported in Efficace et al.’s pooled analysis of randomised cancer trials utilising CONSORT-PRO,76 though methods of PRO data collection had higher levels of reporting in this current review. Ultimately, the key limitation identified is the lack of input from patients with LRRC in the PROMs currently being used, with none demonstrating content validity for use in this context. Content validity is the most important measurement property of a PROM; for PROMs to give meaningful results in LRRC, it is essential that they are relevant to patients with LRRC and present a comprehensive assessment of the construct of interest. Without addressing the lack of an appropriate PROM for use in patients with LRRC, the impact of addressing issues such as heterogeneity in the groups of patients included, the comparator groups used, and the timing of PROM assessment, is likely to be limited.

Harji et al. reported the development of the Locally Recurrent Rectal Cancer – Quality of Life (LRRC-QoL) conceptual framework through undertaking a systematic review and qualitative focus groups to identify the HrQoL issues relevant to patients with LRRC.18,77 The themes identified were symptoms, sexual function, psychological impact, role and social functioning, future perspective and healthcare service utilisation and delivery. Nineteen (54.3%) of the studies identified in this review have been published since this work,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51 using a median of two PROMS, with the EORTC QLQ-CR29 and FACT-C most used. The EORTC QLQ-CR29 and FACT-C have also both demonstrated robust psychometric properties, including content validity, in patients with primary colorectal cancer.78,79 When compared with the LRRC-QoL conceptual framework,77 the EORTC QLQ-CR29 covers 50% of the LRRC-specific domains, including symptoms, sexual function, and psychological impact. It does not however cover the domains of role functioning, or future perspective. The FACT-C covers 66.6% of the LRRC-specific domains identified in the LRRC-QoL conceptual framework including symptoms, psychological impact, role functioning, and future perspective, it does not cover sexual function. Neither the EORTC QLQ-CR29 or FACT-C cover issues relating to healthcare services, self-efficacy and body image, future plans, disease re-recurrence, gynaecological or locomotor symptoms. The evidence identified reporting outcomes utilising these PROMs should not be completely disregarded, as the EORTC QLQ-CR29 and FACT-C capture a proportion of the issues relevant to patients with LRRC. However, it should be interpreted with caution, as they are unlikely to capture the full scope and complexity of the range of issues patients with LRRC experience.18,77

A number of PROMs which measure issues relevant to patients with LRRC were identified in this review; urinary and sexual function were evaluated using specific questionnaires for this purpose by two studies,36,53 however, other questionnaires, such as the EORTC QLQ-CR29, also contain items concerning sexual and urinary function. No specific PROMs concerning stoma-related quality of life were used in the studies identified, despite being relevant to patients with LRRC.77 However, PROMs such as the EORTC QLQ-CR29 and FACT-C contain items specifically for patients with stomas. The increasing number of PROMs currently being used in LRRC reflects the lack of an existing disease-specific measure which adequately reports all the PROs relevant to this cohort of patients. The trend to include several PROMs is likely to reflect the greater understanding of the wider issues which affect patients with LRRC. However, the measures identified in this review are not valid for use in patients with LRRC and therefore this is not a psychometrically robust approach to addressing the lack of a LRRC disease-specific measure. Additionally, this approach potentially increases the burden of participation for patients, without sufficient methodological justification.

There are limitations related to the evidence included in this review, notably, most of the studies identified have a high risk of bias (n = 32, 91.4%) and their findings should generally be interpreted with caution. They also present a predominately Western perspective of PROs in LRRC and demonstrate a lack of multi-centre, international reporting of PROs in LRRC. Furthermore, 13 (37.1%) of the studies identified were conducted within a single centre, reporting cohorts of patients which may potentially overlap. It was not possible to assess the availability and quality of translated PROMs in this review, however, to further the success of initiatives such as the PelvEx collaborative in advancing international outcome reporting in this cohort of patients80 and integrating PRO data, it is essential that PROMs undergo a rigorous process of cross-cultural adaption.

There are several approaches which could be employed to address the lack of PROMs with content validity for patients with LRRC. It is possible to demonstrate the content validity of existing PROMS specifically for LRRC, however, given the narrow breadth of relevant HrQoL issues captured by existing measures, this approach will require significant revision to make these measures applicable to LRRC.77 Employing a modular approach to PROM assessment to LRRC is an alternative approach, provided both the core cancer and site-specific measures are appropriately revised and validated for use in LRRC. Development of a new disease-specific PROMs for use in patients with LRRC, to capture concerns that are specific to patients with LRRC which can be used to more accurately monitor the impact of particular treatments on PROs such as HrQoL is likely to be the most realistic and valid approach.81 The development of the LRRC-QoL PROM will build on the development of the LRRC-QoL conceptual framework.77 The LRRC-QoL is the first disease-specific PROM developed for use in patients with LRRC82 and has been designed to be used in combination with EORTC QLQ-C30, in a modular fashion, which would allow comparison across patient groups. Recruitment to a study to externally validate the LRRC-QoL for use internationally is currently underway (ISRCTN13692671) and includes a robust cross-cultural adaptation process to produce versions of the LRRC-QoL for use in several countries.

Conclusion

This systematic review highlights key methodological issues in the current state of reporting of PROs in LRRC, finding that none of the PROMs currently being used in LRRC are able to provide meaningful results within this context. Future studies in this disease area should focus on utilising PROMs that have undergone a robust development process with the inclusion of patients with LRRC, to ensure high quality, accurate results which are relevant to this patient group. The development of a disease-specific PROM for patients with LRRC or undertaking content validity studies of existing PROMs are approaches which could be employed to enable this, in addition to undertaking cross-cultural adaptation to enable international reporting of outcomes. Greater emphasis should also be placed on the way in which PROMs data are reported and analysed, particularly in defining the PRO of interest and in handling missing PROM data, to ensure that results are reliable.