Background

Colorectal cancer (CRC) is a leading cause of cancer incidence and mortality worldwide [1, 2]. Patient survival is generally well predicted according to the TNM stage; however, there is still stage independent variability due to the molecular heterogeneity of the tumor [3]. Molecular characteristics of the tumor that also affect patient survival, such as MSI, BRAF and KRAS, have only recently been included in the 8th edition of the UICC recommendations as additional indicators for clinical practice guidance.

A deficiency in the MMR system, represented by the microsatellite instability (MSI-H) phenotype, can be found in approximately 15% of CRC cases, and is associated with both sporadic (~ 12%) and hereditary (~ 3%) cancers. Patients with the MSI-high phenotype seem to have better survival than patients with microsatellite stable (MSS) tumors [4, 5]. The MSI status is closely associated with other characteristics of the tumor, such as BRAF mutations, early stage, proximal location, and higher degree of immune infiltration. Some clinical trials focusing on stage III patients receiving chemotherapy have reported conflicting results regarding the prognosis of MSI-high patients [6,7,8], which could be explained by the varying tumor characteristics. Mutations in BRAF V600E and KRAS exon 2 were also associated with worse patient prognosis in previous studies; however, these associations seem to be restricted to MSS tumors [9, 10]. In combination with the TNM stage information, KRAS mutations and the MSI status of the tumor are used in clinical practice to determine treatment options [11, 12].

According to UICC recommendations, before a modified staging system can be endorsed for use in clinical practice, a classification must be validated in various external cohorts using different patient populations in several settings [13]. In a previous systematic review, multiple studies that proposed molecular subtype classifications of CRC (including markers such as MSI, BRAF and KRAS) and determining their associations with patient survival were identified [14]. No classification found significant associations for all proposed subtypes and only two performed external validation of their results [15, 16]. Given the importance of external validation of such classifications, the aim of this study was to validate CRC molecular subtype classifications previously published in the literature and extend their application to additional patient subgroups not previously investigated, using an independent cohort of patients from a population-based study.

Methods

Study population and design

The DACHS study is a population-based case-control study conducted in the Rhine-Neckar region in southern Germany, in which incident CRC cases are followed-up as a cohort. Details of the study design have been reported previously [17]. Briefly, patients over 30 years of age, with a histologically confirmed diagnosis of CRC, and able to participate in an interview in German, were recruited between 2003 and 2010 from 22 hospitals in the region. Patients with available tumor tissue samples, molecular characterization on MSI, BRAF, KRAS and CIMP, and follow-up were included in this analysis.

Baseline characteristics, and medical and family history of CRC were collected by trained interviewers at the time of diagnosis using a standardized questionnaire. Tumor characteristics and stage of disease (6th edition of the TNM staging manual) were obtained from medical records and pathology reports. Follow-up assessment at 3 and 5 years after diagnosis included information on the type of treatment, comorbidities, cancer recurrence, and vital status. Vital status and date of death were obtained from population registries, and determination of cause of death was based on death certificates obtained from the pertinent health authorities.

Molecular marker determination

Tumor tissue analyses were performed on formalin fixed, paraffin embedded (FFPE) samples. As previously described, MSI status was determined using a mononucleotide marker panel (BAT25, BAT26 and CAT25) [18]. BRAF V600E was determined independently by two experienced pathologists (HBl, MK) using IHC analyses in tissue microarray blocks (52%) and by mutational analysis (48%) using Sanger sequencing (exon 15). No significant differences were observed in the proportion of aberrant cases identified by the two techniques. KRAS mutations were determined using DNA samples by the single stranded conformational polymorphism technique (48%) or by Sanger sequencing (52%) (exon 2). CIMP status was defined using a five marker methylation panel (MLH1, MINT1, MINT2, MINT31 and MGMT) and classified according to the number of hyper-methylated loci: CIMP negative (none), CIMP-low (1 or 2 loci), or CIMP-high (3 or more loci).

Studies for comparison

Studies with proposed molecular subtypes of colorectal or colon cancers that provided an estimation of the association with survival for each subtype were identified in a previous systematic review [14] and used as a base to perform this external validation analysis. Studies were included in the systematic review if they suggested a classification system based on at least three markers and reported results for survival by each subtype. In this validation analysis we were able to include classifications of CRC based on the molecular subtypes previously suggested by Jass [19], including MSI and CIMP status, and mutation status of BRAF and KRAS genes [16, 20, 21]. Table 1 presents a summary of the population and study characteristics of the studies included in this analysis.

Table 1 Characteristics of studies for validation

Validation and statistical analysis

Demographic and clinico-pathological characteristics were analyzed for the entire study population using descriptive statistics. Patients were categorized into the same subtypes proposed by the three studies. Cox-proportional hazards regression models were used to calculate cancer-specific (CSS), relapse-free (RFS), or overall survival (OS) for each subtype. CSS was defined as time from diagnosis until death from CRC, RFS until reappearance of disease, metastases or death from CRC, and OS until death from any cause. All models were adjusted using the same set of variables reported by the original studies. Classifications that were developed only among women, or only in stage III colon cancer patients were validated in both the selected sub-population and the entire patient cohort, to investigate whether the classification would yield similar results in an unselected patient cohort.

For the validation of the proposed classifications, the hazard ratios reported by the original studies were compared with those obtained in the DACHS cohort after creating subtypes equal to those proposed by the original studies. As suggested by Royston and Altman [22], Kaplan-Meier plots were created to evaluate the discrimination between subtypes and to allow a visual comparison of discrimination with the originally published plots. Additionally, Uno’s c-statistic was calculated for each of the models [23]. This method provides a measure of discrimination that accounts for time-to-event data and censoring in survival analyses [24]. All statistical analyses were performed in SAS version 9.4 (SAS Institute Inc., Cary, NC, USA).

Results

DACHS population characteristics

Among 1915 cases with complete tumor marker data, patients were classified according to the TNM system as stage I in 355 (19%), stage II in 644 (34%), stage III in 652 (34%), and stage IV in 264 (14%) cases. The MSI and CIMP status, and mutations in BRAF were significantly associated with age, sex, location and stage of disease in descriptive analyses (Table 2). During a median follow-up of 5.3 years, 548 (29%) patients experienced disease recurrence, and 624 (33%) patients died, 414 (66%) of whom from CRC. CRC-specific deaths occurred in 20 (9%) MSI-high cases, 44 (28%) BRAF mutated cases, 144 (23%) KRAS mutated cases, and 66 (19%) CIMP-high cases.

Table 2 Descriptive characteristics of CRC patients according to molecular markers in the DACHS cohort

External validation results

Table 3 presents the molecular classification, original results, and validation results obtained with the DACHS cohort. In general, the distribution of patients from the DACHS study within each classification was similar to the one reported by the original studies. Larger differences between the original studies and our study were observed for the ‘Traditional’ (45.9% vs 56.8%) and ‘Alternate’ (38.4% vs 32.0%) subtypes proposed by Samadder et al. [20], and the ‘Type 4’ subtype proposed by Phipps et al. [21] (4.6% vs 1.7%). Due to the selective recruitment of stage III patients in the study by Sinicrope et al. [16], numbers of patients were lower for all subtypes in the DACHS study.

Table 3 Comparison of patient survival according to the identified classifications with external validation and additional subgroups in the DACHS cohort

In a study restricted to female CRC patients, Samadder et al. [20] proposed three molecular subtypes and two additional ‘unassigned’ subgroups. The two unassigned groups were not clearly defined, and therefore not included in this validation. Both BRAF and KRAS mutated subtypes showed worse CSS compared to the all-negative baseline, however none of the associations were statistically significant. In the DACHS cohort, the magnitudes and directions of the measures of effect were similar in all subtypes for CSS and OS, but no significant associations were observed either. The c-statistic for the CSS and OS models was 0.828 and 0.780, respectively.

Phipps et al. [21] proposed five molecular subtypes and found significantly worse CSS for the MSS subtypes (types 2 and 3), and better CSS for the type 5 (MSI-high, BRAF non-mutated, KRAS non-mutated, CIMP-negative) subtype in comparison to the type 4 (MSS, BRAF non-mutated, KRAS non-mutated, CIMP-negative) subtype [21]. In the DACHS cohort, the magnitude and direction of the effect for CSS and OS were similar for the MSS subtypes (types 2 and 3). The MSI-high subtypes were not significantly associated with better survival (type 1: HR = 0.93 [0.5–1.8]; type 5: HR = 1.04 [0.4–2.5]), regardless of the BRAF or KRAS mutational status. The c-statistic for the CSS and OS models was 0.797 and 0.744, respectively.

Sinicrope et al. [16] described five molecular subtypes in a cohort of stage III colon cancer patients recruited in the N0147 clinical trial. Significantly worse CSS was observed for both KRAS mutated and BRAF mutated MSS subtypes, and no significant associations were found for either of the MSI-high subtypes compared to the MSS non-mutated subtype [16]. These findings were similar in the DACHS cohort, where HRs for RFS showed worse survival in the MSS groups and no significant associations in the MSI-high subtypes. The c-statistic for this model was 0.723.

Extended analyses in patient subgroups not available from the original studies

In exploratory analyses, the classifications were validated using additional subgroups of DACHS patients (see Table 3). Similar to the results in the main analysis, no significant associations were found for the classification proposed by Samadder et al. [20] in analyses restricted to men, although a tendency towards worse CSS and OS in the MSS, KRAS mutated subgroup was observed. This tendency was also observed for the subgroup analyses including both sexes, where HRs for CSS were similar to those reported for women by the original study. Stage-specific analyses for the classification proposed by Phipps et al. [21] suggested a better survival for MSI subtypes only in early stage patients (data not shown). Extended analyses of Sinicrope’s [16] classification including patients with all stages of colon and colorectal cancers showed similar associations for the MSS subtypes, and a borderline significant association of the MSI subgroups with RFS.

Visual assessment of agreement with original studies

Figure 1 presents Kaplan-Meier plots for survival in the DACHS cohort after categorizing patients according to each of the proposed classifications. In the DACHS cohort (Fig. 1a), the Kaplan-Meier plot suggested a better survival for the ‘Traditional’ pathway curve compared to the one published by Samadder et al. [20] (Fig. 3B in the original study). No differences were observed between the ‘Alternate’ and ‘Serrated’ pathways. For the classification proposed by Phipps et al. [21], a visual comparison with the originally published Kaplan-Meier plots allowed to infer good agreement with the type 2 subtype, which corresponds to MSS, BRAF mutated, CIMP-positive tumors. The other subtypes in this classification also showed similar patterns of survival (Fig. 1b). However, type 1 and type 5 (MSI-high subtypes) showed better survival compared to the all-negative type 4 subtype in the original study (Fig. 1 in the original study [21]). The survival curves based on the classification by Sinicrope et al. [16] (Fig. 1c), showed a worse pattern of RFS for the MSS/BRAF mutated subtype. Even though the outcomes were different (DFS and RFS) the patterns observed in the Kaplan-Meier plots were similar to the ones published by the authors in their own external validation cohort (Fig. 3 in the original study [16]).

Fig. 1
figure 1

Kaplan-Meier curves for each classification within DACHS patient cohort. A. Cancer-specific survival for the classification by Samadder et al. [20] B. Cancer-specific survival for the classification by Phipps et al. [21] C. Relapse-free survival for the classification by Sinicrope et al. [16]

Discussion

In this external validation study of previously proposed molecular subtype classifications for the prediction of survival among CRC patients, we found that MSS cancers with BRAF or KRAS mutations generally conferred a worse prognosis compared to tumors with no such mutations. This finding was reported by two previous studies, and we were able to validate their results both in similar patient populations and additional subgroups that were not included in the original studies. Overall, all subtypes of the three proposed classifications showed similar hazard ratios and levels of significance compared to the ones reported by the original studies, except for two MSI-high subtypes proposed by Phipps et al. [21] for which we found no statistically significant associations of better survival.

In their study, Phipps et al. [21] described that patients with MSI-high tumors with or without mutations in BRAF or KRAS (types 1 and 5) had better survival than those with MSS tumors. This reflects the findings of several previous studies and meta-analyses where MSI-high tumors had a better prognosis than MSS tumors [5]. In our study, we found better survival for MSI-high tumors compared to MSS tumors but no statistically significant associations for the specific MSI subtypes (1 and 5). We attribute this difference to the different way in which patients were distributed among the subtypes: i) the proportion of stage I patients differs between the studies for type 1 (47% in original study vs 13% in our study) and type 5 (50% in original study vs 12% in our study); ii) Phipps et al. [21] included a large proportion of patients from a cohort of postmenopausal women and a second recruitment round of patients diagnosed before 50 years of age, whereas patients in our study reflect an older patient population with equal distribution of men and women. These differences in population characteristics might also be responsible for the observed survival in the different subtypes.

None of the other studies included here provided an estimate of survival for MSI-high subtypes with no mutations in BRAF and KRAS. For example, Samadder et al. [20] provided only one subtype where MSI status could be either negative or positive, and found no significant associations with survival. Sinicrope et al. [16] described two MSI-high subtypes without specifying the KRAS mutation status and found no significant associations with DFS. Both subtypes were also not associated with RFS in the validation analyses. This underlines the need to not only discriminate the MSI status of a tumor in any classification [3], but to also include information on the BRAF and KRAS mutational status. When conducting the analyses in different sets of patients (e.g. including all stages and locations) the findings were similar.

Only one of the three studies (Sinicrope et al. [16]) had performed a validation analysis in an external patient population. No measures of effect were provided for the validation cohort; however, the Kaplan-Meier plot created in our validation analysis showed similar patterns to the one provided in this study, especially for the BRAF mutated subtype. Survival curves for the other two studies also showed good agreement with the ones generated in our analysis, particularly the MSS/BRAF mutated subtype described by Phipps et al. [21].

Other CRC classifications have been published in recent years [15, 25,26,27]. The consensus molecular subtype (CMS) group included gene expression information in their classification, and proposed four CRC subtypes for which prediction of survival was only significant for one [25]. The CMS classification system was not included in the present validation analysis, because some information required for the definition of the CMS subtypes was not available from our study. The CMS1 subtype, however, which corresponds to the type 1 tumors reported by Phipps et al. [21], was not significantly associated with better survival, similar to our validation results for type 1 tumors. Even though the proposed CMS subtypes were derived from a complex methodology including information from several international studies, the information required to classify a patient into the subtypes may not be readily available in every clinical practice.

Other studies provided analyses stratifying for the MSS status of the tumor, instead of including MSS as a part of the classification and thus, were not included in this validation analysis [9, 28]. These studies included stage III colon cancer patients recruited in clinical trials for chemotherapy treatment and found significant associations only for MSS tumors with time to relapse and survival after relapse [9, 28]. A recent analysis showed that including molecular information as well as clinical and pathological characteristics in survival models improved their ability to predict overall survival in stage II and III colon cancer patients [29]. This study represents an important first step in optimizing the existing prognostic classification system of CRC and will allow for additional efforts to achieve an ideal classification. Additionally, the Immunoscore showed independent prognostic value for CRC survival after adjusting for the TNM stage and shows promising ability to complement the current system [30,31,32]. These results might be influenced by the close relation between immune infiltration and the MSI status of a tumor [33, 34], but could add value to a prognostic classification that is to complement the traditional staging system. All these diverse studies reflect the increasing international interest in the development of a more comprehensive classification that could help clinicians provide a more personalized treatment to CRC patients [35].

Due to the exploratory character of newly proposed molecular CRC classification systems, validation studies in external patient cohorts are essential before the usefulness of the proposed classification can be judged. In this large cohort study, we were able to perform validation of three previously proposed classification systems by incorporating the same set of molecular markers. We attempted to imitate the original analyses by adjusting for the same set of confounders in each case; however, there might still be differences in the assessment and definition of the variables leading to differences in the calculated estimates. On the other hand, not all hitherto proposed classifications could be validated in our study [14], since not all information required to construct the molecular subtypes was available. Also, although it is a large study, sample size decreases when stratifying patients into several subtypes and restricting the populations to match the ones reported by the original studies, which limits the power for the statistical analyses. Finally, the time periods when the patients were recruited span different decades and this could mean the chemotherapy regimens used were different between the studies.

Conclusions

In conclusion, the results of this validation analysis contribute to the evolving interest in the development of an extended clinically meaningful classification for CRC. None of the published classifications has so far provided a definitive subtyping that allows to predict patient survival in all groups. Our results support the conclusion that MSS subtypes including BRAF or KRAS mutations have a worse prognosis compared to those without the latter mutations and that those subtypes can be readily generalized to most other patient populations. The role of CIMP status was less clear in the prediction of survival, and it is known to be highly associated with MSI. Our extended analysis supported that the observed associations were similar in patient subgroups that were not included in the original studies. However, as for the MSI subtypes, where other characteristics such as stage, location and sex are highly correlated, may require more careful evaluation before generalizing their potential survival benefit. Also, further information from methylation, gene expression, and immune response analyses in tumor tissue may help to further improve the definition of clinically relevant molecular subtypes. The present knowledge about molecular subtypes of colorectal cancer suggests that the stage of disease remains the most important predictor of survival, and that more research is needed to find molecular tumor markers or combinations that help to complement this system.