Abstract
Background
Colon cancer patients with the same stage show diverse clinical behavior dueto tumor heterogeneity. We aimed to discover distinct classes of tumorsbased on microarray expression patterns, to analyze whether the molecularclassification correlated with the histopathological stages or otherclinical parameters and to study differences in the survival.
Methods
Hierarchical clustering was performed for class discovery in 88 colon tumors(stages I to IV). Pathways analysis and correlations between clinicalparameters and our classification were analyzed. Tumor subtypes werevalidated using an external set of 78 patients. A 167 gene signatureassociated to the main subtype was generated using the 3-Nearest-Neighbormethod. Coincidences with other prognostic predictors were assesed.
Results
Hierarchical clustering identified four robust tumor subtypes withbiologically and clinically distinct behavior. Stromal components(p < 0.001), nuclear β-catenin (p = 0.021),mucinous histology (p = 0.001), microsatellite-instability(p = 0.039) and BRAF mutations (p < 0.001) wereassociated to this classification but it was independent of Dukes stages(p = 0.646). Molecular subtypes were established from stage I.High-stroma-subtype showed increased levels of genes and altered pathwaysdistinctive of tumour-associated-stroma and components of the extracellularmatrix in contrast to Low-stroma-subtype. Mucinous-subtype was reflected bythe increased expression of trefoil factors and mucins as well as by ahigher proportion of MSI and BRAF mutations. Tumor subtypes werevalidated using an external set of 78 patients. A 167 gene signatureassociated to the Low-stroma-subtype distinguished low risk patients fromhigh risk patients in the external cohort (Dukes B andC:HR = 8.56(2.53-29.01); Dukes B,C andD:HR = 1.87(1.07-3.25)). Eight different reported survival genesignatures segregated our tumors into two groups the Low-stroma-subtype andthe other tumor subtypes.
Conclusions
We have identified novel molecular subtypes in colon cancer with distinctbiological and clinical behavior that are established from the initiation ofthe tumor. Tumor microenvironment is important for the classification andfor the malignant power of the tumor. Differential gene sets and biologicalpathways characterize each tumor subtype reflecting underlying mechanisms ofcarcinogenesis that may be used for the selection of targeted therapeuticprocedures. This classification may contribute to an improvement in themanagement of the patients with CRC and to a more comprehensiveprognosis.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Background
Colorectal cancer is one of the most common malignancies in the western world andaccounts for about 10% of all cancer deaths in both Europe and the USA.Traditionally, colorectal cancer classification (Dukes, AJCC (American JointCommittee on Cancer)) is based in the extent of the cancer: depth of tumor invasioninto the wall of the intestine, number of nearby affected lymph nodes and whetherthe cancer has metastasized to other organs of the body. Surgery is curative for abig proportion of patients at early stages, but is not enough for many patients atadvanced stages. Most of these patients need adjuvant chemotherapy in order to avoidrelapse or to increase survival. Unfortunately, only a small portion of them showsan objective response to chemotherapy, becoming problematic to correctly predictpatients’ clinical outcome [1]. Microarray gene expression profiling is a powerful tool for theidentification of prognostic gene signatures. Supervised analysis of gene expressionhas been used to discover gene signatures to identify patients at risk of recurrencein colon cancer [2–8]. Recently two extensively validated gene signatures have been reportedOncotype-DX and ColoPrint [9, 10]. A different approach is to use unsupervised analysis. Clustering methodsgroup together samples with similar expression profiles. With this strategy, newsubtypes of tumors can emerge or the existing classification may be redefined withthe result of more uniform groups of tumors. Molecular homogeneity may be essentialin order to identify specific biological pathways affected, to discover precise drugtargets in each subgroup or to obtain individual survival classifiers. Previousattempts to subdivide colon tumors into sub-classes or to correlate gene expressionto Dukes stages using unsupervised analysis haven’t been conclusive. Someauthors were able to correctly classify normal colon, Dukes B and C but not Dukes Aand D and no new subgroups were identified [11]. Others were able to classify in one group normal tissue with Dukes A, inanother cluster B with C, and D clustered separately [12]. Other authors were unable to find differences between stages B, C and D [13]. Some reports found differences between normal and tumor tissue and genesdifferentially expressed between metastatic and nonmetastatic samples [14–16] or segregate normal tissue from primary carcinomas and from livermetastasis and carcinomatoses [14, 17]. Other authors using class comparison between Dukes A and D identified agene signature that could be used for the classification of low- and high-riskpatients in Dukes B and C [7]. Another interesting approach described the identification of a geneexpression profile generated from an experimental model of colon cancer metastasisthat was able to predict cancer recurrence in patients with colon cancer [18]. Other authors reported that epidermal growth factor receptor pathway wasup-regulated in metachronous liver metastasis while angiogenesis was up-regulated insynchronous liver metastasis [19]. Unfortunately, even if there is an ample selection of gene signaturesreported in the literature, almost none of them have reached the clinical practice.There is a need of prognostic and predictive factors to provide authoritativeinformation for medical decisions in routine clinical practice. Our study was mainlyaimed to obtain more homogeneous groups of tumors in colorectal adenocarcinomashypothesizing that discovering molecularly more uniform groups of tumors, wouldlikely discriminate patients with different clinical outcomes, as well. In additionunderstanding the biological pathways underlying each tumor subtype wouldpotentially help in future, to find the appropriate treatment regime.
Methods
Patients
Patients from all stages were selected, keeping approximately equal proportion ofeach stage (24 Dukes A or AJCC (6 edition) stage I; 26 B or II; 19 C or III and19 D or IV)). Tumor samples were taken from the Bank of Tumors of the HospitalClinico San Carlos between 2001 and 2006. The Bank of Tumors follows the rulesestablished by the hospital including the patient consent approved by theEthical Committee of the Hospital Clinico San Carlos.
Histological analysis of tumor samples
Many reports have shown the importance of tumor associated stroma in thedevelopment of cancer; therefore for our study we did not consider to do lasermicrodissection to get just the transformed epithelial cells. We wanted toanalyze both, tumor cells from the malignant epithelia and the alteredsurrounding stroma. We took a representative fragment of the complete tumor andwe carried out a very detailed pathological analysis of the frozen tumorfragments used to extract the RNA and of the corresponding paraffins of thetumor. Only samples with more than 80% of tumor component were included,considering tumor stroma as part of the tumor component.
RNA extraction and quality control
RNA was extracted directly from the frozen samples using TRIZOL (Invitrogen,Carlsbad, CA) and a homogenizer (Ultraturrax T8-S8N-5 G Rose Scientific Ltd,Canada). Afterwards, RNA was treated with DNAse using RNeasy Microkit (QiagenGmbH, Germany). RNA quality was measured with Agilent Bioanalyzer 2100 (Agilenttechnologies, Palo Alto U.S.A) and only good quality samples, RIN (RNA IntegrityNumber) [20, 21] higher than 7.5, were selected for the analysis.
Microarray analysis
Agilent G4112 microarrays were used to analyze gene expression in 88 colon tumorsand 7 normal colon tissues. A reference RNA preparation (pool of normal colontissue RNAs obtained from 68 individuals) was used for double hybridization:tumor-Cy5/pool-Cy3, normal-Cy5/pool-Cy3. Agilent recommended protocols werefollowed. Fluorescence was measured and normalized (LOWESS) using Agilentmicroarray scanner and Feature Extraction software. Quality Control Report wascarried out to discard the microarrays that did not fulfill good qualitycriteria. From the original 44 K features microarray, a total of 28462 spotswithout flags in 90% of the microarrays were used. Only probes that weresignificantly (p < 0.01) up or down regulated vs. thereference pool, in at least 7 samples (considering the 7 normal tissue samplesas the smallest group) were selected to obtain 17392 spots. Probes with the samegene identification were averaged to obtain a total of 14764 genes. Forclassification purposes we chose the genes that showed higher variations betweentumors, selecting the genes that in more than 7 samples had at least a 2.5-foldchange from the gene median value, resulting 1722 genes that were used for theunsupervised analysis of the 89 samples (tumor CT102 was replicated). Clusterreproducibility was measured by the robustness index (R-index) and by thediscrepancy index (D-index); [22] analyses were performed using BRB-ArrayTools developed by Dr. RichardSimon and BRB-ArrayTools Development Team. Transcript Profiling: [ArrayExpressE-TABM-723].
Functional analysis of KEGG pathways
A functional analysis of KEGG pathways using class comparison tools(Goeman’s global, LS, KS Efron. Tibshirani’s tests) was carried outto find differentially affected pathways between the four tumor subtypes. 164gene sets were studied and the threshold used was set at p = 0.005.Multiple comparisons were corrected using resampling and gene permutations.Since Goeman's method tests the null hypothesis that no genes within a givengene set are differentially expressed and LS test, KS test andEfron-Tibshirani's methods, test the hypothesis whether the average degree ofdifferentially expression is greater than expected from a random sample of genes(BRB-ArrayTools), KEGG pathways selected had to be significant at least in twotests: Goeman’s test and any of the other three tests carried out.
Tissue microarrays (TMA), IHC and mutation analysis
Tissue microarrays were assembled as in [23] for immunological analysis of β-catenin (clone17c2 NovocastraLaboratories Ltd. Newcastle upon Tyne, UK), M30 (M30 CytoDEATH Roche DiagnosticsGmbH Mannheim Germany) for apoptosis and KI67 (clone M1B1, Dako, Glostrup,Denmmark) for proliferation. Presence of mutations in KRAS,BRAF and PI3K as well as microsatellite instability (MSI)were also assessed. See Additional file 1:Supplementary Information for more information about the protocols followed forantibody staining and analysis of MSI and gene mutations.
Identification of tumor subgroups in an independent data set
Eschrich et al. [2] data set was used as an external patient collection. Data wascombined using the method published by Hu et al. [24]. The genes that had the same UniGene Cluster ID were averaged and thegenes that did not have a UniGene Cluster ID were eliminated from our data setresulting 11017 genes out of the 14764 genes and 96 samples (normal and tumorsamples). Eschrich data set consists of 78 samples (23B, 22 C, 30D and 3adenomas) and 32208 normalized transcripts. Spots without IDs or with more than25% missing values were eliminated and spots with the same UniGene ClusterID were averaged. Genes with 90% of data were selected to obtain a total of9229 genes. Combination of data sets: both data sets were combined usingthe software “Distance Weighted Discrimination”(https://genome.unc.edu/pubsup/dwd/) to obtain a collection of174 samples (166 tumors) and 5319 common genes. Classification of theexternal data set: A Nearest Centroid predictor was built in our dataset including only genes differentially expressed between classes at ap < 0.001. LOOCV (Leave-One-Out Cross-Validation) and 100 randompermutations were used to compute miss-classification rate. This predictor wassubsequently used to classify the external samples into the four novel clusters.Hierarchical clustering: To analyze whether the externalpatient’s set clustered with our patients in the same tumor subtypes,Centered Pearson correlation and average-linkage-hierarchical clustering of thecombined set (159 tumor samples excluding samples from cluster-5, normal tissuesand adenomas) was carried out using the 461 common genes between both data setsout of the 1722 originally selected genes.
Generation of a low-stroma-subtype predictor
Eschrich samples were classified as belonging to the Low-stroma-subtype orbelonging to the other tumor subtypes using the K-nearest-neighbor,K = 3 (KNN3) prediction method. A predictor was generated in ourdata set using the 461 common genes between both data sets out of the 1722originally selected genes. Genes included in the predictor were differentiallyexpressed between classes at a p < 0.001. LOOCV and 100 randompermutations were used to compute miss-classification rate.
Statistical analysis and correlations with clinical parameters and survivalanalysis
Qualitative variables are given with their frequency distribution. Quantitativevariables are given with their mean and standard deviation (SD). Means werecompared with Kruskal-Wallis test. Proportions were compared by the chi squaretest for independent groups. Survival functions were estimated by the actuarialmethod. Cumulative risks over time and their corresponding standard errors (SE)are provided along with the number of patients at risk (n). Likelihood exacttest was used to compare survival functions for the different subgroups. A Cox'sproportional hazards regression model was fitted. Significance was taken as adrop in the likelihood estimator of the models compared. Adjusted hazard ratios(HR) and its 95% confidence interval (95%CI) are provided in theresults. In each hypothesis contrast the assumption of rate proportionality wasverified. In all hypothesis contrasts (survival analysis and clinical parameterscorrelations) the null hypothesis of no difference was rejected with a type I orα-error of less than 0.05. Correction of p-values was not performed.Statistical analysis was performed with SPSS 15.0 for Windows (SPSS Inc.,Chicago, USA).
Results
Identification of tumor subtypes by hierarchical clustering
Centered Pearson correlation and average-linkage-hierarchical clustering with the1722 selected genes was used to group both tumors and genes. The 89 tumorsamples (tumor CT102 is duplicated) were arranged primary in two main groups(Figure 1A). The first group contains just one class,cluster-1, with 36 tumors (40% of the total number of samples). The secondmain group holds the rest of the tumors that were classified in three smallerreproducible subgroups, clusters-2, -3 and −4 containing 12, 22 and 14tumors respectively (Figure 1A). The robustness andreproducibility of the four new clusters was high, mainly for clusters-1, -3 and−4 with robustness close to 0.9 and a low number of samples additions oromissions. Cluster-2 was the weakest of the four clusters with a robustness of0.75 (Table 1). Hierarchical clustering identified a fifthcluster with five elements in it and a lower robustness; we did not regard this5th cluster as a group considering those samples as unclassified tumors.
The dendrogram of the 89 tumor samples and the 1722 selected genes is shown inFigure 1B. The principal sections of the graph thatdistinguish between clusters are indicated with a color bar on the right(Additional file 2: List of 1722 genes seesupplementary material). The green bar corresponds to a group of genes withlowest expression in cluster-1 and highest expression in cluster-3 and isenriched in genes that have been reported to be specifically activated intumor-associated stroma and molecules implicated in extracellular matrixremodeling and cell migration. Since these two clusters were mainlycharacterized by differences in the abundance of stromal genes, cluster-1 andcluster-3 were defined as the Low-stroma-subtype and the High-stroma-subtyperespectively. The red bar localizes genes with lower levels in cluster-1 buttheir expression is not higher in cluster-3 than in clusters-2 and −4.Distinctive genes of this section are metallothioneins, metallopeptidases andSPP1. Blue bar show the location of genes up-regulated in Cluster-4. These aremolecules typically associated to the mucinous type of adenocarcinomas liketrefoil factors and mucins, for these characteristics cluster-4 was defined asthe Mucinous-subtype. Yellow bar localizes the genes up-regulated in cluster-2.This cluster is mainly characterized by a collection of immunoglobulin-relatedmolecules, for this reason cluster-2 was named as theImmunoglobulin-related-subtype.
Functional analysis of KEGG pathways
KEGG pathways analysis sustains the implication of the tumor microenvironment inthe identified tumor subgroups (see Additional file 1:Table S1 for the list of deregulated KEGG pathways). Pathways corresponding tocell communication, ECM-receptor interaction, Focal adhesion and Cell adhesionmolecules showed differences between clusters; elements of these pathways showedsignificantly lower values in the Low-stroma-subtype than in the other tumorsubtypes. Other significant deregulated pathways showing differences amongclusters were WNT and TGF pathways.
Correlations of tumor subtypes with clinical parameters
The Patients’ characteristics are summarized in Table 2. Correlation analysis was performed to find associations betweenthe four novel clusters and clinical parameters (Table 3and Additional file 1: Table S2). Dukes stages did notshow any association (p = 0.646) with the identified tumor subgroups(Figure 1A, Table 3). Parametersthat showed a clear correlation with the identified tumor subtypes were:proportion of stroma in the tumors, mucinous histology, the extent of nuclearβ-catenin staining, MSI and the V600E BRAF mutation. Proportion oftumor stroma in the frozen fragments and in the corresponding paraffin blockswere compared and no differences were found (IntraclassCorrelation-ICC = 0.781 (IC95% (0.681-0.852)p < 0.001). Amount of stroma was always the lowest in cluster-1and the highest in cluster-3; differences were significant between these twoclusters in both, frozen samples (p < 0.001) and paraffins(p = 0.005). Tumors in cluster-2 had similar amount of stroma thantumors in cluster-4. The quantity of stroma was significantly lower in cluster-1related to cluster-2 (p = 0.02) and cluster-4(p = 0.013) in the frozen samples. Mucinous histology was correlatedsignificantly with this classification (p = 0.001). Pair wisecomparisons showed that there was a significant association of the mucinoushistology, the MSI tumors and B-Raf mutations to cluster-4 (Table 3). Nuclear β-catenin was also associated with the clusters(Table 3; Figure 2). Pair wisecomparisons showed an increased proportion of epithelial cells with nuclearβ-catenin in cluster-1 and −3 and a low proportion in cluster-2 and−4. The presence of mutations in K-Ras (codons 12/13) and PI3K (exons9/20), proliferation (Ki67), apoptosis (M30) as well as other histologicalparameters did not show association with the molecular subtypes (see Additionalfile 1: Table S2 for all parameters studied).
Recognition of tumor subtypes in an external clinical cohort
To classify Eschrich samples into the four novel clusters, a classifier of 1039genes was generated using the 5319 common genes between both data sets and theNearest Centroid method (85% correct classification). Afterwards,unsupervised analysis of the combined set of 159 tumor samples was carried out.Hierarchical clustering, using 461 genes, associated Eschrich’s samplesclassified as Low-stroma subtype with our Low-stroma subtype samples;Eschrich’s samples classified as High-stroma-subtype with ourHigh-stroma-subtype samples and his tumors in the Mucinous-subtype with oursamples in Mucinous-subtype. Samples of the immunoglobulin related subtype didnot show a good association (see Additional file 3:Figure S1). We also used other prediction methods like K-Nearest-Neighbor,(K = 1 and K = 3) and Diagonal Linear Discriminant,obtaining similar results (not shown).
Differences in survival time between the identified tumors subtypes
Our set of patients was too heterogeneous to analyze survival since it was mainlyaimed to obtaining a comprehensive classification of colon cancer. From thetotal of 88 patients, 26 did not had at least 36 months of following up andother 23 patients were under different treatment schemes. Among the 39 untreatedpatients there were just one death and four relapses. Under these circumstances,the number of events was not enough to obtain reliable results in the survivalanalysis.
Reported survival predictors identified the patients of thelow-stroma-subtype
To analyze differences in survival of the novel clusters, first we took advantageof the survival predictors already published. We analyzed whether Eschrich etal. [2] 43 genes survival predictor recognized specifically any of our tumorsubtypes. Hierarchical clustering of our 84 tumor samples (excluding the samplesfrom cluster-5), using the 17 common genes out of the 43 genes predictor,segregated the samples into two clusters, the first was composed mainly oftumors of the Low-stroma-subtype and the second was composed of tumors of theother subtypes. We also used the predictors of Garman et al. [8] Wang et al. [3]; Lin et al. [4]; Jorissen et al. [7]; Smith et al. [18]; O'Connell et al. (Oncotype-DX) [9] and of Salazar et al. (Coloprint) [10] obtaining similar results (Figure 3). However,other reported predictors such as Barrier et al. [5] and Arango et al. [6] were unable to specifically recognize any of our molecular subtypes(not shown).
The external patients classified as belonging to the low-stroma-subtypeshowed better survival
Since Eschrich’s and other published predictors mainly segregated thesamples of the Low-stroma-subtype, next step was to address whether ourLow-stroma-subtype predictor was able to identify in Eschrich’s data set,the patients with good prognosis. Using the KNN3 classification method and the461 common genes between both data sets out of the 1722 selected genes, apredictor of 167 genes was generated (see Additional file 4: supplemental information for the list of 167 genes); 96% ofcorrect classification was obtained (see Additional file 1: Table S3 for classification performance). Kaplan-Meier overallsurvival analysis of Eschrich’s patients classified as belonging to theLow-stroma-subtype showed better survival than the patients belonging to theother tumor subtypes. Low-stroma-subtype patients showed better survival whenanalyzing both, stages B and C only (Figure 4A) and stagesB, C and D (Figure 4B). We also used the Nearest Centroidmethod finding similar results (not shown).
Coincidence among predictors
Usually there is a minimal overlap among reported high risk gene signatures [25]. Our Low-stroma-subtype predictor showed some overlapping (12 genesin common) with Jorissen et al. [7] predictor, two genes in common with Eschrich et al. [2] and with Oncotype-DX [9] predictors; one gene in common with the predictors of Garman et al. [8] Wang et al. [3] and ColoPrint [10]. There were no genes in common with Lin et al. [4] and with Smith et al. [18] predictors (see Additional file 1: Table S4for the list of overlapping genes). Even though there was little or noneoverlapping, all these eight reported predictors recognized the tumors of theLow-stroma-subtype.
Discussion
A general approach to find prognostic markers in colon cancer is using supervisedanalysis of gene expression. Class comparison between patients with good and badprognosis has been carried out, and gene signatures that discriminate between highand low risk patients have been reported [2–10]. In this study, we have used a different strategy, hypothesizing that theidentification of distinct molecular tumor subtypes would likely discriminatepatients with different clinical outcomes, as well. In addition understanding thebiological pathways underlying each tumor subtype would likely help to find theappropriate treatment scheme.
We report a molecular classification of colon adenocarcinomas in four novel tumorsubtypes identified by unsupervised analysis of gene expression.Tumor-associated-stroma was clearly associated with this classificationcharacterizing a Low-stroma-subtype and a High-stroma-subtype. Mucinous histology,MSI, BRAF mutations as well as lower levels of nuclear β-catenincharacterize the Mucinous-subtype. Tumor subtypes were independent of thehistopathological stages. Lack of association with the histopathological staging isimportant, because it implies that tumor subtypes are established since initialstages of the tumor, consequently contributing to the selection of the patients atearly stages. Additionally, explains why many studies were unable to reliablyassociate molecular classification to Dukes stages [11, 12, 14, 17]. The nature of the genes expressed in each cluster and the biologicalpathways affected supported the association of the molecular and pathologicalparameters with the tumor subgroups. Low-stroma-subtype, High-stroma-subtype andMucinous-subtype were robust, associated to biological characteristics and validatedin an external patient set. The combination of two different microarray studies inone data set is challenging; many of the important genes in each data set may belost in the merged spreadsheet. Even though, when we combined our data set with theexternal data set we still kept important features in the combined data set. Thenovel molecular subtypes were also identified in the external data set (at leastthree of the four clusters).
Relevant reports identified stroma gene signatures associated to survival in diffuselarge-B-cell lymphoma [26] and in breast cancer [27] reflecting the importance of tumor microenvironment in the aggressiveprogression of the disease [28]. Moreover a report in colon cancer showed that the presence of a highamount of stroma, predicts worse survival for stage I-II colon cancer patients [29]. Stroma was highly associated to our molecular classification. Genescorresponding to pathways related to cell communication, ECM-receptor interaction,Focal adhesion and CAMs were down-regulated in the Low-stroma-subtype andup-regulated in the High-stroma-subtype and in the Mucinous-subtype.High-stroma-subtype had the highest percentage of stroma in the tumors and thehighest level of stromal components. Mucinous-subtype also had high levels of stromaassociated genes and the proportion of stroma was not significantly lower than inthe High-stroma-subtype. Although clusters-3 and −4 share similar expressionpatterns of some of these stromal genes, there are other important genes thatclearly are different between these two subtypes, genes characteristic of gobletcells, trefoil factors and mucins, as well as other genes like REGIV,COX2 or CD55 are specifically up-regulated in cluster-4 orMucinous-subtype.
Microenvironment is important for tumor development and more interestingly may be thetarget of novel treatments. In this line, promising studies are underway. Althoughinitial studies using antibodies against activated fibroblast proteins, likeFAP, did not obtain objective tumor responses [30]. New developments are taking advantage of the enzymatic activity of FAP.With this strategy, a prodrug is administrated in an inactive form that isproteolytically activated by the FAP present in cancer activatedfibroblasts localized in tumor microenvironment. Once activated, the drug targetsany cell contained in the tumor [30, 31]. Other therapies anti-stroma under development targetintegrins-extracellular membrane interactions [32, 33] or target tumor stroma using T cells [34] or human mesenchymal stem cells [35, 36]. Consequently is an active field of research and the identification of ahigh stroma subtype group of patients may be essential to obtain benefit from thesetreatments, administrating anti-stroma therapies just to this group of patients.
Since our survival results showed that Low-stroma-subtype identified lower riskpatients and High-stroma-subtype and Mucinous-subtype identified higher riskpatients, we contradict many reports indicating that MSI tumors have better clinicaloutcome than MSS/L tumors [37, 38]. However, the Mucinous-subtype retains important factors usually found inpoor prognostic tumors; a) mucinous tumors have worse clinical outcome and in ourstudy mucinous and MSI tumors clustered together; b) high levels of SPP1,FAP, GREMLIN1, CD55 or REGIV among othershave been reported to be associated with cancer invasion, metastasis and poorprognostic in colon cancer [39–45]. These genes are up-regulated in clusters-3 and −4; c) theincreased levels of TFF2 and MUC1, characteristic of theMucinous-subtype, have also been associated to a poor clinical outcome [46]; d) BRAF mutations have been shown as a worse prognostic factor [47, 48]. Four out of the five MSI tumors in the Mucinous-subtype harborBRAF mutations. For all there reasons, consequently, we could expectthat patients of High-stroma-subtype and Mucinous-subtype had a worse clinicaloutcome.
The largest cluster was the Low-stroma-subtype and shows key clinical properties thatspecially distinguish this subtype from the other tumor subtypes. First, a 167 genesignature associated to this group of tumors distinguished low risk patients in anexternal clinical cohort. Second, eight different reported gene signatures includingthe extensively validated Oncotype-DX and ColoPrint [2–4, 7–10, 18], classified the Low-stroma-subtype patients in one group and the othertumor subtypes in a second group. Comparing microarray analysis across differentstudies and platforms is challenging. In general there is little or none overlappingamong different gene signatures. In our study we found that eight different reportedsurvival predictors and our 167 genes Low-stroma-subtype predictor, with almost nooverlap among them, recognized the same group of patients in our data, theLow-stroma-subtype. Furthermore, our 167 genes Low-stroma-subtype predictor was ableto identify in the external data set the patients with better clinical outcome. Whatis important and relevant for the application to the clinics is recognizing the sametype of patients, not to demonstrate overlapping among different gene lists. Thiscoincidence is important to confirm the potential of microarray gene expression forthe identification of low risk patients. Nevertheless, it should be remarked thatsurvival outcomes have not been confirmed with our own survival data and in thesetting of a multivariable analysis. A higher sample size of homogeneous groups ofpatients will be necessary to establish the prognostic value of this molecularclassification.
Conclusions
With these findings, we propose a colon cancer classification in intrinsic molecularsubtypes based on expression patterns. The novel colon tumor subtypes are associatedto important clinicopathological features and show different survival times, but arenot correlated to the histopathological stages. Tumor subtypes are established frominitial tumor stages and validated in an external clinical cohort. Tumormicroenvironment is important for the classification and for the malignant power ofthe tumor. Differential gene sets and biological pathways characterize each tumorsubtype reflecting underlying mechanisms of carcinogenesis that may be used for theselection of targeted therapeutic procedures. The novel molecular classificationreported in this study, may contribute to an improvement in the management of thepatients with colorectal carcinoma and to a more comprehensive prognosis.
References
Diaz-Rubio E, Tabernero J, Gomez-Espana A, Massuti B, Sastre J, Chaves M, Abad A, Carrato A, Queralt B, Reina JJ, et al: Phase III study of capecitabine plus oxaliplatin compared withcontinuous-infusion fluorouracil plus oxaliplatin as first-line therapy inmetastatic colorectal cancer: final report of the Spanish Cooperative Groupfor the Treatment of Digestive Tumors Trial. J Clin Oncol. 2007, 25 (27): 4224-4230. 10.1200/JCO.2006.09.8467.
Eschrich S, Yang I, Bloom G, Kwong KY, Boulware D, Cantor A, Coppola D, Kruhoffer M, Aaltonen L, Orntoft TF, et al: Molecular staging for survival prediction of colorectal cancer patients. J Clin Oncol. 2005, 23 (15): 3526-3535. 10.1200/JCO.2005.00.695.
Wang Y, Jatkoe T, Zhang Y, Mutch MG, Talantov D, Jiang J, McLeod HL, Atkins D: Gene expression profiles and molecular markers to predict recurrence ofDukes' B colon cancer. J Clin Oncol. 2004, 22 (9): 1564-1571. 10.1200/JCO.2004.08.186.
Lin YH, Friederichs J, Black MA, Mages J, Rosenberg R, Guilford PJ, Phillips V, Thompson-Fawcett M, Kasabov N, Toro T, et al: Multiple gene expression classifiers from different array platforms predictpoor prognosis of colorectal cancer. Clin Cancer Res. 2007, 13 (2 Pt 1): 498-507.
Barrier A, Boelle PY, Roser F, Gregg J, Tse C, Brault D, Lacaine F, Houry S, Huguier M, Franc B, et al: Stage II colon cancer prognosis prediction by tumor gene expressionprofiling. J Clin Oncol. 2006, 24 (29): 4685-4691. 10.1200/JCO.2005.05.0229.
Arango D, Laiho P, Kokko A, Alhopuro P, Sammalkorpi H, Salovaara R, Nicorici D, Hautaniemi S, Alazzouzi H, Mecklin JP, et al: Gene-expression profiling predicts recurrence in Dukes' C colorectalcancer. Gastroenterology. 2005, 129 (3): 874-884. 10.1053/j.gastro.2005.06.066.
Jorissen RN, Gibbs P, Christie M, Prakash S, Lipton L, Desai J, Kerr D, Aaltonen LA, Arango D, Kruhoffer M, et al: Metastasis-Associated Gene Expression Changes Predict Poor Outcomes inPatients with Dukes Stage B and C Colorectal Cancer. Clin Cancer Res. 2009, 15 (24): 7642-7651. 10.1158/1078-0432.CCR-09-1431.
Garman KS, Acharya CR, Edelman E, Grade M, Gaedcke J, Sud S, Barry W, Diehl AM, Provenzale D, Ginsburg GS, et al: A genomic approach to colon cancer risk stratification yields biologicinsights into therapeutic opportunities. Proc Natl Acad Sci U S A. 2008, 105 (49): 19432-19437. 10.1073/pnas.0806674105.
O'Connell MJ, Lavery I, Yothers G, Paik S, Clark-Langone KM, Lopatin M, Watson D, Baehner FL, Shak S, Baker J, et al: Relationship between tumor gene expression and recurrence in four independentstudies of patients with stage II/III colon cancer treated with surgeryalone or surgery plus adjuvant fluorouracil plus leucovorin. J Clin Oncol. 2010, 28 (25): 3937-3944. 10.1200/JCO.2010.28.9538.
Salazar R, Roepman P, Capella G, Moreno V, Simon I, Dreezen C, Lopez-Doriga A, Santos C, Marijnen C, Westerga J, et al: Gene expression signature to improve prognosis prediction of stage II and IIIcolorectal cancer. J Clin Oncol. 2011, 29 (1): 17-24. 10.1200/JCO.2010.30.1077.
Frederiksen CM, Knudsen S, Laurberg S, Orntoft TF: Classification of Dukes' B and C colorectal cancers using expressionarrays. J Cancer Res Clin Oncol. 2003, 129 (5): 263-271.
Birkenkamp-Demtroder K, Christensen LL, Olesen SH, Frederiksen CM, Laiho P, Aaltonen LA, Laurberg S, Sorensen FB, Hagemann R: Gene expression in colorectal cancer. Cancer Res. 2002, 62 (15): 4352-4363.
Kwong KY, Bloom GC, Yang I, Boulware D, Coppola D, Haseman J, Chen E, McGrath A, Makusky AJ, Taylor J, et al: Synchronous global assessment of gene and protein expression in colorectalcancer progression. Genomics. 2005, 86 (2): 142-158. 10.1016/j.ygeno.2005.03.012.
Bertucci F, Salas S, Eysteries S, Nasser V, Finetti P, Ginestier C, Charafe-Jauffret E, Loriod B, Bachelart L, Montfort J, et al: Gene expression profiling of colon cancer by DNA microarrays and correlationwith histoclinical parameters. Oncogene. 2004, 23 (7): 1377-1391. 10.1038/sj.onc.1207262.
Yamasaki M, Takemasa I, Komori T, Watanabe S, Sekimoto M, Doki Y, Matsubara K, Monden M: The gene expression profile represents the molecular nature of livermetastasis in colorectal cancer. Int J Oncol. 2007, 30 (1): 129-138.
Koehler A, Bataille F, Schmid C, Ruemmele P, Waldeck A, Blaszyk H, Hartmann A, Hofstaedter F, Dietmaier W: Gene expression profiling of colorectal cancer and metastases divides tumoursaccording to their clinicopathological stage. J Pathol. 2004, 204 (1): 65-74. 10.1002/path.1606.
Kleivi K, Lind GE, Diep CB, Meling GI, Brandal LT, Nesland JM, Myklebost O, Rognum TO, Giercksky KE, Skotheim RI, et al: Gene expression profiles of primary colorectal carcinomas, liver metastases,and carcinomatoses. Mol Cancer. 2007, 6: 2-10.1186/1476-4598-6-2.
Smith JJ, Deane NG, Wu F, Merchant NB, Zhang B, Jiang A, Lu P, Johnson JC, Schmidt C, Bailey CE, et al: Experimentally derived metastasis gene expression profile predicts recurrenceand death in patients with colon cancer. Gastroenterology. 2009, 138 (3): 958-968.
Pantaleo MA, Astolfi A, Nannini M, Paterini P, Piazzi G, Ercolani G, Brandi G, Martinelli G, Pession A, Pinna AD, et al: Gene expression profiling of liver metastases from colorectal cancer aspotential basis for treatment choice. Br J Cancer. 2008, 99 (10): 1729-1734. 10.1038/sj.bjc.6604681.
Imbeaud S, Graudens E, Boulanger V, Barlet X, Zaborski P, Eveno E, Mueller O, Schroeder A, Auffray C: Towards standardization of RNA quality assessment using user-independentclassifiers of microcapillary electrophoresis traces. Nucleic Acids Res. 2005, 33 (6): e56-10.1093/nar/gni054.
Strand C, Enell J, Hedenfalk I, Ferno M: RNA quality in frozen breast cancer samples and the influence on geneexpression analysis–a comparison of three evaluation methods usingmicrocapillary electrophoresis traces. BMC Mol Biol. 2007, 8: 38-10.1186/1471-2199-8-38.
McShane LM, Radmacher MD, Freidlin B, Yu R, Li MC, Simon R: Methods for assessing reproducibility of clustering patterns observed inanalyses of microarray data. Bioinformatics. 2002, 18 (11): 1462-1469. 10.1093/bioinformatics/18.11.1462.
Ortega P, Moran A, de Juan C, Frias C, Hernandez S, Lopez-Asenjo JA, Sanchez-Pernaute A, Torres A, Iniesta P, Benito M: Differential Wnt pathway gene expression and E-cadherin truncation insporadic colorectal cancers with and without microsatellite instability. Clin Cancer Res. 2008, 14 (4): 995-1001. 10.1158/1078-0432.CCR-07-1588.
Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, Livasy C, Carey LA, Reynolds E, Dressler L, et al: The molecular portraits of breast tumors are conserved across microarrayplatforms. BMC Genomics. 2006, 7: 96-10.1186/1471-2164-7-96.
Kopetz S, Abbruzzese JL: Barriers to Integrating Gene Profiling for Stage II Colon Cancer. Clin Cancer Res. 2009, 15 (24): 7451-7452. 10.1158/1078-0432.CCR-09-2523.
Lenz G, Wright G, Dave SS, Xiao W, Powell J, Zhao H, Xu W, Tan B, Goldschmidt N, Iqbal J, et al: Stromal gene signatures in large-B-cell lymphomas. N Engl J Med. 2008, 359 (22): 2313-2323. 10.1056/NEJMoa0802885.
Finak G, Bertos N, Pepin F, Sadekova S, Souleimanova M, Zhao H, Chen H, Omeroglu G, Meterissian S, Omeroglu A, et al: Stromal gene expression predicts clinical outcome in breast cancer. Nat Med. 2008, 14 (5): 518-527. 10.1038/nm1764.
De Olivier W, Pieter D, Marc M, Marc B: Stromal myofibroblasts are drivers of invasive cancer growth. Int J Cancer. 2008, 123 (10): 2229-2238. 10.1002/ijc.23925.
Mesker WE, Liefers GJ, Junggeburt JM, van Pelt GW, Alberici P, Kuppen PJ, Miranda NF, van Leeuwen KA, Morreau H, Szuhai K, et al: Presence of a high amount of stroma and downregulation of SMAD4 predict forworse survival for stage I-II colon cancer patients. Cell Oncol. 2009, 31 (3): 169-178.
Brennen WN, Isaacs JT, Denmeade SR: Rationale behind targeting fibroblast activation protein-expressingcarcinoma-associated fibroblasts as a novel chemotherapeutic strategy. Mol Cancer Ther. 2012, 11 (2): 257-266. 10.1158/1535-7163.MCT-11-0340.
LeBeau AM, Brennen WN, Aggarwal S, Denmeade SR: Targeting the cancer stroma with a fibroblast activation protein-activatedpromelittin protoxin. Mol Cancer Ther. 2009, 8 (5): 1378-1386. 10.1158/1535-7163.MCT-08-1170.
Ma WW, Adjei AA: Novel agents on the horizon for cancer therapy. CA Cancer J Clin. 2009, 59 (2): 111-137. 10.3322/caac.20003.
Burvenich I, Schoonooghe S, Vervoort L, Dumolyn C, Coene E, Vanwalleghem L, Van Huysse J, Praet M, Cuvelier C, Mertens N, et al: Monoclonal antibody 14 C5 targets integrin alphavbeta5. Mol Cancer Ther. 2008, 7 (12): 3771-3779. 10.1158/1535-7163.MCT-08-0600.
Zhang B: Targeting the stroma by T cells to limit tumor growth. Cancer Res. 2008, 68 (23): 9570-9573. 10.1158/0008-5472.CAN-08-2414.
Serakinci N, Christensen R, Fahrioglu U, Sorensen FB, Dagnaes-Hansen F, Hajek M, Jensen TH, Kolvraa S, Keith NW: Mesenchymal stem cells as therapeutic delivery vehicles targeting tumorstroma. Cancer Biother Radiopharm. 2011, 26 (6): 767-773. 10.1089/cbr.2011.1024.
Grisendi G, Bussolari R, Veronesi E, Piccinno S, Burns JS, De Santis G, Loschi P, Pignatti M, Di Benedetto F, Ballarin R, et al: Understanding tumor-stroma interplays for targeted therapies by armedmesenchymal stromal progenitors: the Mesenkillers. Am J Cancer Res. 2011, 1 (6): 787-805.
Bertagnolli MM, Niedzwiecki D, Compton CC, Hahn HP, Hall M, Damas B, Jewell SD, Mayer RJ, Goldberg RM, Saltz LB, et al: Microsatellite Instability Predicts Improves Response to Adjuvant TherapyWith Irinotecan, Fluorouracil, and Leucovorin in Stage III Colon Cancer:Cancer and Leukemia Group B Protocol 89803. J Clin Oncol. 2009, JCO (2008): 2018-2071.
di Pietro M, Sabates Bellver J, Menigatti M, Bannwart F, Schnider A, Russell A, Truninger K, Jiricny J, Marra G: Defective DNA mismatch repair determines a characteristic transcriptionalprofile in proximal colon cancers. Gastroenterology. 2005, 129 (3): 1047-1059. 10.1053/j.gastro.2005.06.028.
Henry LR, Lee HO, Lee JS, Klein-Szanto A, Watts P, Ross EA, Chen WT, Cheng JD: Clinical implications of fibroblast activation protein in patients with coloncancer. Clin Cancer Res. 2007, 13 (6): 1736-1741. 10.1158/1078-0432.CCR-06-1746.
Sneddon JB, Zhen HH, Montgomery K, van de Rijn M, Tward AD, West R, Gladstone H, Chang HY, Morganroth GS, Oro AE, et al: Bone morphogenetic protein antagonist gremlin 1 is widely expressed bycancer-associated stromal cells and can promote tumor cell proliferation. Proc Natl Acad Sci U S A. 2006, 103 (40): 14842-14847. 10.1073/pnas.0606857103.
Oue N, Kuniyasu H, Noguchi T, Sentani K, Ito M, Tanaka S, Setoyama T, Sakakura C, Natsugoe S, Yasui W: Serum concentration of Reg IV in patients with colorectal cancer:overexpression and high serum levels of Reg IV are associated with livermetastasis. Oncology. 2007, 72 (5–6): 371-380.
Rohde F, Rimkus C, Friederichs J, Rosenberg R, Marthen C, Doll D, Holzmann B, Siewert JR, Janssen KP: Expression of osteopontin, a target gene of de-regulated Wnt signaling,predicts survival in colon cancer. Int J Cancer. 2007, 121 (8): 1717-1723. 10.1002/ijc.22868.
Durrant LG, Chapman MA, Buckley DJ, Spendlove I, Robins RA, Armitage NC: Enhanced expression of the complement regulatory protein CD55 predicts a poorprognosis in colorectal cancer patients. Cancer Immunol Immunother. 2003, 52 (10): 638-642. 10.1007/s00262-003-0402-y.
Christiansen VJ, Jackson KW, Lee KN, McKee PA: Effect of fibroblast activation protein and [alpha]2-antiplasmin cleavingenzyme on collagen Types I, III, and IV. Arch Biochem Biophys. 2007, 457 (2): 177-186. 10.1016/j.abb.2006.11.006.
Mitani Y, Oue N, Matsumura S, Yoshida K, Noguchi T, Ito M, Tanaka S, Kuniyasu H, Kamata N, Yasui W: Reg IV is a serum biomarker for gastric cancer patients and predicts responseto 5-fluorouracil-based chemotherapy. Oncogene. 2007, 26 (30): 4383-4393. 10.1038/sj.onc.1210215.
Duncan TJ, Watson NF, Al-Attar AH, Scholefield JH, Durrant LG: The role of MUC1 and MUC3 in the biology and prognosis of colorectalcancer. World J Surg Oncol. 2007, 5: 31-10.1186/1477-7819-5-31.
Zlobec I, Bihl MP, Schwarb H, Terracciano L, Lugli A: Clinicopathological and protein characterization of BRAF- and K-RAS-mutatedcolorectal cancer and implications for prognosis. Int J Cancer. 2010, 127 (2): 367-380.
Tran B, Kopetz S, Tie J, Gibss P, Jiang ZQ, Lieu CH, Agarwal A, Maru DM, Sieber O, Desai J: Impact of BRAF mutation and microsatellite instability on the pattern ofmetastatic spread and prognosis in metastatic colorectal cancer. Cancer. 2011, 10.1002/cncr.26086.
Pre-publication history
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2407/12/260/prepub
Acknowledgments
This work was supported by the Fundacion Mutua Madrileña (EDR), FundacionCientifica de la Asociacion Española Contra el Cancer (BPV), FundacionGenoma España (BPV), Sociedad Española de Oncologia Medica (EDR).RTICC-ISCIII ref 06/0020/0021(EDR), Accion Transversal del Cancer andInfraestructuras del FIS (IF063747) (EDR). We thank the Biobank of the IdISSC ofthe HCSC for kindly provide the frozen tumor samples. We thank Jesus GarciaAranda for his kindly help with the figures.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
BPV, FMS and EDR supervised and designed the original study with the collaboration ofTC. SHP, JALA and JSA performed the pathological study. ARL and BPV carried out themicroarray experiments. ARL, BPV, GLC and FMS analyzed the data. CFP and ARLperformed the statistical analysis. AC, JS and RA performed the selection andclinical study of the patients. All authors contributed to revising the article. BPVwrote the paper. All authors read and approved the final manuscript.
Beatriz Perez Villamil, Alejandro Romera Lopez, Susana Hernandez Prieto, Guillermo Lopez Campos and Eduardo Diaz Rubio contributed equally to this work.
Electronic supplementary material
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Perez Villamil, B., Romera Lopez, A., Hernandez Prieto, S. et al. Colon cancer molecular subtypes identified by expression profiling and associatedto stroma, mucinous type and different clinical behavior. BMC Cancer 12, 260 (2012). https://doi.org/10.1186/1471-2407-12-260
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1471-2407-12-260