Background

Colorectal cancer (CRC) is the third most common cause of cancer deaths in the USA [1, 2]. Disease mortality has significantly decreased, predominately due to improvements in screening [2]. Despite these improvements, there are still approximately 50,000 CRC-related deaths per year in the USA [1]. Current estimates indicate that 20–30% of those who undergo treatment will experience recurrence and 35% of all patients will die within 5 years [35]. Identification of methods to assess patients’ risk of recurrence is of great importance to reduce mortality and healthcare costs.

There is growing evidence that the gut microbiota is involved in the progression of CRC. Mouse-based studies have identified populations of Bacteroides fragilis, Escherichia coli, and Fusobacterium nucleatum that alter disease progression [610]. Furthermore, studies that shift the structure of the microbiota through the use of antibiotics or inoculation of germ-free mice with human feces have shown that varying community compositions can result in varied tumor burden [1113]. Collectively, these studies support the hypothesis that the microbiota can alter the amount of inflammation in the colon and with it the rate of tumorigenesis [14].

Building upon this evidence, several human studies have identified unique signatures of colonic lesions [1520]. One line of research has identified community-level differences between those bacteria that are found on and adjacent to colonic lesions and have supported a role for B. fragilis, E. coli, and F. nucleatum in tumorigenesis [2123]. Others have proposed feces-based biomarkers that could be used to diagnose the presence of colonic adenomas and carcinomas [2426]. These studies have associated F. nucleatum and other oral pathogens with colonic lesions (adenoma, advanced adenoma, and carcinoma). They have also noted that the loss of bacteria generally thought to produce short-chain fatty acids, which can suppress inflammation, is associated with colonic lesions. This suggests that gut bacteria have a role in tumorigenesis with potential as useful biomarkers for aiding in the early detection of disease [2126].

Despite advances in understanding the role between the gut microbiota and colonic tumorigenesis, we still do not understand how treatments including resection, chemotherapy, and/or radiation affect the composition of the gut microbiota. If the microbial community drives tumorigenesis then one would hypothesize that treatment to remove a lesion would not only remove the lesion, but also the microbiota that promoted the tumorigenesis and hence the risk of recurrence. To test this hypothesis, we addressed two related questions: does treatment affect the colonic microbiota in a predictable manner? If so, does the treatment alter the community to more closely resemble that of individuals with normal colons?

We answered these questions by sequencing the V4 region of 16S rRNA genes amplified from fecal samples of individuals with adenoma, advanced adenoma, and carcinomas pre- and post-treatment. We used classical community analysis to compare the alpha and beta-diversity of communities pre- and post-treatment. Next, we generated random forest models to identify bacterial populations that were indicative of treatment for each diagnosis group. Finally, we measured the predictive probabilities to assess whether treatment yielded bacterial communities similar to those individuals with normal colons. We found that treatment alters the composition of the gut microbiota and that, for those with carcinomas, the gut microbiota shifted more toward that of a normal colon after treatment. In the individuals with carcinomas, no difference was found by the type of treatment (surgery alone, surgery with chemotherapy, surgery with chemotherapy, and radiation). Understanding how the community responds to these treatments could be a valuable tool for identifying biomarkers to quantify the risk of recurrence and the likelihood of survival.

Results

Treatment for colonic lesions alters the bacterial community structure

Within our 67-person cohort, we tested whether the microbiota of patients with adenoma (N = 22), advanced adenoma (N = 19), or carcinoma (N = 26) had any broad differences between pre- and post-treatment samples [Table 1]. None of the individuals in this study had any recorded antibiotic usage that was not associated with surgical treatment of their respective lesion. The structure of the microbial communities of the pre and post-treatment samples differed, as measured by the θ YC beta diversity metric [Fig. 1 a]. We found that the communities obtained pre- and post-treatment among the patients with carcinomas changed significantly more than those patients with adenoma (P value< 0.001). There were no significant differences in the amount of change observed between the patients with adenoma and advanced adenoma or between the patients with advanced adenoma and carcinoma (P value > 0.05). Next, we tested whether there was a consistent direction in the change in the community structure between the pre- and post-treatment samples for each of the diagnosis groups [Fig. 1 bd]. We only observed a consistent shift in community structure for the patients with carcinoma when using a PERMANOVA test (adenoma P value = 0.999, advanced adenoma P value = 0.945, and carcinoma P value = 0.005). Finally, we measured the number of observed OTUs, Shannon evenness, and Shannon diversity in the pre- and post-treatment samples and did not observe a significant change for any of the diagnosis groups (P value > 0.05) [Additional file 1: Table S1].

Fig. 1
figure 1

General differences between adenoma, advanced adenoma, and carcinoma groups after treatment. a (θ)YC distances from pre- versus post-sample within each individual. A significant difference was found between the adenoma and carcinoma group (P value = 5.36e–05). Solid black points represent the median value for each diagnosis group. b NMDS of the pre- and post-treatment samples for the adenoma group. c NMDS of the pre- and post-treatment samples for the advanced adenoma group. d NMDS of the pre- and post-treatment samples for the carcinoma group

Table 1 Demographic data of patients in the pre and post-treatment cohort

The treatment of lesions are not consistent across diagnosis groups. We used two approaches to identify those bacterial populations that change between the two samples for each diagnosis group. First, we sought to identify individual OTUs that could account for the change in overall community structure. However, using a paired Wilcoxon test, we were unable to identify any OTUs that were significantly different in the pre- and post-treatment groups (P value > 0.05). It is likely that high inter-individual variation and the irregular distribution of OTUs across individuals limited the statistical power of the test. We attempted to overcome these problems by using random forest models to identify collections of OTUs that would allow us to differentiate between pre- and post-treatment samples from each of the diagnosis groups. The adenoma and carcinoma models performed well (adenoma AUC range = 0.54 –0.83 and carcinoma AUC range = 0.82 – 0.98); however, the model for patients treated for advanced adenomas was not able to reliably differentiate between the pre and post-treatment samples (advanced adenoma AUC range = 0.34 – 0.65). Interestingly, the top 10 most important OTUs by MDA that were used for each model had little overlap with each other [Fig. 2]. Although treatment had an impact on the overall community structure, the effect of treatment was not consistent across patients and diagnosis groups. Both the adenoma and carcioma treatment models had AUCs that were significantly higher than a random model permutation (P value < 0.0001).

Fig. 2
figure 2

The top 10 most important OTUs used to classify treatment for adenoma, advanced adenoma, and carcinoma. a Adenoma OTUs. b Advanced Adenoma OTUs. c Carcinoma OTUs. The darker circle highlights the median log10 MDA value obtained from 100 different 80/20 splits while the lighter colored circles represents the value obtained for a specific run

Post-treatment samples from patients with carcinoma more closely resemble those of a normal colon. Next, we determined whether treatment changed the microbiota in a way that the post-treatment communities resembled that of patients with normal colons. To test this, we used an expanded cohort of 423 individuals that were diagnosed under the same protocol as having normal colons or colons with adenoma, advanced adenoma, or carcinoma [Table 2]. We then constructed random forest models to classify the study samples, with the three diagnosis groups (adenoma, advanced adenoma, or carcinoma), or having a normal colon. The models performed moderately with CRC being the best (adenoma AUC range = 0.50 – 0.62, advanced adenoma AUC range = 0.53 – 0.67, carcinoma AUC range = 0.71 – 0.82; Additional file 2: Figure S1). The OTUs that were in the top 10% of importance for the adenoma and advanced adenoma models largely overlapped and those OTUs that were used to classify the carcinoma samples were largely distinct from those of the other two models [Fig. 3 a]. Among the OTUs that were shared across the three models were those populations generally considered beneficial to their host (e.g., Faecalibacterium, Lachnospiraceae, Bacteroides, Dorea, Anaerostipes, and Roseburia) [Fig. 3 b]. Although many of important OTUs in the top 10% were also included in the model differentiating between patients with normal colons and those with carcinoma, this model also included OTUs affiliated with populations that have previously been associated with carcinoma (Fusobacterium, Porphyromonas, Parvimonas) [2426] [Additional file 3: Figure S2] with some individuals showing a marked decrease in relative abundance [Additional file 4: Figure S3]. Finally, we applied these three models to the pre- and post-treatment samples for each diagnosis group and quantified the change in the positive probability of the model. A decrease in the positive probability would indicate that the microbiota more closely resembled that of a patient with a normal colon. There was no significant change in the positive probability for the adenoma or advanced adenoma groups (P value > 0.05) [Fig. 4]. The positive probability for the pre- and post-treatment samples from patients diagnosed with carcinoma significantly decreased with treatment, suggesting a shift toward a normal microbiota for most individuals (P value = 0.001). Only, 7 of the 26 patients (26.92%) who were diagnosed with a carcinoma had a higher positive probability after treatment; one of those was re-diagnosed with carcinoma on the follow up visit. These results indicate that, although there were changes in the microbiota associated with treatment, those experienced by patients with carcinoma after treatment yielded gut bacterial communities of greater similarity to that of a normal colon.

Fig. 3
figure 3

Top 10% most important OTUs common to those models used to differentiate between patients with normal colons and those with adenoma, advanced adenoma, and carcinoma. a Venn diagram showing the OTU overlap between each model. b For each common OTU the lowest taxonomic identification and importance rank for each model run is shown

Fig. 4
figure 4

Treatment response based on models built for adenoma, advanced adenoma, or carcinoma. a Positive probability change from initial to follow-up sample in those with adenoma. b Positive probability change from initial to follow-up sample in those with advanced adenoma. c Positive probability change from initial to follow-up sample in those with carcinoma

Table 2 Demographic data of training cohort

Difficult to identify effects of specific treatments on the change in the microbiota. The type of treatment that the patients received varied across diagnosis groups. Those with adenomas and advanced adenomas received surgical resection (adenoma, N=4; advanced adenoma, N=4) or polyp removal during colonoscopy (adenoma, N=18; advanced adenoma, N=15) and those with carcinomas received surgical resection (N=12), surgical resection with chemotherapy (N=9), and surgical resection with chemotherapy and radiation (N=5). Regardless of treatment used, there was no significant difference in the effect of these treatments on the number of observed OTUs, Shannon diversity, or Shannon evenness (P value > 0.05). Furthermore, there was not a significant difference in the effect of the treatments on the amount of change in the community structure (P value = 0.375). Finally, the change in the positive probability was not significantly different between any of the treatment groups (P value = 0.375). Due to the relatively small number of samples in each treatment group, it was difficult to make a definitive statement regarding the specific type of treatment on the amount of change in the structure of the microbiota.

Discussion

Our study focused on comparing the microbiota of patients diagnosed with adenoma, advanced adenoma, and carcinoma before and after treatment. For all three groups of patients, we observed changes in their microbiota. Some of these changes, specifically for adenoma, may be due to normal temporal variation, however, those with advanced adenoma and carcinoma clearly had large microbiota changes. After treatment, the microbiota of patients with carcinoma changed significantly more than the other groups. This change resulted in communities that more closely resembled those of patients with a normal colon. This may suggest that treatment for carcinoma is not only successful for removing the carcinoma but also at reducing the associated bacterial communities. Understanding the effect of treatment on the microbiota of those diagnosed with carcinomas may have important implications for reducing disease recurrence. It is intriguing that it may be possible to use microbiome-based biomarkers to not only predict the presence of lesions but to also assess the risk of recurrence due to these changes in the microbiota.

Patients diagnosed with adenoma and advanced adenoma, however, did not experience a shift toward a community structure that resembled those with normal colons. This may be due to the fundamental differences between the features of adenomas and advanced adenomas and carcinoma. Specifically, carcinomas may create an inflammatory milieu that would impact the structure of the community and removal of that stimulus would alter said structure. It is possible that the difference between the microbiota of patients with adenoma and advanced adenoma and those with normal colons is subtle. This is supported by the reduced ability of our models to correctly classify patients with adenomas and advanced adenomas relative to those diagnosed with carcinomas [Additional file 2: Figure S1]. Given the irregular distribution of microbiota across patients in the different diagnosis groups, it is possible that we lacked the statistical power to adequately characterize the change in the communities following treatment.

There was a subset of patients (7 of the 26 with carcinomas) who demonstrated an elevated probability of carcinoma after treatment. This may reflect an elevated risk of recurrence. The 26.92% prevalence of increased carcinoma probability from our study is within the expected rate of recurrence (20–30% (3, 4)). We hypothesized that these individuals may have had more severe tumors; however, the tumor severity of these seven individuals (1 with stage I, 3 with stage II, and 3 with stage III) was similar to the distribution observed among the other 19 patients. We also hypothesized that we may have sampled these patients later than the rest, and their communities may have reverted to a carcinoma-associated state; however, there was not a statistically significant difference in the length of time between sample collection among those whose probabilities increased (331 (246–358) days) or decreased (364 (301–434) days) (Wilcoxon test; P value = 0.39) (all days data displayed as median (IQR)). Finally, it is possible that these patients may not have responded to treatment as well as the other 19 patients diagnosed with carcinoma and so the microbiota may not have been impacted the same way. Again, further studies looking at the role of the microbiota in recurrence are needed to understand the dynamics following treatment.

Our final hypothesis was that the specific type of treatment altered the structure of the microbiome. The treatment to remove adenomas and advanced adenomas was either polyp removal or surgical resection whereas it was surgical resection alone or in combination with chemotherapy or with chemotherapy and radiation for individuals with carcinoma. Because chemotherapy and radiation target rapidly growing cells, these treatments would be more likely to cause a turnover of the colonic epithelium driving a more significant change in the structure of the microbiota. Although, we were able to test for an effect across these specific types of treatment, the number of patients in each treatment group was relatively small. Finally, those undergoing surgery would have received antibiotics, and this may be a potential confounder. However, our pre-treatment stool samples were obtained before the surgery and the post-treatment samples were obtained long after any effects due to antibiotic administration on the microbiome would be expected to occur (344 (266–408) days). We also found no difference in the community structure of those that received surgery and those that did not as a treatment for adenoma or advanced adenoma.

Conclusion

This study expands upon existing research that has established a role for the microbiota in tumorigenesis and that demonstrated the utility of microbiome-based biomarkers to predict the presence of colonic lesions. We were surprised by the lack of a consistent signal that was associated with treatment of patients with adenomas or advanced adenomas. The lack of a large effect size may be due to differences in the role of bacteria in the formation of adenomas and carcinomas or it could be due to differences in the behaviors and medications within these classes of patients. One of the most exciting of these future directions is the possibility that markers within the microbiota could be used to potentially evaluate the effect of treatment and to predict recurrence for those diagnosed with carcinoma. If such an approach is effective, it might be possible to target the microbiota as part of adjuvant therapy, if the biomarkers identified play a key role in the disease process. Our data provides additional evidence on the importance of the microbiota in tumorigenesis by addressing the recovery of the microbiota after treatment and opens interesting avenues of research into how these changes may affect recurrence.

Methods

Study design and patient sampling

Sampling and design have been previously reported in Baxter et al. [24]. Briefly, samples were stored on ice for at least 24 h before freezing. Although we cannot exclude that this sampling protocol may have impacted the gut microbiota composition, all samples were subjected to the same methodology. Study exclusion involved those who had already undergone surgery, radiation, or chemotherapy, had colorectal cancer before a baseline fecal sample could be obtained, had IBD, a known hereditary non-polyposis colorectal cancer, or familial adenomatous polyposis. Samples used to build the models for prediction were collected either prior to a colonoscopy or between 1 and 2 weeks after initial colonoscopy. The bacterial community has been shown to normalize back to a pre-colonoscopy community within this time period [27]. Our study cohort consisted of 67 individuals with an initial sample as described and a follow-up sample obtained between 188 and 546 days after treatment of lesion [Table 1]. Patients were diagnosed by colonoscopic examination and histopathological review of any biopsies taken. Patients were classified as having advanced adenoma if they had an adenoma greater than 1 cm, more than three adenomas of any size, or an adenoma with villous histology. This study was approved by the University of Michigan Institutional Review Board. All study participants provided informed consent, and the study itself conformed to the guidelines set out by the Helsinki Declaration. The original protocol for the study did not provide for tracking patients after the follow-up samples and so it was not possible for us to ascertain their diagnosis after the completion of the study.

Treatment

For this study, treatment refers specifically to the removal of a lesion with or without chemotherapy and radiation. The majority of patients undergoing treatment for adenoma or advanced adenoma were not treated surgically [Table 1] but rather via colonoscopy. All patients diagnosed with carcinomas were treated with at least surgery or a combination of surgery and chemotherapy or surgery, chemotherapy, and radiation. The type of chemotherapy used for patients with CRC included Oxaliplatin, Levicovorin, Folfox, Xeloda, Capecitabine, Avastin, Fluorouracil, and Glucovorin. These were used individually or in combination with others depending on the patient [Table 1]. If an individual was treated with radiation, they were also always treated with chemotherapy. Radiation therapy generally used 18 mV photons for treatment.

16S rRNA gene sequencing

Sequencing was completed as described by Kozich et al. [28]. DNA extraction used the 96-well Soil DNA isolation kit (MO BIO Laboratories) and an epMotion 5075 automated pipetting system (Eppendorf). The V4 variable region was amplified, and the resulting product was split between four sequencing runs with normal, adenoma, and carcinoma evenly represented on each run. Each group was randomly assigned to avoid biases based on sample collection location. The pre and post-treatment samples were sequenced on the same run.

Sequence processing

The mothur software package (v1.37.5) was used to process the 16S rRNA gene sequences and has been previously described [28]. The general workflow using mothur included merging paired-end reads into contigs, filtering for low quality contigs, aligning to the SILVA database [29], screening for chimeras using UCHIME [30], classifying with a naive Bayesian classifier using the Ribosomal Database Project (RDP) [31], and clustered into operational taxonomic units (OTUs) using a 97% similarity cutoff with an average neighbor clustering algorithm [32]. The number of sequences for each sample was rarefied to 10523 to minimize the impacts of uneven sampling.

Model building

The random forest [33] algorithm was used to create the three models used to classify pre- and post-treatment samples by diagnosis (adenoma, advanced adenoma, or carcinoma) as well as to assess the probability that a sample was more similar to the patient’s original diagnosis or that of a disease-free patient. All models included only OTU data obtained from 16S rRNA sequencing and were processed using the caret (v6.0.76) R package. For each model, we optimized the mtry hyper-parameter, which defines the number of OTUs to investigate at each split before a new division of the data was created with the random forest model [33]. To insure that our optimization did not result in over-fitting of the data, we made 100 different 80/20 (train/test) splits of the data where the same proportion was present within both the whole data set and the 80/20 split. For each of the 100 splits, 20 repeated 10-fold cross validation was performed on the 80% component to optimize the mtry hyper-parameter by maximizing the AUC (area under the curve of the receiver operator characteristic). The resulting model was then tested on the 20% of the data that were held out. A summary of the mtry hyperparameter values that were tried is available in Additional file 1: Table S5. The reported P values for each model relative to a random labeling was assessed by comparing the distribution of the 100 80/20 splits for the correctly labeled data to the distribution of randomly labeled data.

The three diagnosis models were constructed by using the data from Baxter et al. [24], which was censored for the pre-treatment samples of the patients that we had post-treatment samples. The treatment models were then used to quantify the model probability that a patient with an initial diagnosis retained that diagnosis or a disease-free diagnosis.

Statistical analysis

The R software package (v3.4.1) was used for all statistical analysis. Comparisons between bacterial community structure utilized PERMANOVA [34] in the vegan package (v2.4.3). Comparisons between probabilities as well as overall differences in the median relative abundance of each OTU between pre- and post-treatment samples utilized a paired Wilcoxon ranked sum test. Comparisons between different treatment for lesions utilized a Kruskal-Wallis test. Where multiple comparison testing was appropriate, a Benjamini-Hochberg (BH) correction was applied [35] and a corrected P value of less than 0.05 was considered significant. The P values reported are those that were BH corrected. Model rank importance was determined by obtaining the median MDA from the 100, 20 repeated 10-fold cross validation and then ranking from largest to smallest MDA.

Reproducible methods

A detailed and reproducible description of how the data were processed and analyzed can be found at https://github.com/SchlossLab/Sze_FollowUps_Microbiome_2017. Raw sequences have been deposited into the NCBI Sequence Read Archive (SRP062005 and SRP096978) and the necessary metadata can be found at https://www.ncbi.nlm.nih.gov/Traces/study/ and searching the respective SRA study accession.