Gut microbiome associations with breast cancer risk factors and tumor characteristics: a pilot study

Objective To investigate the association between gut microbiome with breast tumor characteristics (receptor status, stage and grade) and known breast cancer risk factors. Methods In a pilot cross-sectional study of 37 incident breast cancer patients, fecal samples collected prior to chemotherapy were analyzed by 16S ribosomal RNA (rRNA) gene-based sequencing protocol. Alpha diversity and specific taxa by tumor characteristics and breast cancer risk factors were tested by Wilcoxon rank sum test, and by differential abundance analysis, using a zero-inflated negative binomial regression model with adjustment for total counts, age and race/ethnicity. Results There were no significant alpha diversity or phyla differences by estrogen/progesterone receptor status, tumor grade, stage, parity and body mass index. However, women with human epidermal growth factor receptor 2 positive (HER2+) (n = 12) compared to HER2− (n = 25) breast cancer showed 12–23% lower alpha diversity [number of species (OTU) p = 0.033, Shannon index p = 0.034], lower abundance of Firmicutes (p = 0.005) and higher abundance of Bacteroidetes (p = 0.089). Early menarche (ages ≤ 11) (n = 11) compared with later menarche (ages ≥ 12) (n = 26) was associated with lower OTU (p = 0.036), Chao1 index (p = 0.020) and lower abundance of Firmicutes (p = 0.048). High total body fat (TBF) (> 46%) (n = 12) compared to lower (≤ 46%) TBF was also associated with lower Chao 1 index (p = 0.011). There were other significant taxa abundance differences by HER2 status, menarche age, as well as other tumor and breast cancer risk factors. Conclusions and relevance Further studies are needed to identify characteristics of the human microbiome and the interrelationships between breast cancer hormone receptor status and established breast cancer risk factors.


Background
In the past decade numerous intriguing links between the gut microbiota and risk of obesity, metabolic diseases and inflammatory responses have been reported [1,2] but less is known about the gut microbiota of breast cancer patients [3,4]. A study conducted in Kaiser Permanente health care members of pretreatment samples showed that after adjusting for age, body mass index (BMI), and other factors, postmenopausal women diagnosed with incident breast cancer (n = 48) compared to control women (n = 48) showed significantly lower alpha diversity in fecal microbiota, and differing relative abundance of select taxa of Firmicutes (Clostridiaceae, Faecalibacterium, Ruminococcaceae, Dorea and Lachnospiraceae) [5]. Low gut microbial diversity has been associated with obesity, insulin resistance, and other factors some of which are aligned to risk of breast cancer [6]. In a case-only study of 31 women diagnosed with early stage breast cancer [7], the total number of unique species of Bacteroidetes, and Firmicutes differed significantly by tumor stage and abundance of Firmicutes was 16% lower among those with overweight BMI (≥ 25 kg/m 2 ) than those with normal BMI (p = 0.06).
Breast cancer is a heterogenous disease with multiple subtypes that display distinct risk factor patterns with differences between estrogen receptor (ER)/progesterone receptor (PR) positive (ER+PR+) versus those that are negative for ER/PR [8][9][10]. Breast cancers that are positive for human epidermal growth factor (HER2+) also differ from those that are HER2−, and triple negative (ER−PR−HER2−) breast cancers are the most deadly [9,11]. It is not known whether different breast cancer subtypes are associated with distinct microbial signatures. Several studies have also explored the role of breast tissue microbiome in modulating the risk of breast cancer [12][13][14][15][16][17]. We are aware of one study that applied a panpathogen microarray (PathoChip) strategy on formalin fixed paraffin embedded samples of breast tissues to investigate microbial patterns by different breast cancer subtypes, but this study lacked information on tumor stage or grade or breast cancer risk factors [18].
We describe below results from a cross-sectional analysis conducted among 37 women diagnosed with incident breast cancer in Los Angeles County to further investigate whether gut microbiome prior to breast cancer chemotherapy differs by receptor status (ER, PR, HER2) and stage and grade of breast cancer. We also investigated whether gut microbiome profile differed by well-established breast cancer risk factors including age at menarche, parity, baseline BMI, and physical activity.

Patient population and specimen collection
This study was conducted at the University of Southern California (USC) Norris Comprehensive Cancer Center and at the Los Angeles County + USC Medical Center. Women of all race/ethnicities, newly diagnosed with incident invasive breast cancer were considered potentially eligible. Exclusionary criteria included recurrent breast cancer, a history of other cancers (other than non-melanoma skin cancer), celiac disease, inflammatory bowel disease, bariatric surgery, pregnancy or nursing within past 12 months, past treatment with chemotherapy, antibiotic use (defined as 1 week or more during the month prior to baseline fecal sample collection), or use of probiotic supplements or prednisone. After signing informed consent, eligible and willing patients donated up to four fecal specimens and completed up to four clinical visits during an average of 9 months follow-up. Baseline specimens were collected before chemotherapy started for those who received neoadjuvant chemotherapy and were collected after surgery but before chemotherapy for those who received adjuvant chemotherapy or only had surgery (Fig. 1). The study protocol was approved by the USC Institutional Review Board. We used a fecal specimen collection kit with illustrated instructions that was designed and tested at the University of Maryland [19]. Participants were given collection kits and obtained samples using the provided pre-labeled collection devices and tubes containing the nucleic acid preservative RNAlater. All fecal samples were discreetly stored in the participants' home freezers, and were either picked up by the study staff or brought in to USC by the study participants. These stool samples were then stored in the -80 °C freezers of Preventive Medicine laboratory at USC until they were sent for measurement at the completion of the study. Body composition data obtained from the dual-energy x-ray absorptiometry (DEXA) scans at the first clinic visit (baseline) were included in our analysis. The DEXA scan was conducted at the USC Integrative Center for Oncology Research in Exercise. Participants also completed a baseline questionnaire to assess menstrual and reproductive history, medical history (e.g., hypertension, diabetes, benign breast diseases), family history of cancer, use of medications, and other lifestyle factors. Only the baseline fecal sample, i.e., collected before chemotherapy was included in the data analysis of this paper. Fecal samples collected during and after completion of breast cancer treatment are still under investigation.

Fecal specimen processing and microbiome analyses
Microbiome analyses were conducted in the laboratory of Dr. Jacques Ravel using his well-established methods, including DNA extraction, 16S rRNA gene amplification of the two barcoded universal primers 319F and 806R for PCR amplification of the V3 and V4 hypervariable regions and sequencing the amplicons on the Illumina MiSeq platform [5,19]. The 16S rRNA genes were amplified in 96-well microtiter plates. Negative controls without a template were processed for each primer pair. They performed taxonomic assignments and generated taxa abundance and read count tables for each of the 144 fecal samples we collected from 38 breast cancer patients. After we excluded 14 samples with low (< 100) read counts (referred to as failed), 130 samples remained from 37 patients as all 4 samples failed in one patient and she was excluded from all subsequent analyses. Hence this current analysis is comprised of baseline samples from 37 women diagnosed with incident breast cancer ( Table 1).

Statistical analyses
Microbiome alpha diversity was estimated after rarefaction using four measures: (a) counts of observed species (OTUs) unadjusted for relative abundances; (b) Chao1 as an estimate of the species richness; (c) Shannon index to measure both richness and evenness, and (d) phylogenetic distance (PD whole tree) in the diversity calculation. We used Wilcoxon rank sum test to examine differences in the alpha diversity between any two groups of interest (e.g., HER2+ vs HER2−) and Kruskal-Wallis to examine differences between any three groups of interest (e.g., age at menarche ≤ 11, 12, ≥ 13).
The relationship of overall gut microbiome composition with personal factors (age, menopause status, race/ethnicity, age at menarche, parity, physical activity, BMI, TBF) and tumor characteristics was assessed by principal coordinate analysis (PCoA) based on the unweighted (qualitative) Uni-Frac distance matrix [20]. PCoA plots were generated using the first two principal coordinates, according to categories of personal and tumor characteristics.
Turning to taxonomy, we investigated the 201 specific genera that were present in at least 25% of our study samples. To accommodate the sparse, non-normally distributed count data, we conducted differential abundance analysis, using a zero-inflated negative binomial regression (NBR) model [21] provided by SAS proc genmod, to examine relationships of specific taxa to tumor characteristics and breast cancer risk factors. We investigated differences in taxa between groups with adjustment for total counts (Model 1), as well as age (< 49, 50-59, 60+) and race/ethnicity (Hispanic vs non-Hispanic) (Model 2). The presumed lower risk categories [e.g., HER−, ER+, PR+, lower stage (0/I), lower grade (I/II), later age at menarche (≥ 12 years), parous, physically active, lower BMI (< 25 kg/m 2 ), and lower TBF (≤ 46%)] were used as the reference groups in the NBR analysis. The mean estimate ratio (MER) under the NBR model represents the ratio of the log estimate in one group versus the reference group and the p value is the probability of obtaining such a ratio under the null hypothesis. Thus, if the mean abundance of a taxon is higher in the HER2+ than in the HER2− group (reference group), we expect a MER greater than one. On the other hand, if the mean abundance of a taxon is lower among HER2+ than HER2− tumors, we expect a MER less than one. A probability of P ≤ 0.001 was accepted as significant in this study. Results were similar for Model 1 and 2 and we showed statistically significant MERs in NBR from Model 2 (Tables 3, 4, 5 and 6). For this pilot study we did not adjust for multiple testing [22]. All data were analyzed using R (R Foundation for Statistical Computing Vienna, Austria or SAS version 9.4 (SAS, Cary, NC).

PERMANOVA analysis of personal and tumor characteristics with the unweighted UniFrac distance matrix
Beta diversity (between-subjects species diversity) was assessed using the unweighted and weighted UniFrac distance. BMI was associated with baseline gut microbiome composition. Axis 1 explained 20.9% of all variance while axis 2 explained 10.5% (Fig. 2). Separation between the baseline microbiota of the BMI groups (< 25 vs ≥ 25 kg/ m 2 ) differed for axis 1 (p = 0.20) and axis 2 (p = 0.024) with the unweighted UniFrac distance matrix but not with the weighted UniFrac distance (Fig. 2). Separation of baseline microbiota was also observed using cutpoints of < 30 vs ≥ 30 for BMI (axis 1 p = 0.16; axis 2 p = 0.009) and < 46% vs ≥ 46% for TBF (axis 1 p = 0.21; axis 2 p = 0.048). None of the other factors were associated with overall fecal composition (data not shown).

Alpha diversity by tumor characteristics and personal characteristics
There were no statistically significant baseline alpha diversity (within-subject species diversity) differences by tumor stage and grade, ER or PR status (Table 2). However, alpha diversity measures were 12% to 23% lower for HER2+ (n = 12) than HER2− (n = 25) breast cancer; including lower OTU (p = 0.033), Chao1 index (p = 0.073), and Shannon index (p = 0.035). High (> 46%) TBF compared to lower (≤ 46%) TBF was associated with lower Chao 1 index (p = 0.011) and OTU (p = 0.059). Similar patterns of differences were observed for those with normal BMI versus overweight or obese. Alpha diversity measures were lower among women with early (≤ 11) than later (≥ 12) age of menarche; these differences were statistically significant for OTU (p = 0.034), Chao 1 index (p = 0.020) and borderline statistically significant for Shannon index (p = 0.057) and PD whole tree (p = 0.073). Those who were physically active had higher Chao 1 index (p = 0.07) and OTU (p = 0.58) than those who were not physically active but Shannon index and PD tree were not higher. Alpha diversity measures did not differ between parous and nulliparous women.

Phyla abundance differences by tumor characteristics and breast cancer risk factors
There were no significant phyla differences by ER and PR status, stage, grade, parity, BMI, and TBF% (data not shown). However, median level of Firmicutes was lower among women with HER2+ than those with HER2− breast cancer (33.53 vs 51.75, p = 0.005), and also lower among women with early (≤ 11) than those with later (≥ 12) age of menarche (35.61 vs 50.17, p = 0.048) (Fig. 3). We explored differences in abundance by age at menarche and HER2 status combined (Fig. 4) Fig. 5, in support of the results shown by MER in Table 3.

Taxa abundance differences by stage and grade
Two taxa of Firmicutes (g_Clostridium, g_Veillonella) were more abundant (MER > 1) among women with higher grade (III) or higher stage breast cancers compared to lower grade (I/II) or lower stage breast cancers. In addition, higher grade was associated with higher abundance of Actinobacteria (g_ Eggerthella) but lower abundance (MER < 1) of other taxa of Actinobacteria (f_Coriobacteriaceae), and Firmucutes (f_Lachnospiraceae, g_Anaerostipes, f_Ruminococcaceae) ( Table 4). Higher stage breast cancer was also associated with higher abundance of Firmicutes (f_Clostridiaceae) and Proteobacteria (f_Enterobacteriaceae, g_Haemophilus) but lower abundance of Firmicutes (g_Acidaminococcus, g_Catenbacterium) ( Table 4).

Taxa abundance differences and breast cancer risk factors
We also explored whether there are taxa differences by treating older age at diagnosis (≥ 50 years), later age at menarche, parous, BMI (< 25 kg/m 2 ), TBF (≤ 46%), and physically active as the reference groups in the NBR  . When we examined difference in taxa by TBF, women with higher TBF (≥ 46%) compared to those with lower TBF (< 46%) also showed higher abundance (MER > 1) of Firmicutes (f_Clostridiaceae, g_Clostridium, g_Lachnospira) but lower abundance (MER < 1) of Actinobacteria (f_Coriobacteriaceae) and Firmiciutes (g_Catenbacterium). There are some taxa differences between those who were physically active compared to those who were inactive; including lower abundance of some Firmicutes (f_Clostridiaceae; g_Lachnobacterium, g_Lactobacillus) but higher abundance of other Firmicutes (f_ Veillonella).

Discussion
We investigated the gut microbiome profile in relation to ER/PR and HER2 status, tumor grade and stage, and select breast cancer risk factors in 37 women diagnosed with incident breast cancer; most of whom (73%) were Hispanics, and were overweight or obese (75%). Women with HER2+ compared with HER2− breast cancers displayed a less diverse microbiome and a distinct bacterial composition profile, including in abundance of Firmicutes (see below). Breast cancer patients with high (≥ 46%) TBF and earlier age at menarche (≤ 11) also had a less diverse gut microbiome. Abundance of Firmicutes was significantly lower among women with HER2+ breast cancer and early menarche than those with HER2− breast cancer and later menarche. Before we interpret these new results, we discuss our results on body size comparisons and tumor grade and stage in relation to published findings. Alpha diversity measures have been used as a hallmark of health habits including adherence to Mediterranean diets [23][24][25] and body composition [26]. Lower gut alpha diversity has been associated with human obesity in a metaanalysis, showing significant relationships between obesity and microbial richness, evenness, and diversity [26]. Chao 1 index and OTU were 31% (p = 0.011) and 14% (p = 0.059) lower among women with > 46% TBF compared to those with ≤ 46% TBF; similar but weaker patterns were observed by BMI (Table 2). Associations between various bacterial groups and BMI have been reported but a consistent taxonomic signature of obesity has not been identified [27,28]. Women in this study with higher BMI or higher TBF displayed higher abundance of Firmicutes (f_Clostridiaceae). Additionally, those with higher BMI displayed higher abundance of g_Akkermansia; enrichment of this taxa has been related with body composition in other studies [29][30][31]. Several sub-taxa within Firmicutes (g_Streptococcus) associated with lower BMI [28,31,32] also appeared to differ by BMI in this study. However, small numbers of those with BMI < 25 kg/m 2 (n = 9) may have limited our ability to identify other taxa that have been associated with lean/normal BMI (e.g., f_Christensenellaceae; g_Oscillospira) [23,33,34]. Interestingly, breast cancer patients without regular physical activity also showed lower Chao 1 index (p = 0.07) and tended to have lower abundance of several taxa of Firmicutes (f_Clostridiaceae) in support of growing evidence that exercise favorably influences the function and composition  [28]. Our findings on taxa differences by breast cancer grade and stage add to results from one previous study of mostly low grade (77% were grade I/II) and low stage (59% stage 0/I) breast cancers [7]. A higher abundance of g_Clostridium was found among those with higher tumor grade or stage in this study, similar to the finding of abundance of Clostridium coccoides cluster in the previous study [7]. Moreover, women with higher grade or higher stage breast cancers also displayed higher abundance of f_Veillonella but lower abundance of f_Erysipelotrichaeceae which has been related with inflammation-related conditions [36]. The significance of our finding of high abundance of taxa in p_Proteobacteria (g_Haemophilus, f_Enterobacteriaceae) among those with higher tumor stage is not clear but it is intriguing that g_Haemophilus appeared to be over-represented among individuals with impaired glucose regulation [36].
Reasons for the lower alpha diversity among women with HER2+ compared to those with HER2− breast cancer are not known. Menarche age, parity, BMI, and TBF did not differ by HER2 status. It is intriguing that women with HER2+ compared to those with HER2− breast cancer displayed lower abundance of select genera of Firmicutes (e.g., g_Clostridium, g_Blautia, g_Coprococcus, g_Ruminococcus, g_SMB53) and higher abundance of select genera of p_Bacteroidetes; thus a deficit of taxa that have often been linked with healthy body composition, body leanness and healthy metabolic profile [37,38]. Table 5 Mean estimate ratios (MER) a obtained by zero-inflated negative binomial model of taxa abundances by age group b , menarche age and parity c a MER > means higher taxa in women aged < 50, early menarche age (≤ 11), nulliparous, high BMI (≥ 25), high TBF(> 46%) than age 50+, later menarche (≥ 12), parous, low BMI, and low TBF, respectively b Adjustment for total counts and race/ethnicity c Adjustment for total counts, age and race/ethnicity in analysis on age at menarche and parity (model 2, MER) Lower weight gain has been associated with taxa of the Ruminococcaceae family in studies of twins [27]. Another novel finding is that earlier menarche age was associated with lower alpha diversity; these findings were statistically significant for OTU and Chao1 index. Age at menarche is likely a marker of earlier life diet and nutrition [39]. Earlier age at menarche has been found to have a lasting effect [40], conferring higher circulating estradiol levels for those who started to menstruate at ages 11 or younger than at age 14 or older (p = 0.033) [41]. High gut microbial diversity has been associated with a profile of estrogen metabolites associated with reduced breast cancer risk [42]. Levels of urinary estrogen metabolites have been correlated with relative abundances of specific Clostridia taxa [42,43]. There are likely bidirectional influences between sex steroids and the gut microbiome. Various bacterial genes have been found to affect β-glucuronidase enzymatic activity, influencing deconjugation and reabsorption of estrogens. Levels of circulating estrogen, in turn, may influence the abundance of certain bacteria species [42][43][44][45][46][47].
Strengths of this pilot study include our collection of detailed information on relevant breast cancer risk factors and tumor characteristics and considering them in this analysis using two complementary methods, by Wilcoxon rank sum test and a zero-inflated NBR model with adjustment for select covariates. This study included mostly Hispanics in the catchment area of USC. However, we are limited by our cross-sectional analyses and modest sample size so that we used only two categories in our comparisons of taxa differences by age at menarche, parity, physical activity, BMI and TBF%. Breastfeeding, a parity-related factor, that has emerged as an important modifiable lifestyle factor for breast cancer, was not asked in our study. Research regarding the association of specific microbiome taxa to disease or other conditions inherently involves studying the relationships of numerous taxa with multiple conditions, thus greatly increasing the possibility of type 1 errors. On the other hand, small sample sizes preclude the recognition of any but the strongest associations when very small alphalevels are used for statistical significance. Even with our conservative α-level of 0.001 we found far more statistically significant results than would be expected by chance alone, particularly with respect to HER2, grade, and age at menarche. Although some of these findings may be chance findings, while other important associations may have been missed due to the small alpha used, we feel that we have struck a reasonable balance, and that these findings are informative and warrant further consideration.

Conclusions
In summary, this pilot cross-sectional study of mostly Hispanic women found that HER2 status and age at menarche had significant associations with gut microbiome alpha diversity measures and specific microbial composition. These findings warrant confirmation in studies with larger Table 6 Mean estimate ratios (MER) a obtained by zero-inflated negative binomial model of taxa abundances by BMI, total body fat, and physical activity with adjustment for total counts, age and race/ethnicity (model 2, MER) a MER > means higher taxa in high BMI (≥ 25), high TBF(> 46%), and no regular physical activity than low BMI, and low TBF and yes regular physical activity sample sizes of diverse racial/ethnic groups and with repeated sample collections to determine how microbiome are associated with breast cancer subtypes and specific risk factors.