Introduction

More than 700 bacterial species and a range of other microorganisms (archaea, fungi, and viruses) colonize the human oral cavity, known collectively as the oral microbiome1,2. Oral microbiota are closely tied to oral diseases, such as periodontitis and dental caries3, and potentially to systemic diseases, including diabetes4,5, cardiovascular disease6,7, and several types of cancer8,9,10,11. The salivary microbiome is increasingly preferred in studies investigating the relationship between microbiome and health, due to its non-invasive accessibility, temporal stability12, and its interactions with external factors and with other microbiomes (e.g. dental plaques, gut)13,14. However, less is known about external modifiable dietary factors associated with salivary microbiome composition.

We hypothesized that frequent consumption of high-sugar beverages (HSB), such as fruit juices and carbonated beverages, may impact the salivary microbiota. Intake of HSBs, containing high amounts of acids and fermentable carbohydrates, contributes to caries development by facilitating fermentation and selection of bacteria that thrive in a low-pH environment15,16. Sugars, particularly sucrose, disrupt the balance of oral microbial systems by enhancing the proportion of acid-producing bacteria and reducing alkali-producing bacteria in oral biofilms17. Cariogenic microorganisms in the oral microbiome not only produce more acid when metabolizing fermentable sugars18 but are more likely to survive in these acidic conditions compared to beneficial oral microbiota19. Salivary microbiota assemblages may be affected by variations in biofilm bacterial composition and the subsequently altered structure of tooth surfaces on which oral bacteria are attached. Additionally, HSB consumption elevates salivary glucose concentration, and acids found in fruit juice and carbonated beverages can decrease the pH value after consumption20,21, both of which impact global salivary microbiome diversity and certain bacterial phylotypes22,23. Finally, frequent HSB consumption may lead to inflammatory processes24,25,26,27 that can reshape the salivary microbiome28. While oral dysbiosis is likely influenced by dietary factors29,30,31, thus far there have been no examinations of the impact of HSB consumption on overall oral microbiome composition, and few studies have investigated the relationships of broader dietary patterns with the oral microbiome, with none finding significant associations32,33.

We tested the association of HSB consumption with the oral microbiome in 989 adults from two large U.S. cohorts, the American Cancer Society (ACS) Cancer Prevention Study-II (CPS-II) Nutrition cohort34 and the National Cancer Institute (NCI) Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial cohort35. The oral microbiome was characterized by bacterial 16S rRNA gene sequencing. Comparisons of overall community structure and taxonomic abundance were conducted across HSB intake groups. Sensitivity analysis was conducted to examine potential interaction or confounding effects of diabetes and specific bacterial pathogens on overall microbiome-HSB association.

Methods

Study population

Participants were drawn from the ACS CPS-II and NCI PLCO cohort studies. Oral wash samples were collected by mail from 70,004 CPS-II Nutrition cohort participants between 2000 and 2002 and in the PLCO control arm (n = 77,445) at recruitment from 1993 to 2001. As previously described36, subjects included in the present analyses were originally selected from the CPS-II and PLCO cohorts as cases (i.e. subjects who subsequently developed cancer) or controls (i.e. subjects who subsequently did not develop cancer) for collaborative nested case–control studies of the oral microbiome in relation to head and neck cancer37 and pancreatic cancer9. The study participants were all healthy at the time of sample collection. After excluding participants with implausible total energy intake (greater than 3,500 kilocalorie (kcal) or less than 500 kcal per day, n = 149) and participants without information on body mass index (BMI, n = 26), 436 participants from CPS-II (n = 157 from the head and neck study and n = 279 from the pancreas study) and 553 participants from PLCO (n = 218 from the head and neck study and n = 335 from the pancreas study) were included in the current study (Supplementary Fig. 1). All participants provided informed consent, all protocols were approved by the New York University School of Medicine Institutional Review Board (IRB), and all research was conducted in accordance with the relevant IRB guidelines and regulations.

Assessment of HSB consumption and covariates

Dietary intake at baseline and follow-up was assessed in the PLCO cohort using a self-administered food frequency questionnaire (FFQ) that assessed usual dietary intake over the past year. This questionnaire has been validated in different studies38,40,40 and was successfully used to examine dietary intake and risk of cancer in pooled cohort studies41,43,43. We considered “orange juice or grapefruit juice,” “other 100% fruit juice or 100% fruit juice mixtures,” “other fruit drinks (such as cranberry cocktail, Hi-C, lemonade, or KoolAid, diet or regular),” and “soft drinks, soda, or pop” as HSBs for this study. CPS-II cohort participants completed a validated Willett FFQ44,46,47,47 in 1999 that assessed consumption of three types of regular (not sugar-free) carbonated beverages: cola-type, other caffeine-containing (e.g. Mt. Dew), and “other” (e.g. 7 Up). Participants were also asked about consumption of punch/lemonade/sugared iced tea. Frequency categories ranged from never to 4 or more servings per day. Fruit juices queried included apple juice or cider, orange juice, grapefruit juice, and other fruit juices; frequency categories ranged from never to 2 or more servings per day. The HSBs included in this study may contain a variety of sugar types, including sucrose- and fructose-based beverages, but may also include other natural and artificial sugar constituents. Comprehensive demographic and lifestyle information was collected by baseline questionnaires in both cohorts.

Oral microbiota characterization using 16S rRNA gene amplification and sequencing

Participants in both cohorts were asked to swish vigorously with 10 mL Scope mouthwash (Proctor & Gamble; P&G, Cincinnati, OH) for 30 s, and then to expectorate into a specimen tube. This sample collection method results in similar oral microbiome profiles to that of fresh frozen unstimulated saliva, collected by allowing saliva to accumulate on the floor of the mouth48. Samples were shipped to each cohort’s biorepository, pelleted by centrifugation, resuspended, and aliquoted for storage at −80 °C until use35,49. Bacterial genomic DNA was extracted from the samples using the MoBio PowerSoil DNA Isolation Kit (Carlsbad, CA) which lyses cells for DNA extraction using mechanical (bead-based homogenization) and proprietary chemical methods that can detect both Gram-positive and Gram-negative bacteria and actinomycetes. As reported previously50, 16S rRNA gene sequencing on the extracted DNA was performed. 16S rRNA amplicon libraries were generated using primers incorporating FLX Titanium adapters and a sample barcode sequence, allowing unidirectional sequencing covering variable regions V3 to V4 (Primers: 347F- 5′GGAGGCAGCAGTAAGGAAT-3′ and 803R- 5′CTACCGGGGTATCTAATCC-3′). Five ng genomic DNA was used as the template in 25 μL PCR reaction buffer for 16S rRNA amplicon preparation. Cycling conditions were one cycle of 94 °C for 3 min, followed by 25 cycles of 94 °C for 15 s, 52 °C 45 s, and 72 °C for 1 min, followed by a final extension of 72 °C for 8 min. The generated amplicons were then purified using the Agencourt AMPure XP kit (Beckman Coulter, CA). Purified amplicons were quantified by fluorometry using the Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen, CA). Equimolar amounts (107 molecules/μL) of purified amplicons were pooled for sequencing. Pyrosequencing (Roche 454 GS FLX Titanium) was carried out according to the manufacturer’s instructions51.

Upstream sequence analysis of microbiome data

16S rRNA gene amplicon sequences were processed and analyzed using the QIIME pipeline52. Multiplexed libraries were demultiplexed based on the barcodes assigned to each sample. Poor-quality sequences were excluded using the default parameters of the QIIME script split_libraries.py (minimum average quality score = 25, minimum/maximum sequence length = 200/1000 base pairs, no ambiguous base calls, and no mismatches allowed in the primer sequence). Filtered sequence reads were clustered into operational taxonomic units (OTUs) and subsequently assigned to taxa by using the Human Oral Microbiome Database (HOMD) pre-defined taxonomy map of reference sequences with ≥ 97% identity53. From 989 pre-diagnostic oral wash samples, we obtained 9,117,268 high-quality 16S rRNA gene sequence reads (mean 9,219 [SD 2557] sequences per sample), with a similar number of reads in all cohorts36. Confidence scores for the assigned taxonomic identities were calculated by QIIME using the Ribosomal Database Project (RDP) naïve Bayesian classification method54, with a score of 1 representing the highest possible confidence. Taxa with confidence scores ≥ 0.8 are presented in Supplementary Table 1. A summary of sequence reads per sample that were assigned to the HOMD reference is shown in Supplementary Table 2.

Quality control

Blinded positive quality control (QC) subject specimens were used across all sequencing batches. We previously reported good agreement of microbiome parameters in replicates from these QC subjects (coefficient of variability across the four cohorts ranged from 0.45% to 8.28% for Shannon Index; 6.29% to 26% for relative abundances across various phyla)36. Negative control samples (with Scope mouthwash only) were used to detect possible reagent and environment contamination in all sequencing batches as well. No DNA was detected from the negative control samples. Comparing genus-level taxa detection between metataxonomics (the 16 s rRNA sequencing used in the present analysis) and metagenomics (whole-metagenome shotgun sequencing)55 for n = 197 ACS samples and n = 257 PLCO samples demonstrated high correlation between the two sequencing methods: Spearman correlation coefficient (mean [min–max]) for ACS cohort: (0.815 [0.319–0.983]) and PLCO cohort: (0.843 [0.228–0.998]).

Statistical analysis

HSB consumption

For individuals in both cohorts, HSB consumption was defined using a standard serving size (1 serving = 12 oz or 355 g), with usual weekly consumption characterized as non-drinkers (0 servings/week), low (< 1 serving/week), medium (1–3 servings/week), or high (> 3 servings/week). To assess potential between-cohort differences in consumption, we additionally computed total HSB consumption for each cohort by summing HSB intake as grams per day from each HSB-related beverage.

Community composition of microbiota

Α-diversity (within-subject diversity) was assessed using species richness and the Inverse Simpson’s index, which were calculated in 500 iterations of rarefied OTU tables of 2,707 sequence reads per sample, the lowest sequencing depth among the samples. We modeled richness and Inverse Simpson’s index as outcomes in mixed-effect linear regression models accounting for the differences in HSB consumption and correlations with the outcomes across cohorts56, allowing the association with HSB and the predictor variables (age, sex, race, BMI, cigarette smoking status [never, former, current], alcohol consumption status [never, ever], total caloric intake [kcal], and history of diabetes) to vary across cohorts. β-diversity (between-subject diversity) was assessed by using unweighted and weighted UniFrac phylogenetic distance matrices accounting for both presence or absence of observed OTUs and their relative abundance, respectively57. We performed Permutational Multivariate Analysis of Variance (PERMANOVA; adonis function, vegan package, R)58 and Principal Coordinate Analysis (PCoA) to examine statistically and visually whether bacterial community composition differed by HSB intake levels. Pair-wise comparisons among intake levels for each of the first three coordinates in PCoA were conducted using the Kruskal–Wallis post hoc test (Dunn's test). PERMANOVA models considered the random effect of cohort by constraining permutations within each study stratum (CPS-II and PLCO) and all models adjusted for age, sex, race, BMI, cigarette smoking status, alcohol consumption status, total caloric intake, and history of diabetes.

Carriage and abundance of OTUs

OTUs were classified into 10 phyla, 24 classes, 41 orders, 72 families, 150 genera, and 496 species, according to their alignment with the HOMD reference database. In the analysis of presence/absence (carriage), we included taxa carried by at least 5% and not greater than 95% of participants, leaving 4 phyla, 11 classes, 19 orders, 35 families, 73 genera, and 271 species. Logistic regression models were used to examine the difference in carriage rate of taxa by intake levels, separately in the CPS-II and PLCO cohorts. We calculated nominal p-values for the CPS-II and PLCO cohorts and also report meta-analysis p-values based on Z-score methods59. In the analysis of abundance, taxa (phylum to species) with greater than 2 sequence reads in at least 100 participants were included, resulting in 8 phyla, 14 classes, 20 orders, 33 families, 55 genera, and 176 species. We used the 'DESeq' function within the DESeq2 package60 in R to fit a negative binomial generalized linear model to test for differentially abundant taxa by HSB intake level at each taxonomic level. This function models raw counts using a negative binomial distribution, which is appropriate when analyzing zero-inflated or over-dispersed counts such as microbiome taxa abundance61, and adjusts internally for “size factors” which normalize for differences in sequencing depth between samples. We calculated nominal p-values for the CPS-II and PLCO cohorts and also report meta-analysis p-values based on Fisher’s method62. In both the carriage and abundance analysis, HSB intake was treated as continuous by assigning the numbers 0, 1, 2, 3 to non-drinkers, and low-, moderate-, and high-level consumers, respectively. All models were adjusted for the above-mentioned covariates. We considered taxa with individual nominal p-values less than 0.10 in both cohorts and meta-p-values less than 0.05 as significant.

Sensitivity analysis

To examine if oral health status has a confounding effect on the observed HSB-microbiome associations, we additionally adjusted models for abundance of Streptococcus mutans (a surrogate marker for dental caries63,64) and carriage of Porphyromonas gingivalis (a surrogate marker for periodontal disease65). We used these surrogate markers of oral disease as we lacked information on oral health conditions in our study. We conducted several stratified analyses according to history of diabetes, smoking status, median abundance of S. mutans, and carriage of P. gingivalis. We also performed case-only analyses (separately by cohort for each cancer type) to see if the observed associations may be driven by subclinical or undetectable disease in either population. Chi-square test of Cochran’s Q statistic was used to examine heterogeneity across strata in the sensitivity analyses.

OTU correlation network

Spearman’s correlation of carriage rate and abundance was used to assess relationships among OTUs that were associated with HSB intake, as well as the bacterial markers for caries and periodontal disease. OTU counts were normalized for DESeq2 size factors, to account for differences in library size in a manner consistent with our differential abundance analysis, before correlation analysis. Correlation coefficients with magnitude ≥ 0.30 were selected for visualization using the “igraph” package in R. Partial correlation test was used to examine if the selected correlations were statistically significant after controlling for covariates, and a false discovery rate (FDR)-adjusted p-value < 0.05 was considered statistically significant after multiple comparison adjustment. All statistical tests were two-sided, and all statistical analyses were carried out using R version 3.4.0.

Ethical approval and consent to participate

Written informed consent was obtained from all study participants, and all protocols were conducted in accordance with the U.S. Common Rule and approved by the New York University Grossman School of Medicine Institutional Review Board.

Results

Demographic characteristics of participants by level of HSB intake are shown in Table 1. Of the 989 participants in this study, 130 (29.8%) in CPS-II and 246 (44.5%) in PLCO were non-consumers of HSBs. Participants in the highest HSB consumer group (> 3 servings/week) in CPS-II and PLCO drank, respectively, 336 and 398 g per day on average, which is greater than 1 12 oz. can of soda or juice per day. Participants were predominantly white and above middle-age. Greater HSB intake was associated with male gender, history of smoking, no history of diabetes, and higher overall caloric intake in both cohorts, and higher BMI in CPS-II only.

Table 1 Demographic characteristics of the study participants by levels of high-sugar beverage consumption.

We first investigated how the overall community composition of oral microbiota varied according to HSB intake. When assessing α-diversity, we found that species richness decreased with higher HSB intake (Fig. 1a,c, p trend = 0.032), while evenness, measured by the inverse Simpson’s index, was not associated with HSB (Fig. 1b,d). When assessing β-diversity based on the unweighted UniFrac distance, which measures the presence or absence of bacterial lineages to define community composition, PCoA revealed that those with the highest HSB intake (Fig. 2a, large red triangle) separated from the rest on the first PCoA axis (Fig. 2a, non-drinkers, < 1/week, and 1–3/week; large gray inverted triangle, yellow square, and purple circle, respectively), and this separation was statistically significant (Fig. 2c, p = 0.003, 0.003, and 0.019 for high-, low-, and moderate-levels of intake, respectively, compared to non-drinkers, from Kruskal–Wallis post hoc test). This difference remained in PERMANOVA after controlling for covariates (p = 0.040 for high intake vs. non-consumer) (Supplementary Table 3). PCoA results were also similar when assessing Jaccard and Bray Curtis differences (data not shown). When assessing β-diversity based on the weighted UniFrac distance, which detects differences in relative taxa abundance, all HSB-consuming groups clustered together in PCoA (Fig. 2b), and showed no difference in the first three principal PCoA coordinates (Fig. 2d) or in PERMANOVA analysis (all p from post hoc test and PERMANOVA > 0.05).

Figure 1
figure 1

Richness and evenness of the oral microbiome by high-sugar beverage intake. (a,b) Violin plots of (a) number of observed OTUs (richness) and (b) inverse Simpson’s Index (evenness) by high-sugar beverage intake. These indices were calculated for 500 iterations of rarefied OTU tables of 2,027 sequence reads per sample, and the average over the iterations was taken for each participant. Plotted are median, interquartile ranges, and the probability density of the indices at different values. Mean values of the richness in non-drinker, low (< 1 can/week), moderate (1–3 cans/week), and high (> 3 cans/week) intake groups were 102.7, 106.0, 102.7, and 97.2; p = 0.51, 0.41, and 0.027 for each intake level compared to non-drinkers, and p = 0.032 for the trend test in linear regression model. Mean values of the inverse Simpson's Index in each group were 10.1, 10.5, 10.5, and 10.6; inverse Simpson’s index did not differ significantly by high-sugar beverage intake. (c,d) Rarefaction curves of (c) number of observed OTUs and (d) inverse Simpson’s Index according to the number of reads per sample, by high-sugar beverage intake group.

Figure 2
figure 2

Principal Coordinate Analysis (PCoA) showing β-diversity of oral bacterial communities by high-sugar beverage intake. (a-b) PCoA plots using (a) unweighted and (b) weighted UniFrac phylogenetic distance matrices in all study participants. The community structures in non-drinkers and low- (< 1 can/week), moderate- (1–3 cans/week), and high- (> 3 cans/week) high-sugar beverage intake groups are depicted using different colors. Larger filled shapes indicate centroids for each group. (c,d) Barplots of the mean of the first three coordinates of PCoA by high-sugar beverage intake, using (c) unweighted and (d) weighted UniFrac phylogenetic distance matrices. In Adonis analysis, p-values for each high-sugar beverage intake group compared to non-drinkers were 0.10, 0.83, 0.44, and 0.040 using unweighted UniFrac (Kruskal–Wallis p-values = 0.003, 0.003, and 0.019, respectively), and 0.86, 0.18, 0.91, and 0.94 using weighted UniFrac (Kruskal–Wallis p-values n.s.). One star (*) indicates p < 0.05 in the Kruskal–Wallis post-hoc test (Dunn's test).

In the taxon-level analysis by logistic regression (Table 2), we found that higher HSB intake was associated with an increase in relative abundance of several taxa (measured as the carriage rate percentage, calculated as the number of participants who carried the taxon divided by total number of participants in each drinking level). Specifically, we observed more frequent carriage of family Bifidobacteriaceae (meta-p = 0.017) and two species: Lactobacillus rhamnosus (meta-p = 0.002) and Streptococcus tigurinus (meta-p = 0.0001). Carriage of genera Lachnospiraceae_[G-2] (meta-p = 0.003) and Peptostreptococcaceae_[XI][G-1] (meta-p = 0.001), and species Lachnospiraceae_[G-2] sp. (meta-p = 0.003) and Alloprevotella rava (meta-p = 0.008) were less abundant with higher HSB intake, and these taxa were correlated in network analysis (all FDR adjusted p-values from partial correlation test < 0.05, Fig. 3a). Other species with depleted carriage in HSB consumers include Capnocytophaga sp., Mycoplasma faucium, Leptotrichia sp., and Campylobacter showae.

Table 2 Taxa* related to high-sugar beverage intake: carriage analysis.
Figure 3
figure 3

Co-occurrence and correlation networks in the oral bacterial community. (a,b) Co-occurrence and correlation network plots using (a) carriage rate and (b) abundance of taxa shown in Tables 23, as well as bacterial makers for caries and periodontitis. The nodes represent taxa, and edges with Spearman’s correlation coefficient |r|> 0.3 using all subjects are shown. Blue nodes represent taxa differentially carried by high-sugar beverage intake (Table 2); yellow nodes represent taxa differentially abundant by high-sugar beverage intake (Table 3); and purple nodes represent S. mutans and P. gingivalis, bacterial makers of caries and periodontitis. The thickness of edges corresponds to the coefficient values.

In the linear analysis of oral taxa abundance according to levels of HSB intake (Table 3), we found that order Fusobacteriales (meta-p = 0.038) and its major genus Leptotrichia (meta-p = 0.025), as well as Lachnoanaerobaculum (meta-p = 0.01) and its species L. saburreum (meta-p = 0.001), and Campylobacter (meta-p = 0.025) were less abundant with higher HSB intake. Network analysis showed that abundances of these taxa were also highly correlated (all FDR adjusted p-values from partial correlation test < 0.05, Fig. 3b).

Table 3 Taxa* related to high-sugar beverage intake: abundance analysis.

As diabetes and the oral diseases of dental caries and periodontitis are related to both HSB use and oral microbiome composition, we examined whether the observed associations of the oral microbiome and HSB intake were influenced by diabetes status (yes/no) and by oral bacterial markers for dental caries (S. mutans) and periodontitis (P. gingivalis) (Supplementary Tables 4–5). The finding of lower richness with higher HSB intake remained after further adjustment for abundance of S. mutans and carriage of P. gingivalis (p = 0.024 and 0.017, respectively) and history of diabetes (p = 0.090). No evidence of heterogeneity across strata was observed in stratified analyses for history of diabetes, smoking status, and markers of dental caries or periodontitis, and magnitude of the associations were attenuated in the case-only analyses. In taxon-based analysis, further adjustment for abundance of S. mutans and carriage of P. gingivalis did not affect the results shown in Tables 2, 3 (data not shown).

Discussion

In this study we observed, for the first time, that greater HSB intake is associated with lower bacterial richness and altered bacterial composition in the saliva. Some acidogenic bacteria were overrepresented in those with higher HSB intake, while certain commensal bacteria were significantly lower. These findings were consistent in two independent cohorts. Additionally, these associations were independent of history of diabetes and microbial markers of caries and periodontitis, suggesting that the associations are not mediated by diabetic and oral health conditions. The strength of the associations decreased in the case-only analyses, suggesting that the findings are not driven by underlying cancer risk factors.

We observed that greater HSB intake was associated with decreased richness of the salivary microbiota. A more diverse bacterial community typically results in higher stability and an enhanced capacity to respond to changes in the environment, while decreased diversity is often associated with disease states5,66,67,68,69,70. The lower richness we observed with HSB intake may result from a direct impact of HSBs on the oral microbiota, or simply reflect the poor oral health conditions in HSB consumers. Poor oral health conditions, including higher plaque index, presence of decayed teeth, and deeper periodontal pockets could be related to both HSB consumption and salivary microbial diversity71,73,73. However, sensitivity analyses using surrogate bacterial markers of oral disease showed no significant confounding effect.

We also identified certain phylotypes overrepresented in those with higher HSB intake. The increase of Bifidobacteriaceae and L. rhamnosus, taxa with high acidogenic capacity64,74,75,76, may result from ecological changes in the mouth due to HSB consumption. In addition, a member of the non-mutans S. mitis group, S. tigurinus, was increased in consumers. This species was recently isolated from a periodontitis patient77 and is a suspected component of pathogenic biofilms78. In contrast, the abundance of S. mutans, which is specifically linked to dental caries64, was not related to HSB intake in our study. This suggests that other dietary factors, including other sources of dietary sugars and carbohydrates, may play a larger role in promoting S. mutans growth64. This result is also consistent with the current evidence that, while salivary S. mutans is strongly associated with caries prevalence63, other bacteria able to produce substantial amounts of acid from fermentable carbohydrates may also contribute to caries development79.

Several commensal bacteria were underrepresented in those with higher HSB intake. Commensal bacteria play important roles in the homeostasis of microbiome and host, and their depletion may disturb the innate immunity of gingival tissue and lead to subsequent health conditions80. For example, butyric acid-producing Lachnospiraceae is important for both microbial and host epithelial cell growth81, and oral Fusobacteria and its major genus Leptotrichia are associated with decreased risk of pancreatic cancer9. We also identified two positive co-occurrence and correlation networks among some of the HSB-related phylotypes. The underlying mechanisms for this may include nutritional cross-feeding, co-aggregation, co-colonization, signaling pathways, and co-survival in similar environments82,84,85,85. Though confirmation of these bacterial networks is needed, our results provide some insights into potential probiotics for oral health, particularly targeting those with high HSB intake. Lactobacilli and Bifidobacteria are common intestinal probiotics and have been considered as potential probiotics for oral health as well, due to negligible pathogenicity, lack of toxic fermentation products, production of antimicrobial compounds, and stimulation of immunity86. However, the acidogenic properties of Lactobacilli and Bifidobacteria can lead to caries87,89,90,90, highlighting the complex bacterial interactions which may be responsible for oral disease. In this study, higher HSB intake was related to greater carriage of L. rhamnosus and Bifidobacteriaceae; these taxa were unrelated to other HSB-depleted commensal phylotypes, while the abundance of Bifodobacteriaceae was positively correlated with S. mutans in network analysis. These findings indicate that Lactobacilli and Bifidobacteria may not be good candidates for oral system probiotics91.

Our study has several strengths. First, the use of 16S rRNA gene sequencing for microbiome analysis allowed us to comprehensively assess overall oral bacterial community composition and specific taxon abundances. Second, our very large sample size provided excellent statistical power to detect variation in the oral microbiome with respect to HSB intake in two independent cohorts and allowed us to confirm our findings across sub-groups of interest. Finally, the detailed demographic and lifestyle information allowed us to adjust for potential confounding factors. A limitation of our study is that it is observational and cross-sectional, limiting the ability to establish a causal relationship. A longitudinal trial where those with moderate or high HSB intake were randomized to continue or stop HSB consumption could provide more definitive information. Further, the majority of study participants were White and above middle-age, limiting the generalizability of our findings to other races and age groups. Because the samples were collected by the participants outside of a controlled clinical environment, we do not have information on the specific conditions under which they were collected, such as the food or medications consumed that day or what time of day they were collected. We also lacked information on the oral health status of the study participants, though sensitivity analysis using bacterial markers as proxies of dental caries (dental decay) and periodontitis (gingival disease) suggest that the observed HSB-microbiota associations are independent of oral diseases. Using bacterial surrogates of two distinct oral health indicators affecting different tissues provides orthogonal evidence supporting this claim92. Given the evolving landscape of multi-omics technologies, future analyses using more advanced sequencing techniques are needed to validate these findings. In particular, confirming the species-level results using greater sequencing depth and increased sequence homology with the HOMD reference database (e.g. ≥ 98.5%), will be important, as not all species in our analysis met the 0.8 confidence threshold for taxa assignment. Lastly, 16S rRNA gene sequencing data alone limits our ability to understand the functional activities of the salivary microbial community related to HSB intake.

Conclusions

We found that HSB intake is related to overall salivary microbiome community composition and the abundance of specific oral taxa. Greater HSB intake was associated with greater prevalence of acidogenic bacteria and depletion of commensal bacteria. Such changes may potentially contribute to HSB-related diseases, including caries, periodontitis, oral cancer, and diabetes. The taxa we have identified can be further investigated to elucidate their potential role in HSB-related health conditions. Future studies should also investigate the impact of HSB intake on the metagenomic (functional) content of the oral microbiome. Improved understanding of the causes and health impacts of HSB consumption and oral microbiome composition can lead to diet and microbiome-targeted approaches for disease prevention, such as HSB substitution and probiotics for oral health.