Introduction

Gastric cancer (GC) is the third leading cause of cancer-related death worldwide, accounting for more than 720,000 deaths annually [1]. It is generally believed that GC develops via a multistep progression cascade from superficial gastritis (SG), atrophic gastritis, intestinal metaplasia (IM), dysplasia and subsequently to cancer. This cascade of pathological changes in gastric carcinogenesis, called Correa’s cascade, is often initiated by Helicobacter pylori (H. pylori) infection [2]. Although more than half of the global population is infected by H. pylori, only 1–2% of infected individuals develop GC [3]. It is believed that additional factors also contribute to susceptibility to GC, including pathogenicity of H. pylori strains, duration of infection, host genetic polymorphisms and environmental factors such as diet [4]. Similar to the influence of intestinal microbiota on human health, the microbial residents in the stomach also likely contribute to gastric immuno-biology and possibly gastric diseases [5].

Recent advance in high-throughput sequencing based on conserved 16S ribosomal RNA and newly developed computational methods have uncovered a complex and distinct bacterial community that inhabits both the gastric mucosa (GM) and fluid (GF) in addition to H. pylori, including members of the phyla Firmicutes, Proteobacteria and Bacteroidetes [6]. It remains unclear whether the presence of H. pylori shapes the microbiota composition in GF just as GM [7]. Additionally, some studies provide evidence on the gastric bacterial shifting from precancerous lesions to GC and highlight the potential involvement of microbes other than H. pylori in gastric carcinogenesis [8, 9]. However, the microbial profiling of GF and its association with GM in gastric tumorigenesis remain scarce.

Here we performed the 16S ribosomal RNA (rRNA) gene sequencing on 414 gastric samples, including GM and GF, from 180 patients with progressive histological stages (SG, IM and GC) during gastric tumorigenesis. The compositional alterations of microbiota were observed in not only GM but also GF from SG through IM to GC, and the prediction values of bacterial taxa as markers for GC in both sites were explored. We also constructed paired comparison of the community differences between GM and GF along gastric tumorigenesis. Ultimately, this study will provide a better understanding of the global ecological changes in GC development and help to define the potential crosstalk of GF and GM microbiota in oncogenesis.

Materials and methods

Patients and sample collection

A total of 318 gastric biopsy tissues were retrospectively sampled from 179 patients including 61 SG, 54 IM and 64 GC from Department of Gastroenterology, the First Affiliated Hospital of Nanchang University, Jiangxi, China. Samples were obtained from antrum of SG and IM, while biopsies were obtained from sites of cancer lesions and adjacent non-cancerous tissues of GC. Meanwhile, 96 gastric fluid samples were aspirated from the same cohort including 42 SG, 26 IM and 28 GC. The demographic characteristics of patients are shown in Supplement Table 1. Exclusion criteria were as follows: age under 40 years; the presence of a serious illness, such as severe cardiopulmonary, renal, or metabolic disease; prior medication history of antibiotics, acid blockers (proton pump inhibitor and H2 receptor antagonist), anti-inflammatory agents (aspirin, nonsteroidal anti-inflammatory drugs, and steroids), or probiotics for past 1 month; prior history of any surgical gastric resection; and refusal of consent to the study. This study was approved by the institutional review boards of the First Affiliated Hospital of Nanchang University (2016-034). Both mucosa and fluid samples were collected during endoscopy and frozen immediately at – 80 °C. An antrum biopsy from each patient was tested for the presence of H. pylori by immunohistochemistry test (Helicobacter pylori antibody, TALENT BIOMEDICAL, China), whose sensitivity and specificity were validated by a previous study [10]. Details of study patients are provided in Supplement Table 2.

The measure of pH values and nitrite in gastric fluid

On entering the stomach, a double lumen sphincterotome (CleverCut, Olympus) was passed down from the suction biopsy channel of the endoscope and approximately 5 ml of gastric juice in the gastric fundus was aspirated gently through the inner sterile catheter. The pH values of the samples were determined using a glass pH electrode (PH5S-E, SANXIN, Shanghai, China).

The concentrations of nitrite in gastric aspirate were analyzed according to a previous study [11]. Briefly, it is determined spectrophotometrically by diazotization of sulphanilic acid followed by coupling to N-(1-naphthyl)-ethylenediamine after removal of interfering substances on an anion exchange column.

DNA extraction and 16S rRNA gene sequencing

Total DNA was extracted using QIAGEN DNeasy Kit (QIAGEN, California, USA). The concentration and integrity were assessed using a Nanodrop (2000c) (Thermo Scientific, USA) and agarose gel electrophoresis, respectively. The PCR-based library preparation targeting the 16S rRNA gene’s variable region 4 (V4) was performed using the following primer pair: 515F (5ʹ-GTGCCAGCMGCCGCGGTAA-3ʹ), 806R (5ʹ-GGACTACHVGGGTWTCTAAT-3ʹ). Then the PCR products were purified and sequenced by the Illumina Miseq instrument at BGI (Shenzhen, China) using 250 bp paired-end (PE) sequencing.

Bioinformatic analysis

Microbiome bioinformatics were performed with QIIME2 (2020.11) [12]. Briefly, raw sequence data were demultiplexed followed by primers cutting. Sequences were then quality filtered, denoised, merged and chimera removed using DADA2 method [13]. The deduplicated sequences were amplicon sequence variants (ASVs) with approximately 100% identity. All representative sequences were annotated and blasted against Silva database Version 132 using the pre-trained naïve Bayes classifier.

Statistical analysis

The alpha diversity of gastric microbiota was estimated using Observed species and Shannon index at ASV level. Beta diversity of microbial community was characterized by principal coordinate analysis based on Bray–Curtis and Weighted UniFrac distances. Wilcoxon Rank-Sum test was performed to compare the alpha diversity differences of two groups, while multiple group comparisons were made using Kruskal–Wallis test. Comparison of demographic data with normal distributions among multiple groups was performed using one-way analysis of variance (ANOVA) or Chi-squared test. Significant differences of beta diversity were evaluated by PERMANOVA. The differentially abundant taxa between groups were identified using linear discriminant analysis (LDA) effect size (LEfSe) [14]. Only taxa with LDA greater than 3 at a p value < 0.05 were considered significantly enriched. Multivariate association with linear models algorithm (MaAsLin) 2 was used for association testing of the covariates versus the abundance of microbial taxa to eliminate the confounding effects of age, gender and BMI [15]. A tenfold cross-validation (10 trials) was performed on a random forest model to select the optimal set of genera (V4.6–14). The possibility of disease (POD) was defined as the ratio of the number of randomly generated decision trees predicting samples as GC to that of SG [16]. The receiving operational curve (ROC) was drawn with pROC package, and the area under curve (AUC) was calculated to assess the diagnostic efficacy of the model (R v3.5.1). Pearson’s correlation analysis was used to analyze the relationship between Helicobacter abundance and Shannon index. Spearman’s test was applied to estimate microbial correlation in SG and GC, and Cytoscape V.3.8.2 was used for visualization of significant co-occurrence and co-excluding interactions. Paired comparisons were employed by the subtraction of diversity indices and distances between matched mucosa and fluid samples.

Results

Different distribution of microbiome in gastric mucosa and fluid

In total, 9.82G clean data were generated. The proportion of high-quality reads among all raw reads from each sample was 85.12% on average, and a total of more than 15.5 million reads and 9553 ASVs were obtained, which corresponded to a mean of 37,497 reads and 125 ASVs per sample. The gastric microbiota was dominated by bacterial phyla Firmicutes (42.2%) Epsilonbacteraeota (25.9%), Proteobacteria (18.0%), Bacteroidetes (8.9%) and Fusobacteria (2.7%). Overall, the GM had significantly lower alpha diversity than GF as revealed by observed species and Shannon index at the ASV level (Wilcoxon rank-sum test, Fig. 1A, B), suggesting that the passenger bacteria in the lumina partially colonize the mucosa. Principal coordinate analysis (PCoA) plots generated by beta diversity revealed that the overall microbial compositions in GM were significantly different from those in GF (PERMANOVA, p = 0.001 for Weighted UniFrac and Bray–Curtis distances, Fig. 1C, D).

Fig. 1
figure 1

Differences in bacterial community structures between gastric mucosa and fluid. Alpha diversity illustrated by Observed species (A) and Shannon index (B) was lower in GM compared to GF. PCoA based on Weighted UniFrac (C) and Bray–Curtis distances (D) revealed different microbial structures between GM and GF. Relative proportions of microbiota at the phylum (E) and genus (F) level were displayed. G LDA scores for the bacterial taxa differentially abundant between GM and GF. Only the taxa having a p < 0.05 and LDA > 3 are shown. ***p < 0.001. GM, gastric mucosa; GF, gastric fluid; PCoA, principal coordinate analysis; LDA, linear discriminant analysis

At the phylum level, Bacteroidetes, Fusobacteria and Proteobacteria were significantly enriched in GF, while Firmicutes was enriched in GM (Fig. 1E). As shown in Fig. 1F, most of the taxa distributed evenly in GF, while three genera including Helicobacter, Lactococcus and Bacillus accounted for 60% of the total bacteria in GM. Differential comparison revealed that abundance of Helicobacter, Lactobacillus, Lactococcus and Bacillus was higher in GM, while levels of Neisseria, Haemophilus, Streptococcus, Gemella were abundant in GF (Fig. 1G).

Impact of H. pylori infection on gastric microbiota community is greater in mucosa than fluid

Although H. pylori has coevolved with humans for many years, it is obscure whether its impact on GM and GF is equal [17]. First, we found that the microbial richness and diversity as revealed by observed species and Shannon indexes were remarkably lower in the GM of H. pylori-positive patients compared to the negative counterparts (Fig. 2A, B). However, no significant difference was observed in GF samples (Fig. 2A, B). PCoA analysis based on Weighted UniFrac and Bray–Curtis distances showed that the H. pylori-positive samples clustered separately from the negative samples, especially in GM (Fig. 2C, D, PERMANOVA, p = 0.001 for both GM and GF).

Fig. 2
figure 2

Greater impact of H. pylori infection on gastric mucosa microbiota compared to fluid. Observed species (A) and Shannon (B) index of alpha diversity was compared between patients with and without H. pylori infection in both GM and GF. PCoA analysis based on Weighted UniFrac (C) and Bray–Curtis (D) distances showed that the gastric samples, especially GM, were separately clustered according to the infection of H. pylori. E Taxonomic profiles of microbiota at the genus level in GM and GF from SG to IM and GC. Microbial diversity of GM (F) and GF (G) was negatively correlated with H. pylori abundance. ***p < 0.001. GM, gastric mucosa; GF, gastric fluid; P, H. pylori positive; N, H. pylori negative; PCoA, principal coordinate analysis; SG, superficial gastritis; IM, intestinal metaplasia; GC, gastric cancer; ns, not significant

Further compositional analysis reflected that Helicobacter was more abundant in both GM and GF samples from patients infected with H. pylori compared to those without infection, but GM had higher Helicobacter abundances than GF in H. pylori-positive patients (62 vs 17%) (Fig. 2E). The variance of Helicobacter abundance was considerable among different individuals, ranging from 0 to 99.9%. Pearson’s analysis demonstrated that the amount of Helicobacter was negatively correlated with Shannon index. The correlation coefficient was higher in GM than GF, which indicates that the impact of Helicobacter is more profound on GM microbiota compared to GF (Fig. 2F, G). Interestingly, the relative abundances of Helicobacter were significantly decreased in patients with GC compared to SG, and this alteration was observed in GM rather than GF, which supports that the GM was not suitable for H. pylori colonization during gastric tumorigenesis (Fig. 2E).

Distinct characteristics of gastric mucosa and fluid microbiome during stomach carcinogenesis

Accumulating evidence suggests the association of specific mucosal microbiota with GC [8, 9]. In this study, we evaluated the alterations of microbiota in both GM and GF across disease stages. In GF samples, patients with GC had lower Shannon index and observed species than SG (Fig. 3A). In contrast, in GM samples, the Shannon index were higher in GC than SG, while no significant difference of observed species among different stages (Fig. 3B). We further examined the gastric acid in different disease stages as the acidic environment of stomach critically determines the diversity of gastric microbes. We found that the pH value was increased in GC compared to SG, which suggest that the reduction of gastric acid might promote the diversity of mucosal bacteria (Supplement Fig. 1).

Fig. 3
figure 3

Alterations of microbial diversity in gastric mucosa and fluid along the histopathological stages of gastric carcinogenesis. The comparison of Observed species and Shannon index among patients with SG, IM and GC in GF (A) and GM (B). PCoA of bacterial beta diversity based on Bray–Curtis distances (C) and Weighted UniFrac distances (D) demonstrated the samples of GM in GC patients got close to the GF samples. LDA effect size analysis revealed the differentially abundant genera between SG and GC in both GM (E) and GF (F). *p < 0.05, **p < 0.01, ***p < 0.001. GM, gastric mucosa; GF, gastric fluid; SG, superficial gastritis; IM, intestinal metaplasia; GC, gastric cancer; PCoA, principal coordinate analysis; LDA, linear discriminant analysis

The overall microbial compositions were distinct across disease stages in both GM and GF (p = 0.001 for Bray–Curtis and Weighted UniFrac distances). Notably, the PCoA plot showed that there was some overlap between GM and GF samples in patients with GC while GM samples deviated from GF samples in SG and IM patients, which indicates the convergence of GM and GF microbiota in GC (Fig. 3C, D). To investigate the specific changes of bacterial taxa associated with GC, we performed LEfSe analysis between GC and SG. As shown in Fig. 3E, there were 13 GM genera enriched in GC patients, including Neisseria, Veillonella, Fusobacterium and Lactobacillus, which are reported to reduce nitrate to nitrite or to form N-nitroso compounds (NOC) [18]. Accordingly, we found that the level of nitrite was elevated in GC as compared to SG and IM (Supplement Fig. 2). In line with GM, the genus Lactobacillus in GF was also significantly increased in GC compared to SG (Fig. 3F).

To assess the diagnostic value of GM and GF microbial markers for GC, we constructed a random forest classifier model between GC and SG. The tenfold cross-validation of random forest model selected 7 GM genera (Lactobacillus, Gemella, Enterococcus, Helicobacter, etc.) and 13 GF genera (Lactobacillus, Filifactor, Staphylococcus, Dialister, etc.) as the optimal marker set (Supplement Fig. 3). The probability of disease (POD) index was markedly higher in GC versus SG (Fig. 4A, B). The ROC analysis showed that the performance of the GM markers achieved an AUC of 94% (95% CI 0.81 to 1) while GF markers generated an AUC of 83% (95% CI 0.34 to 1) (Fig. 4C, D). The classifying ability of the model was then validated in an independent cohort comprising 60 SG and 60 GC patients from Qindao Municipal Hospital (PRJNA313391). In the validation cohort, the AUC of GM markers and GF markers was 84% (95% CI 0.58 to 1) and 89% (95% CI 0.54 to 1), respectively, confirming that the gastric microbiome-based classifier is able to accurately distinguish GC from SG (Fig. 4E, F).

Fig. 4
figure 4

Gastric microbial biomarkers for the prediction of gastric cancer. The POD values using mucosal (A) and fluid (B) genus-based markers identified by random forest model were significantly increased in GC compared to SG. ROCs of the mucosal and fluid model were constructed with good performance in both discovery cohort (C and D) and an independent cohort (E and F). ***p < 0.001. POD, probability of disease; SG, superficial gastritis; GC, gastric cancer; ROC, receiving operational curve

Microbial dysbiosis emerges prior to pathological lesion

Previous studies reported that the structure of gastric microbiota in tumors was different from the tumor-adjacent tissues [19]. Thus, the GC samples used in the above analysis were tumoral tissues (GM_GC_T). We then determined the composition and diversity of microbiota in peri-tumoral tissues (GM_GC_P). The Shannon index was higher in GM_GC_T than GM_SG, while no significant difference was observed between GM_GC_T and GM_GC_P (Supplement Fig. 4A). Although PCoA analysis based on Weighted UniFrac distance showed that the separation among GM_GC_T, GM_GC_P and GM_SG was not apparent (Supplement Fig. 4B), LEfSe detected a significantly higher relative abundance of the genera Fusobacterium, Gemella, Veillonella and Neisseria in GM_GC_P as compared to GM_SG (Supplement Fig. 4C). Taken together, the overabundance of these tumor-enriched bacteria in the peri-tumoral tissues indicated the development of microbial dysbiosis, which trends toward tumor microhabitat, emerges prior to histopathological lesion.

Convergence of microbial community between gastric mucosa and fluid in stomach cancer

Although the overall microbial community was different between GM and GF, we are surprised to find that the GM samples of GC trended toward GF samples (Fig. 3C, D). Thus, we performed paired comparison between GM and GF across disease stages. The comparison of alpha diversity as revealed by observed species and Shannon indexes showed that the discrepancy between GM and GF was prominently decreased in GC as compared to SG and IM (Fig. 5A, B). Likely, the beta diversity analysis revealed that the bacterial compositions between GM and GF became more similar in GC compared to SG and IM based on the Bray–Curtis and Weighted UniFrac distances (Fig. 5C, D).

Fig. 5
figure 5

Microbial distinction between gastric mucosa and fluid dwindled during the progression of stomach cancer. Paired comparison of alpha diversity was performed by subtraction of Observed species (A) and Shannon (B) indexes between GM and GF among SG, IM and GC. The Bray–Curtis (C) and Weighted UniFrac (D) distances between GM and GF samples was calculated in different disease stages. The Heatmap showed the differentially abundant genera between GM and GF in SG (E) and GC (F), respectively. The representative microbes, including Veillonella (G), Haemophilus (H), Peptostreptococcus (I), with differential abundance between GM and GF in SG, shared equal abundance in GC. *p < 0.05, **p < 0.01, ***p < 0.001. GM, gastric mucosa; GF, gastric fluid; SG, superficial gastritis; IM, intestinal metaplasia; GC, gastric cancer

At the genus level, the differential bacterial taxa between GM and GF were fewer in GC than in SG (p < 0.05, Wilcoxon rank-sum test after Bonferroni adjustment) (Fig. 5E, F). Furthermore, we investigated the specific bacteria that contribute to the dwindling distinction between GF and GM. The paired comparison showed that the abundances of some NOC-producing genera, such as Veillonella, Haemophilus, Peptostreptococcus, in GM approached to GF with no significant difference in patients with GC, while their abundances were strikingly higher in GF than GM in patients with SG (Fig. 5G, H, I).

Spearman’s correlation test was performed to evaluate the relationships among the top 50 most abundant genera in GM and GF (Supplement Table 3). We observed that the microbial interactions between GM and GF were significantly stronger in GC compared to SG (Fig. 6). In particular, some genera in GF exhibited significant positive correlations with their counterparts in GM, including Helicobacter (r = 0.5, p = 0.01), Streptococcus (r = 0.75, p < 0.001), Haemophilus (r = 0.48, p = 0.01), which were associated with GC. The intensified interplays between GM and GF in patients with GC suggest that there may be an interchange of bacteria between these two microhabitats.

Fig. 6
figure 6

Microbial correlation strengths between gastric mucosa and fluid increased in gastric cancer. Spearman’s correlations were performed among the top 50 most abundant genera in GM and GF from patients with SG and GC. Stronger positive correlations were observed between GM (triangle) and GF (circle) in GC compared to SG. The size of each shape represents the abundance of the genus that belongs to different phylum. SparCC algorithm was used to estimate correlation coefficients and Cytoscape V.3.8.2 was used for network construction. A subset of significant correlations with strengths of at least 0.4 was selected for visualization. GM, gastric mucosa; GF, gastric fluid; SG, superficial gastritis; GC, gastric cancer

Discussion

There is growing evidence that the imbalance of gastric microbiomes has been linked to the development of gastric carcinogenesis [20]. However, the overall understanding of the role of the gastric microbiome in both mucosa (GM) and fluid (GF) as well as their alterations during disease progression are limited. Herein, we found that GM and GF shared both common and specific characteristics of microbial dysbiosis across stages of gastric cancer. Paired comparison demonstrated that the microbial compositions in GM and GF were similar to each other in GC, while their differences in SG and IM were significant. Additionally, we found that the tumor-free tissues in the patients with GC harbored an aberrant microbiota that deviated from SG and trended toward tumor microhabitat. Based on the microbial signature, we established both the GM and GF models that have discriminatory power for classifying GC. The microbial markers identified in our cohort were confirmed in additional validation cohort downloaded from database indicating their generalization.

The gastric microbiome consists of two adjacent but independent populations, the luminal microbiota in the fluid and the mucosa-associated microbiota. Despite a large body of studies addressed the bacterial biodiversity in GM, there is a lack of knowledge regarding the bacterial community in GF as well as its association with GM. Our data showed that there are significant differences in the microbial compositions with a lower bacterial diversity in GM than GF. This result is consistent with a previous study reporting a lower bacterial diversity of stomach fluid than gastric biopsies from four subjects [21]. Compared to the fluid samples, the mucosal samples comprised relatively lower abundances of Bacteroidetes, Haemophilus, Neisseria, Clostridia, and relatively higher abundances of Bacillus. Similarly, one recent study also reported that phylotypes belonging to the Bacteroidetes were abundant in the lumina, while phylotypes belonging to the Firmicutes were abundant in the biopsies, by comparing the communities of gastric mucosal and luminal biopsies in 24 patients [22]. These lines of evidence indicate the distinct microbial communities in GM and GF, which may result from (1) certain bacterial species from diet, air and drinking water are just transiently present in GF and (2) only a portion of bacteria floated in the lumina could penetrate the mucous layers and inhabit on the mucosa due to their optimal colonization niches.

Since H. pylori plays an important role in gastric diseases, its impact on gastric microbial structures including GM and GF was investigated. In agreement with previous reports, Helicobacter spp. were shown to dominate the microbial community in the stomach [8, 23]. Moreover, we found that the relative abundance of Helicobacter was significantly higher in GM than GF, indicating that the mucosal microhabitat is more favorable for its colonization. The predominance of Helicobacter was confirmed to inversely correlated with Shannon index, especially in GM with 90% coefficient. This finding was supported by several other studies that revealed lower mucosal microbial diversity in H. pylori-positive individuals compared with non-infected subjects, which could be restored after H. pylori eradication [24, 25]. Additionally, a reduction in the abundance of Helicobacter was detected in patients with GC compared to SG and IM. Consistently, it has been documented the frequently negative detection of H. pylori in gastric adenocarcinoma, which is probably due to the modification of microenvironment by persistent H. pylori infection that in turn causes its own decline [26, 27].

Recently, accumulating evidence suggest the dysbiosis of mucosa-associated microbiota in gastric carcinogenesis, yet the study regarding alterations in GF is scarce. This study demonstrated that both GM and GF microbiota changed significantly from SG to GC, although with differential characteristics. Compared to SG, the microbial diversity in patients with GC was decreased in fluid while increased in mucosa. It is acknowledged that the diversity of gastric microbiota is mainly influenced by the surrounding acidic environment [28]. Accordingly, we examined the gastric pH and found higher levels in GC compared to SG and IM, which indicates that the neutralization of microenvironment facilitates the diversification of mucosal microbiota. There are also clinical studies presented that the impairment of gastric acid secretion is associated with a significant increase in the risk of gastric cancer [29, 30]. In addition to the depletion of Helicobacter, we observed several genera, including Lactobacillus, Neisseria, Fusobacterium and Vellionella, were more abundant in GC compared to SG, which is in consistence with previous studies [31, 32]. These bacteria have the capability to convert nitrogen compounds to potentially carcinogenic N-nitroso compounds, which have been demonstrated to increase nitrosating functions in GC [8, 18]. As expected, our study revealed that the amount of nitrite was increased in GC compared to SG and IM. We noticed that these nitrosating bacteria are also over-represented in the tumor-adjacent mucosa, whose microbial structure shifted from SG toward tumor lesions. The dysbiosis of microbiota in adjacent non-cancerous mucosa suggests that the alterations of gastric microbiota occur prior to histological lesions.

A further notable finding of our study was that the microbial community profiling in GM converged to GF during disease progression. Both alpha and beta diversity analysis using paired samples showed that the compositional difference between GM and GF was decreased in GC compared to SG and IM. Up till now, only two studies made paired comparison between GM and GF using high-throughput sequencing analysis, although these pioneer studies did not address how the microbial compositions changes during gastric disease progression [21, 22]. Interestingly, our findings showed that the microbial compositions in the mucosal samples overlapped with fluid samples in patients with GC, while separated significantly in SG. Specifically, we characterized the taxa that contribute to the dwindling distinction from SG to GC between GM and GF. In line with previous studies, these bacteria, including Veillonella, Haemophilus, Peptostreptococcus, Gemella, Streptococcus, were probably oral microbes, which were associated with the development of GC [9, 33]. The saliva has been demonstrated as the main source for the gastric microbiome and the closest relationship was found between salivary and GF samples from the same individuals [34]. However, a large proportion of microbes could be either destroyed by the gastric acid or prevented from invading the epithelium with the protection of mucous-bicarbonate barrier35. Thus, as the present study showed, the microbial diversity was significantly lower in GM compared to GF. With the impairment of the mucosal layer in disease status, such as gastric ulcer or even GC, and the neutralization of gastric acid, it is plausible to predict that these potential invaders may more easily reach from GF to the GM and subsequently promote inflammation (Fig. 7).

Fig. 7
figure 7

Schematic illustration for the mechanism. The composition of mucosal microbiota is significantly different from fluid microbiota in patients with SG because of acidic environment and the integrity of mucosal barrier. The dissimilarity between mucosal and fluid microbiota is dwindled in patients with GC. We speculate that impaired acid secretion and destroyed mucosal barrier facilitate the fluid microbes colonize in the mucosa

A major advantage of our study includes collection of both GM and GF across stages of GC and paired comparison between these two sites, which reduce the impact of inter-individual differences. Nevertheless, several limitations need to be noted. First, it is difficult to decide whether bacterial markers are dead or living by next-generation sequencing. This could be compensated by culture and biochemical testing, although not all bacterial species can be successfully cultured. Second, 16S rRNA gene sequencing rather than metagenomics sequencing is utilized in this study, which limits data interpretation in terms of species level and function analysis. Third, this study provides evidence of association not causality. Further studies are warranted to assess the role of specific bacteria on GC development using germ-free mice.

In conclusion, our data identified both GM and GF microbiota as biomarkers for GC, which were validated with good performance in an independent cohort. We also noticed the convergent microbial profiles and intensified bacterial interactions between GM and GF in the development of GC, which suggests the interchange of microbes in GF with GM. Future longitudinal human studies as well as germ-free mice models are needed to elucidate the role of potential genotoxic bacteria other than H. pylori in the gastric tumorigenesis.