Background

Celiac disease (CeD) is defined as an autoimmune enteropathy triggered by gluten, affecting genetically predisposed individuals (HLA DQ2 and/or DQ8) [1, 2]. Recent data show that tolerance to gluten can be lost at any time in life [3]. These findings, together with the lack of complete CeD concordance among monozygotic twins, suggest that, while genetic predisposition and gluten intake are necessary for CeD development, they are insufficient to trigger the onset of the disease [4]. Thus, other contributing factors such as changes in microbiome composition and function have been suggested to be associated with CeD.

The microbiome of a healthy individual is relatively stable by 3 years of age; however, this composition can be modulated throughout the entire lifespan by different factors, such as lifestyle, dietary choices, antibiotic treatment, stress, and other environmental components. Intestinal dysbiosis via such factors has been reported to be associated with development of CeD [5].

The Saudi population has been reported to have a high prevalence of CeD (1.5%). The high rate of CeD-predisposing HLA-DQ genotypes in the general population (52.7%) may partially account for this high prevalence, although additional external factors should also be taken into consideration [6]. The consumption of gluten-containing cereals in the diet of the Saudi population is reported to be very high as recorded by the Food and Drug Organization [7]. This high intake of cereals may directly increase the prevalence of CeD, or indirectly by altering other factors such as the microbiome composition.

Most of the literature on the microbiome in CeD were from Western populations. Cultural and dietary lifestyle in non-Western populations, mostly developing countries could affect microbiota profile and studies on microbiome in CeD from these populations may increase our understanding of the pathogenesis of CeD. Therefore, our objective was to determine whether a different microbiota profile is associated with CeD in children in Saudi Arabia.

Results

Characteristics of the study population

A total of 40 children with CeD (provided 20 tissue and 20 stool samples) and 39 controls were enrolled in this study. There were two types of controls. Twenty healthy children who provided stool samples only (fecal controls), and 19 non-CeD children who provided mucosal samples only (mucosal controls). The latter had normal endoscopy and normal duodenal mucosal histopathology. In addition, all controls had normal anti-tissue transglutaminase A values. The demographic and clinical characteristics are presented in Table 1. Briefly, males accounted for 28%, 35%, and 42% of the children with CeD, fecal, and mucosal control groups respectively. The median age at diagnosis was 10.3, 11.3 and 10.6 years in children with CeD, fecal, and mucosal control groups respectively. The number of asymptomatic children with CeD was 15/40 (38%), whereas the remainder had various combination of symptoms including anemia, growth impairment, and abdominal pain.

Table 1 Demographic and clinical characteristics

Alpha- and beta-diversities

Differences in alpha diversity between the CeD and non-CeD groups were measured in both fecal and duodenal samples using the Chao and Shannon indices, an abundance-based estimators of species richness. Although not statistically significant, our analysis showed a clear difference in bacterial diversity between the mucosal and fecal samples, indicating an increased richness and variability in stools (Fig. 1A, B). Interestingly, alpha diversity did not differ between CeD and non-CeD groups, despite there being a trend toward smaller diversity in CeD stools compared with that in non-CeD stools.

Fig. 1
figure 1

Alpha diversity. Illustration of alpha diversity measured by Chao index (A) and by Shannon index (B) for bacterial communities in duodenal and fecal samples of CeD patients and non-CeD controls

For bacterial beta diversity, Bray–Curtis PCoA analysis did not show any significant clustering patterns in samples from the duodenal mucosa or stools of the CeD and non-CeD groups (Fig. 2A, B). However, in the analysis of bacterial fecal samples, there were small clusters characteristic of either CeD or non-CeD groups.

Fig. 2
figure 2

Beta diversity. Bray–Curtis-based bacterial beta-diversity analysis of mucosal (A) and fecal samples (B) from patients with CeD (pink dots) or non-CeD controls (blue dots)

Overall bacterial composition

The overall bacterial composition of fecal and mucosal samples was analyzed in both CeD and non-CeD groups; this was represented through heatmap (Fig. 3). As expected, the bacterial richness in stools was higher than that in duodenal samples, and in both sets of samples, Firmicutes and Bacteroidetes were the most abundant phyla. In duodenal samples, an increased percentage of Proteobacteria species was detected, whereas overall, the stools were characterized by increased abundance of Verrucomicrobia species.

Fig. 3
figure 3

Heatmap: Representing bacterial microbiome composition in duodenal and fecal samples of patients with the CeD and non-CeD. The bacterial richness in stools was higher than that in duodenal samples, Firmicutes and Bacteroidetes were the most abundant phyla. Actinobacteria abundance was reduced

LDA effect size

The LDA effect size (LEfSe) plot revealed statistically significant different bacterial composition in fecal samples between children with CeD and non-CeD controls. For example, there was an increase of Escherichia in the CeD group and an increase of Desulfovibrio in the non-CeD group at the genus level (decreased in the CeD group) (Fig. 4A). Similarly, at the species level, there was a statistically-significant difference between the CeD and non-CeD group. For example, in the CeD group there was an increase of E. coli and Lachnospiraceae_bacterium_oral; whereas several species of Bacteroides were significantly increased in fecal samples of non-CeD controls (decreased in CeD) (Fig. 4B). In mucosal samples, although not statistically different by standard criteria, there were important differences in abundance of several taxa between CeD and non- CeD mucosal samples. For example, Lactobacillus acidophilus, Neisseria and Coprococcus species were increased in the CeD group; whereas Roseburia and Lachnospiraceae species were increased in non-CeD group (decreased in the CeD group) (Fig. 4C, D).

Fig. 4
figure 4

LEfSE LDA scores: A and B show statistically significant abundance difference in stool samples from patients with CeD with those of non-CeD controls at the genus (A) and species (B) level. C and D Illustrate the abundance difference, although not statistically different, of LDA scores in mucosal samples from patients with CeD with those of non-CeD controls at the genus (C) and species (D) level. Bars with a positive LDA score (green) are higher in non-celiac samples, and bars with a negative LDA score (red) are higher in celiac samples. The extensions u_g and _u_s mean unclassified genera and species respectively

DeSeq2 differential abundance analysis

DeSeq2 differential abundance analysis revealed statistically significant differences in log2 fold change abundance between CeD and non-CeD samples. Log2 fold change > 0 and < 0 indicate increased abundance in children with CeD and non-CeD respectively. Increased abundance in children with non-CeD suggests decreased abundance in children with CeD. Table 2 shows the log2 abundance change for the top 10 taxa (order, family, and genera) in mucosal and fecal samples, illustrating the different microbiota profile between mucosa and stool. For example, in mucosal samples, Flavobacteriales (p = 0.0005), Flavobacteriaceae (unadjusted p = 8.11–08), and Clostridium (unadjusted p = 0.011), were the most significantly decreased bacterial order, family, and genus levels respectively, whereas Micrococcales (unadjusted p = 0.018), Micrococcaceae (unadjusted p = 0.022), and Subdoligranulum (unadjusted p = 0.021) were the most significantly abundant bacterial order, family and genus levels respectively. In fecal samples, however, Cardiobacteriales (p = 0.01), Leuconostocaceae (p = 0.003), and Tannerella (p = 1.17–05) were the most significantly decreased bacterial order, family and genus levels respectively, whereas Planctomycetaceae (p = 0.013) and Kocuria (p = 0.003) were the most abundant family and genera levels. The top 10 most significant species abundance in mucosal samples is presented in Table 3. In these samples, Bifidobacterium angulatum (unadjusted p = 0.006) and Roseburia intestinalis (unadjusted p = 0.031), were examples of increased species in mucosal samples of children with CeD and non-CeD (decreased in CeD) respectively.

Table 2 Log2 abundance change for bacterial order, family, and genera in children with CeD
Table 3 Log2 fold abundance change of the top 10 bacterial species in mucosal samples of children with CeD

The log2 fold change abundance of 169 significantly different bacterial species in fecal samples of children with CeD and non-CeD controls is depicted in Table 4. There were several species significantly decreased in children with CeD belonging to the Bifidobacterium genus, such as B. breve (p = 0.0028), B. angolatum (p = 2.24−07), B. merycicum (p = 0.012), and B. thermophilum (p = 0.027). Among Lactobacilli species, L. plantarum (p = 0.0043), was significantly less abundant in CeD samples, whereas the abundance of other lactobacilli such as L. gasseri (p = 0.033) was significantly- increased in children with CeD. Prevotella species (P. timonensis (p = 0.018); P. bergensis (p = 0.022) were significantly more abundant in stool samples of children with CeD, whereas Prevotella sp. P5-119 was significantly less abundant (p = 1.69–06). Finally, several Bacteroides species were less abundant in fecal samples from children with CeD. In contrast, different Clostridium species were increased in abundance among children with CeD.

Table 4 Log2 fold change abundance of bacterial species in fecal samples of children with CeD

Discussion

The association between CeD and intestinal dysbiosis has already been described in several studies [8,9,10,11]. However, the exact role of the microbiome in CeD pathogenesis has not yet been fully elucidated, and, given the fundamental functions that the intestinal microbiota plays in regulating intestinal homeostasis, it has been suggested that specific changes in microbiome composition may contribute to CeD onset [12]. The intestinal microflora is very functionally diverse, and its composition can depend on the intestinal site considered [13, 14]. CeD is a duodenum-specific enteropathy, and changes in the small intestinal microbiome are therefore thought to be associated with its development [15]. However, several studies have also shown that patients with CeD present fecal microbiota dysbiosis [16]. These data suggest that, along with the small intestine, other parts of the gastrointestinal tract, such as the colon, may be a source of information for CeD pathogenesis.

This report, the first metagenomic analysis from a population of Saudi children, highlights several important differences between mucosal and fecal microbiome. Alpha-diversity analysis, for example, confirmed previously reported findings with fecal samples having increased bacterial richness and diversity as compared with those from mucosal samples [16]. Interestingly, we did not see any differences in alpha diversity between CeD and non-CeD groups. Microbial diversity in patients with CeD has been shown to be reduced compared with that in non-CeD controls [17], although another study found this was not the case [18]. Our analysis included a relatively small number of samples, which could account for the lack of significant differences in microbial diversity.

LDA LEfSe and DeSeq2 differential abundance analyses demonstrated significant differences between CeD and non-CeD groups at both mucosal and fecal levels. Overall, samples from patients with CeD appeared to have a decreased abundance of Actinobacteria phylum that is mainly represented by bacteria belonging to the Bifidobacterium genus. Many Bifidobacteria have positive immunomodulatory functions and are therefore used as probiotics. However, the increased abundance of L. acidophilus and Coprococcus species in in children with CeD contrasts with previous reports description as “good bacteria” [19, 20]. Samples from non-CeD controls appeared to have an increased abundance of “beneficial” bacteria (decreased in CeD) such as Roseburia and Lachnospiraceae species. Roseburia species are short-chain fatty acid-producing bacteria, which modulate intestinal motility and have anti-inflammatory properties. Changes in Roseburia species abundance have been correlated to several diseases such as irritable bowel syndrome, obesity, and type 2 diabetes [21, 22]. Similarly, Lachnospiraceae are often used as probiotics because of their “beneficial” impact on overall intestinal health [23]. Finally, increased levels of Subdoligranulum species have been found in CeD samples by several groups [18, 24]. Interestingly, a recent work by Leonard et al. demonstrated an increase in this specific genus in fecal samples from infants genetically predisposed to CeD even before the onset of the disease [24]. These findings are intriguing as they suggest a causative link between dysbiosis and CeD onset. Furthermore, they also raise the possibility that fecal microbiome markers could be representative of small intestinal dysbiosis. While our findings partially confirm previously reported differences between patients with CeD and those without, the use of metagenomic technology in this study revealed many unreported species, with significantly different abundance between children with CeD and non-CeD controls. Finally, it should be noted that bacterial associations with CeD reported in this study do not imply causality, a limitation that is common to most microbiota studies.

Study limitations

The most important limitation of this study is the relatively small sample size. However, the use of shotgun metagenomic analysis and the finding of many unreported bacterial species in this population of Saudi Arab children with high prevalence CeD make the results unique. Other limitations included the unavailability of information on the diet, growth and results of laboratory investigations.

Conclusions

Although preliminary, our data from Saudi Arabia, reports new bacterial species significantly associated with CeD. The fact that mucosal and fecal samples were collected from newly diagnosed children with CeD on normal gluten-containing diet suggests strong association between the identified bacteria and CeD. In addition, the identification of many unreported taxa associated with celiac disease, indicates the need for further studies from different populations to expand our understanding of the role of bacteria in the pathogenesis of celiac disease, hopefully leading to new treatment options.

Methods

Study population

The participants were enrolled from King Khalid University Hospital, King Saud University (KSU), Al Mofarreh PolyClinic, and King Fahad Medical City Children’s Hospital, Ministry of Health. All institutions are in Riyadh, Kingdom of Saudi Arabia (KSA). Main inclusion criteria included children below 18 years of age who were on normal gluten containing diet and had no history of antibiotic intake for at least 6 months before presentation to the clinic. In addition, confirmation of CeD for cases and exclusion of CeD for controls were according to the European Society of Pediatric Gastroenterology Hepatology and Nutrition guidelines [25].

Samples collection, storage, and processing

Mucosal samples from 20 children with confirmed CeD and 19 non-CeD controls were collected from the second part of the duodenum (D2); these were then stored in cryovials without fixative or stabilizer and transported in ice to the Central Laboratory. Similarly, fecal samples were also collected in cryovials from 20 children with CeD and 20 non-CeD controls and transported in ice to the Central Laboratory at the College of Medicine, (KSU). All samples were stored at − 80 °C. At the time of analysis, all samples were retrieved and dispatched by express mail in dry ice containers with temperature control for metagenomic analysis at Cosmos ID (Rockville, MD, USA).

DNA isolation and sequencing

DNA was isolated from mucosa samples using the ZymoBiomics miniprep kit and from stool samples using QIAGEN DNeasy PowerSoil DNA kit, both according to the manufacturer’s instructions. Isolated DNA was quantified via Qubit ds DNA HS assay kit (Thermo Fisher).

DNA libraries were prepared using the Illumina Nextera XT library preparation kit, according to the manufacturer’s protocol. Library quantity and quality were assessed using Qubit and TapeStation (Agilent Technologies, CA, USA). Libraries were then sequenced on Illumina HiSeq platform (2 × 150 bp reads). The samples were sequenced on the deeper end. They were sequenced at an average of about 20 million total reads per sample.

Bioinformatic analysis

Unassembled sequencing reads were directly analyzed using CosmosID bioinformatics platform (CosmosID Inc., Rockville, MD) for multi-kingdom microbiome analysis and quantification of organism’s relative abundance [26,27,28,29]. Briefly, the system utilizes curated genome databases and a high-performance data mining algorithm that rapidly disambiguates hundreds of millions of metagenomic sequence reads into discrete microorganisms that engender the particular sequences.

Custom analysis

Alpha-diversity boxplots

Alpha-diversity boxplots were calculated from the species-level abundance score matrices from CosmosID taxonomic analysis. Chao’s and Shannon’s alpha-diversity metrics were calculated in R using the R package Vegan. Further, t-tests were performed between each celiac and non-celiac group using the R package ggsignif. Boxplots with overlaid significance in p-value format were generated using the R package ggplot2 [30,31,32].

Beta-diversity principal coordinate analyses (PCoA)

Beta-diversity PCoA were calculated from the species-level relative abundance matrices from CosmosID taxonomic analysis. Bray–Curtis diversity was calculated in R using the R package Vegan with the functions vegdist; then, PCoA tables were generated using Vegan’s function PCoA. Plots were visualized using the R package ggpubr [30, 33].

Linear discriminant analysis effect size (LEfSe)

The LEfSe figures were generated using the Galaxy web application, based on relative abundance tables from CosmosID taxonomic analysis. Figures were calculated using a Kruskal–Wallis P-value of < 0.05, a Wilcoxon P-value of < 0.05, and a logarithmic linear discriminant analysis (LDA) score of ≥ 2.0 and therefore exhibited a statistically significant difference between groups. In addition, although not showing significant difference, some organisms may be functionally important. To explore this possibility, the P- values were set to < 0.2 for both Wilcoxon and Kruskal–Wallis tests, and the logarithmic LDA score of ≥ 0.05 and figures were calculated based on this threshold. In the LEfSe figures, the red bars (negative bars) indicate that the organism is more abundant in the CeD group; Whereas green bars ( positive bars) indicate greater organism abundance in the non-CeD group [34].

DeSeq2 differential abundance analysis

Differential abundance analysis used the abundance score matrices from the CosmosID taxonomic analysis. Differential abundance for organisms was calculated using DeSEQ2 from the R Phyloseq package (R Foundation for Statistical Computing, Vienna, Austria). For the mucosal and stool samples separately, the log2 fold change and associated P-values for celiac vs. non-celiac samples are displayed [35, 36]. A log2 > 0 indicates that the organism is more abundant in the CeD group; whereas a value < 0 indicates more abundance in the non-CeD group. P-values were calculated using the t-test function in R and adjusted for false discover rate. However, we also reported unadjusted p values to detect taxa not reaching the adjusted significance level but with possible biologic importance. The difference in abundance was considered significant when the adjusted P- value was < 0.05. In addition, unadjusted P- value was reported to reveal taxa that might have functional properties.