Introduction

The incidence of diabetes, in developed and emerging countries, has reached epidemic proportions [1]. A body of evidence demonstrates that the intestinal microbiota, which correspond to the overall bacterial community present in the intestine, play a role in the onset of metabolic disease [24]. The causal role of the intestinal microbiota in weight gain has been demonstrated in animal models [2]. In humans, obesity is associated with changed phylum levels in the gut microbiota [3]. In addition, the role of bacterial components in blood has also been demonstrated: thus, mice chronically infused with a low dose of lipopolysaccharides developed inflammation and diabetes [4]. Importantly, the plasma lipopolysaccharides concentration has been found to predict the onset of diabetes [5]. From these data, we hypothesised that the presence of a blood microbiome could be one of the initial steps leading to diabetes and obesity. To test this hypothesis, we investigated whether the 16S rDNA concentration in blood could be a marker of the risk of diabetes and obesity in a large general population. We studied this gene as it is highly conserved between different species of bacteria and is hence considered to be a marker of the overall microbiota [6]. Moreover, by pyrosequencing the overall 16S rDNA gene population in blood, we determined the profile of bacteria present in individuals who were destined to develop diabetes versus those present in controls.

Methods

Study overview Data from an Epidemiological Study on the Insulin Resistance Syndrome (D.E.S.I.R.) is a longitudinal cohort study of 5,212 adults aged 30 to 65 years at baseline; the primary aim of the study was to describe the natural history of the metabolic syndrome [7]. Participants were recruited between 1994 and 1996 from volunteers insured by the French national social security system in ten social security health examination centres. All participants gave written informed consent and the study protocol was approved by the Comité Consultatif de Protection des Personnes pour la Recherche Biomédicale of the Hôpital Bicêtre (Paris, France). Participants were clinically and biologically evaluated at inclusion and at follow-up visits in years 3, 6 and 9. We measured the baseline 16S rDNA concentration in individuals without diabetes (defined by treatment for diabetes or fasting plasma glucose ≥ 7.0 mmol/l) or obesity (BMI ≥ 30 kg/m2) at baseline and in those who had a known diabetes status at the year 9 examination. We excluded those who were likely to have a current infection (C-reactive protein > 10 mg/l, abundant leucocyturia) or who were taking antiviral therapy.

Variables studied

Weight and height were measured in lightly clad participants and BMI was calculated. Waist circumference, i.e. the smallest circumference between the lower ribs and the iliac crests, was also measured. The examining physician noted the existence of a family history of diabetes, and treatments for diabetes and hypertension were recorded. Hypertension was defined as a systolic/diastolic blood pressure of at least 140/90 mmHg or taking antihypertensive medication. Smoking habits were documented in a questionnaire filled in by the participants. Central adiposity was defined as a waist circumference ≥102 cm in men and ≥88 cm in women, while high fasting glucose was defined as ≥ 6.1 mmol/l [8].

Biological analyses

Blood was drawn after a 12 h fast. The methods used for biological analyses, except for bacterial DNA preparation and analysis, have been described elsewhere [7].

Bacterial DNA preparation

DNA was extracted from peripheral blood leucocytes using a classical phenol/chloroform extraction method followed by alcohol precipitation (ice-cold 70% alcohol by vol.). Air-dried DNA was resuspended in TRIS EDTA and stored at −80°C before use at a concentration of 1 μg/μl [9, 10]. It is noteworthy that this method did not use glass microbeads to increase the efficiency of bacterial DNA extraction, as commonly done in current methods. The reason for this was that isolation of bacterial DNA was not the goal of the work at the time of the sample preparation. However, despite this important technical difference, we succeeded in amplifying 16S rDNA from blood. To validate the difference between both extraction procedures, we extracted total blood DNA in the presence or absence of microbeads. The efficacy of bacterial DNA extraction, as assessed by 16S rDNA quantitative PCR amplification, was ten times greater when using microbeads. Although the absolute value of the bacterial DNA concentration was higher, the proportion between samples remained the same.

Quantification of 16S rDNA

The total DNA concentration was determined using a broad-range assay kit (Quant-iT dsDNA; Life Technologies, Villebon-sur-Yvette, France). The mean concentration ± SD was 121.1 ± 208.3 ng/μl. Each sample was diluted tenfold in TRIS buffer EDTA. The DNA was amplified by real-time PCR (Life Technologies) in optical-grade 96-well plates. The PCR reaction was performed in a total volume of 25 μl using a master mix (Power SYBR Green PCR; Life Technologies) containing 300 nmol/l of each of the universal forward and reverse primers for Eubact: forward 5′-TCCTACGGGAGGCAGCAGT-3′ and reverse 5′-GGACTACCAGGGTATCTAATCCTGTT-3′. The reaction conditions for amplification of DNA were 95°C for 10 min, and 35 cycles at 95°C for 15 s and at 60°C for 1 min. The amplification step was followed by a melting curve step according to the manufacturer’s instructions (from 60°C to 90°C) to determine the specificity of the amplification product obtained. The amount of amplified DNA was determined using a 16S rDNA standard curve obtained by real-time PCR from dilutions ranging from 0.001 to 10 ng/μl of E. coli BL21 DNA.

Identification of bacterial sequences

This sub-study included 14 participants with incident diabetes at year 9, but who had been at low risk of diabetes at baseline, having been non-smokers with a waist perimeter of < 85 cm for men and < 75 cm for women. These 14 participants were matched for age (±3 years), sex, waist circumference (±3 cm) and fasting blood glucose (±0.3 mmol/l) with 28 control participants who remained free of diabetes over the entire follow-up period. The V1–V2 variable region of the 16S rDNA gene was amplified by PCR for each participant, and amplicons from case and control participants were pooled separately after size verification by 2% agarose gel electrophoresis. Each pool was assigned to specific multiplex identifiers (MIDs) that were used as tags. These two pools of amplicons were pyrosequenced using a genome sequencer (454 Life Sciences; Branford, CT, USA) on a multiplexed pyrosequencing run (GS FLX-Ti; Life Technology). Sequences were binned for removal of pyrosequencing MIDs and PCR primers (forward: AG-AGT-TTG-ATC-MTG-GCT-CAG; reverse: GC-TGC-CTC-CCG-TAG-GAG-T) [2, 3]. Meta_RNA software, designed for the identification of 16S rRNA sequences based on hidden Markov models, was further applied to select prokaryotic 16S rRNA sequences [11]. The resulting 16S rRNA sequences were assigned to taxonomies using mothur software [12] and the SILVA reference taxonomic outline [13].

Bioinformatic analyses

Of the 919,095 and 768,245 sequence reads recovered for case and control participants respectively, ∼30% were removed by initial quality filters. A de-noising step was performed to remove most of the common sequencing errors encountered with the 454 platform by clustering chimeric sequences as well as reads that were most likely to have been derived from the same sequence.

Statistical analyses

For statistical analysis, in order to carry out parametric tests, the 16S rDNA concentrations were log-transformed, as the distribution was skewed (as were levels of triacylglycerol and insulin). The characteristics of the participants who did and did not have incident diabetes over follow-up are shown and compared by t and χ² tests. Note that log-transformed 16S rDNA concentrations were compared statistically. For incident diabetes and abdominal adiposity, logistic regression was used to calculate the standardised odds ratios and 95% CIs for an increase of one SD of baseline 16S rDNA concentration as a continuous variable (logarithm). Adjustments were made for sex, baseline age, family history of diabetes, hypertension, waist circumference, BMI, smoking status and fasting plasma glucose. The relationship with 16S rDNA concentration was linear, as an additional squared term was not significant. ORs were also calculated over risk factor strata. The C statistic was used to quantify the discriminative ability of 16S rDNA and that of various risk factors for diabetes. SAS versions 9.1 and 9.2 (Cary, NC, USA) were used for statistical analysis.

Results

Characteristics of the population studied At baseline, of the 5,212 participants in the D.E.S.I.R. study, 126 had diabetes, 474 were obese, 65 had biological signs of infection or were taking antiviral therapy, and 333 did not undergo 16S rDNA concentration determination. For 1,146 participants, the diabetes status was not known at the end of the 9 years, as they did not attend the year 9 examination. These volunteers were excluded from the analysis. By comparison, the participants analysed (n = 3,280) were older and fewer were current smokers. There was no significant difference in baseline 16S rDNA gene concentrations, waist circumference or fasting plasma glucose. The characteristics of the study population are shown in Table 1; 131 incident cases of diabetes were recorded. The mean 16S rDNA concentration was higher in those with incident diabetes (Table 2) and the distribution of log-transformed 16S rDNA concentration was shifted to the right (Fig. 1). No difference in baseline blood 16S rDNA concentration was observed in participants destined to become obese. In contrast, mean baseline 16S rDNA concentration (log-transformed) tended to be higher (mean ± standard deviation −2.73 ± 1.04 vs −2.63 ± 1.04; n = 485; p = 0.05) in those who were to present with abdominal adiposity after 9 years of follow-up. A bivariate analysis between blood bacterial DNA and other biological variables showed a modest negative correlation with fasting blood glucose, while a positive correlation with fibrinogen and leucocyte count was observed (Table 3).

Table 1 Socio-demographic and other characteristics of study population
Table 2 Laboratory and other characteristics of study population
Fig. 1
figure 1

Distribution of participants destined to become diabetic (continuous line) and those who did not (dotted line) according to baseline blood 16S rDNA gene concentration

Table 3 Correlation coefficients between blood bacterial DNA gene concentration and other biological variables

Prediction of diabetes and abdominal adiposity

The 16S rDNA concentration predicted the onset of diabetes after adjustment for confounding factors, with a standardised OR of 1.29 (95% CI 1.08, 1.55) after adjustment for age, sex and fasting blood glucose, and of 1.35 (95% CI 1.11, 1.64) in a fully adjusted model (Table 3). While higher 16S rDNA concentrations appeared to carry more risk in women and in those with lower fasting glucose, there was no significant difference between strata for the risk of incident diabetes between bacterial DNA and any of these strata (Table 4). We also analysed the risk of developing diabetes by follow-up time. The OR of developing diabetes was higher in participants who developed diabetes within the 6 to 9 year window (OR 1.61, 95% CI 1.23, 2.10; p = 0.0005) than in those who developed diabetes soon after the initial measurement, i.e. in the 0 to 3 year window (OR 1.10, 95% CI 0.87, 1.39; p = 0.45). On its own, 16S rDNA as a marker carried little discriminative ability between those who did and did not develop incident diabetes, with a C statistic of 0.564; however, when the level of 16S rDNA was added to a predictive score for diabetes derived in the D.E.S.I.R. population [14], the C statistic increased significantly from 0.862 to 0.871 (p = 0.02) and 16S rDNA was still predictive of diabetes after adjustment for this score (p = 0.004)

Table 4 Adjusted ORs (95% CI) of incident diabetes for 1 SD of log (16S rDNA gene concentration) in the overall population and in various strata

The 16S rDNA concentration also predicted the presence of abdominal adiposity at the end of 9 years, after adjustment for confounders, with a standardised OR of 1.18 (95% CI 1.03, 1.34; p = 0.01).

Identification of blood bacteria phyla in participants with incident diabetes

The 16S rDNA gene was sequenced in the pooled baseline DNA samples from participants destined to become diabetic and from controls. We first analysed the alpha diversity, which corresponds to groups of sequences characterised by less than 10% differences. Each group of similar sequences defines an operational taxonomic diversity unit (OTU). The number of OTUs (at 10%) was 745 and 800 in cases and controls, respectively (see electronic supplementary material [ESM] Fig. 1). A common core of 443 OTUs was identified. At the phylum level, proteobacteria represented 80% to 90% of all phyla in the blood, both in cases and controls (Fig. 2). Within the Proteobacteria phylum, at a genera level, the Ralstonia genus was the most prevalent (see ESM Fig. 2).

Fig. 2
figure 2

Abundance of bacterial phyla for those who developed diabetes (black bars) and controls without diabetes over the entire follow-up period (white bars)

Discussion

We show, for the first time, that the blood concentration of a bacterial gene predicts the onset of diabetes, particularly 6 to 9 years after baseline, and also predicts abdominal adiposity in a large sample of non-obese participants from a general population. We found that more than 90% of this bacterial DNA belonged to the Proteobacteria phylum.

The involvement of inflammation in the development of insulin resistance [1416] and the pro-inflammatory effects of bacterial components present within tissues [4] suggest that the role played by tissue bacteria in relation to diabetes and its complications should be explored. Indeed, experimental data have already linked the metabolic syndrome with gut microbiota and innate immunity against infectious diseases [3, 4]. We [4, 17, 18] and others [3] have demonstrated the influence of gut microbiota on metabolic disease in animal models. Consistently, in humans, periodontitis, a chronic gram-negative infectious disease of the oral cavity, has been associated with the metabolic syndrome [1923]. In line with these data, the longitudinal study reported here provides for the first time evidence of the involvement of tissue microbiota in the development of diabetes in humans.

These epidemiological data also provide some insight into the mechanisms of action of the microbiota. Indeed, although no difference was observed in blood microbiota composition in participants destined to develop diabetes and in controls, possibly because of a lack of statistical power, we did establish the predominance of proteobacteria, which represent 90% of overall microbiota. In this respect, the role of lipopolysaccharides, a major component of proteobacteria, in the onset of diabetes has already been demonstrated [3, 5]. Interestingly, in faeces, this phylum is in the minority, since Bacteroidetes and Firmicutes are the most predominant phyla [24], suggesting selectivity of the filtering mechanism, which allows only some bacteria to be present in the blood [17]. From these data, it seems possible that proteobacteria play a role in the onset of metabolic diseases. Our results show at baseline a negative correlation between bacterial DNA levels and fasting blood glucose. At this stage, we cannot rule out a transitory improvement in insulin sensitivity in the early phase of metabolic infection, as shown in an experimental model of gram-negative sepsis in healthy humans [25, 26]. This could be due to the release of nitric oxide during latent infection, which would in turn improve insulin sensitivity [27, 28]. On the other hand, the exclusion of patients with treated or non-treated diabetes at baseline, i.e. participants in whom bacteraemia would already have had a deleterious influence on glucose metabolism, is more likely to have contributed to this association. Obviously, the present study is observational in nature and thus cannot demonstrate the causative role played by blood microbiota in the onset of diabetes. Indeed, the blood bacterial DNA burden may only represent an innocent bystander in the disease process. Moreover, these results are preliminary and need to be replicated in other longitudinal data sets.

In conclusion, our results show for the first time that blood microbiota are a marker of the risk of diabetes and abdominal adiposity in a general population. These findings establish evidence for the concept of an involvement of tissue bacteria in the onset of diabetes in humans.