Background

Inflammatory bowel disease (IBD) encompasses both Crohn's disease (CD) and ulcerative colitis (UC), chronic inflammatory disorders of the gastrointestinal tract with developed world predominance and an incidence that has risen dramatically in the post-war period [1]. IBD manifests with symptoms such as severe diarrhoea, weight loss and debilitating abdominal pain, resulting in substantial morbidity and impairment in quality of life [2]. In both diseases visibly inflamed and non-inflamed areas of intestine can be identified at assessment by colonoscopy.

The cause of both conditions is still speculative. Host genetics play a key role, with genetic factors more important for development of CD than UC [3, 4], but genetic defects cannot wholly explain the increasing prevalence of IBD in recent years, suggesting that environmental factors are also involved [5]. The current generally accepted disease hypothesis is that the chronic inflammation of IBD results from a genetically dysregulated host immune response directed at the gut microbiota [68].

The human gut microbiota is a highly diverse and abundant community of microbes that under normal circumstances is either commensal or beneficial to human health [9]. Bacteria in the gut contribute to host nutrition via production of short chain fatty acids and vitamins, and play integral roles in maintaining human health by preventing colonisation by pathogens and by shaping and maintaining normal mucosal immunity [10]. The microbiota is also, however, a major source of antigens, including lipopolysaccharides, peptidoglycan, lipoproteins, flagellin and unmethylated CpG-containing DNA, all of which can activate both innate and adaptive immune responses [11, 12]. A balanced relationship, therefore, must exist between bacteria and their human hosts. A disruption in this homeostasis threatens the state of immune tolerance and may result in gut inflammation.

Several lines of evidence suggest a role for gut bacteria in the pathogenesis of IBD. Faecal stream diversion induces remission in CD [13], animal models of colitis require the presence of gut bacteria to initiate inflammation (reviewed in [14]), an increased mucosal bacterial load is observed in IBD patients [15, 16], genome-wide IBD association studies have identified polymorphisms in genes involved in bacterial recognition and clearing (reviewed in [17]) and broad-spectrum antibiotics have some efficacy in the treatment of CD [18, 19].

With CD in particular, individual species such as Mycobacterium avium subspecies paratuberculosis or Escherichia coli have been implicated in disease aetiology [20, 21] while the emerging "dysbiosis" hypothesis implicates multi-species assemblages in an overall imbalance between harmful and protective bacteria [22, 23]. Numerous studies have attempted to characterise the microbial communities in IBD and to compare these with healthy individuals. Results indicate that individuals with IBD have reduced bacterial diversity, temporal stability and cluster separately when compared to healthy controls [2428]. Compositional comparisons have generated inconsistent results but have generally identified reductions in components of the Firmicutes phylum in IBD, often, but not always, with concurrent increases in Bacteroidetes and facultative anaerobes such as Enterobacteriaceae [12, 22, 2931].

Faecal/luminal bacterial communities have repeatedly been shown to be distinct from mucosal communities [3237], meaning that study of the IBD mucosa-associated microbiota and comparison with those from healthy individuals should provide the best insight into whether or not a particular microbial signature is disease specific. In addition, within IBD-affected intestines disease-causing agents might be enriched at sites of active inflammation relative to comparatively unaffected mucosa. We have therefore used in-depth bacterial 16S rRNA gene cloning and sequencing technology to compare the mucosa-associated microbiota from inflamed and non-inflamed sites of the colon in CD and UC patients and in non-IBD controls. Our findings indicate that mucosal microbial diversity and composition is disturbed in IBD and that there are significant differences in microbial community structure between inflamed and non-inflamed mucosa.

Results

Twenty-nine mucosal biopsies were collected from a total of seventeen patients, including paired biopsies of inflamed and non-inflamed tissue from six patients with active CD (n = 12), paired biopsies from six patients with active UC (n = 12) and five biopsies from non-IBD controls (n = 5). Demographic data, disease phenotype, biopsy site and histological scores are shown in Table 1. All biopsies from non-IBD controls were histologically normal. There was no age difference between CD and UC cases but, due to the indication for colonoscopy, the average age of the non-IBD control patients was higher. The median ages were 32 (25-51) years for the CD group, 26 (24-73) years for the UC group and 51 (45-73) years for the controls. Disease duration was similar.

Table 1 Characteristics of patients and biopsy tissue at time of sampling.

Quantification of bacterial populations

Using qPCR we measured the total bacterial load in the mucosal biopsy samples. The results showed high variability between samples but overall the biopsies from the inflamed intestinal regions of CD patients contained the lowest number of bacteria (Figure 1). The total number of bacteria detected in these inflamed CD samples was significantly lower than the bacterial load present in the inflamed regions of the UC patients' colons. While it appeared that within each disease cohort the bacterial load was generally lower in inflamed regions of the colon compared to non-inflamed regions the inter-individual variation meant that no other significant differences were detected.

Figure 1
figure 1

qPCR analysis of total bacterial load in mucosal biopsy samples. Figures are mean results for each patient cohort. Error bars denote standard deviation from the mean. Total bacterial load was significantly lower in the inflamed CD biopsies than the UC inflamed biopsies.

Overall phylogenetic classification of 16S rRNA gene sequences

We next analysed the bacterial diversity in the 29 mucosal biopsy samples by deep sequencing of 16S rRNA gene clone libraries. The final dataset of 10,010 chimera-checked, full-length sequences included an average of 620 clones per CD patient, 750 clones per UC patient and ~350 clones per healthy control. As a whole, the dataset contained an estimated 565 phylotypes (clustered at >99% sequence identity), which could be mapped to eight bacterial phyla. 93% of the sequences belonged to just two of these phyla; the Firmicutes (51.8% of clones) and the Bacteroidetes (41.1%). Within the Firmicutes phylum the vast majority of sequences grouped into two families, the Lachnospiraceae (51.2%) and the Ruminococcaceae (33.1%), which comprise clostridial clusters XIVa and IV respectively. The Bacteroidetes sequences were predominantly from the Bacteroidaceae family (62.6%) but also included Porphyromonadaceae, mainly Parabacteroides species, (13%) and Prevotellaceae (19%). Proteobacteria represented ~6% of the total sequences, the majority of which were β-proteobacterial species related to Sutterella spp. The remaining five phyla we detected each accounted for less than 1% of total bacteria: Actinobacteria (0.89%), Fusobacteria (0.14%), Verrucomicrobia (0.03%), Lentisphaera (0.01%) and TM7 bacteria (0.02%).

Comparison of bacterial composition in IBD and control biopsies

There was a large degree of inter-individual variation between patients at all taxonomic levels but, despite this, distributions could be significantly associated with disease. Samples from both the inflamed and non-inflamed sites from CD and UC patients contained proportionally less Firmicutes, and correspondingly more Bacteroidetes, than the non-IBD control samples (Figure 2). The decreased proportion of Firmicutes present in UC, but not CD, samples reached statistical significance when compared with the controls (Figure 2). Related to these shifts, the ratio between Firmicutes and Bacteroidetes was changed in IBD patients. In non-IBD controls there were significantly more Firmicutes than Bacteroidetes, but this difference was lost with disease (Figure 2). We also observed a slight increase in Enterobacteriaceae in CD samples. Enterobacteriaceae were detected in 2 out of the 5 control patients and accounted for 0.11% of the total pooled community from these samples; they were detected in samples from 2 out of 6 UC patients and accounted for 0.09% of the total pooled community from these samples. In contrast, Enterobacteriaceae were detected in the paired biopsy samples from 5 out of the 6 CD patients included in the study and accounted for a ten-fold increase in proportion of the total CD microbiota compared to the other sample types (1.05%). This increase was significant when compared to UC samples (p = 0.049) but did not reach significance when compared to the non-IBD control cohort (p = 0.069). We could find no significant association, however, between microbiota composition and the severity of inflammation or the site of mucosal biopsy.

Figure 2
figure 2

Compositional analysis of 16S rRNA gene clone libraries. Phylum-level classification of bacterial phylotypes in CD, UC and non-IBD control patients showing significant reduction in the proportion of Firmicutes sequences in UC samples relative to non-IBD controls (*a) and disruption in Firmicutes to Bacteroidetes ratio in IBD patients relative to non-IBD controls (*b).

Measurements of bacterial diversity

Using a number of different measures to explore the bacterial diversity within our samples we found that there was reduced diversity in biopsies from IBD patients compared to controls and that the reduction was particularly apparent in patients with CD (Figure 3). Rarefaction curves built from the cumulative dataset revealed that there were significant differences in species richness between control and CD samples (Figure 3A). The rarefaction curves also revealed a trend towards a slight increase in species richness in inflamed versus non-inflamed tissues, although these difference were not significant. In agreement with these findings, using the Shannon diversity index (SDI) to measure the richness and evenness of each sample, we found that the individual non-IBD control samples generally generated the highest SDI figures and that these were significantly higher (p < 0.05) than those from both the inflamed and non-inflamed CD samples and from the non-inflamed UC samples (Figure 3B).

Figure 3
figure 3

Measures of bacterial diversity in the mucosal biopsies. 3A) Rarefaction analysis showing number of phylotypes observed with increasing sequencing effort across all patient cohorts. Data points show the observed diversity after each individual biopsy sample was incorporated into the analysis. Colour-coded errors bars show 95% confidence intervals for each patient cohort. Note that, as each patient is incorporated into the analysis, the gap between the number of phylotypes observed in non-IBD patients compared to IBD patients grows larger. The reduction in species richness appeared to be particularly significant in CD patients. Number of sequences per sample: Non-IBD controls = 252-489, CD Inflamed = 248-342, CD Non-inflamed = 287-445, UC Inflamed = 267-469, UC Non-inflamed = 286-499. 3B) Mean Shannon diversity indices (SDI) calculated from the individual biopsies for each sample type. Significantly reduced SDI compared to non-IBD control samples are indicated by * (p = < 0.05). Error bars indicate standard deviation from the mean.

Bacterial community structure comparisons

We next wanted to test whether or not the biopsy samples grouped together by disease cohort, by individual or both. Cluster analysis using both the Jaccard coefficient and PCoA showed that the samples clustered together according to donor (Figures 4 and 5) and that there was no separation between the CD, UC and non-IBD cohorts. There was also no separation based upon the location of biopsy sampling. This suggests that, despite differences in bacterial community composition and diversity between IBD and non-IBD samples, inter-individual variation is a stronger determinant of overall gut bacterial composition than disease. Despite this, although the paired samples clustered together, the branch lengths in the dendrogram were longer than might be expected if the community structure was highly similar between paired biopsies, indicating that there were still significant differences between the inflamed and non-inflamed tissues.

Figure 4
figure 4

Cluster dendrogram generated using the Jaccard coefficient, illustrating relationship between bacterial species membership and biopsy type across all samples included in the study. Crohn's disease patients are indicated by numbers CD1-CD6. Ulcerative colitis patients are indicated by UC1-UC6. Samples marked with "I" are from inflamed intestinal regions, those marked with "N" are from non-inflamed regions. Non-IBD control samples are indicated with N1-N5. Adjacent bar charts show the Family level classification (as determined by the RDP classifier) for each of the sequences per sample. Families coloured in yellow/brown belong to the Firmicutes phylum, blue = Bacteroidetes, pink = Actinobacteria, green = Proteobacteria, black = all other sequences not belonging to the specified Families.

Figure 5
figure 5

Principal coordinates analysis of variation between the bacterial communities present in all biopsy samples. Each data point represents an individual sample. Blue circles denote non-IBD control samples, red squares are Crohn's disease samples, green triangles are ulcerative colitis samples. Numbers indicate the donor the samples were obtained from. The paired, inflamed and non-inflamed, biopsy samples from each donor can be seen to cluster together. Figure was calculated using unweighted Fast UniFrac [39].

Statistical comparisons between inflamed and non-inflamed tissue

We therefore sought to properly determine whether or not a characteristic localised dysbiosis between healthy and inflamed tissue within individual IBD patients exists. To test this we first performed whole community comparisons using ∫-LIBSHUFF [38], unweighted and weighted UniFrac [39] and the parsimony P-test [40] which all test whether or not two communities are significantly different overall without indicating which phylotypes cause the significance. We then used the Library Compare tool at the RDPII website [41], which pinpoints significant differences between two communities at all taxonomic designations from phylum to genus level to try and discover which bacterial groups were differentially abundant between the paired samples. Analyses with these tools indicated that in 11 out of the 12 IBD patients robust statistically significant differences between the inflamed and non-inflamed mucosal communities existed (Table 2).

Table 2 Comparison of bacterial composition from inflamed and non-inflamed tissue within individual IBD patients using ∫-LIBSHUFF, unweighted and weighted UniFrac, the parsimony P-test and RDP Library Compare.

∫-LIBSHUFF analysis indicated a significant difference in all of the UC patients and 4 out 6 CD patients. Library Compare analysis confirmed that there were statistically significant differences between inflamed and non-inflamed sites for most of these samples. However, no obvious pattern was apparent and the statistically significant differences were spread between a number of phylogenetic groups (Table 2). Three of the sample pairs that had significant comparisons with ∫-LIBSHUFF (CD3, UC1 and UC5) showed no significant differences with Library Compare. Interestingly, these discrepancies may be explained by the UniFrac analysis. Unweighted UniFrac does not take into account the relative abundances of different phylotypes when comparing communities, only the species overlap. Weighted UniFrac also takes into account the relative abundance of each species. For the three sample pairs with no significant Library Compare results the unweighted UniFrac comparison showed highly significant differences between the paired communities, while the weighted comparison did not (Table 2). This indicates that these paired samples had significantly different community membership but that the overlapping members of the bacterial community that were present in both samples had similar abundances, thus explaining the significant ∫-LIBSHUFF results and the non-significant Library Compare results. In contrast to this, the paired set of samples from CD patient 4 were highly significantly different when measured using weighted UniFrac but showed no significance when measured using the unweighted version. Further analysis revealed that a Prevotella species was 3.6 times more abundant in the inflamed than non-inflamed site and accounted for 25% of the total community in the inflamed sample, a difference that was found to be significant to p < 0.00000001 with Library Compare. As the two communities were not recognised as significantly different with ∫-LIBSHUFF and unweighted UniFrac it is possible that this was because, regardless of the differential abundance, overall community membership was similar across both samples. The only sample pair to show no significant differences between inflamed and non-inflamed tissue with either ∫-LIBSHUFF or Library Compare (patient CD6) was characterised by a very low overall diversity, indicating that the microbiota may have been particularly disturbed in this patient.

As Library Compare is only able to classify sequences down to the genus level we then sought to characterise whether or not there were differences at the species level. For this purpose we compared sequences that had been grouped into phylotypes using DOTUR (99% identity) and assigned identities with MegaBLAST (see Additional file 1 ). While we were often able to observe statistically significant differences between individual phylotypes in single patients (data not shown) we were unable to detect a specific or recurring pattern or identify disease-specific phylotypes. Recently, a reduction in Faecalibacterium prausnitzii has been implicated in CD aetiology [31, 42]. We did not observe a difference in F. prausnitzii proportional abundance between healthy and IBD patients but found that, when looking at paired biopsies from individual IBD patients, this species was almost always reduced in inflamed versus non-inflamed tissue. This trend did not reach statistical significance however. Species-level analysis also failed to identify any pathogenic species that have been previously associated with IBD such as Mycobacterium avium subspecies paratuberculosis, Yersinia spp or Listeria spp. [43]. We did recover E. coli/Shigella spp. from many CD samples but as 16S rRNA gene sequence data does not provide enough resolution to differentiate between commensal and pathogenic strains we could not determine whether or not these species were pathogenic. Sulphate-reducing bacteria (SRB) have also been implicated in the pathogenesis of IBD [44] but we recovered only one SRB sequence, which had greater than 99% identity to Desulfovibrio piger, and this was detected in one of the non-IBD control patients.

Discussion

To our knowledge, this is one of the largest clone library studies investigating the microbiota in IBD. In contrast to an earlier study by Frank et al., [30], which examined a smaller number of clones from a large number of patients, we sought instead to add to current knowledge by obtaining a higher resolution of the IBD-associated microbiota with particular emphasis placed on observing differences between inflamed and non-inflamed colon sites in the same patients. This was inevitably done in a smaller number of patients and samples because of the depth of molecular analysis required for each sample. Our in-depth clone library analysis, utilizing the resolving power of near full-length 16S rRNA gene sequences, revealed significant differences in diversity and composition between the mucosal microbiota of healthy patients and IBD sufferers. The results also suggest a tendency towards a reduction in Firmicutes and an increase in Bacteroidetes species in IBD patients compared to controls and also indicate that there is an increase in Enterobacteriaceae in CD. Similar shifts in composition, in either one or all of these groups, have been reported by other investigators using both culture [22] and a variety of molecular techniques [29, 31, 4555]. A previous large-scale clone library analysis by Frank et al., [30], however, reported a decrease in proportions of Bacteroidetes and the Firmicutes family Lachnospiraceae in a subset of, but not all, IBD patients and an increase in Proteobacteria. The observed discrepancies between these two large-scale clone library studies may in part be explained by different disease phenotypes, dietary or other environmental differences, the effect of inter-individual variation between patients or the differing number of samples studied and the depth of sequencing between each study.

We also demonstrated a reduction in bacterial diversity within IBD patients compared to controls and this is in agreement with several previous studies [2427, 56, 57]. Our data shows, however, that despite the differences between IBD and non-IBD patients in both bacterial composition and diversity that samples clustered predominantly by individual rather than disease.

Using both culture and molecular methods, many studies have demonstrated that the mucosal community along the length of the colon is largely stable, in healthy and IBD patients, and distinct from that recovered in faeces [3237]. Here we provide evidence instead for the development of localised differences in mucosal microbiota structure in IBD. Our community comparison results suggest that there may be differences between inflamed and non-inflamed tissue, with significant changes in the composition of the bacterial communities at these sites. A number of prior studies have also attempted to establish whether or not there is localised dysbiosis in IBD between inflamed and non-inflamed tissue. While two of these studies indicated that there is a dysbiosis [58, 59], the majority have suggested that this is not the case [29, 48, 6062]. Discrepancies between these results and ours may result from the use of differing molecular methodology and/or the greater sequencing depth we employed. DGGE/TGGE and FISH are useful tools but the resolving power of these methods is much lower than that for in-depth clone libraries covering the full length of the 16S rRNA gene [63]. In addition, DGGE/TGGE cannot accurately describe quantitative differences between dominant bands or describe qualitative differences in sub-dominant species and single bands on the gel may contain DNA from more than one species [64].

While our results suggest that localised changes in the mucosal microbiota do exist in IBD we were not able to identify a bacterial species or cluster that was consistently associated with the inflamed gut and therefore, potentially, with IBD aetiology. Other large-scale clone library analyses have also failed to identify specific pathogens [29, 30]. While their absence may indicate that potential pathogens may simply form a very minor component of the microbiota, these results do not support the hypothesis that a particular bacterial agent causes IBD. Clone libraries generate inherent biases, however, and it is possible that they are unable to detect certain species due to methodological artefacts. Indeed, this may be important with Mycobacterium avium subspecies paratuberculosis, a member of the often underrepresented Actinobacteria phylum [65, 66]. The absence of bifidobacteria from our dataset indicates that our clone libraries also suffer from this same bias against Actinobacteria. It is also worth noting that our analysis would not detect any viral, archaeal or eukaryotic aetiological agents. This may be important given recent evidence suggesting a role for viruses in the induction of at least some models of IBD [67].

Sequence-based microbiota comparisons such as ours can of course only demonstrate associations and do not provide information regarding mechanism or causation. It is also difficult to differentiate between compositional changes that may play a role in disease pathogenesis and those which may simply have occurred as a result of disease. However, given the absence of a specific and recurring aetiological agent in the cumulative data across all published IBD studies, which incorporate both culture- and molecular-based methodologies, it is possible that the alterations in bacterial composition and diversity seen between healthy and IBD patients and between inflamed and non-inflamed mucosa may be, to at least some extent, the result of the disturbed gut environment rather than the direct cause of disease. Indeed, there are a number of reasons why IBD is likely to result in altered conditions for bacterial growth. For example, the gut in IBD is likely to be a less stable environment than that of healthy individuals, with more exposure to antibiotics and other drug regimes, and alterations in transit time. Microscopy studies have suggested that there is a higher penetration of bacteria and a greater bacterial load in the mucosal layer in IBD patients [47, 68] and the resulting inflammation drives the localised release of antimicrobial compounds [69]. In addition, in UC there is a reduced mucus layer in inflamed relative to non-inflamed regions [70].

Despite proportional increases in Enterobacteriaceae and Bacteroidetes within IBD patients, if these organisms were directly responsible for disease we might expect them to be elevated at sites of inflammation and this was not shown in our analysis. Taking into account all of the above factors, the observed increases in these bacterial groups in IBD patients as a whole may therefore simply reflect the adaptation of the individual microbiota to the IBD gut environment. Bacteroides thetaiotaomicron, for example, can adapt to inflammation in an experimental mouse model by inducing genes that metabolise host oxidative products [71] and inflammation per se has also been shown to promote the growth of Enterobacteriaceae in mouse models [72, 73]. Clearly, further similar studies are required on a far greater range of gut bacterial species so that we can better understand the response of the gut microbiota to alterations in environmental conditions.

Conclusions

This work demonstrates a dysbiosis, or imbalance, in microbial community structure and composition in inflammatory bowel disease patients relative to non-IBD controls. It also indicates that inflamed tissue differs from non-inflamed tissue, but not in a consistent or predictable manner. Indeed, despite general trends such as a reduction in diversity, the response to IBD may be to some extent specific to the individual. This lends support to the emerging hypothesis that IBD is combinatorial in aetiology, with many different combinations of genetic and environmental causes leading to similar therapeutic responses [67], and highlights the importance of interconnection between the environment, the microbiota and the host in health and disease.

Despite this, even if particular bacteria are not the specific cause of IBD, altered immune responses may act to select particular bacterial species through creation of favourable microenvironments and might therefore cause the outgrowth of potentially pathogenic commensal species [74]. Shifts in the microbiota may therefore still impact gut health by altering the antigenic exposure to the gut mucosa or by reducing its exposure to beneficial microbes and/or their metabolic products, thereby initiating a cycle that favours recruitment and growth of more pro-inflammatory species [17, 75]. The observed reduction in Firmicutes proportions, for example, might lead to an undesirable affect on gut health. Recent work describing the anti-inflammatory properties of one Firmicutes species, Faecalibacterium prausnitzii [42] illustrates this point.

Finally, results from metagenomic studies indicate that, regardless of species composition, the collective genomes of each individual's microbiota appear to encode a remarkably conserved set of functions [28]. If similar, and potentially aggravating, factors are encoded by multiple species, it is possible that we will be better served in the future by looking at the complete gene complement of the microbial community as a whole, not just species composition. With this in mind, it is hoped that further analysis of the complex interplay between host and microbes will yield important insights into the pathogenesis of IBD.

Methods

Patients

Patients were selected from those undergoing routine colonoscopic assessment of IBD at Guy's and St. Thomas' Hospitals, London, UK. As controls, asymptomatic individuals undergoing colonoscopy for a family history of colorectal cancer or polyp surveillance were also invited to take part. Written informed consent was obtained from each patient and the study was granted ethical approval by the St. Thomas' Research Ethics Committee (Ref No. 06/Q0702/74). Patient information, including sex, age and the location of the colon that biopsies were taken from, is given in Table 1.

Colonoscopy was undertaken after prior preparation of the colon with two sachets of sodium picosulphate. No individuals received antibiotics in the preceding 2 months. For those with CD or UC, mucosal biopsies were taken from macroscopically inflamed and non-inflamed areas of the colon using standard gape forceps. Once taken, biopsy samples (approximately 1 × 2 mm) were placed in a cryovial without preservative, immediately snap frozen in liquid nitrogen, and stored at -70°C until analysis. Additional biopsy samples from the same area were also sent for histological analysis. These biopsies were scored independently for presence of ulceration, acute and chronic inflammation by a single gastrointestinal pathologist. Prior diagnosis of active CD or UC was determined by standard clinical, radiological, endoscopic and histopathological criteria. A modified Baron score with a range from 0-5, where a score of 5 represents the most severe disease, was used to grade the endoscopic severity of inflammation at the site of each biopsy used in the study [76].

DNA extraction and sequence analysis

DNA was extracted from each mucosal biopsy sample using the QIAamp® DNA Mini-Kit (Qiagen, UK) and the eluted DNA was stored at -20°C. 16S rRNA genes were amplified using the broad-range bacterial primers Bact-8F (5'-AGAGTTTGATCCTGGCTCAG-3') and Bact-1391R (5'-GACGGGCGGTGTGTRCA-3') [34]. Clone library construction and sequencing were carried out as described previously [72].

Sequences were aligned using the NAST aligner [77] and these alignments were subject to extensive manual curation using the ARB package [78] before further analysis. Sequences were tested for chimeras with Mallard [79], Bellerophon at Greengenes [77] and Pintail [80] and any that appeared to be chimeric were removed. The sequences (deposited in GenBank under accession numbers FJ503060-FJ513069) were initially given a broad classification to the phylum and family levels using the Classifier tool at the RDPII website [41]. To obtain more detailed taxonomic information the sequences were then divided into phylotypes. Distance matrices were generated in ARB with the Olsen correction and a 60% maximal-base frequency filter applied. This filter removed many ambiguously-aligned columns but was not so stringent that distinct species were commonly merged into single phylotypes. Distance matrices were then entered into the DOTUR program [81] set to the furthest neighbour and 99%-similarity setting. The resulting phylotypes were then assigned similarities to nearest neighbours using MegaBLAST [82].

To determine the depth of coverage in each of the clone libraries Good's coverage was calculated using the mothur software package [40]. Using this estimator the median coverage across all samples was found to be 94.35% (range of 83.73-97.3%).

Shannon diversity indices were calculated for each library by entering distance matrices generated in ARB, with the Olsen correction and a 60% maximal base-frequency filter applied, into DOTUR [81]. Rarefaction curves for each sample were calculated using mothur [40].

Community structure comparisons across the whole dataset, incorporating unweighted and weighted UniFrac, Parsimony testing and cluster analysis using the Jaccard coefficient, were performed using mothur and were based on an alignment created in mothur using the reference SILVA-alignment and with the 60% maximal-base filter and Olsen correction applied prior to distance matrix construction in ARB. Cluster dendrograms, with added bar charts showing the microbial composition of each sample, were visualised using the iTOL web package [83].

Paired (inflamed and non-inflamed) biopsy sample sequences from individual patients were aligned using the NAST aligner and were again extensively corrected in the ARB package [78] before further analysis. Olsen-corrected, 60% maximal-base frequency filtered distance matrices were subjected to ∫-LIBSHUFF analysis [38]. Unaligned paired-sample sequences were used as input for the Library Compare tool at the RDPII website [41].

Principal coordinates analysis (PCoA) plots were generated using the Fast UniFrac web application [39] based upon neighbour joining trees created in ARB, with 60% maximal-base frequency filter and Olsen correction applied, using the sequences aligned to the SILVA reference in mothur as initial input.

Quantitative PCR (qPCR)

Total bacteria were quantified in 25 of the 29 biopsies by qPCR (CD1 non-inflamed, CD5 inflamed, CD5 non-inflamed and UC4 non-inflamed were not included in the analysis due to a lack of DNA from these samples). All PCRs were performed using a Stratagene Mx3000P thermal cycler, in conjunction with Stratagene MxPro qPCR Software. Each reaction contained a total volume of 20 μl per well and was performed in triplicate. qPCR reactions contained 10 ng of forward and reverse primer, 10 μl Brilliant II SYBR Green qPCR Master Mix (Agilent Technologies, La Jolla, CA), ~ 900 pg of template DNA (1:100 dilutions of sample genomic DNA preparations) and were made up to 20 μl with RNase free water. A 466-bp fragment of the bacterial 16S rRNA gene was amplified using the forward primer 5'-TCCTACGGGAGGCAGCAGT-3' and the reverse primer 5' -GGACTACCAGGGTATCTAATCCTGTT-3' [84]. The thermal cycling conditions were 50°C for 2 minutes and 95°C for 5 minutes followed by 40 cycles of denaturing at 95°C for 15 seconds, primer annealing at 60°C for 30 seconds and DNA extension at 72°C for 90 seconds. Finally a dissociation step was added to qualitatively assess reaction product specificity (temperature raised to 95°C, cooled to 60°C then slowly heated back to 95°C) for melt curve analysis of the PCR products. Extracted DNA from a pure Bacteroides vulgatus (ATCC 8482) culture was prepared into a series of ten-fold dilutions in RNase free water ranging from 1 × 106 copies to one copy and used as a positive control in order to make a standard curve. Quantification of template concentrations was made by linear extrapolation of baseline-subtracted data from the bacterial dilution series standard curve. For each reaction a threshold of luminescence was determined and compared to the standard curve. Thus for each sample an equivalent concentration given in colony forming units could be established.

Statistical analysis

For the qPCR and compositional results the Mann-Whitney U test was used for comparisons between two groups and the Kruskall-Wallace method, analogous to one-way analysis of variance, to compare more than two groups. The levels of significance reported were not adjusted to take account of multiple comparisons. As these were multiple comparisons, p values <1% were considered significant to imply strong evidence of a difference.