Introduction

Due to the increasing interest in the microbiome and its relationship within the different ecosystems, it is crucial to establish the appropriate method that will allow for a precise assessment of the microorganisms’ composition in various ecological niches. Research on the microbiome includes its relationship with the natural environment (Cabello-Yeves et al. 2021) and in the context of animal breeding (Vasco et al. 2021) and agricultural crops (Zhang et al. 2021). As it was described in many reports using both animal and human models, the gut microbiome is the most complex, often studied and correlated with many diseases such as diabetes (Salamon et al. 2018, 2021), obesity (Sroka-Oleksiak et al. 2020), Crohn’s disease (Kowalska-Duplaga et al. 2019), and skin diseases or schizophrenia (Szeligowski et al. 2020). It is known that many microorganisms are difficult to culture, e.g., in the gut (Lau et al. 2016) or hydrothermal sources (Bellec et al. 2020), so they are hard to detect in different samples. Therefore, the optimal method in such studies seems to be next-generation sequencing (NGS), high-throughput sequencing. The most common taxa identification is based on the amplicon-based NGS of the 16S ribosomal RNA (rRNA) gene. The prokaryotic 16S rRNA gene is approximately 1500 bp long and has conserved and hypervariable nine regions (V1-V9), which are used for phylogenetic classification of genera and in diverse microbial populations (Gao et al. 2021; Nakao et al. 2021). Such metagenomic studies require specific infrastructure, trained personnel, appropriate sequencing platforms, and bioinformatics pipelines (Allali et al. 2017; Gao et al. 2021). Since 2011, many laboratories have used the Illumina NGS technology, especially the MiSeq system (San Diego, CA, USA), which is the most commonly used, enabling the evaluation of relatively short paired-end sequencing reads with high accuracy (https://www.illumina.com/content/dam/illumina/gcs/assembled-assets/marketing-literature/miseq-system-data-sheet-m-gl-00006/miseq-data-sheet-m-gl-00006.pdf; accessed on September 25, 2022). In 2018, the company introduced the new iSeq 100 system, which can be used for high-throughput sequencing of the 16S rRNA gene. The low cost and small size of the iSeq machine create a new opportunity for many laboratories, even being an alternative to the MiSeq platform (https://www.illumina.com/content/dam/illumina-marketing/documents/products/datasheets/iseq100-sequencing-system-spec-sheet-770-2017-020.pdf; accessed on September 25, 2022). Based on the data provided by Illumina, the most important characteristics of the two above-mentioned platforms are presented in Table 1.

Table 1 Key features of the 2 selected NGS (Illumina, San Diego, CA, USA) platforms in the field of 16S sequencing [based on references: (https://jornades.uab.cat/workshopmrama/sites/jornades.uab.cat.workshopmrama/files/16s_sequencing.pdf; accessed on September 25, 2022; Hu et al. 2021)] 

This study aimed to compare the accuracy and efficacy of two selected iSeq and MiSeq, Illumina machines in sequencing 16S rRNA (16S) in the same samples of the rat schizophrenia model. Furthermore, using the two different iSeq and MiSeq, Illumina technologies for sequencing 16S rRNA in the same schizophrenia rats’ feces samples, the potential differences in the most important characteristics (in the field of richness and evenness (alpha diversity) and differences in microbial structure (beta diversity) were compared.

Materials and methods

Samples and DNA isolation

We used 20 fecal samples taken from rats while running another project (“The effects of ligands of the alpha7 nicotinic acetylcholine receptors on complex cognitive processes and social behavior in the neurodevelopmental schizophrenia-like model”, project grant no 014/15/N/NZ7/02978 supported by National Science Centre, approval of the 2nd Local Institutional Animal Care and Use Committee in Krakow, no 1198/2015) on gut microbiota in a rat model of schizophrenia. Samples were collected in polypropylene tubes (FL Medical, Padova, Italy) and immediately frozen at − 80 °C pending further analysis, for no longer than 2 months. Bacterial DNA was isolated according to our previous works using mechanical, chemical, and enzymatic lysis of microbial cells and commercial Genomic Mini AX Stool Spin kit (A&A Biotechnology, Gdansk, Poland) (Gosiewski et al. 2014; Sroka-Oleksiak et al. 2020; Krawczyk et al. 2021). The concentration and purity of DNA isolates were determined spectrophotometrically for the A260 nm and A260nm/280 nm ratio using NanoDrop (Thermofisher, Waltham, MA USA).

Library preparation

The 16S libraries were amplified by PCR. Primers targeting the 16S rRNA gene hypervariable V3 and V4 regions with the overhang adapter sequences (Klindworth et al. 2013). The composition of the reaction mixture and the thermal amplification profiles were presented in Supplemental Table S1.

Further, the PCR product (550 bp) was verified using agarose gel electrophoresis.

Libraries were prepared following the protocol for Preparing 16S Ribosomal RNA Gene Amplicons for Illumina MiSeq System (Part#15044223Rev.B) (http://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/16s/16s-metagenomic-library-prep-guide-15044223-b.pdf; accessed on September 25, 2022). First, according to the manufacturer’s recommendations, the PCR-based amplification was performed using KAPA HiFi HotStart ReadyMix (Roche, Basel, Switzerland). Then amplicons were pooled in equimolar concentrations of 10 pM for MiSeq and 55 pM for iSeq sequencing. Finally, both pooled libraries were diluted with 10 mM Tris pH 8.5 to the concentration of 4 nM. The differences and the details in library preparations are described in Table 2.

Table 2 The differences and the details in library preparations in final stage for two sequencers

Next-generation sequencing

The libraries containing 20 pooled indexed samples with 30% spike-in PhiX (Illumina San Diego, CA, USA) control DNA were loaded into the libraries for MiSeq and 10% onto libraries for iSeq. Sequencing was performed using MiSeq Reagent Kit v3 (600-cycle) and iSeq 100 i1 Reagent v2 (300-cycle) respectively.

Bioinformatics and statistical analysis

The data generated from MiSeq and iSeq as raw reads in FASTQ formats were filtered using the Illumina 16S Metagenomics workflow. Then the high-quality sequences were clustered, and the operational taxonomic units (OTUs) with 99.9% identity were prepared based on the Greengenes Database and the algorithm with the high-performance implementation of the Ribosomal Database Project (RDP) classifier, described by Wang et al. (2007). The data filtering was performed at the beginning of the downstream statistical analysis. The features with identical or zero values and the artifacts across all samples were excluded from further analysis. Next, the data with low quality and uninformative features were removed. A low count filter was adjusted with a prevalence of at least 20%, and features with minimal counts (< 4) were excepted from the analysis to exclude sequencing errors. In the next step of downstream analysis, data were normalized, including the variability in sampling depth and the sparsity of the data. After the normalization, a meaningful biological comparison was performed, and species richness was compared. All samples were rarefied to sequencing depth based on the sample having the lowest sequencing depth to normalize the data. Then LEfSe (Segata et al. 2011) and Microbiome Analyst platform (Chong et al. 2020) was applied in the next clustering steps and further statistical analysis. The most characteristic features were determined using the linear discriminant analysis (LDA) effect size with LEfSe (Segata et al. 2011). Using LEfSe, we described the highest statistical and biological differences between samples sequenced with MiSeq and iSeq. The discovered statistical significance and biological relevance were described based on the normalized relative abundance matrix, the Kruskal–Wallis rank-sum test, the significant alpha at 0.05, and the effect size threshold of 2. The hierarchical structure of taxonomic classifications was characterized using the median abundance and the non-parametric Wilcoxon rank-sum test to show taxonomic differences between microbial communities and abundance profiles of two experimental groups (Foster et al. 2017). Alpha and beta diversity were calculated using QIIME 2.0 software with Python scripts (Schloss et al. 2009). Alpha diversity was calculated based on the sequence similarity at the 97% level. The richness was calculated as the amount of unique OTUs found in each sample and presented as observed OTUs, and the count of unobserved species based on low-abundance OTUs was visualized as ACE and Chao1 indices. Shannon, Simpson, and Fisher estimators were calculated to measure both the richness and evenness within individual samples and in the experimental groups of samples. Beta diversity was determined as the distance and dissimilarities in-between microbial communities based on Jaccard, Bray–Curtis, and Jensen-Shannon Divergence indices calculated by QIIME. The distances were visualized by principal coordinate analysis (PCoA). Differences based on beta diversity of the whole microbiome structure among groups were calculated using a permutational multivariate analysis of variance (PERMANOVA). The similarity and dissimilarity were measured based on Pearson’s Correlation Coefficient. The dissimilarities, showing the distances between samples, were calculated based on the Jensen-Shannon Divergence beta diversity metric.

The sequence data (fastq files) has been deposited in The Jagiellonian University Repository—online access: https://ruj.uj.edu.pl/xmlui/handle/item/298521.

Results

DNA samples with a purity ≥ 1.7 were selected for further testing and all DNA samples tested fell within this range. The twenty fecal samples of rats were sequences on iSeq, and then the same samples were sequenced on the MiSeq machine.

Table 3 demonstrates the detailed run characteristics for both iSeq and MiSeq sequencing. We showed the differences, especially in maximum read counts per sample and the number of identified species higher for MiSeq (Table 3).

Table 3 The detailed summary of reading counts calculated for each sample and detailed run characteristics for two sequencers

The statistically significant differences were found in species richness, with higher numbers of species detected using the MiSeq compared to iSeq (p < 0.01). Moreover, the most OTUs detected on iSeq platform represented the same species. The differences in species richness are shown in Fig. 1, demonstrating rarefaction curves for both iSeq and MiSeq sequencing results.

Fig. 1
figure 1

The rarefaction curves for each group in two separate plots show the species richness in samples sequenced in iSeq and MiSeq machines. First, the data and raw sequencing reads were mapped against the complete gene catalog to assess sampling depth and sparsity variability. Then the gene recovery for different numbers of reads was calculated and plotted in the form of a rarefaction curve. The rarefaction curves created for samples sequenced in two different machines show the increasing numbers of raw sequencing reads and much more variation in discovered gene content in samples analyzed by MiSeq than variation attributed to samples sequenced in iSeq. The rarefaction curves of iSeq results reach a plateau earlier than those of MiSeq, which means that only a few new sequences are detected with increasing sequencing depth in iSeq. There was a difference in unequal sequencing depths in MiSeq and iSeq, influencing the richness between microbial communities measured in the samples. Rarefaction curves were obtained based on the R statistical programming language run on a high-performance computing cluster

The OTU data were summarized and compared based on their abundance at different taxonomic levels (Fig. 2). The significant differences in detected OTUs were noticed at all taxonomic levels except L2 (Fig. 2A). At the class level–L3 (Fig. 2B), the results for the MiSeq showed a higher abundance of Clostridia compared to iSeq (p < 0.001), and from the iSeq platform a higher abundance of Erysipelotrichia than for MiSeq (p < 0.001). At L4 (order level), the abundance of Bifidobacterales (80,542 reads vs 44,994 reads, p = 0.04), Erysipelotrichales (5,73,282 reads vs 326,310 reads, p < 0.001), and Propionibacteriales (1848 reads vs 157 reads, p < 0.001) was higher for the iSeq than from MiSeq respectively. The abundance of Clostridiales was higher for MiSeq (364,839 reads) than iSeq (87,868 reads, p < 0.001). The statistically significant differences in the OTUs abundance at the levels family (L5), genus (L6) (Fig. 2C), and species (L7) were detailed in Supplemental Table S2.

Fig. 2
figure 2

Different taxonomic levels differences in the actual abundance (the taxa were merged based on the sum of their counts across all samples categorized to the experimental groups) of samples sequenced on MiSeq and iSeq at different taxonomic levels, A phylum, B class, C genus

The statistically significant differences in the abundance profiles were detected when comparing the results of the characteristics of the taxonomic profile of one randomly selected sample (ME9dkMiSeq/ME9dkiSeq) obtained from both tested sequencers. As visualized in Fig. 3AB, the results from MiSeq were characterized by a higher abundance of Firmicutes phylum (73%) (Fig. 3AB) than those from iSeq (60%) (Fig. 3AB). In contrast, the Actinobacteria phylum was less abundant (24%) in the sample sequenced in MiSeq (Fig. 3AB) compared to the iSeq (39%) (Fig. 3AB).

Fig. 3
figure 3

The pie charts show the abundance profiles of one randomly selected sample (ME9okMiSeq) at two taxonomic levels: phylum (A) and family (B) of the two selected phyla: Firmicutes and Actinobacteria sequenced with MiSeq and iSeq

The linear discriminant analysis (LDA) revealed statistical significance in the biomarkers discovery. Furthermore, using the Kruskal–Wallis rank sum test, the features with significant differential abundance were identified, showing the discrepancies in the interpretation of metagenomic data obtained for the same samples in iSeq or MiSeq systems. These significant differential abundances were detected with regard to the following taxonomic levels: class (Clostridia, p < 0.000; Erysipelotrichia, p < 0.000), order (Clostridiales, p < 0.000; Propionibacteriales, p < 0.000; Erysipelotrichales, p < 0.000; Bifidobacteriales, p < 0.004), family (Promicromonosporaceae, p < 0.000; Bacillaceae, p < 0.000; Peptostreptococcaceae, p < 0.000; Erysipelatoclostridiaceae, p < 0.000; Thermoactinomycetaceae, p < 0.000; Nocardioidaceae, p < 0.000; Barnesiellaceae, p < 0.000; Erysipelotrichaceae, p < 0.002; Christensenellaceae, p < 0.005; Tannerellaceae, p < 0.024; Bacteroidaceae, p < 0.039), genus (p < 0.05), and species (p < 0.05). For details, see Supplemental Table S3 and S4. The 15 genus and species with the highest significant differential abundance were selected to be shown in Fig. 4.

Fig. 4
figure 4

The statistically significant biomarkers discovery results with regard to genus – L6 (A), and species – L7 level (B) for the same samples sequenced using iSeq and MiSeq systems demonstrated by colored bar plots. The graphical output of the LDA score demonstrates both the negative and positive values, with the cutoff of 2.0 determining the most significant features

The alpha diversity profiling also showed the statistical significance in the diversity of the same samples sequenced in iSeq or MiSeq sequencers. The significance testing of alpha diversity within the same samples sequenced in iSeq and MiSeq machines showed statistically significant differences starting from the order level (Table 4). Moreover, the alpha diversity metrics for results obtained from MiSeq were higher when compared to iSeq, excluding phylum and class levels (Supplemental Figs. S1 to S5). The most common metrics which were used to calculate the diversity within the same samples sequenced in iSeq and MiSeq are shown in Table 4.

Table 4 The p-value of alpha diversity measurements using the most common metrics in the same samples sequenced in iSeq and MiSeq machines

The comparison of microbial communities (in-between) based on their compositions revealed statistically significant. Table 5 demonstrated the beta diversity of the same samples sequenced in both iSeq and MiSeq machines. Statistically significant differences in microbiota composition of the same samples sequenced in two different sequencers were detectable starting from the order level (Table 5, Supplemental Figs. S6 to S11).

Table 5 The F-values, R-squared values, and p-values of beta diversity significance testing in the same samples sequenced in iSeq and MiSeq machines

A phylogenetic comparison, using unweighted UniFrac distance measures, of the samples sequenced on both tested sequencers was also performed and demonstrated the differences (Fig. 5). A dendrogram was represented by a row graph showing the clades (as the branches) with the leaves. This analysis revealed the different distances between samples in the clustering input for the results obtained in two different sequencers. In addition, the clades show similarities and dissimilarities. For example, clades close to the same height are similar, and clades with different sizes are dissimilar — the more significant the height difference, the more dissimilarity is present.

Fig. 5
figure 5

The dendrograms demonstrate the concordance in beta diversity MiSeq and iSeq at phylum levels (A) and species levels (B). The x-axis shows the cluster distances. The x-axis shows the cluster distances. The distance between two clusters is the average of the distances between all the features in those clusters

Discussion

This study compared two short-read Illumina platforms: iSeq 100 and MiSeq systems. They enable the second-generation sequencing, relating to the NGS technologies after the first-generation Sanger sequencing (Das et al. 2020; Hu et al. 2021). Illumina’s sequencing systems are currently most popular used in microbiome studies (Gao et al. 2021; Nakao et al. 2021).

Due to the same techniques of genomic library preparation and bioinformatics evaluation of sequenced 16S RNA gene fragments, both platforms seem helpful to the same extent. However, if we consider the equipment’s size, cost and the sequencing’s time duration (Table 1), iSeq seems to be the platform of choice (Colman et al. 2019; Dohál et al. 2020; Kazantseva et al. 2021). Additionally, a close model of the cartridge-made iSeq platform may have a lower cross-contamination risk between sequencing runs than in MiSeq (Nakao et al. 2021), although future research should evaluate cross-contamination risks as additional system’s maintenance procedures were introduced.

Nakao et al. (2021) considered that both platforms were equally suitable for evaluating freshwater fish environmental microbiome, and the % PF value for iSeq was slightly lower compared to MiSeq: 80.8% vs. 95.05%, respectively. Uelze L. et al. compared the consistency, accuracy, and repeatability of whole-genome short read sequencing. They found that the four Illumina platforms (MiSeq, NextSeq, iSeq, NovaSeq) gave similar results with slight variation among the different Illumina instruments. However the researchers looked for a few selected bacterial strains Uelze et al. (2020). Our comparative analysis shows a more significant difference in the % PF (passing filter) value (74% for iSeq vs 92% for MiSeq—Table 1), which was more concordant with the data provided by (Brun et al. (2018) (67.6% for iSeq vs 74.9% for MiSeq). In our study, we used an increased concentration of PhiX at the level of 30%, which allowed us to optimize the sequencing process and which has already been used in our other study (Kowalska-Duplaga et al. 2019). Additionally, our study found a higher number of identified bacterial species when MiSeq was used rather than iSeq (9184 vs 7026 respectively), which may be necessary in the final evaluation of the investigated microbiomes. We showed the scale of this difference by assessing the technical aspects of iSeq and MiSeq, comparing the sequencing results on two platforms of one randomly selected sample (Fig. 2).

The results of Nakao et al. (2021) indicate that the freshwater fish microbiota exhibited differences in the number of species between iSeq and MiSeq — a higher number of species per sample in iSeq. However, for example, at L4, all species detected only by iSeq, after rarefaction, were below the cut-off value. The authors suggested that for the same sequence depth, the difference in the number of species between iSeq and MiSeq will be much smaller (Nakao et al. 2021). Our rarefaction curve analysis demonstrated that the species richness of iSeq technology was statistically lower than that of MiSeq (Fig. 1). That meant that most of the OTUs detected with iSeq were the same species, unlike MiSeq, where more species could be identified. This fact seemed to be confirmed by the abundance results at the different taxonomic levels (Fig. 2 and Supplemental Table S2). Furthermore, the discrepancies with a different substantial proportion of taxa at the species level (Fig. 4) suggested that the MiSeq platform would be more beneficial for detecting less known bacteria.

The results comparing alpha and beta diversity (Finotello et al. 2018) were also significantly different for both platforms, starting at the order level, and were most pronounced at the species level. Lower alpha diversity in the samples sequenced on the iSeq platform indicated less biodiversity and, therefore smaller number of species in a single sample (minor compositional complexity of a community within the site). In turn, the differences in beta diversity allowed us to assume that the identified taxa obtained from the iSeq platform were completely different from those sequenced on the MiSeq platform. It would be particularly evident at the species level, as shown in the dendrogram (Fig. 5b). Our comparative studies suggest that choosing a sequencer to evaluate the microbiome is essential and depends on various factors. When analyzing the technical aspect of a randomly selected sample, we noticed the differences in the obtained results due to the platform used. Received results indicated significant differences in relative abundance at the phylum and family levels (Fig. 2). In such a situation, it is impossible to assume that the two sequencers can be used interchangeably to evaluate the gut microbiome.

The mentioned observations suggest that selecting a specific sequencer to assess the microbiome is more critical than the authors postulate, who found no significant difference in the results obtained from iSeq and MiSeq (Uelze et al. 2020; Nakao et al. 2021).Moreover, the limitations of the iSeq system were highlighted by Román-Reyna et al. (2021). They used metagenomics to identify plant pathogenic bacterium Xylella fastidiosa down to the strain level in various plant samples. They could not recover the complete sequences of the seven genes of this bacterium and the sequence types were not determined for any field sample. The authors believed this was probably associated with low genome coverage during sequencing by this sequencer (Román-Reyna et al. 2021). It may be necessary for the interpretetion the other studies results where sequencing was carried out on different platforms and the data obtained were evaluated in aggregate without taking into account the type of sequencer (e.g., virological studies on COVID-19) (Lu et al. 2020; Bhoyar et al. 2021). The above data disclose that the iSeq 100 system may be used to evaluate the bacterial profile of the samples to create an overall picture. However, because of twice the read lengths (MiSeq – 2 × 300 base pairs vs iSeq – 2 × 150 bp) and more than six times the maximum reads per run (MiSeq – 25 million sequencing reads vs iSeq – 4 million sequencing reads) the MiSeq system allows for greater probability of a correct, more precise identification of microorganisms at the levels of genus and species, therefore it seems to be better for a detailed analysis of the differences in the microbiota composition of the studied samples.

The weakness of this analysis was a small number of samples and the fact that they come from only animals, which made it difficult to interpret the specific results for all bacterial taxa. However, this work aimed to assess the technical characteristics of the two sequencers and not to solve the investigation results of the particular microbiome. In addition, it was one of the first studies to evaluate the efficiency of the two recommended NGS sequencing platforms on the example of the gut microbiome. https://ruj.uj.edu.pl/xmlui/handle/item/298521.