Comparison of iSeq and MiSeq as the two platforms for 16S rRNA sequencing in the study of the gut of rat microbiome

Abstract Amplicon-based next-generation sequencing (NGS) of the 16S ribosomal RNA (16S) regions is a culture-free method used to identify and analyze Procaryota occurring within a given sample. The prokaryotic 16S rRNA gene contains conserved regions and nine variable regions (V1-V9) frequently used for phylogenetic classification of genus or species in diverse microbial populations. This work compares the accuracy and efficacy of two platforms, iSeq and MiSeq from Illumina, used in sequencing 16S rRNA. The most important similarities and differences of 16S microbiome sequencing in 20 fecal rat samples were described. Genetic libraries were prepared according to 16S Metagenomic Sequencing Library Preparation (Illumina) for the V3 and V4 regions of the 16S. The species richness obtained using iSeq technology was lower compared to MiSeq. At the second taxonomy level (L2), the abundance of taxa was comparable for both platforms. At the L7, the taxa abundance was significantly different, and the number of taxa was higher for the MiSeq. The alpha diversity was lower for iSeq than for MiSeq, starting from the order to the species level. The beta diversity estimation revealed statistically significant differences in microbiota diversity starting from the class level to the species level in samples sequenced on two investigated platforms. This work disclosed that the iSeq platform could be used to evaluate the bacterial profile of the samples to characterize the overall profile. The MiSeq System seems to be better for a detailed analysis of the differences in the microbiota composition. Key points • iSeq platform allows to shorten the sequencing time three times compared to the MiSeq. • iSeq can only be used for an initial and quick microbiome assessment. • MiSeq is better for a detailed analysis of the differences in the microbiota composition. Supplementary information The online version contains supplementary material available at 10.1007/s00253-022-12251-z.


Introduction
Due to the increasing interest in the microbiome and its relationship within the different ecosystems, it is crucial to establish the appropriate method that will allow for a precise assessment of the microorganisms' composition in various ecological niches. Research on the microbiome includes its relationship with the natural environment (Cabello-Yeves et al. 2021) and in the context of animal breeding (Vasco et al. 2021) and agricultural crops ). As it was described in many reports using both animal and human models, the gut microbiome is the most complex, often studied and correlated with many diseases such as diabetes (Salamon et al. 2018, obesity (Sroka-Oleksiak et al. 2020), Crohn's disease (Kowalska-Duplaga et al. 2019), and skin diseases or schizophrenia Dominika Salamon and Barbara Zapała contributed equally to this work as first authors. (Szeligowski et al. 2020). It is known that many microorganisms are difficult to culture, e.g., in the gut (Lau et al. 2016) or hydrothermal sources (Bellec et al. 2020), so they are hard to detect in different samples. Therefore, the optimal method in such studies seems to be next-generation sequencing (NGS), high-throughput sequencing. The most common taxa identification is based on the amplicon-based NGS of the 16S ribosomal RNA (rRNA) gene. The prokaryotic 16S rRNA gene is approximately 1500 bp long and has conserved and hypervariable nine regions (V1-V9), which are used for phylogenetic classification of genera and in diverse microbial populations (Gao et al. 2021;Nakao et al. 2021). Such metagenomic studies require specific infrastructure, trained personnel, appropriate sequencing platforms, and bioinformatics pipelines (Allali et al. 2017;Gao et al. 2021). Since 2011, many laboratories have used the Illumina NGS technology, especially the MiSeq system (San Diego, CA, USA), which is the most commonly used, enabling the evaluation of relatively short pairedend sequencing reads with high accuracy (https:// www. illum ina. com/ conte nt/ dam/ illum ina/ gcs/ assem bled-assets/ marke ting-liter ature/ miseq-system-data-sheet-m-gl-00006/ miseq-data-sheet-m-gl-00006. pdf; accessed on September 25, 2022). In 2018, the company introduced the new iSeq 100 system, which can be used for high-throughput sequencing of the 16S rRNA gene. The low cost and small size of the iSeq machine create a new opportunity for many laboratories, even being an alternative to the MiSeq platform (https:// www. illum ina. com/ conte nt/ dam/ illum inamarke ting/ docum ents/ produ cts/ datas heets/ iseq1 00-seque ncing-system-spec-sheet-770-2017-020. pdf; accessed on September 25, 2022). Based on the data provided by Illumina, the most important characteristics of the two abovementioned platforms are presented in Table 1.
This study aimed to compare the accuracy and efficacy of two selected iSeq and MiSeq, Illumina machines in sequencing 16S rRNA (16S) in the same samples of the rat schizophrenia model. Furthermore, using the two different iSeq and MiSeq, Illumina technologies for sequencing 16S rRNA in the same schizophrenia rats' feces samples, the potential differences in the most important characteristics (in the field of richness and evenness (alpha diversity) and differences in microbial structure (beta diversity) were compared.

Samples and DNA isolation
We used 20 fecal samples taken from rats while running another project ("The effects of ligands of the alpha7 nicotinic acetylcholine receptors on complex cognitive processes and social behavior in the neurodevelopmental schizophrenia-like model", project grant no 014/15/N/NZ7/02978 supported by National Science Centre, approval of the 2nd Local Institutional Animal Care and Use Committee in Krakow, no 1198/2015) on gut microbiota in a rat model of schizophrenia. Samples were collected in polypropylene tubes (FL Medical, Padova, Italy) and immediately frozen at − 80 °C pending further analysis, for no longer than 2 months. Bacterial DNA was isolated according to our previous works using mechanical, chemical, and enzymatic lysis of microbial cells and commercial Genomic Mini AX Stool Spin kit (A&A Biotechnology, Gdansk, Poland) (Gosiewski et al. 2014;Sroka-Oleksiak et al. 2020;Krawczyk et al. 2021). The concentration and purity of DNA isolates were determined spectrophotometrically for the A260 nm and A260nm/280 nm ratio using NanoDrop (Thermofisher, Waltham, MA USA).

Library preparation
The 16S libraries were amplified by PCR. Primers targeting the 16S rRNA gene hypervariable V3 and V4 regions with the overhang adapter sequences (Klindworth et al. 2013). The composition of the reaction mixture and the thermal amplification profiles were presented in Supplemental  Table S1. Further, the PCR product (550 bp) was verified using agarose gel electrophoresis.
Libraries were prepared following the protocol for Preparing 16S Ribosomal RNA Gene Amplicons for Illumina MiSeq System (Part#15044223Rev.B) (http:// suppo rt. illum ina. com/ conte nt/ dam/ illum ina-suppo rt/ docum ents/ docum entat ion/ chemi stry_ docum entat ion/ 16s/ 16s-metag enomiclibra ry-prep-guide-15044 223-b. pdf; accessed on September 25, 2022). First, according to the manufacturer's recommendations, the PCR-based amplification was performed using KAPA HiFi HotStart ReadyMix (Roche, Basel, Switzerland). Then amplicons were pooled in equimolar concentrations of 10 pM for MiSeq and 55 pM for iSeq sequencing. Finally, both pooled libraries were diluted with 10 mM Tris pH 8.5 to the concentration of 4 nM. The differences and the details in library preparations are described in Table 2.

Next-generation sequencing
The libraries containing 20 pooled indexed samples with 30% spike-in PhiX (Illumina San Diego, CA, USA) control DNA were loaded into the libraries for MiSeq and 10% onto libraries for iSeq. Sequencing was performed using MiSeq Reagent Kit v3 (600-cycle) and iSeq 100 i1 Reagent v2 (300-cycle) respectively.

Bioinformatics and statistical analysis
The data generated from MiSeq and iSeq as raw reads in FASTQ formats were filtered using the Illumina 16S Metagenomics workflow. Then the high-quality sequences were clustered, and the operational taxonomic units (OTUs) with 99.9% identity were prepared based on the Greengenes Database and the algorithm with the high-performance implementation of the Ribosomal Database Project (RDP) classifier, described by Wang et al. (2007). The data filtering was performed at the beginning of the downstream statistical analysis. The features with identical or zero values and the artifacts across all samples were excluded from further analysis. Next, the data with low quality and uninformative features were removed. A low count filter was adjusted with a prevalence of at least 20%, and features with minimal counts (< 4) were excepted from the analysis to exclude sequencing errors. In the next step of downstream analysis, data were normalized, including the variability in sampling depth and the sparsity of the data. After the normalization, a meaningful biological comparison was performed, and species richness was compared. All samples were  rarefied to sequencing depth based on the sample having the lowest sequencing depth to normalize the data. Then LEfSe (Segata et al. 2011) and Microbiome Analyst platform (Chong et al. 2020) was applied in the next clustering steps and further statistical analysis. The most characteristic features were determined using the linear discriminant analysis (LDA) effect size with LEfSe (Segata et al. 2011). Using LEfSe, we described the highest statistical and biological differences between samples sequenced with MiSeq and iSeq. The discovered statistical significance and biological relevance were described based on the normalized relative abundance matrix, the Kruskal-Wallis rank-sum test, the significant alpha at 0.05, and the effect size threshold of 2. The hierarchical structure of taxonomic classifications was characterized using the median abundance and the non-parametric Wilcoxon rank-sum test to show taxonomic differences between microbial communities and abundance profiles of two experimental groups (Foster et al. 2017). Alpha and beta diversity were calculated using QIIME 2.0 software with Python scripts (Schloss et al. 2009). Alpha diversity was calculated based on the sequence similarity at the 97% level.
The richness was calculated as the amount of unique OTUs found in each sample and presented as observed OTUs, and the count of unobserved species based on low-abundance OTUs was visualized as ACE and Chao1 indices. Shannon, Simpson, and Fisher estimators were calculated to measure both the richness and evenness within individual samples and in the experimental groups of samples. Beta diversity was determined as the distance and dissimilarities in-between microbial communities based on Jaccard, Bray-Curtis, and Jensen-Shannon Divergence indices calculated by QIIME. The distances were visualized by principal coordinate analysis (PCoA). Differences based on beta diversity of the whole microbiome structure among groups were calculated using a permutational multivariate analysis of variance (PERMANOVA). The similarity and dissimilarity were measured based on Pearson's Correlation Coefficient. The dissimilarities, showing the distances between samples, were calculated based on the Jensen-Shannon Divergence beta diversity metric. The sequence data (fastq files) has been deposited in The Jagiellonian University Repository-online access: https:// ruj. uj. edu. pl/ xmlui/ handle/ item/ 298521.

Results
DNA samples with a purity ≥ 1.7 were selected for further testing and all DNA samples tested fell within this range. The twenty fecal samples of rats were sequences on iSeq, and then the same samples were sequenced on the MiSeq machine. Fig. 1 The rarefaction curves for each group in two separate plots show the species richness in samples sequenced in iSeq and MiSeq machines. First, the data and raw sequencing reads were mapped against the complete gene catalog to assess sampling depth and sparsity variability. Then the gene recovery for different numbers of reads was calculated and plotted in the form of a rarefaction curve. The rarefaction curves created for samples sequenced in two different machines show the increasing numbers of raw sequencing reads and much more variation in discovered gene content in samples analyzed by MiSeq than variation attributed to samples sequenced in iSeq. The rarefaction curves of iSeq results reach a plateau earlier than those of MiSeq, which means that only a few new sequences are detected with increasing sequencing depth in iSeq. There was a difference in unequal sequencing depths in MiSeq and iSeq, influencing the richness between microbial communities measured in the samples. Rarefaction curves were obtained based on the R statistical programming language run on a high-performance computing cluster Table 3 demonstrates the detailed run characteristics for both iSeq and MiSeq sequencing. We showed the differences, especially in maximum read counts per sample and the number of identified species higher for MiSeq (Table 3).
The statistically significant differences were found in species richness, with higher numbers of species detected using the MiSeq compared to iSeq (p < 0.01). Moreover, the most OTUs detected on iSeq platform represented the same species. The differences in species richness are shown in Fig. 1, demonstrating rarefaction curves for both iSeq and MiSeq sequencing results.
The OTU data were summarized and compared based on their abundance at different taxonomic levels (Fig. 2). The significant differences in detected OTUs were noticed at all taxonomic levels except L2 ( Fig. 2A). At the class level-L3 (Fig. 2B), the results for the MiSeq showed a higher abundance of Clostridia compared to iSeq (p < 0.001), and from the iSeq platform a higher abundance of Erysipelotrichia than for MiSeq (p < 0.001). At L4 (order level), the abundance of Bifidobacterales (80,542 reads vs 44,994 reads, p = 0.04), Erysipelotrichales (5,73,282 reads vs 326,310 reads, p < 0.001), and Propionibacteriales (1848 reads vs 157 reads, p < 0.001) was higher for the iSeq than from MiSeq respectively. The abundance of Clostridiales was higher for MiSeq (364,839 reads) than iSeq (87,868 reads, p < 0.001). The statistically significant differences in the OTUs abundance at the levels family (L5), genus (L6) (Fig. 2C), and species (L7) were detailed in Supplemental  Table S2.
The statistically significant differences in the abundance profiles were detected when comparing the results of the characteristics of the taxonomic profile of one randomly  Fig. 3A, B, the results from MiSeq were characterized by a higher abundance of Firmicutes phylum (73%) (Fig. 3A, B) than those from iSeq (60%) (Fig. 3A, B). In contrast, the Actinobacteria phylum was less abundant (24%) in the sample sequenced in MiSeq (Fig. 3A, B) compared to the iSeq (39%) (Fig. 3A, B).
The alpha diversity profiling also showed the statistical significance in the diversity of the same samples sequenced in iSeq or MiSeq sequencers. The significance testing of alpha diversity within the same samples sequenced in iSeq and MiSeq machines showed statistically significant differences starting from the order level (Table 4). Moreover, the alpha diversity metrics for results obtained from MiSeq were higher when compared to iSeq, excluding phylum and class levels (Supplemental Figs. S1 to S5). The most common metrics which were used to calculate the diversity within the same samples sequenced in iSeq and MiSeq are shown in Table 4.
The comparison of microbial communities (in-between) based on their compositions revealed statistically significant. Table 5 demonstrated the beta diversity of the same samples sequenced in both iSeq and MiSeq machines. Statistically significant differences in microbiota composition of the same samples sequenced in two different sequencers were detectable starting from the order level (Table 5, Supplemental Figs. S6 to S11).
A phylogenetic comparison, using unweighted UniFrac distance measures, of the samples sequenced on both tested sequencers was also performed and demonstrated the differences (Fig. 5). A dendrogram was represented by a row graph showing the clades (as the branches) with the leaves. This analysis revealed the different distances between samples in the clustering input for the results obtained in two Fig. 3 The pie charts show the abundance profiles of one randomly selected sample (ME9okMiSeq) at two taxonomic levels: phylum (A) and family (B) of the two selected phyla: Firmicutes and Actinobacteria sequenced with MiSeq and iSeq ◂ Fig. 4 The statistically significant biomarkers discovery results with regard to genus -L6 (A), and species -L7 level (B) for the same samples sequenced using iSeq and MiSeq systems demonstrated by colored bar plots. The graphical output of the LDA score demonstrates both the negative and positive values, with the cutoff of 2.0 determining the most significant features different sequencers. In addition, the clades show similarities and dissimilarities. For example, clades close to the same height are similar, and clades with different sizes are dissimilar -the more significant the height difference, the more dissimilarity is present.

Discussion
This study compared two short-read Illumina platforms: iSeq 100 and MiSeq systems. They enable the second-generation sequencing, relating to the NGS technologies after the first-generation Sanger sequencing (Das et al. 2020;Hu et al. 2021). Illumina's sequencing systems are currently most popular used in microbiome studies (Gao et al. 2021;Nakao et al. 2021).
Due to the same techniques of genomic library preparation and bioinformatics evaluation of sequenced 16S RNA gene fragments, both platforms seem helpful to the same extent. However, if we consider the equipment's size, cost and the sequencing's time duration (Table 1), iSeq seems to be the platform of choice (Colman et al. 2019;Dohál et al. 2020;Kazantseva et al. 2021). Additionally, a close model of the cartridge-made iSeq platform may have a lower cross-contamination risk between sequencing runs than in MiSeq (Nakao et al. 2021), although future research should evaluate cross-contamination risks as additional system's maintenance procedures were introduced. Nakao et al. (2021) considered that both platforms were equally suitable for evaluating freshwater fish environmental microbiome, and the % PF value for iSeq was slightly lower compared to MiSeq: 80.8% vs. 95.05%, respectively. Uelze L. et al. compared the consistency, accuracy, and repeatability of whole-genome short read sequencing. They found that the four Illumina platforms (MiSeq, NextSeq, iSeq, NovaSeq) gave similar results with slight variation among the different Illumina instruments. However the researchers looked for a few selected bacterial strains Uelze et al. (2020). Our comparative analysis shows a more significant difference in the % PF (passing filter) value (74% for iSeq vs 92% for MiSeq- Table 1), which was more concordant with the data provided by (Brun et al. (2018) (67.6% for iSeq vs 74.9% for MiSeq). In our study, we used an increased concentration of PhiX at the level of 30%, which allowed us to optimize the sequencing process and which has already been used in our other study (Kowalska-Duplaga et al. 2019). Additionally, our study found a higher number of identified bacterial species when MiSeq was used rather than iSeq (9184 vs 7026 respectively), which may be necessary in the final evaluation of the investigated microbiomes. We showed the scale of this difference by assessing the technical aspects of iSeq and MiSeq, comparing the sequencing results on two platforms of one randomly selected sample (Fig. 2). The results of Nakao et al. (2021) indicate that the freshwater fish microbiota exhibited differences in the number of species between iSeq and MiSeq -a higher number of species per sample in iSeq. However, for example, at L4, all species detected only by iSeq, after rarefaction, were below the cut-off value. The authors suggested that for the same sequence depth, the difference in the number of species between iSeq and MiSeq will be much smaller (Nakao et al. 2021). Our rarefaction curve analysis demonstrated that the species richness of iSeq technology was statistically lower than that of MiSeq (Fig. 1). That meant that most of the OTUs detected with iSeq were the same species, unlike MiSeq, where more species could be identified. This fact seemed to be confirmed by the abundance results at the different taxonomic levels ( Fig. 2 and Supplemental Table S2). Furthermore, the discrepancies with a different substantial proportion of taxa at the species level (Fig. 4) suggested that the MiSeq platform would be more beneficial for detecting less known bacteria.
The results comparing alpha and beta diversity (Finotello et al. 2018) were also significantly different for both platforms, starting at the order level, and were most pronounced at the species level. Lower alpha diversity in the samples sequenced on the iSeq platform indicated less biodiversity and, therefore smaller number of species in a single sample (minor compositional complexity of a community within the site). In turn, the differences in beta diversity allowed us to assume that the identified taxa obtained from the iSeq platform were completely different from those sequenced on the MiSeq platform. It would be particularly evident at the species level, as shown in the dendrogram (Fig. 5b). Our comparative studies suggest that choosing a sequencer to evaluate the microbiome is essential and depends on various factors. When analyzing the technical aspect of a randomly selected sample, we noticed the differences in the obtained results due to the platform used. Received results indicated Fig. 5 The dendrograms demonstrate the concordance in beta diversity MiSeq and iSeq at phylum levels (A) and species levels (B). The x-axis shows the cluster distances. The x-axis shows the cluster dis-tances. The distance between two clusters is the average of the distances between all the features in those clusters significant differences in relative abundance at the phylum and family levels (Fig. 2). In such a situation, it is impossible to assume that the two sequencers can be used interchangeably to evaluate the gut microbiome.
The mentioned observations suggest that selecting a specific sequencer to assess the microbiome is more critical than the authors postulate, who found no significant difference in the results obtained from iSeq and MiSeq (Uelze et al. 2020;Nakao et al. 2021).Moreover, the limitations of the iSeq system were highlighted by Román-Reyna et al. (2021). They used metagenomics to identify plant pathogenic bacterium Xylella fastidiosa down to the strain level in various plant samples. They could not recover the complete sequences of the seven genes of this bacterium and the sequence types were not determined for any field sample. The authors believed this was probably associated with low genome coverage during sequencing by this sequencer (Román-Reyna et al. 2021). It may be necessary for the interpretetion the other studies results where sequencing was carried out on different platforms and the data obtained were evaluated in aggregate without taking into account the type of sequencer (e.g., virological studies on COVID-19) (Lu et al. 2020;Bhoyar et al. 2021). The above data disclose that the iSeq 100 system may be used to evaluate the bacterial profile of the samples to create an overall picture. However, because of twice the read lengths (MiSeq -2 × 300 base pairs vs iSeq -2 × 150 bp) and more than six times the maximum reads per run (MiSeq -25 million sequencing reads vs iSeq -4 million sequencing reads) the MiSeq system allows for greater probability of a correct, more precise identification of microorganisms at the levels of genus and species, therefore it seems to be better for a detailed analysis of the differences in the microbiota composition of the studied samples.
The weakness of this analysis was a small number of samples and the fact that they come from only animals, which made it difficult to interpret the specific results for all bacterial taxa. However, this work aimed to assess the technical characteristics of the two sequencers and not to solve the investigation results of the particular microbiome. In addition, it was one of the first studies to evaluate the efficiency of the two recommended NGS sequencing platforms on the example of the gut microbiome. https:// ruj. uj. edu. pl/ xmlui/ handle/ item/ 298521.