A comparison between full-length 16S rRNA Oxford nanopore sequencing and Illumina V3-V4 16S rRNA sequencing in head and neck cancer tissues

Describing the microbial community within the tumour has been a key aspect in understanding the pathophysiology of the tumour microenvironment. In head and neck cancer (HNC), most studies on tissue samples have only performed 16S rRNA short-read sequencing (SRS) on V3-V5 region. SRS is mostly limited to genus level identification. In this study, we compared full-length 16S rRNA long-read sequencing (FL-ONT) from Oxford Nanopore Technology (ONT) to V3-V4 Illumina SRS (V3V4-Illumina) in 26 HNC tumour tissues. Further validation was also performed using culture-based methods in 16 bacterial isolates obtained from 4 patients using MALDI-TOF MS. We observed similar alpha diversity indexes between FL-ONT and V3V4-Illumina. However, beta-diversity was significantly different between techniques (PERMANOVA - R2 = 0.131, p < 0.0001). At higher taxonomic levels (Phylum to Family), all metrics were more similar among sequencing techniques, while lower taxonomy displayed more discrepancies. At higher taxonomic levels, correlation in relative abundance from FL-ONT and V3V4-Illumina were higher, while this correlation decreased at lower levels. Finally, FL-ONT was able to identify more isolates at the species level that were identified using MALDI-TOF MS (75% vs. 18.8%). FL-ONT was able to identify lower taxonomic levels at a better resolution as compared to V3V4-Illumina 16S rRNA sequencing. Supplementary Information The online version contains supplementary material available at 10.1007/s00203-024-03985-7.


Introduction
The effect of tumour associated microbial communities on tumour biology is under intense investigation (Helmink et al. 2019;Cullin et al. 2021;Sepich-Poore et al. 2021;Yang et al. 2023a).To date, the tumour microbiome has been implicated in modulating anti-tumoural immune responses, chemotherapy efficacy, and tumour progression (Helmink et al. 2019;Cullin et al. 2021;Sepich-Poore et al. 2021).Apart from tissues, microbial signatures from other collection sites such as stool and saliva may have diagnostic or prognostic roles in various cancers (Thomas et al. 2019;Ratiner et al. 2023;Yang et al. 2023a).Together these studies demonstrate the potential impact of understanding the tumour microbiome in cancers.However, as a prerequisite to further research, it is critical to use the right tools for a robust microbiome identification.
DNA sequencing techniques such as targeted sequencing of the 16S ribosomal RNA (rRNA) gene, metagenomics, and to a lesser extent, meta-transcriptomics have been instrumental in microbiome identification (Cullin et al. 2021).Of these, Illumina based short-read sequencing (SRS) of the 16S rRNA has been widely adopted due to its relatively low cost and high throughput (Cullin et al. 2021;Kim et al. 2024).The 16S rRNA gene is approximately 1,500 to 1,600 base pairs (bp) long in most bacteria, and is composed of nine variable regions which allows taxonomical identification of microbial communities.Although sequencing all nine variable regions offers better taxonomic resolution, most studies usually sequence only a selection of variable regions, limiting the capacity of species level identification (Yeo et al. 2024).
In head and neck cancer (HNC), most studies on microbiome identification relied on SRS of the 16S V3-V5 regions (V3-V4: ~465 bp, V4: ~250 bp, V4-V5: ~392 bp) on tissues, swabs, saliva, and oral rinse (Ting et al. 2023;Yeo et al. 2024).Our recent meta-analysis of V3-V5 short-read Illumina sequencing datasets identified key oral microbes localised in HNC tumours (Yeo et al. 2024).However, taxonomic classifications were limited to the genus level, with species-specific contributions to HNC pathophysiology largely unknown (Curry et al. 2022;Yeo et al. 2024).Given that several oral species such as Fusobacterium nucleatum and Porphyromonas gingivalis can promote tumour progression and alter anti-tumour immunity (Lan et al. 2023), utilising cutting-edge technologies that can provide species level information will provide critical insights to the role of microbiome in HNC.
In this study, we comprehensively evaluated the differences in microbiome diversities and abundance between ONT and Illumina 16S rRNA sequencing technique on HNC tissue samples.Bacterial abundance between ONT and Illumina was evaluated at each taxonomic level using paired Wilcoxon test on relative abundance and paired ANOVA-Like Differential Expression tool 2 (ALDEx2) differential abundance analysis, which takes into account the compositional and zero-inflation nature of microbiome dataset (Fernandes et al. 2014).Furthermore, matrix assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) was also performed on bacteria isolated from 4 patient tissue samples for comparison to the 16S rRNA sequencing performed.To our best knowledge, this is the first study to perform long read 16S rRNA sequencing on HNC cancer tissue samples, and the first to evaluate ONT and Illumina 16S rRNA sequencing on HNC tissue samples.

Sample collections
Tumour samples were collected from 26 HNC patients undergoing surgical resection of primary tumours at the Royal Adelaide Hospital (Adelaide, SA, Australia) and The Memorial Hospital (Adelaide, SA, Australia).Tumour samples were placed into a sterile cryotube immediately after surgical excision to prevent any environmental contamination.Ethics approval for the collection and storage of patient samples were granted by Central Adelaide Local Health Network Human Research Ethics Committee (Adelaide, South Australia) (HREC MYIP14116), and all patients had signed written informed consent.

DNA extraction
DNA was extracted in a laminar flow cabinet with aseptic technique, using DNeasy Blood & Tissue Kit (Qiagen, Germany, Hilden) with some modification, as described previously (Hang et al. 2014).Briefly, prior to DNA extraction, the tissue samples were homogenised using 3 mm stainless steel beads (Qiagen) and a TissueLyser II (Qiagen) at 23 Hz for 3 min.Afterwards, the homogenized tissues were incubated with 1 mg/mL lysozyme (cat no: L3790, Sigma Aldrich, MO, USA) and 0.2 mg/mL lysostaphin (L7386, Sigma) at 37 °C for 1 h, followed by 0.5 mg/mL proteinase K (Qiagen) incubation at 56 °C for 2 h, before proceeding with manufacturer's DNA extraction protocol.The DNA was quantified using Qubit™ dsDNA Quantification Assay Kit (Invitrogen, USA, MA), before undergoing Illumina 16S rRNA V3-V4 SRS (referred to as V3V4-Illumina) and ONT full-length V1-V9 16S rRNA LRS (referred to as FL-ONT).Negative controls were also included in extraction process.

Alpha-and Beta-diversity analysis
Since short-read Illumina 16S rRNA sequencing is limited to genus level resolution (Curry et al. 2022), we performed alpha and beta-diversities analyses at the genus level.Before diversities analysis, samples were rarefied using rarefy_even_depth in phyloseq to sample with least depth (read = 1612) (McMurdie and Holmes 2013).Alpha-diversity was measured using Shannon, Simpson, InvSimpson and Observed indexes, using microeco R

FL-ONT and V3V4-Illumina 16S rRNA sequencing groups display comparable alpha diversity indexes at the genus level
To compare observed richness and evenness between FL-ONT and V3V4-Illumina, alpha diversity was measured using Shannon, Simpson, InvSimpson, and Observed indexes (Fig. 1).Since Illumina SRS 16S rRNA sequencing is largely limited to genus level resolution, alpha diversity was measured at genus level (Curry et al. 2022).After agglomerating rarefied datasets to genus level, a total of 92 genera were identified.Similar to previous findings comparing LRS and SRS (Heikema et al. 2020), we found no significant differences (p > 0.05) between ONT and Illumina 16S rRNA sequencing -Shannon (median difference = -0.207),InvSimpson (median difference = -1.140)and Observed genera (median difference = 1.00) (Fig. 1A, package (Liu et al. 2021).Wilcoxon matched-pairs signed rank test was performed to determine differences between paired samples sequenced using different techniques.
For beta-diversity analysis, rarefied relative abundance of all genera were ordinated using Bray-Curtis distance and plotted plotted on a principal coordinate analysis (PCoA) using phyloseq v1.46 and ggpubr v0.6 R packages (McMurdie and Holmes 2013).Permutational multivariate analysis of variance (PERMANOVA) and Analysis of similarities (ANOSIM), strata for paired sample, were performed to assess differences between in beta-diversity between paired ONT and Illumina sequencing groups (Dixon 2003).Additionally, we also included the W d test, a test which is robust for heteroscedastic datasets, to determine differences in beta-diversity between ONT and Illumina (Hamidi et al. 2019).Variance between groups were measured using the betadisper test from vegan v2.6 (Dixon 2003).Permutations for all tests were set to n = 9999.Additional compositional approach was also performed for beta-diversity by performing central-log ratio (CLR) normalisation (offset = 0.5) of all counts at the genera level (Gloor et al. 2017).CLR abundance was coordinated using Euclidean distance and plotted on a principal coordinate analysis (PCoA) using phyloseq v1.46 and ggpubr v0.6 R packages (McMurdie and Holmes 2013) (Figure S1).

Differential abundance analysis
We analysed differential abundance at all taxonomic levels -phylum, class, order, family, genus, and species.Data was agglomerated to specific levels before downstream analysis.For differential relative abundance analysis, data was normalized into relative abundance (%), and Wilcoxon matched-pairs signed rank test adjusted for Benjamini-Hochberg corrected false discovery rate (FDR) was used to determine differences between paired samples.Additionally, we also applied ALDEx2 differential abundance analysis, which uses a Monte Carlo Dirichlet sampling approach which considers the compositional and zero-inflation nature of microbiome dataset while to determining differences between ONT and Illumina sequencing group (Fernandes et al. 2014).
C and D).However, Simpson index (median difference = -0.078,p = 0.027) showed statistically significant but subtle differences between groups (Fig. 1B).Overall, these results suggest that there are subtle differences between ONT and Illumina 16S rRNA sequencing groups with respect to alpha diversity at the genus level.

Differences in beta-diversity were observed between paired FL-ONT and V3V4-Illumina sequencing on tumour samples at the genus level
Differences in β-diversity between FL-ONT and V3V4-Illumina were assessed using PCoA plot of Bray-Curtis distance on rarefied relative abundance, PERMANOVA, ANOSIM and W d test (Fig. 2).Ordination PCoA Bray-Curtis plot suggest that there is a shift in beta diversity between FL-ONT and V3V4-Illumina 16S rRNA sequencing (Fig. 2).Similarly, we observed significant differences in β-diversity between FL-ONT and V3V4-Illumina using

FL-ONT and V3V4-Illumina 16S rRNA sequencing displays greater discrepancies in microbial community profiling at the genus level
Since Illumina 16S rRNA SRS is capable of identifying taxa mostly to the genus level, with limited capability of identification at species level, we compared FL-ONT and V3V4-Illumina at the genus level (Martínez-Porchas et al. 2016;Curry et al. 2022).When we compared relative abundance between sequencing techniques, we found that 29/92 bacterial genera were significantly different in relative abundance (Fig. 4A and B, Table S8A).Haemophilus (mean diff = 14.6%, p < 0.0001) and Campylobacter (mean diff = 10.5%, p < 0.0001) had significantly higher relative abundance in FL-ONT group, while Prevotella (mean diff = -15.4%,p < 0.0001) had significantly higher relative abundance in the V3V4-Illumina group (Fig. 5B, Table S8A).Other notable bacterial genera such as Streptococcus (mean diff = 9.71%, p < 0.0001) and Fusobacterium (mean diff = -6.72%,p = 0.00002) also had significantly higher relative abundances in FL-ONT and V3V4-Illumina group respectively (Table S8A).Of these 29 bacterial genera, 22 bacterial genera had less than 5% differences in relative abundance between techniques, although being statistically significant (Table S8A).

Paired sample analysis of FL-ONT and V3V4-Illumina 16S rRNA sequencing reveals decreasing correlation in relative abundance from higher to lower taxonomic levels
To determine taxonomic differences at each taxonomic level between FL-ONT and V3V4-Illumina sequencing technologies, we performed correlation analysis of relative abundance between paired samples, and paired Wilcoxon rank sum test on CLR-normalized abundance (ALDEx2) and on relative abundance (Fig. 3, Supplementary Table S4-S9, Supplementary Figure S2-S5) (Fernandes et al. 2014).Full description of taxonomic differences for phylum, class, order, and family level are presented in Supplementary Materials (Figure S2-S5).Overall, the bacteria identified by FL-ONT and V3V4-Illumina group were mostly from the same lineage at the phylum, class, order, and family taxonomical levels.However, we also detected bacteria that were unique to the sequencing technique, albeit detected at very low abundance (< 0.1%) (Table S4-S7).Furthermore, we also observed that there was a good concordance in the relative abundance of the top bacteria detected, whereby both techniques have similar top bacteria detected (Figure S2-S5).
We further performed correlation analysis of relative abundance between paired FL-ONT and V3V4-Illumina sequencing (Fig. 3).Moderate correlation in relative abundance(R > 0.7) were observed from phylum to family level, with phylum (R = 0.758), class (R = 0.779) and order (R = 0.761) level showing similar median R value as compared to family (R = 0.708) level (Fig. 3).

ONT LRS full-length 16S rRNA sequencing is superior for species level bacterial identification
Illumina SRS is limited to sequencing short fragments which results in poor capacity to differentiate and identify highly similar species (Martínez-Porchas et al. 2016;Curry et al. 2022).By sequencing the full-length 16S rRNA gene, FL-ONT is able to provide bacterial community identification at the species level.We further compared FL-ONT to V3V4-Illumina in HNC tissues samples at the species level.Furthermore, we also isolated bacteria from 4 HNC patients and identified these bacteria using MALDI-TOF MS to confirm that FL-ONT were able to identify the correct bacterial species.

Discussion
Recent studies suggest that microbiome contributions to tumour pathobiology can be attributed to specific bacterial species (Helmink et al. 2019;Cullin et al. 2021;Sepich-Poore et al. 2021), there is a significant need to adopt sequencing technologies capable of species level identification such as FL-ONT 16S rRNA sequencing (Curry et al. 2022).We have previously reported a consensus tissue microbiome signature in HNC using previously published Illumina SRS 16S rRNA sequencing data (Yeo et al. 2024).
To the best of our knowledge, this is the first study to perform FL-ONT 16S rRNA sequencing on HNC tumour samples.Furthermore, we comprehensively assessed the performance of FL-ONT to V3V4-Illumina sequencing.We found that alpha diversity was comparable between paired FL-ONT and V3V4-Illumina 16S rRNA sequencing.In contrast, beta-diversity was significantly different between paired FL-ONT and V3V4-Illumina 16S rRNA sequenced HNC samples.At higher taxonomic levels (phylum, class, order, and family), moderate correlations between the two sequencing methodologies for bacterial relative abundance, while at lower taxonomic levels, particularly species level, the correlations were poor.Importantly, FL-ONT identified more unique species that were also detected at higher in abundance than V3V4-Illumina.
In this study, we compared alpha and beta-diversities between FL-ONT to V3V4-Illumina 16S rRNA sequencing data at the genus level which is the current acceptable limit for short-read Illumina 16S rRNA sequencing based taxonomic classification (Curry et al. 2022).Similar to our previous work on nasal swabs (Connell et al. 2024), we identified comparable alpha-diversities between paired FL-ONT and V3V4-Illumina 16S rRNA sequencing in HNC tissues samples.Out of the 4 alpha diversities matrices tested, only Simpson index showed a statistically significant, but minimal difference (mean differences = -0.07) in our study.A previous study also reported minimal but statistically significant differences in alpha-diversity measurement using InvSimpson index between the two sequencing techniques (Heikema et al. 2020).Importantly, in our HNC tissue samples, we showed minimal or no differences in alphadiversities.Consistent with our findings, previous reports have shown significant beta-diversity differences between ONT and Illumina based 16S rRNA sequencing in the gut and nasal microbiome (Heikema et al. 2020;Szoboszlay et al. 2023).Critically, our beta-diversities were stratified for patients accounting for inter-patient sample differences.
As expected, we observed most discrepancies in bacterial species identification between FL-ONT and V3V4-Illumina sequencing groups.Moreover, FL-ONT was able to identify more unique bacterial species at a higher bacterial abundance than V3V4-Illumina sequencing.Furthermore, MALDI-TOF MS identification were more identical to FL-ONT than V3V4-Illumina.Similar to previous studies, poorest correlation between FL-ONT and V3V4-Illumina inter-sample differences such as lifestyle activities including smoking, alcohol or diet intake that is known to affect the microbiome (Yu et al. 2017;Fan et al. 2018;Shoer et al. 2023).
Consistent with most studies (Shin et al. 2016;Wei et al. 2020;Matsuo et al. 2021), we observed a decrease in correlation between relative abundance produced from different sequencing techniques from higher to lower taxonomic levels (Figure S1).In our study, we observed differences in relative abundance between FL-ONT and V3V4-Illumina 16S rRNA sequencing especially for bacteria related to phylum Campylobacterota, Proteobacteria, Actinobacteriota and Firmicutes.Furthermore, we found that there were biases in the bacteria detected in FL-ONT or V3V4-Illumina 16S rRNA sequencing.ALDEx2 is an alternative method that considers compositional and zero-inflated microbiome datasets and is more robust than standard relative abundance analyses (Nearing et al. 2022).Using ALDEx2, we also identified differences at every taxonomy although at a smaller number, reflective of its conservative nature to reduce false-postives detection (Gloor et al. 2017;Nearing et al. 2022).Taken together, we have comprehensively shown that there are significant differences in the two sequencing technologies' ability to detect the bacteria relative abundance of HNC tissues.
The microbiome has been reported to influence numerous facets of tumour pathobiology biology including treatment efficacy, tumour immunity and tumour progression (Yang et al. 2023a).Gemcitabine, a chemotherapeutic treatment for pancreatic, bladder and metastatic triple-negative breast cancers, can be transported into the cytoplasm of Gammaproteobacteria (class) using nucleoside transporter (NupC), where it gets inactivated by bacterial cytidine deaminase (Geller et al. 2017;Gallagher et al. 2022;Yang et al. 2023b).Gut-derived Bifidobacterium spp. is associated with increased response rates and progression free survival to PD-1 checkpoint inhibitors (Dizman et al. 2022).Notably, well-studied microbial metabolites such as butyrate, can also improve PD-1 checkpoint inhibitor response rates (Gopalakrishnan et al. 2018;Zhu et al. 2023).Butyrate can be produced from Faecalibacterium (genus) and Akkermansia muciniphila (species) (Gopalakrishnan et al. 2018;Zhu et al. 2023).Of note, these tumour modulating abilities is dependent on specific genomic features shared within a taxonomic level (Yang et al. 2023b).Thus, microbiome identification at higher taxonomical levels that can be accurately identified by Illumina 16S rRNA sequencing is important (Kim et al. 2024).However, our study shows that FL-ONT 16S rRNA sequencing is similar to the precision of V3V4-Illumina at higher taxonimical levels but with the advantage of providing species level identification in a cost-effective manner.A major benefit for ONT sequencers Together, these findings indicate that ONT and Illumina 16S rRNA sequencing have minimal impact on bacterial genera richness and evenness, however overall bacterial composition was affected by the sequencing technique employed.
We next determined whether bacterial composition difference observed were present in every taxonomic level.Previous studies have examined FL-ONT and Illumina 16S rRNA sequencing datasets for differences in relative abundance at the phylum (Szoboszlay et al. 2023), order (Shin et al. 2016), family (Shin et al. 2016;Acharya et al. 2019;Winand et al. 2019;Connell et al. 2024), genus (Shin et al. 2016;Acharya et al. 2019;Winand et al. 2019;Fujiyoshi et al. 2020;Heikema et al. 2020;Wei et al. 2020;de Siqueira et al. 2021;Low et al. 2021;Matsuo et al. 2021;Oberle et al. 2021;Rozas et al. 2021;Connell et al. 2024), andspecies (Shin et al. 2016;Winand et al. 2019;Wei et al. 2020;Low et al. 2021;Connell et al. 2024) level.However, these studies have used different analytical approaches that may affect the interpretation of their results.Some compared relative abundance of paired samples without paired differential abundance analysis (de Siqueira et al. 2021;Oberle et al. 2021;Szoboszlay et al. 2023), while others compared averages within each sequencing group (Heikema et al. 2020;Wei et al. 2020).Most compared correlation in abundance between ONT and Illumina (Shin et al. 2016;Wei et al. 2020;Matsuo et al. 2021;Rozas et al. 2021), specifically the top 10 to 15 bacteria (Shin et al. 2016;Acharya et al. 2019;Wei et al. 2020;Matsuo et al. 2021), thus not reflecting the magnitude of differences in abundance between sequencing techniques.Furthermore, a few studies had small sample sizes (< 10) which limits their interpretation (Shin et al. 2016;Fujiyoshi et al. 2020;Oberle et al. 2021;Szoboszlay et al. 2023).Notably, in addition to this study, our previous study on nasal swabs was the only study to have applied paired analysis to evaluate differences in relative abundance (family, genus) and diversities between ONT and Illumina sequencing (Connell et al. 2024).Paired differential abundance analysis should be employed to account for Fig. 5 Comparison of abundance between FL-ONT and V3V4-Illumina 16S rRNA sequencing at the Genus level.After agglomerating to genus level, a total of 92 genera were identified.(A) Relative abundance (%) of top 10 genus, strata to per patient.For each patient panel, ONT and Illumina sequencing were represented by left and right bar plot respectively.(B) Relative abundance (%) of genus with > 10% differences between techniques.Paired Wilcoxon tests were performed to compare differences between ONT to Illumina sequencing.Additionally, ALDEx2 was performed to assess differences in genus between sequencing techniques.(C) ALDEx2 volcano plot.Red dot points represent Benjamini-Hochberg corrected FDR p-value of Wilcoxon test < 0.05.Rab.win.grouprefers to the median bacterial clr value for the group of samples.(D) Genera that were significantly different between ONT and Illumina using ALDEx2 analysis.Diff.btw refers to the median difference in bacterial clr values between ONT and Illumina groups (Illumina -ONT).****p < 0.0001 cost-effective.We expect this technology to be more widely adopted in future cancer microbiome studies.
Although we have thoroughly investigated differences in both techniques, there are limitations to this study.While biological replicates were included, this study lacks technical replicates for each sequencing patient's sample which will provide more confidence in the study.In addition, this study did not include an oral mock microbial community as a reference.Having a commercial oral mock community will allow benchmarking of library preparation steps such as primer efficacy and PCR conditions between both FL-ONT and V3V4-Illumina 16S rRNA sequencing.Furthermore, future studies should consider including other primers or all primer sets to cover the entire region of the 16S rRNA for short-read Illumina sequencing.This will ensure better coverage and comparison between full-length ONT and full-length short-read Illumina sequencing (Johnson et al. 2019).Additionally, future studies should also include more samples and culture conditions (i.e.aerobic and anaerobic) in the MALDI-TOF culturomics approach to provide substantial confidence in sequencing results.Lastly, adding on a metagenomics approach can also provide greater confidence with extra sequencing coverage outside of the 16S rRNA gene (Kim et al. 2024).In the context of HNC pathobiology, future addition of matched non-cancer and cancer samples could provide more insights to microbial differences at the species level.
In conclusion, our study provides the first comprehensive comparison of FL-ONT and V3V4-Illumina 16S rRNA microbial sequencing in HNC tumour tissue samples.We have shown that there were key differences such as betadiversity and some bacterial groups in every taxonomy at every level.Critically, we show that FL-ONT can provide more information about the microbiome that is

Fig. 1
Fig.1Paired alpha diversity analysis of FL-ONT and V3V4-Illumina at the genus level.Tissues were sequenced using ONT and Illumina technologies and data were aligned to the SILVA 16S rRNA database.To compare the differences in alpha diversity between technologies, paired Wilcoxon rank sum tests (adjusted for FDR) was performed for (A) Shannon index, (B) Simpson index, (C) InvSimpson, and (D) Observed genera using R package, microeco

Fig. 2
Fig. 2 Paired beta diversity analysis of paired FL-ONT and V3V4-Illumina 16S rRNA sequencing on tissue samples at the genus level.Principal Coordinate Analysis (PCoA) plot of Bray-Curtis distance on rarefied normalized abundance.PERMANOVA, ANOSIM and W d test

Fig. 3
Fig.3Correlation in bacterial relative abundance (%) at every taxonomic level between FL-ONT and V3V4-Illumina 16S rRNA sequencing.Spearman correlation analysis was performed for paired FL-ONT (D) ALDEx2 volcano plot.Red dot points represent Benjamini-Hochberg corrected FDR p-value of Wilcoxon test < 0.05.Rab.win.grouprefers to median bacterial species clr value for the group of samples.E) Top 5 species detected based on effect size using ALDEx2 analysis.Diff.btw refers to median difference in bacterial species clr values between FL-ONT and V3V4-Illumina groups (Illumina -ONT).****p < 0.0001, ***p < 0.001