Abstract
High-throughput, multiplexed-amplicon sequencing has become a core tool for understanding environmental microbiomes. As researchers have widely adopted sequencing, many open-source analysis pipelines have been developed to compare microbiomes using compositional analysis frameworks. However, there is increasing evidence that compositional analyses do not provide the information necessary to accurately interpret many community assembly processes. This is especially true when there are large gradients that drive distinct community assembly processes. Recently, sequencing has been combined with Q-PCR (among other sources of total quantitation) to generate “Quantitative Sequencing” (QSeq) data. QSeq more accurately estimates the true abundance of taxa, is a more reliable basis for inferring correlation, and, ultimately, can be more reliably related to environmental data to infer community assembly processes. In this paper, we use a combination of published data sets, synthesis, and empirical modeling to offer guidance for which contexts QSeq is advantageous. As little as 5% variation in total abundance among experimental groups resulted in more accurate inference by QSeq than compositional methods. Compositional methods for differential abundance and correlation unreliably detected patterns in abundance and covariance when there was greater than 20% variation in total abundance among experimental groups. Whether QSeq performs better for beta diversity analysis depends on the question being asked, and the analytic strategy (e.g., what distance metric is being used); for many questions and methods, QSeq and compositional analysis are equivalent for beta diversity analysis. QSeq is especially useful for taxon-specific analysis; QSeq transformation and analysis should be the default for answering taxon-specific questions of amplicon sequence data. Publicly available bioinformatics pipelines should incorporate support for QSeq transformation and analysis.
Similar content being viewed by others
Data Availability
The datasets supporting the conclusions of this article are available at https://github.com/djeppschmidt/QSeq_Model along with scripts used to generate the results.
Abbreviations
- CLR :
-
Centered log-ratio
- QPCR :
-
Quantitative polymerase chain reaction
- QSeq :
-
Quantitative sequencing
- ASV :
-
Amplicon sequence variant
- TFW :
-
Tidal freshwater wetlands
- FSP :
-
Farming Systems Project
- GLUSEEN :
-
Global Urban Soil Ecology and Education Network
- USDA :
-
United States Department of Agriculture
References
Fierer N, Jackson RB (2006) The diversity and biogeography of soil bacterial communities. PNAS 103:626–631. https://doi.org/10.1073/pnas.0507535103
Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Meth 7:335–336. https://doi.org/10.1038/nmeth.f.303
Caporaso JG, Lauber CL, Walters WA et al (2011) Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. PNAS 108:4516–4522. https://doi.org/10.1073/pnas.1000080107
Caporaso JG, Lauber CL, Walters WA et al (2012) Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J 6:1621–1624. https://doi.org/10.1038/ismej.2012.8
Bates ST, Berg-Lyons D, Caporaso JG et al (2011) Examining the global distribution of dominant archaeal populations in soil. ISME J 5:908–917. https://doi.org/10.1038/ismej.2010.171
Pawlowski J, Esling P, Lejzerowicz F et al (2014) Environmental monitoring through protist next-generation sequencing metabarcoding: assessing the impact of fish farming on benthic foraminifera communities. Mol Ecol Resour 14:1129–1140. https://doi.org/10.1111/1755-0998.12261
Pawlowski J, Lejzerowicz F, Apotheloz-Perret-Gentil L et al (2016) Protist metabarcoding and environmental biomonitoring: Time for change. Eur J Protistol 55:12–25. https://doi.org/10.1016/j.ejop.2016.02.003
Auguet J-C, Barberan A, Casamayor EO (2009) Global ecological patterns in uncultured Archaea. ISME J 4:182–190. https://doi.org/10.1038/ismej.2009.109
Bahram M, Hildebrand F, Forslund SK et al (2018) Structure and function of the global topsoil microbiome. Nature 560:233–237. https://doi.org/10.1038/s41586-018-0386-6
Bokulich NA, Kaehler BD, Rideout JR et al (2018) Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 6:90. https://doi.org/10.1186/s40168-018-0470-z
López-García A, Pineda-Quiroga C, Atxaerandio R, et al (2018) Comparison of Mothur and QIIME for the analysis of rumen microbiota composition based on 16S rRNA amplicon sequences. Front Microbiol 9 https://doi.org/10.3389/fmicb.2018.03010
Schloss PD (2020) Reintroducing mothur: 10 Years Later. Appl Environ Microbiol 86 https://doi.org/10.1128/AEM.02343-19
Lauber CL, Hamady M, Knight R, Fierer N (2009) Pyrosequencing-based assessment of soil pH as a predictor of soil bacterial community structure at the continental scale. Appl Environ Microbiol 75:5111–5120. https://doi.org/10.1128/AEM.00335-09
Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ (2017) microbiome datasets are compositional: and this is not optional. Front Microbiol. https://doi.org/10.3389/fmicb.2017.02224
Gloor GB, Wu JR, Pawlowsky-Glahn V, Egozcue JJ (2016) It’s all relative: analyzing microbiome data as compositions. Ann Epidemiol 26:322–329. https://doi.org/10.1016/j.annepidem.2016.03.003
Aitchison J (1982) The Statistical Analysis of Compositional Data. J Roy Stat Soc: Ser B (Methodol) 44:139–160. https://doi.org/10.1111/j.2517-6161.1982.tb01195.x
Fernandes AD, Macklaim JM, Linn TG et al (2013) ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq. PLOS One 8:e67019. https://doi.org/10.1371/journal.pone.0067019
Fernandes AD, Reid JN, Macklaim JM et al (2014) Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome 2:15. https://doi.org/10.1186/2049-2618-2-15
Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550. https://doi.org/10.1186/s13059-014-0550-8
Tsilimigras MCB, Fodor AA (2016) Compositional data analysis of the microbiome: fundamentals, tools, and challenges. Ann Epidemiol 26:330–335. https://doi.org/10.1016/j.annepidem.2016.03.002
Alteio LV, Séneca J, Canarini A et al (2021) A critical perspective on interpreting amplicon sequencing data in soil ecological research. Soil Biol Biochem 160:108357. https://doi.org/10.1016/j.soilbio.2021.108357
Epp Schmidt DJ, Pouyat R, Szlavecz K et al (2017) Urbanization erodes ectomycorrhizal fungal diversity and may cause microbial communities to converge. Nature Ecol Evol 1:0123. https://doi.org/10.1038/s41559-017-0123
Epp Schmidt D, Dlott G, Cavigelli M et al (2022) Soil microbiomes in three farming systems more affected by depth than farming system. Appl Soil Ecol 173:104396. https://doi.org/10.1016/j.apsoil.2022.104396
Jian C, Luukkonen P, Yki-Järvinen H et al (2020) Quantitative PCR provides a simple and accessible method for quantitative microbiota profiling. PLoS One 15:e0227285. https://doi.org/10.1371/journal.pone.0227285
Vandeputte D, Kathagen G, D’hoe K et al (2017) Quantitative microbiome profiling links gut community variation to microbial load. Nature 551:507–511. https://doi.org/10.1038/nature24460
Tourlousse DM, Yoshiike S, Ohashi A et al (2017) Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing. Nucleic Acids Res 45:e23. https://doi.org/10.1093/nar/gkw984
Stämmler F, Gläsner J, Hiergeist A et al (2016) Adjusting microbiome profiles for differences in microbial load by spike-in bacteria. Microbiome 4:28. https://doi.org/10.1186/s40168-016-0175-0
Smets W, Leff JW, Bradford MA et al (2016) A method for simultaneous measurement of soil bacterial abundances and community composition via 16S rRNA gene sequencing. Soil Biol Biochem 96:145–151. https://doi.org/10.1016/j.soilbio.2016.02.003
Barlow JT, Bogatyrev SR, Ismagilov RF (2020) A quantitative sequencing framework for absolute abundance measurements of mucosal and lumenal microbial communities. Nat Commun 11:2590. https://doi.org/10.1038/s41467-020-16224-6
Hoshino T, Nakao R, Doi H, Minamoto T (2021) Simultaneous absolute quantification and sequencing of fish environmental DNA in a mesocosm by quantitative sequencing technique. Sci Rep 11:4372. https://doi.org/10.1038/s41598-021-83318-6
Gevers D, Kugathasan S, Denson LA et al (2014) The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe 15:382–392. https://doi.org/10.1016/j.chom.2014.02.005
Weiss S, Xu ZZ, Peddada S et al (2017) Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome 5:27. https://doi.org/10.1186/s40168-017-0237-y
McMurdie PJ, Holmes S (2014) Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol 10:e1003531. https://doi.org/10.1371/journal.pcbi.1003531
McMurdie PJ, Holmes S (2013) phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One 8:e61217. https://doi.org/10.1371/journal.pone.0061217
Lin H, Peddada SD (2020) Analysis of compositions of microbiomes with bias correction. Nat Commun 11:3514. https://doi.org/10.1038/s41467-020-17041-7
Kurtz ZD, Müller CL, Miraldi ER et al (2015) Sparse and compositionally robust inference of microbial ecological networks. PLOS Comput Biol 11:e1004226. https://doi.org/10.1371/journal.pcbi.1004226
Watts SC, Ritchie SC, Inouye M, Holt KE (2019) FastSpar: rapid and scalable correlation estimation for compositional data. Bioinformatics 35:1064–1066. https://doi.org/10.1093/bioinformatics/bty734
Prasse CE, Baldwin AH, Yarwood SA (2015) Site history and edaphic features override the influence of plant species on microbial communities in restored tidal freshwater wetlands. Appl Environ Microbiol 81:3482–3491. https://doi.org/10.1128/AEM.00038-15
Callahan BJ, McMurdie PJ, Rosen MJ et al (2016) DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583. https://doi.org/10.1038/nmeth.3869
Martino C, Morton JT, Marotz CA et al (2019) A novel sparse compositional technique reveals microbial perturbations. mSystems 4:e00016-19. https://doi.org/10.1128/mSystems.00016-19
Oksanen J, Simpson G, Blanchet F, Kindt R, Legendre P, Minchin P, O'Hara R, Solymos P, Stevens M, Szoecs E, Wagner H, Barbour M, Bedward M, Bolker B, Borcard D, Carvalho G, Chirico M, De Caceres M, Durand S, Evangelista H, FitzJohn R, Friendly M, Furneaux B, Hannigan G, Hill M, Lahti L, McGlinn D, Ouellette M, Ribeiro Cunha E, Smith T, Stier A, Ter Braak C, Weedon J (2022) Vegan: community ecology package. R package version 2. 6–2. https://CRAN.R-project.org/package=vegan
Gloor GB, Reid G (2016) Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data. Can J Microbiol 62:692–703. https://doi.org/10.1139/cjm-2015-0821
Quinn TP, Erb I, Gloor G et al (2019) A field guide for the compositional analysis of any-omics data. Gigascience 8:giz107. https://doi.org/10.1093/gigascience/giz107
Smith JR, Letten AD, Ke P-J et al (2018) A global test of ecoregions. Nat Ecol Evol 2:1889–1896. https://doi.org/10.1038/s41559-018-0709-x
Lloréns-Rico V, Vieira-Silva S, Gonçalves PJ et al (2021) Benchmarking microbiome transformations favors experimental quantitative approaches to address compositionality and sampling depth biases. Nat Commun 12:3562. https://doi.org/10.1038/s41467-021-23821-6
Baiser B, Olden JD, Record S, et al (2012) Pattern and process of biotic homogenization in the New Pangaea. Proc Royal Soc London B: Biological Sciences rspb20121651 10.1098/rspb.2012.1651
Ovaskainen O, Abrego N (2020) Joint species distribution modelling: with applications in R (. Cambridge University Press
Tikhonov G, Opedal ØH, Abrego N et al (2020) Joint species distribution modelling with the r-package Hmsc. Methods Ecol Evol 11:442–447. https://doi.org/10.1111/2041-210X.13345
Shade A, Stopnisek N (2019) Abundance-occupancy distributions to prioritize plant core microbiome membership. Curr Opin Microbiol 49:50–58. https://doi.org/10.1016/j.mib.2019.09.008
Acknowledgements
We thank Dr. Mihai Pop for feedback that substantially improved the manuscript. The Long-Term Agroecosystem Research (LTAR) network is supported by the United States Department of Agriculture.
Funding
Dietrich Epp Schmidt was supported by NRT-INFEWS: UMD Global STEWARDS (STEM Training at the Nexus of Energy, WAter Reuseand FooD Systems) that was awarded to the University of Maryland School of Public Health by the National Science Foundation National Research Traineeship Program, Grant number 1828910.
Author information
Authors and Affiliations
Contributions
DES, SAY, and JEM conceptualized and revised the manuscript. DES ran the analyses.
Corresponding author
Ethics declarations
Ethics Approval and Consent to Participate
Not applicable.
Consent for Publication
All authors consent to publish.
Competing Interests
The authors declare no competing interests.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Epp Schmidt, D., Maul, J.E. & Yarwood, S. . Quantitative Amplicon Sequencing Is Necessary to Identify Differential Taxa and Correlated Taxa Where Population Sizes Differ. Microb Ecol 86, 2790–2801 (2023). https://doi.org/10.1007/s00248-023-02273-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00248-023-02273-z