Advertisement

Analytic Methods in Microbiome Studies

  • Philipp Rausch
  • Axel Künstner
Chapter

Abstract

Most autoimmune diseases are polygenic conditions, with strong environmental components. For example, rheumatoid arthritis (RA) has approximately 100 risk alleles, with small individual effect sizes, indicating interaction of environmental factors with genetic susceptibilities in disease development (Okada et al., Nature 506:376–381, 2014). A theme throughout this textbook is the association of microbial populations with disease development. The last 10 years have witnessed an explosion of technologies and informatics pipelines permitting in-depth exploration of these microbial communities. Thus, it is important to investigate microbial communities in autoimmune diseases like RA in a systematic and comparable way to find common dynamics in the microbiome within and across studies that may lead to diagnostic or therapeutic innovations. The previous chapter reviewed multiple technologies, old and new, to query the microbiota; this chapter will focus on the analysis of commonly used sequencing approaches.

Keywords

Rheumatoid arthritis Microbiota Methods 

Abbreviations

AL

Average linkage

ANOSIM

Analysis of similarity

BLAST

Basic Local Alignment Search Tool

CCA

Canonical correspondence analysis

CL

Complete linkage

COI

Cytochrome c oxidase subunit I

DA

Differentially abundant

dbRDA

Distance-based redundancy analysis

DGE

Differential gene expression

ITS

Internal transcribed spacer

MDS

Multidimensional scaling

NAST

Nearest alignment space termination

NMDS

Nonmetric multidimensional scaling

OTU

Operational taxonomic unit

PAM

Partitioning around medoids

PCR

Polymerase chain reaction

PD

Phylogenetic diversity

PERMANOVA

Permutational multivariate analysis of variance

RA

Rheumatoid arthritis

RDA

Redundancy analysis

RDP

Ribosomal Database Project

rRNA

Ribosomal ribonucleic acid

SL

Single linkage

UniFrac

Unique fraction metric

UPGMA

Unweighted pair group method with arithmetic mean

ZIG

Zero-inflated Gaussian

References

  1. 1.
    Okada Y, Wu D, Trynka G, Raj T, Terao C, Ikari K, et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature. 2014;506(7488):376–81.PubMedCrossRefPubMedCentralGoogle Scholar
  2. 2.
    He X, McLean JS, Edlund A, Yooseph S, Hall AP, Liu S-Y, et al. Cultivation of a human-associated TM7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle. Proc Natl Acad Sci U S A. 2015;112(1):244–9.PubMedCrossRefPubMedCentralGoogle Scholar
  3. 3.
    Vartoukian SR, Adamowska A, Lawlor M, Moazzez R, Dewhirst FE, Wade WG. In vitro cultivation of ‘unculturable’ oral bacteria, facilitated by community culture and media supplementation with siderophores. PLoS One. 2016;11(1):e0146926.PubMedPubMedCentralCrossRefGoogle Scholar
  4. 4.
    Solden L, Lloyd K, Wrighton K. The bright side of microbial dark matter: lessons learned from the uncultivated majority. Curr Opin Microbiol. 2016;31:217–26.PubMedCrossRefPubMedCentralGoogle Scholar
  5. 5.
    Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A. 1990;87(12):4576–9.PubMedPubMedCentralCrossRefGoogle Scholar
  6. 6.
    Woese CR, Fox GE. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A. 1977;74(11):5088–90.PubMedPubMedCentralCrossRefGoogle Scholar
  7. 7.
    Lane DJ, Pace B, Olsen GJ, Stahl DA, Sogin ML, Pace NR. Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proc Natl Acad Sci U S A. 1985;82(20):6955–9.PubMedPubMedCentralCrossRefGoogle Scholar
  8. 8.
    Yarza P, Yilmaz P, Pruesse E, Glockner FO, Ludwig W, Schleifer K-H, et al. Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences. Nat Rev Microbiol. 2014;12(9):635–45.PubMedCrossRefPubMedCentralGoogle Scholar
  9. 9.
    Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456(7218):53–9.PubMedPubMedCentralCrossRefGoogle Scholar
  10. 10.
    Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437(7057):376–80.PubMedPubMedCentralCrossRefGoogle Scholar
  11. 11.
    Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M, et al. An integrated semiconductor device enabling non-optical genome sequencing. Nature. 2011;475(7356):348–52.PubMedCrossRefPubMedCentralGoogle Scholar
  12. 12.
    Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323(5910):133–8.PubMedCrossRefPubMedCentralGoogle Scholar
  13. 13.
    Schloss PD. The effects of alignment quality, distance calculation method, sequence filtering, and region on the analysis of 16S rRNA gene-based studies. PLoS Comput Biol. 2010;6(7):e1000844.PubMedPubMedCentralCrossRefGoogle Scholar
  14. 14.
    Huttenhower C, Gevers D, Knight R, Abubucker S, Badger JH, Chinwalla AT. Structure, function and diversity of the healthy human microbiome. Nature. 2012;486:207.CrossRefGoogle Scholar
  15. 15.
    Hebert PDN, Cywinska A, Ball SL, deWaard JR. Biological identifications through DNA barcodes. Proc R Soc Lond Ser B Biol Sci. 2003;270(1512):313–21.CrossRefGoogle Scholar
  16. 16.
    Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, et al. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 2013;41(1):e1.PubMedCrossRefPubMedCentralGoogle Scholar
  17. 17.
    Eloe-Fadrosh EA, Ivanova NN, Woyke T, Kyrpides NC. Metagenomics uncovers gaps in amplicon-based detection of microbial diversity. Nat Microbiol. 2016;1:15032.PubMedCrossRefPubMedCentralGoogle Scholar
  18. 18.
    Brown CT, Hug LA, Thomas BC, Sharon I, Castelle CJ, Singh A, et al. Unusual biology across a group comprising more than 15% of domain Bacteria. Nature. 2015;523(7559):208–11.PubMedCrossRefPubMedCentralGoogle Scholar
  19. 19.
    Rinke C, Schwientek P, Sczyrba A, Ivanova NN, Anderson IJ, Cheng J-F, et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature. 2013;499(7459):431–7.PubMedCrossRefPubMedCentralGoogle Scholar
  20. 20.
    Bokulich NA, Subramanian S, Faith JJ, Gevers D, Gordon JI, Knight R, et al. Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nat Methods. 2013;10(1):57–9.PubMedCrossRefPubMedCentralGoogle Scholar
  21. 21.
    Edgar RC, Flyvbjerg H. Error filtering, pair assembly, and error correction for next-generation sequencing reads. Bioinformatics. 2015;31(21):​3476–82.PubMedCrossRefPubMedCentralGoogle Scholar
  22. 22.
    Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl Environ Microbiol. 2013;79(17):5112–20.PubMedPubMedCentralCrossRefGoogle Scholar
  23. 23.
    McInerney P, Adams P, Hadi MZ. Error rate comparison during polymerase chain reaction by DNA polymerase. Mol Biol Int. 2014;2014:8.CrossRefGoogle Scholar
  24. 24.
    Gohl DM, Vangay P, Garbe J, MacLean A, Hauge A, Becker A, et al. Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies. Nat Biotechnol. 2016;34(9):942–9.PubMedCrossRefPubMedCentralGoogle Scholar
  25. 25.
    Huse SM, Dethlefsen L, Huber JA, Welch DM, Relman DA, Sogin ML. Exploring microbial diversity and taxonomy using SSU rRNA Hypervariable tag sequencing. PLoS Genet. 2008;4(11):e1000255.PubMedPubMedCentralCrossRefGoogle Scholar
  26. 26.
    Haas BJ, Gevers D, Earl AM, Feldgarden M, Ward DV, Giannoukos G, et al. Chimeric 16S rRNA sequence formation and detection in sanger and 454-pyrosequenced PCR amplicons. Genome Res. 2011;21(3):494–504.PubMedPubMedCentralCrossRefGoogle Scholar
  27. 27.
    Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics. 2011;27(16):2194–200.PubMedPubMedCentralCrossRefGoogle Scholar
  28. 28.
    Ashelford KE, Chuzhanova NA, Fry JC, Jones AJ, Weightman AJ. New screening software shows that most recent large 16S rRNA gene clone libraries contain chimeras. Appl Environ Microbiol. 2006;72(9):5734–41.PubMedPubMedCentralCrossRefGoogle Scholar
  29. 29.
    Edgar R. UCHIME2: improved chimera prediction for amplicon sequencing. bioRxiv. 2016.Google Scholar
  30. 30.
    DeSantis TZ, Hugenholtz P, Keller K, Brodie EL, Larsen N, Piceno YM, et al. NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes. Nucleic Acids Res. 2006;34(suppl 2):W394–W9.PubMedPubMedCentralCrossRefGoogle Scholar
  31. 31.
    Caporaso JG, Bittinger K, Bushman FD, DeSantis TZ, Andersen GL, Knight R. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics. 2010;26(2):266–7.PubMedCrossRefPubMedCentralGoogle Scholar
  32. 32.
    Schloss PD. A high-throughput DNA sequence aligner for microbial ecology studies. PLoS One. 2009;4(12):e8230.PubMedPubMedCentralCrossRefGoogle Scholar
  33. 33.
    Schloss PD. Secondary structure improves OTU assignments of 16S rRNA gene sequences. ISME J. 2013;7(3):457–60.PubMedCrossRefPubMedCentralGoogle Scholar
  34. 34.
    White J, Navlakha S, Nagarajan N, Ghodsi M-R, Kingsford C, Pop M. Alignment and clustering of phylogenetic markers – implications for microbial diversity studies. BMC Bioinformatics. 2010;11(1):152.PubMedPubMedCentralCrossRefGoogle Scholar
  35. 35.
    Moeller AH, Degnan PH, Pusey AE, Wilson ML, Hahn BH, Ochman H. Chimpanzees and humans harbour compositionally similar gut enterotypes. Nat Commun. 2012;3:1179.PubMedPubMedCentralCrossRefGoogle Scholar
  36. 36.
    Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7(5):335–6.PubMedPubMedCentralCrossRefGoogle Scholar
  37. 37.
    Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, et al. Introducing mothur: open source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75(23):7537–41.PubMedPubMedCentralCrossRefGoogle Scholar
  38. 38.
    Rognes T, Flouri T, Nichols B, Quince C, Mahé F. VSEARCH: a versatile open source tool for metagenomics. PeerJ. 2016;4:e2584.PubMedPubMedCentralCrossRefGoogle Scholar
  39. 39.
    Westcott SL, Schloss PD. De novo clustering methods outperform reference-based methods for assigning 16S rRNA gene sequences to operational taxonomic units. PeerJ. 2015;3:e1487.PubMedPubMedCentralCrossRefGoogle Scholar
  40. 40.
    Schmidt TSB, Matias Rodrigues JF, von Mering C. Limits to robustness and reproducibility in the demarcation of operational taxonomic units. Environ Microbiol. 2015;17(5):1689–706.PubMedCrossRefPubMedCentralGoogle Scholar
  41. 41.
    Schloss P, Handelsman J. Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl Environ Microbiol. 2005;71(3):1501–6.PubMedPubMedCentralCrossRefGoogle Scholar
  42. 42.
    Edgar RC, Search and clustering orders of magnitude faster than BLAST, Bioinformatics (2010);26(19):2460–1. doi: 10.1093/bioinformatics/btq461.CrossRefPubMedPubMedCentralGoogle Scholar
  43. 43.
    Li W, Jaroszewski L, Godzik A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics. 2001;17(3):282–3.PubMedCrossRefPubMedCentralGoogle Scholar
  44. 44.
    DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol. 2006;72(7):5069–72.PubMedPubMedCentralCrossRefGoogle Scholar
  45. 45.
    Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, et al. The ribosomal database project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 2009;37(suppl_1):D141–5.PubMedCrossRefPubMedCentralGoogle Scholar
  46. 46.
    Cole JR, Chai B, Marsh TL, Farris RJ, Wang Q, Kulam SA, et al. The ribosomal database project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res. 2003;31(1):442–3.PubMedPubMedCentralCrossRefGoogle Scholar
  47. 47.
    Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 2007;35(21):7188–96.PubMedPubMedCentralCrossRefGoogle Scholar
  48. 48.
    Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41(D1):D590–6.PubMedCrossRefPubMedCentralGoogle Scholar
  49. 49.
    Schloss PD, Westcott SL. Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis. Appl Environ Microbiol. 2011;77(10):3219–26.PubMedPubMedCentralCrossRefGoogle Scholar
  50. 50.
    Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73(16):5261–7.PubMedPubMedCentralCrossRefGoogle Scholar
  51. 51.
    Liu K-L, Porras-Alfaro A, Kuske CR, Eichorst SA, Xie G. Accurate, rapid taxonomic classification of fungal large subunit rRNA genes. Appl Environ Microbiol. 2011;78(5):1523–33.PubMedCrossRefPubMedCentralGoogle Scholar
  52. 52.
    Newton I, Roeselers G. The effect of training set on the classification of honey bee gut microbiota using the naive Bayesian classifier. BMC Microbiol. 2012;12(1):221.PubMedPubMedCentralCrossRefGoogle Scholar
  53. 53.
    Werner JJ, Koren O, Hugenholtz P, DeSantis TZ, Walters WA, Caporaso JG, et al. Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys. ISME J. 2012;6(1):94–103.PubMedCrossRefPubMedCentralGoogle Scholar
  54. 54.
    Edgar R. SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS sequences. bioRxiv. 2016.Google Scholar
  55. 55.
    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.PubMedCrossRefPubMedCentralGoogle Scholar
  56. 56.
    Liu Z, DeSantis TZ, Andersen GL, Knight R. Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers. Nucleic Acids Res. 2008;36(18):e120.PubMedPubMedCentralCrossRefGoogle Scholar
  57. 57.
    Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009;26(7):1641–50.PubMedPubMedCentralCrossRefGoogle Scholar
  58. 58.
    Price MN, Dehal PS, Arkin AP. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490.PubMedPubMedCentralCrossRefGoogle Scholar
  59. 59.
    Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar, et al. ARB: a software environment for sequence data. Nucleic Acids Res. 2004;32(4):1363–71.PubMedPubMedCentralCrossRefGoogle Scholar
  60. 60.
    Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11(1):538.PubMedPubMedCentralCrossRefGoogle Scholar
  61. 61.
    Evans J, Sheneman L, Foster J. Relaxed neighbor joining: a fast distance-based phylogenetic tree construction method. J Mol Evol. 2006;62(6):785–92.PubMedCrossRefPubMedCentralGoogle Scholar
  62. 62.
    Sheneman L, Evans J, Foster JA. Clearcut: a fast implementation of relaxed neighbor joining. Bioinformatics. 2006;22(22):2823–4.PubMedCrossRefPubMedCentralGoogle Scholar
  63. 63.
    Huson DH, Richter DC, Rausch C, Dezulian T, Franz M, Rupp R. Dendroscope: an interactive viewer for large phylogenetic trees. BMC Bioinformatics. 2007;8(1):460.PubMedPubMedCentralCrossRefGoogle Scholar
  64. 64.
    Whittaker RH. Evolution and measurement of species diversity. Taxon. 1972;21(2/3):213–51.CrossRefGoogle Scholar
  65. 65.
    Whittaker RH. Vegetation of the Siskiyou mountains, Oregon and California. Ecol Monogr. 1960;30(3):279–338.CrossRefGoogle Scholar
  66. 66.
    Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27:623–56.CrossRefGoogle Scholar
  67. 67.
    Simpson EH. Measurement of diversity. Nature. 1949;163(4148):688.CrossRefGoogle Scholar
  68. 68.
    Jost L. Entropy and diversity. Oikos. 2006;113(2):​363–75.CrossRefGoogle Scholar
  69. 69.
    Jost L. Partitioning diversity into independent alpha and beta components. Ecology. 2007;88(10):2427–39.PubMedCrossRefPubMedCentralGoogle Scholar
  70. 70.
    Faith DP. Conservation evaluation and phylogenetic diversity. Biol Conserv. 1992;61(1):1–10.CrossRefGoogle Scholar
  71. 71.
    Cavender-Bares J, Kozak KH, Fine PVA, Kembel SW. The merging of community ecology and phylogenetic biology. Ecol Lett. 2009;12:693–715.PubMedPubMedCentralCrossRefGoogle Scholar
  72. 72.
    Chao A. Nonparametric-estimation of the number of classes in a population. Scand J Stat. 1984;11(4):265–70.Google Scholar
  73. 73.
    Chazdon RL, Colwell RK, Denslow JS, Guariguata MR. Statistical methods for estimating species richness of woody regeneration in primary and secondary rain forests of Northeastern Costa Rica; 1998. p. 285–309.Google Scholar
  74. 74.
    Chiu C-H, Wang Y-T, Walther BA, Chao A. An improved nonparametric lower bound of species richness via a modified good–turing frequency formula. Biometrics. 2014;70(3):671–82.PubMedCrossRefPubMedCentralGoogle Scholar
  75. 75.
    Koleff P, Gaston KJ, Lennon JJ. Measuring beta diversity for presence–absence data. J Anim Ecol. 2003;72(3):367–82.CrossRefGoogle Scholar
  76. 76.
    Jaccard P. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaud Sci Nat. 1901;37:547–79.Google Scholar
  77. 77.
    Bray JR, Curtis JT. An ordination of the upland forest communities of southern Wisconsin. Ecol Monogr. 1957;27(4):326–49.CrossRefGoogle Scholar
  78. 78.
    Lozupone C, Knight R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol. 2005;71(12):8228–35.PubMedPubMedCentralCrossRefGoogle Scholar
  79. 79.
    Lozupone CA, Hamady M, Kelley ST, Knight R. Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities. Appl Environ Microbiol. 2007;73(5):1576–85.PubMedPubMedCentralCrossRefGoogle Scholar
  80. 80.
    Chang Q, Luan Y, Sun F. Variance adjusted weighted UniFrac: a powerful beta diversity measure for comparing communities based on phylogeny. BMC Bioinformatics. 2011;12(1):118.PubMedPubMedCentralCrossRefGoogle Scholar
  81. 81.
    Chen J, Bittinger K, Charlson ES, Hoffmann C, Lewis J, Wu GD, et al. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics. 2012;28(16):2106–13.PubMedPubMedCentralCrossRefGoogle Scholar
  82. 82.
    Swenson NG. Phylogenetic resolution and quantifying the phylogenetic diversity and dispersion of communities. PLoS One. 2009;4(2):e4390.PubMedPubMedCentralCrossRefGoogle Scholar
  83. 83.
    Davies TJ, Kraft NJB, Salamin N, Wolkovich EM. Incompletely resolved phylogenetic trees inflate estimates of phylogenetic conservatism. Ecology. 2011;93(2):242–7.CrossRefGoogle Scholar
  84. 84.
    Vellend M, Drummond EBM, Tomimatsu H. Measuring phylogenetic biodiversity. In: Magurran AE, McGill BJ, editors. Biological diversity: frontiers in measurement and assessment. Oxford: Oxford University Press; 2011. p. 193–206.Google Scholar
  85. 85.
    Kanagawa T. Bias and artifacts in multitemplate polymerase chain reactions (PCR). J Biosci Bioeng. 2003;96(4):317–23.PubMedCrossRefPubMedCentralGoogle Scholar
  86. 86.
    von Wintzingerode F, Göbel UB, Stackebrandt E. Determination of microbial diversity in environmental samples: pitfalls of PCR-based rRNA analysis. FEMS Microbiol Rev. 1997;21(3):213–29.CrossRefGoogle Scholar
  87. 87.
    Weider LJ, Elser JJ, Crease TJ, Mateos M, Cotner JB, Markow TA. The functional significance of ribosomal (r)DNA variation: impacts on the evolutionary ecology of organisms. Annu Rev Ecol Evol Syst. 2005;36:219–42.CrossRefGoogle Scholar
  88. 88.
    Kembel SW, Wu M, Eisen JA, Green JL. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Comput Biol. 2012;8(10):e1002743.PubMedPubMedCentralCrossRefGoogle Scholar
  89. 89.
    Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 2014;12:12.CrossRefGoogle Scholar
  90. 90.
    Weiss S, Amir A, Hyde ER, Metcalf JL, Song SJ, Knight R. Tracking down the sources of experimental contamination in microbiome studies. Genome Biol. 2014;15(12):564.PubMedPubMedCentralCrossRefGoogle Scholar
  91. 91.
    Anderson MJ. A new method for non-parametric multivariate analysis of variance. Austral Ecol. 2001;26(1):32–46.Google Scholar
  92. 92.
    Legendre P, Anderson MJ. Distance-based redundancy analysis: testing multispecies responses in multifactorial ecological experiments. Ecol Monogr. 1999;69(1):1–24.CrossRefGoogle Scholar
  93. 93.
    ter Braak CJF. Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology. 1986;67(5):1167–79.CrossRefGoogle Scholar
  94. 94.
    Anderson MJ. Permutation tests for univariate or multivariate analysis of variance and regression. Can J Fish Aquat Sci. 2001;58(3):626–39.CrossRefGoogle Scholar
  95. 95.
    Clarke KR. Non-parametric multivariate analyses of changes in community structure. Aust J Ecol. 1993;18(1):117–43.CrossRefGoogle Scholar
  96. 96.
    Carignan V, Villard M-A. Selecting indicator species to monitor ecological integrity: a review. Environ Monit Assess. 2002;78(1):45–61.PubMedCrossRefPubMedCentralGoogle Scholar
  97. 97.
    Dufrene M, Legendre P. Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecol Monogr. 1997;67(3):345–66.Google Scholar
  98. 98.
    De Cáceres M, Legendre P. Associations between species and groups of sites: indices and statistical inference. Ecology. 2009;90(12):3566–74.PubMedCrossRefPubMedCentralGoogle Scholar
  99. 99.
    De Cáceres M, Legendre P, Moretti M. Improving indicator species analysis by combining groups of sites. Oikos. 2010;119(10):1674–84.CrossRefGoogle Scholar
  100. 100.
    Knights D, Costello EK, Knight R. Supervised classification of human microbiota. FEMS Microbiol Rev. 2011;35(2):343–59.PubMedCrossRefPubMedCentralGoogle Scholar
  101. 101.
    Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.CrossRefGoogle Scholar
  102. 102.
    McMurdie PJ, Holmes S. Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol. 2014;10(4):e1003531.PubMedPubMedCentralCrossRefGoogle Scholar
  103. 103.
    Robinson MD, McCarthy DJ, Smyth GK. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40.PubMedCrossRefPubMedCentralGoogle Scholar
  104. 104.
    Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550.PubMedPubMedCentralCrossRefGoogle Scholar
  105. 105.
    Hardcastle TJ, Kelly KA. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics. 2010;11(1):422.PubMedPubMedCentralCrossRefGoogle Scholar
  106. 106.
    Xu L, Paterson AD, Turpin W, Xu W. Assessment and selection of competing models for zero-inflated microbiome data. PLoS One. 2015;10(7):e0129606.PubMedPubMedCentralCrossRefGoogle Scholar
  107. 107.
    Paulson JN, Stine OC, Bravo HC, Pop M. Differential abundance analysis for microbial marker-gene surveys. Nat Methods. 2013;10(12):1200–2.PubMedPubMedCentralCrossRefGoogle Scholar
  108. 108.
    Fernandes A, Reid J, Macklaim J, McMurrough T, Edgell D, Gloor G. Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome. 2014;2(1):15.PubMedPubMedCentralCrossRefGoogle Scholar
  109. 109.
    Thorsen J, Brejnrod A, Mortensen M, Rasmussen MA, Stokholm J, Al-Soud WA, et al. Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies. Microbiome. 2016;4(1):62.PubMedPubMedCentralCrossRefGoogle Scholar
  110. 110.
    Faust K, Sathirapongsasuti JF, Izard J, Segata N, Gevers D, Raes J, et al. Microbial co-occurrence relationships in the human microbiome. PLoS Comput Biol. 2012;8(7):e1002606.PubMedPubMedCentralCrossRefGoogle Scholar
  111. 111.
    Weiss S, Van Treuren W, Lozupone C, Faust K, Friedman J, Deng Y, et al. Correlation detection strategies in microbial data sets vary widely in sensitivity and precision. ISME J. 2016;10(7):1669–81.PubMedPubMedCentralCrossRefGoogle Scholar
  112. 112.
    Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, et al. Detecting novel associations in large data sets. Science. 2011;334(6062):1518–24.PubMedPubMedCentralCrossRefGoogle Scholar
  113. 113.
    Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687.PubMedPubMedCentralCrossRefGoogle Scholar
  114. 114.
    Faust K, Raes J. CoNet app: inference of biological association networks using Cytoscape. F1000Res. 2016;5:1519.PubMedPubMedCentralCrossRefGoogle Scholar
  115. 115.
    Ruan Q, Dutta D, Schwalbach MS, Steele JA, Fuhrman JA, Sun F. Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors. Bioinformatics. 2006;22(20):2532–8.PubMedCrossRefPubMedCentralGoogle Scholar
  116. 116.
    Berry D, Widder S. Deciphering microbial interactions and detecting keystone species with co-occurrence networks. Front Microbiol. 2014;5(219):219.PubMedPubMedCentralGoogle Scholar
  117. 117.
    Butts CT. Social network analysis with sna. J Stat Softw. 2008;24(6):1–51.PubMedPubMedCentralCrossRefGoogle Scholar
  118. 118.
    Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal Complex Syst. 2006;1695:1–9.Google Scholar
  119. 119.
    Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504.PubMedPubMedCentralCrossRefGoogle Scholar
  120. 120.
    Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977;40(1):35–41.CrossRefGoogle Scholar
  121. 121.
    Freeman LC. Centrality in social networks conceptual clarification. Soc Netw. 1979;1(3):215–39.CrossRefGoogle Scholar
  122. 122.
    Bavelas A. Communication patterns in task-oriented groups. J Acoust Soc Am. 1950;22(6):723–30.CrossRefGoogle Scholar
  123. 123.
    Allesina S, Pascual M. Googling food webs: can an eigenvector measure Species’ importance for Coextinctions? PLoS Comput Biol. 2009;5(9):​e1000494.PubMedPubMedCentralCrossRefGoogle Scholar
  124. 124.
    Newman MEJ, Girvan M. Finding and evaluating community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2004;69(2):026113.PubMedCrossRefPubMedCentralGoogle Scholar
  125. 125.
    Okuda S, Tsuchiya Y, Kiriyama C, Itoh M, Morisaki H. Virtual metagenome reconstruction from 16S rRNA gene sequences. Nat Commun. 2012;3:1203.PubMedCrossRefPubMedCentralGoogle Scholar
  126. 126.
    Langille MGI, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes JA, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. 2013;31(9):814–21.PubMedPubMedCentralCrossRefGoogle Scholar
  127. 127.
    Aßhauer KP, Wemheuer B, Daniel R, Meinicke P. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics. 2015;31(17):2882–4.PubMedPubMedCentralCrossRefGoogle Scholar
  128. 128.
    Iwai S, Weinmaier T, Schmidt BL, Albertson DG, Poloso NJ, Dabbagh K, et al. Piphillin: improved prediction of metagenomic content by direct inference from human microbiomes. PLoS One. 2016;11(11):e0166104.PubMedPubMedCentralCrossRefGoogle Scholar
  129. 129.
    Jing G, Sun Z, Wang H, Gong Y, Huang S, Ning K, et al. Parallel-META 3: comprehensive taxonomical and functional analysis platform for efficient comparison of microbial communities. Sci Rep. 2017;7:40371.PubMedPubMedCentralCrossRefGoogle Scholar
  130. 130.
    McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 2012;6(3):610–8.PubMedCrossRefPubMedCentralGoogle Scholar
  131. 131.
    Abubucker S, Segata N, Goll J, Schubert AM, Izard J, Cantarel BL, et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol. 2012;8(6):e1002358.PubMedPubMedCentralCrossRefGoogle Scholar
  132. 132.
    Xu Z, Malmer D, Langille MGI, Way SF, Knight R. Which is more important for classifying microbial communities: who’s there or what they can do? ISME J. 2014;8(12):2357–9.PubMedPubMedCentralCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Max Planck Institute for Evolutionary BiologyPlönGermany
  2. 2.Institute for Experimental MedicineChristian-Albrechts-University of KielKielGermany
  3. 3.Group for Medical Systems Biology, Lübeck Institute of Experimental Dermatology (LIED)University of LübeckLübeckGermany

Personalised recommendations