Introductory Overview of Statistical Analysis of Microbiome Data

  • Yinglin Xia
  • Jun Sun
  • Ding-Geng Chen
Part of the ICSA Book Series in Statistics book series (ICSABSS)


In this chapter, we first introduce and discuss the themes and statistical hypotheses in human microbiome studies in Sect. 3.1. Then, we overview the classic statistical methods and models for microbiome studies in Sect. 3.2. In Sect. 3.3, we introduce the newly developed multivariate statistical methods. Section 3.4 introduces the compositional analysis of microbiome data. In Sect. 3.5, we discuss the longitudinal data analysis and causal inference in microbiome studies. In Sect. 3.6, we introduce some statistical packages for analyzing microbiome data. Finally, we cover the limitations of existing statistical methods and future development in Sect. 3.7.


  1. Adams, R.I., A.C. Bateman, et al. 2015. Microbiota of the indoor environment: A meta-analysis. Microbiome 3 (1): 49.Google Scholar
  2. Aitchison, J. 1981. A new approach to null correlations of proportions. Mathematical Geology 13 (2): 175–189.MathSciNetCrossRefGoogle Scholar
  3. Aitchison, J. 1982. The statistical analysis of compositional data (with discussion). Journal of the Royal Statistical Society, Series B (Statistical Methodology) 44 (2): 139–177.Google Scholar
  4. Aitchison, J. 1983. Principal component analysis of compositional data. Biometrika 70 (1): 57–65.MathSciNetzbMATHCrossRefGoogle Scholar
  5. Aitchison, J. 1984. Reducing the dimensionality of compositional data sets. Journal of the International Association for Mathematical Geology 16 (6): 617–635.CrossRefGoogle Scholar
  6. Aitchison, J. 1986. The statistical analysis of compositional data. London: Chapman and Hall Ltd. Reprinted in 2003 with additional material by The Blackburn Press. Google Scholar
  7. Albenberg, L.G., and G.D. Wu. 2014. Diet and the intestinal microbiome: Associations, functions, and implications for health and disease. Gastroenterology 146 (6): 1564–1572.CrossRefGoogle Scholar
  8. Albenberg, L.G., J.D. Lewis, et al. 2012. Food and the gut microbiota in IBD: A critical connection. Current Opinion in Gastroenterology 28 (4): 314–320.
  9. Alekseyenko, A.V., G.I. Perez-Perez, et al. 2013. Community differentiation of the cutaneous microbiota in psoriasis. Microbiome 1 (1): 31.CrossRefGoogle Scholar
  10. Anders, S., and W. Huber. 2010. Differential expression analysis for sequence count data. Genome Biology 11 (10): R106–R106.CrossRefGoogle Scholar
  11. Anderson, M.J. 2001. A new method for non-parametric multivariate analysis of variance. Austral Ecology 26: 32–46.Google Scholar
  12. Backhed, F., J. Roswall, et al. 2015. Dynamics and stabilization of the human gut microbiome during the first year of life. Cell Host & Microbe 17 (6): 852.CrossRefGoogle Scholar
  13. Baxter, N.T., J.J. Wan, et al. 2015. Intra- and interindividual variations mask interspecies variation in the microbiota of sympatric peromyscus populations. Applied and Environment Microbiology 81 (1): 396–404.CrossRefGoogle Scholar
  14. Bhattacharya, A., D. Pati, et al. 2015. Dirichlet–Laplace priors for optimal shrinkage. Journal of the American Statistical Association 110 (512): 1479–1490.MathSciNetzbMATHCrossRefGoogle Scholar
  15. Bhute, S., P. Pande, et al. 2016. Molecular characterization and meta-analysis of gut microbial communities illustrate enrichment of prevotella and megasphaera in Indian subjects. Frontiers in Microbiology 7: 660.Google Scholar
  16. Bokulich, N.A., J. Chung, et al. 2016. Antibiotics, birth mode, and diet shape microbiome maturation during early life. Science Translational Medicine 8 (343): 343ra382.Google Scholar
  17. Bose, E., M. Hravnak, et al. 2017. Vector autoregressive (VAR) models and granger causality in time series analysis in nursing research: Dynamic changes among vital signs prior to cardiorespiratory instability events as an example. Nursing Research 66 (1): 12–19.CrossRefGoogle Scholar
  18. Bucci, V., B. Tzen, et al. 2016. MDSINE: Microbial dynamical systems inference engine for microbiome time-series analyses. Genome Biology 17 (1): 016–0980.Google Scholar
  19. Caporaso, J.G., J. Kuczynski, et al. 2010. QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7 (5): 335–336.CrossRefGoogle Scholar
  20. Castellarin, M., R.L. Warren, et al. 2012. Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. Genome Research 22 (2): 299–306.CrossRefGoogle Scholar
  21. Chang, Q., Y. Luan, et al. 2011. Variance adjusted weighted UniFrac: A powerful beta diversity measure for comparing communities based on phylogeny. BMC Bioinformatics 12: 118.CrossRefGoogle Scholar
  22. Charlson, E.S., J. Chen, et al. 2010. Disordered microbial communities in the upper respiratory tract of cigarette smokers. PLoS ONE 5 (12): e15216.CrossRefGoogle Scholar
  23. Chen, J., and H. Li. 2013. Variable selection for sparse dirichlet-multinomial regression with an application to microbiome data analysis. The Annals of Applied Statistics 7 (1): 418–442.MathSciNetzbMATHCrossRefGoogle Scholar
  24. Chen, E.Z., and H. Li. 2016. A two-part mixed-effects model for analyzing longitudinal microbiome compositional data. Bioinformatics 32 (17): 2611–2617.CrossRefGoogle Scholar
  25. Chen, D., and K. Peace. 2013. Applied meta-analysis with R. New York: Chapman and Hall/CRC.Google Scholar
  26. Chen, J., K. Bittinger, et al. 2012a. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics 28 (16): 2106–2113.CrossRefGoogle Scholar
  27. Chen, W., F. Liu, et al. 2012b. Human intestinal lumen and mucosa-associated microbiota in patients with colorectal cancer. PLoS ONE 7 (6): e39743.CrossRefGoogle Scholar
  28. Chen, J., E. Ryu, et al. 2016. Impact of demographics on human gut microbial diversity in a US Midwest population. PeerJ 4: e1514.CrossRefGoogle Scholar
  29. Chen, J., E. King, et al. 2018. An omnibus test for differential distribution analysis of microbiome sequencing data. Bioinformatics 34 (4): 643–651.CrossRefGoogle Scholar
  30. Cook, R.D. 1994. On the interpretation of regression plots. Journal of the American Statistical Association 89 (425): 177–189.MathSciNetzbMATHCrossRefGoogle Scholar
  31. Cook, R.D. 1996. Graphics for regressions with a binary response. Journal of the American Statistical Association 91 (435): 983–992.MathSciNetzbMATHCrossRefGoogle Scholar
  32. Cook, R.D. 2004. Testing predictor contributions in sufficient dimension reduction. The Annals of Statistics 32 (3): 1062–1092.MathSciNetzbMATHCrossRefGoogle Scholar
  33. Costello, E.K., C.L. Lauber, et al. 2009. Bacterial community variation in human body habitats across space and time. Science 326 (5960): 1694–1697.CrossRefGoogle Scholar
  34. Curtis, H. 1997. What is normal vaginal flora? Genitourinary Medicine 73 (3): 230.CrossRefGoogle Scholar
  35. D’Argenio, V., G. Casaburi, et al. 2014. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines. BioMed Research International, 2014.Google Scholar
  36. Degnan, P.H., A.E. Pusey, et al. 2012. Factors associated with the diversification of the gut microbial communities within chimpanzees from Gombe national park. Proceedings of the National Academy of Sciences of the United States of America 109 (32): 13034–13039.CrossRefGoogle Scholar
  37. Dennis, S.Y. 1991. On the hyper-dirichlet type 1 and hyper-liouville distributions. Communications in Statistics—Theory and Methods 20 (12): 4069–4081.Google Scholar
  38. Dethlefsen, L., and D.A. Relman. 2011. Incomplete recovery and individualized responses of the human distal gut microbiota to repeated antibiotic perturbation. Proceedings of the National Academy of Sciences of the United States of America 108 (Suppl 1): 4554–4561.CrossRefGoogle Scholar
  39. Dethlefsen, L., S. Huse, et al. 2008. The pervasive effects of an antibiotic on the human gut microbiota, as revealed by deep 16S rRNA sequencing. PLoS Biology 6 (11): e280.CrossRefGoogle Scholar
  40. Dhariwal, A., J. Chong, et al. 2017. MicrobiomeAnalyst: a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucleic Acids Research 45 (W1): W180–W188CrossRefGoogle Scholar
  41. Diggle, P.J., P. Heagerty, et al. 2002. Analysis of longitudinal data. Oxford: Oxford University Press.Google Scholar
  42. Donoho, D.L. 2000. High-dimensional data analysis: The curses and blessings of dimensionality. In Conference on Mathematical Challenges of the 21st Century. American Mathematical Society.Google Scholar
  43. Duvallet, C., S.M. Gibbons, et al. 2017. Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nature Communications 8 (1): 1784.Google Scholar
  44. Erb, I., and C. Notredame. 2016. How should we measure proportionality on relative gene expression data? Theory in Biosciences 135: 21–36.CrossRefGoogle Scholar
  45. Erb, I., T. Quinn, et al. 2017. Differential proportionality—A normalization-free approach to differential gene expression. bioRxiv.Google Scholar
  46. Fan, J., and R. Li. 2006. Statistical challenges with high dimensionality: Feature selection in knowledge discovery. In Proceedings of the international congress of mathematicians, vol. III, ed. M. Sanz-Sole, J. Soria, J.L. Varona, and J. Verdera, 595–622. Freiburg: European Mathematical Society.Google Scholar
  47. Fang, R., B.D. Wagner, et al. 2016. Zero-inflated negative binomial mixed model: An application to two microbial organisms important in oesophagitis. Epidemiology and Infection 144 (11): 2447–2455.CrossRefGoogle Scholar
  48. Fei, N., and L. Zhao. 2013. An opportunistic pathogen isolated from the gut of an obese human causes obesity in germfree mice. ISME Journal 7 (4): 880–884.CrossRefGoogle Scholar
  49. Fernandes, A.D., J.M. Macklaim, et al. 2013. ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-Seq. PLoS ONE 8 (7): e67019.CrossRefGoogle Scholar
  50. Finucane, M.M., T.J. Sharpton, et al. 2014. A taxonomic signature of obesity in the microbiome? Getting to the guts of the matter. PLoS ONE 9 (1): e84689.CrossRefGoogle Scholar
  51. Fisher, C.K., and P. Mehta. 2014. Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression. PLoS ONE 9 (7): e102451.CrossRefGoogle Scholar
  52. Fitzmaurice, G.M., N.M. Laird, et al. 2004. Applied longitudinal analysis. NJ: Wiley.Google Scholar
  53. Gajer, P., R.M. Brotman, et al. 2012. Temporal dynamics of the human vaginal microbiota. Science Translational Medicine 4 (132): 3003605.CrossRefGoogle Scholar
  54. Gerber, G.K. 2014. The dynamic microbiome. FEBS Letters 588 (22): 4131–4139.CrossRefGoogle Scholar
  55. Gerber, G.K. 2015. Longitudinal microbiome data analysis. In Metagenomics for microbiology, ed. J. Izard and M.C. Rivera. London, UK: Elsevier Inc.CrossRefGoogle Scholar
  56. Gerber, G.K., A.B. Onderdonk, et al. 2012. Inferring dynamic signatures of microbes in complex host ecosystems. PLoS Computational Biology 8 (8): e1002624.CrossRefGoogle Scholar
  57. Giatsis, C., D. Sipkema, et al. 2014. The colonization dynamics of the gut microbiota in tilapia larvae. PLoS ONE 9 (7): e103641.CrossRefGoogle Scholar
  58. Gibbons, S.M., S.M. Kearney, et al. 2017. Two dynamic regimes in the human gut microbiome. PLoS Computational Biology 13 (2): e1005364.CrossRefGoogle Scholar
  59. Gloor, G.B., and G. Reid. 2016. Compositional analysis: A valid approach to analyze microbiome high-throughput sequencing data. Canadian Journal of Microbiology 62 (8): 692–703.CrossRefGoogle Scholar
  60. Gloor, G.B., J.R. Wu, et al. 2016. It’s all relative: Analyzing microbiome data as compositions. Annals of Epidemiology 26 (5): 322–329.CrossRefGoogle Scholar
  61. Gorzelak, M.A., S.K. Gill, et al. 2015. Methods for improving human gut microbiome data by reducing variability through sample processing and storage of stool. PLoS ONE 10 (8): e0134802.CrossRefGoogle Scholar
  62. Grantham, Neal S., Brian J. Reich, et al. 2017. MIMIX: A Bayesian mixed-effects model for microbiome data from designed experiments. arXiv:1703.07747 [stat.ME].
  63. Guo, Y., H.L. Logan, et al. 2013. Selecting a sample size for studies with repeated measures. BMC Medical Research Methodology 13 (1): 100.Google Scholar
  64. Gupta, K., A.E. Stapleton, et al. 1998. Inverse association of H2O2-producing lactobacilli and vaginal Escherichia coli colonization in women with recurrent urinary tract infections. The Journal of Infectious Diseases 178 (2): 446–450.CrossRefGoogle Scholar
  65. Hastie, T.J., and R.J. Tibshirani. 1990. Generalized additive models. London: Chapman & Hall.Google Scholar
  66. He, Y., B.-J. Zhou, et al. 2013. Comparison of microbial diversity determined with the same variable tag sequence extracted from two different PCR amplicons. BMC Microbiology 13 (1): 208.CrossRefGoogle Scholar
  67. Hillier, S.L., M.A. Krohn, et al. 1993. The normal vaginal flora, H2O2-producing lactobacilli, and bacterial vaginosis in pregnant women. Clinical Infectious Diseases 16: S273–S281.Google Scholar
  68. Ho, N. T., and F. Li. 2018. MetamicrobiomeR: An R package for analysis of microbiome relative abundance data using zero-inflated beta GAMLSS and meta-analysis across studies using random effect models. bioRxiv preprint first posted online, 4 Apr 2018.Google Scholar
  69. Holman, D.B., B.W. Brunelle, et al. 2017. Meta-analysis to define a core microbiota in the swine gut. mSystems 2 (3): e00004–e00017.Google Scholar
  70. Holmes, I., K. Harris, et al. 2012. Dirichlet multinomial mixtures: Generative models for microbial metagenomics. PLoS ONE 7 (2): e30126.CrossRefGoogle Scholar
  71. Huang, J., P. Breheny, et al. 2012. A selective review of group selection in high-dimensional models. Statistical Science: A Review Journal of The Institute of Mathematical Statistics 27 (4): 481–499.
  72. Ivanov, I.I., and D.R. Littman. 2010. Segmented filamentous bacteria take the stage. Mucosal Immunology 3 (3): 209–212.CrossRefGoogle Scholar
  73. Ivanov, I.I., K. Atarashi, et al. 2009. Induction of intestinal Th17 cells by segmented filamentous bacteria. Cell 139 (3): 485–498.CrossRefGoogle Scholar
  74. Jakobsson, H.E., C. Jernberg, et al. 2010. Short-term antibiotic treatment has differing long-term impacts on the human throat and gut microbiome. PLoS ONE 5 (3): e9836.CrossRefGoogle Scholar
  75. Jannicke Moe, S., A.B. Kristoffersen, et al. 2005. From patterns to processes and back: Analysing density-dependent responses to an abiotic stressor by statistical and mechanistic modelling. Proceedings of the Royal Society B: Biological Sciences 272 (1577): 2133–2142.CrossRefGoogle Scholar
  76. Jin, D., S. Wu, et al. 2015. Lack of vitamin D receptor causes dysbiosis and changes the functions of the murine intestinal microbiome. Clinical Therapeutics 37 (5): 996–1009. e1007.CrossRefGoogle Scholar
  77. Jonsson, V. 2017. Statistical analysis and modelling of gene count data in metagenomics. Sweden: Göteborg.Google Scholar
  78. Jonsson, V., T. Osterlund, et al. 2017. Variability in metagenomic count data and its influence on the identification of differentially abundant genes. J Comput Biol 24 (4): 311–326.CrossRefGoogle Scholar
  79. Kanhere, M., J. He, et al. 2018. Bolus weekly vitamin D3 supplementation impacts gut and airway microbiota in adults with cystic fibrosis: A double-blind, randomized, placebo-controlled clinical trial. Journal of Clinical Endocrinology and Metabolism 103 (2): 564–574.CrossRefGoogle Scholar
  80. Kelley, S.T., D.V. Skarra, et al. 2016. The gut microbiome is altered in a Letrozole-induced mouse model of polycystic ovary syndrome. PLoS ONE 11 (1): e0146509.CrossRefGoogle Scholar
  81. Kelly, B.J., R. Gross, et al. 2015. Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA. Bioinformatics 31 (15): 2461–2468.CrossRefGoogle Scholar
  82. Kim, K.A., I.H. Jung, et al. 2013. Comparative analysis of the gut microbiota in people with different levels of ginsenoside Rb1 degradation to compound K. PLoS ONE 8 (4): e62409.CrossRefGoogle Scholar
  83. Koenig, J.E., A. Spor, et al. 2011. Succession of microbial consortia in the developing infant gut microbiome. Proceedings of the National Academy of Sciences of the United States of America 1: 4578–4585.CrossRefGoogle Scholar
  84. Kostic, A.D., D. Gevers, et al. 2012. Genomic analysis identifies association of fusobacterium with colorectal carcinoma. Genome Research 22 (2): 292–298.CrossRefGoogle Scholar
  85. Kostic, A. D., D. Gevers, et al. 2015. The dynamics of the human infant gut microbiome in development and in progression towards type 1 diabetes. Cell Host & Microbe 17 (2): 260-273CrossRefGoogle Scholar
  86. Kuczynski, J., Z. Liu, et al. 2010. Microbial community resemblance methods differ in their ability to detect biologically relevant patterns. Nature Methods 7 (10): 813–819.CrossRefGoogle Scholar
  87. La Rosa, P.S., J.P. Brooks, et al. 2012a. Hypothesis testing and power calculations for taxonomic-based human microbiome data. PLoS ONE 7 (12): e52078.CrossRefGoogle Scholar
  88. La Rosa, P.S., B. Shands, et al. 2012b. Statistical object data analysis of taxonomic trees from human microbiome data. PLoS ONE 7 (11): e48996.CrossRefGoogle Scholar
  89. La Rosa, P.S., B.B. Warner, et al. 2014. Patterned progression of bacterial populations in the premature infant gut. Proceedings of the National Academy of Sciences 111 (34): 12522–12527.CrossRefGoogle Scholar
  90. La Rosa, P.S., Y. Zhou, et al. 2015. Hypothesis testing of metagenomic data. In Metagenomics for microbiology, ed. J. Izard and M.C. Rivera, 81–96. Waltham, MA, USA: Academic Press.Google Scholar
  91. La Rosa, Patricio S., Elena Deych, et al. 2016. HMP: Hypothesis testing and power calculations for comparing metagenomic samples from HMP. R package version 1.4.3.
  92. Lahti, L., and J. Salojarvi. 2014–2016. Microbiome R package. URL:
  93. Lahti, L., A. Salonen, et al. 2013. Associations between the human intestinal microbiota, Lactobacillus rhamnosus GG and serum lipids indicated by integrated analysis of high-throughput profiling data. PeerJ 1: e32.Google Scholar
  94. Lee, J., and M. Sison-Mangus. 2018. A Bayesian semiparametric regression model for joint analysis of microbiome data. Frontiers in Microbiology 9: 522.Google Scholar
  95. Lee, K.H., Brent A. Coull, et al. 2017. Bayesian variable selection for multivariate zero-inflated models: Application to microbiome count data. arXiv:1711.00157 [stat.AP].
  96. Lewis, J.D., E.Z. Chen, et al. 2015. Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohn’s disease. Cell Host & Microbe 18 (4): 489–500.CrossRefGoogle Scholar
  97. Ley, R.E., F. Backhed, et al. 2005. Obesity alters gut microbial ecology. Proceedings of the National Academy of Sciences of the United States of America 102 (31): 11070–11075.CrossRefGoogle Scholar
  98. Ley, R.E., P.J. Turnbaugh, et al. 2006. Microbial ecology: Human gut microbes associated with obesity. Nature 444 (7122): 1022–1023.CrossRefGoogle Scholar
  99. Li, K.-C. 1991. Sliced inverse regression for dimension reduction. Journal of the American Statistical Association 86 (414): 316–327.MathSciNetzbMATHCrossRefGoogle Scholar
  100. Love, M.I., W. Huber, et al. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15 (12): 550.Google Scholar
  101. Lovell, D., V. Pawlowsky-Glahn, et al. 2015. Proportionality: A valid alternative to correlation for relative data. PLoS Computational Biology 11 (3): e1004075.CrossRefGoogle Scholar
  102. Lozupone, C.A., and R. Knight. 2005. UnifFrac: A new phylogenetic method for comparing microbial communities. Applied and Environmental Microbiology 71 (12): 8228–8235.CrossRefGoogle Scholar
  103. Lozupone, C.A., M. Hamady, et al. 2007. Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities. Applied and Environment Microbiology 73 (5): 1576–1585.CrossRefGoogle Scholar
  104. Lozupone, C., M.E. Lladser, et al. 2011. UniFrac: An effective distance metric for microbial community comparison. ISME Journal 5 (2): 169–172.CrossRefGoogle Scholar
  105. Lozupone, C.A., J.I. Stombaugh, et al. 2012. Diversity, stability and resilience of the human gut microbiota. Nature 489 (7415): 220–230.CrossRefGoogle Scholar
  106. Lozupone, C.A., J. Stombaugh, et al. 2013. Meta-analyses of studies of the human microbiota. Genome Research 23 (10): 1704–1714.CrossRefGoogle Scholar
  107. Lu, J., J.K. Tomfohr, et al. 2005. Identifying differential expression in multiple SAGE libraries: An overdispersed log-linear model approach. BMC Bioinformatics 6 (1): 165.CrossRefGoogle Scholar
  108. MacKinnon, D.P. 2008. Introduction to statistical mediation analysis. Mahwah, NJ: Erlbaum.Google Scholar
  109. MacKinnon, D.P., A.J. Fairchild, et al. 2007. Mediation analysis. Annual Review of Psychology 58: 593–614.CrossRefGoogle Scholar
  110. Mancabelli, L., C. Milani, et al. 2017. Meta-analysis of the human gut microbiome from urbanized and pre-agricultural populations. Environmental Microbiology 19 (4): 1379–1390.CrossRefGoogle Scholar
  111. Mandal, S., W. Van Treuren, et al. 2015. Analysis of composition of microbiomes: A novel method for studying microbial composition. Microbial Ecology in Health and Disease 26: 27663.Google Scholar
  112. Mao, Jialiang, Yuhan Chen, et al. 2017. Bayesian graphical compositional regression for microbiome data. arXiv:1712.04723 [stat.ME].
  113. Marino, S., N.T. Baxter, et al. 2014. Mathematical modeling of primary succession of murine intestinal microbiota. Proceedings of the National Academy of Sciences of the United States of America 111 (1): 439–444.CrossRefGoogle Scholar
  114. McArdle, B.H., and M.J. Anderson. 2001. Fitting multivariate models to community data: A comment on distance based redundancy analysis. Ecology 82: 290–297.CrossRefGoogle Scholar
  115. McCarthy, D.J., Y. Chen, et al. 2012. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research 40 (10): 4288–4297.CrossRefGoogle Scholar
  116. McCord, A.I., C.A. Chapman, et al. 2014. Fecal microbiomes of non-human primates in Western Uganda reveal species-specific communities largely resistant to habitat perturbation. American Journal of Primatology 76 (4): 347–354.CrossRefGoogle Scholar
  117. McMurdie, P.J., and S. Holmes. 2013. Phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8 (4): e61217.CrossRefGoogle Scholar
  118. McMurdie, P.J., and S. Holmes. 2014. Waste not, want not: Why rarefying microbiome data is inadmissible. PLoS Computational Biology 10 (4): e1003531.CrossRefGoogle Scholar
  119. Moher, D., A. Liberati, et al. 2009. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLOS Medicine 6 (7): e1000097.CrossRefGoogle Scholar
  120. Morgan, X.C., and C. Huttenhower. 2012. Human microbiome analysis. PLoS Comput Biol 8 (12): 27. (Chapter 12).CrossRefGoogle Scholar
  121. Morgan, X.C., T.L. Tickle, et al. 2012. Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biology 13 (9): R79.CrossRefGoogle Scholar
  122. Narrowe, A.B., M. Albuthi-Lantz, et al. 2015. Perturbation and restoration of the fathead minnow gut microbiome after low-level triclosan exposure. Microbiome 3: 6.Google Scholar
  123. Nilakanta, H., K.L. Drews, et al. 2014. A review of software for analyzing molecular sequences. BMC Research Notes 7 (1): 830.CrossRefGoogle Scholar
  124. Nobel, Y.R., L.M. Cox, et al. 2015. Metabolic and metagenomic outcomes from early-life pulsed antibiotic treatment. Nature Communications 6: 7486.Google Scholar
  125. Oksanen, Jari, F. Guillaume Blanchet, et al. 2016. Vegan: Community ecology package. R package version 2.4-1.
  126. Palmer, C., E.M. Bik, et al. 2007. Development of the human infant intestinal microbiota. PLoS Biology 5 (7): 26.CrossRefGoogle Scholar
  127. Paulson, J.N., O.C. Stine, et al. 2013a. Differential abundance analysis for microbial marker-gene surveys. Nature Methods 10 (12): 1200–1202.CrossRefGoogle Scholar
  128. Paulson, J.N., O.C. Stine, et al. 2013b. Robust methods for differential abundance analysis in marker gene surveys. Nature Methods 10 (12): 1200–1202.CrossRefGoogle Scholar
  129. Pawlowsky-Glahn, V., and A. Buccianti. 2011. Compositional data analysis: Theory and applications. Chichester, UK: Wiley.Google Scholar
  130. Pawlowsky-Glahn, V., J.J. Egozcue, et al. 2015. Modeling and analysis of compositional data. London, UK: Springer. Wiley.Google Scholar
  131. Pearson, K. 1897. Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London LX: 489–502.zbMATHCrossRefGoogle Scholar
  132. Perez-Cobas, A.E., M.J. Gosalbes, et al. 2013. Gut microbiota disturbance during antibiotic therapy: A multi-omic approach. Gut 62 (11): 1591–1601.CrossRefGoogle Scholar
  133. Peterfreund, G.L., L.E. Vandivier, et al. 2012. Succession in the gut microbiome following antibiotic and antibody therapies for clostridium difficile. PLoS ONE 7 (10): 10.CrossRefGoogle Scholar
  134. Peterson, J., S. Garges, et al. 2009. The NIH human microbiome project. Genome Research 19: 2317–2323.Google Scholar
  135. Plummer, E., J. Twin, et al. 2015. A comparison of three bioinformatics pipelines for the analysis of preterm gut microbiota using 16S rRNA gene sequencing data. Journal of Proteomics and Bioinformatics 8: 283–291.Google Scholar
  136. Praveen, P., F. Jordan, et al. 2015. The role of breast-feeding in infant immune system: A systems perspective on the intestinal microbiome. Microbiome 3 (1): 41.Google Scholar
  137. Rawls, J.F., B.S. Samuel, et al. 2004. Gnotobiotic zebrafish reveal evolutionarily conserved responses to the gut microbiota. Proceedings of the National Academy of Sciences of the United States of America 101 (13): 4596–4601.CrossRefGoogle Scholar
  138. Rawls, J.F., M.A. Mahowald, et al. 2006. Reciprocal gut microbiota transplants from zebrafish and mice to germ-free recipients reveal host habitat selection. Cell 127 (2): 423–433.CrossRefGoogle Scholar
  139. Redondo-Lopez, V., R.L. Cook, et al. 1990. Emerging role of lactobacilli in the control and maintenance of the vaginal bacterial microflora. Reviews of Infectious Diseases 12 (5): 856–872.CrossRefGoogle Scholar
  140. Ridaura, V.K., J.J. Faith, et al. 2013. Gut microbiota from twins discordant for obesity modulate metabolism in mice. Science 341 (6150): 1241214.Google Scholar
  141. Rigby, R., and D. Stasinopoulos. 2001. The GAMLSS project: A flexible approach to statistical modelling. In New trends in statistical modelling: Proceedings of the 16th international workshop on statistical modelling, ed. B. Klein and L. Korsholm, 249–256. Odense, Denmark.Google Scholar
  142. Rigby, R.A., and D.M. Stasinopoulos. 2005. Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society: Series C (Applied Statistics) 54 (3): 507–554.MathSciNetzbMATHCrossRefGoogle Scholar
  143. Robinson, M.D., and G.K. Smyth. 2007. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23 (21): 2881–2887.CrossRefGoogle Scholar
  144. Robinson, M.D., and G.K. Smyth. 2008. Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 9 (2): 321–332.zbMATHCrossRefGoogle Scholar
  145. Robinson, M.D., D.J. McCarthy, et al. 2010. edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26 (1): 139–140.CrossRefGoogle Scholar
  146. Rogosa, M., and M.E. Sharpe. 1960. Species differentiation of human vaginal lactobacilli. Journal of General Microbiology 23: 197–201.CrossRefGoogle Scholar
  147. Romero, R., S.S. Hassan, et al. 2014. The composition and stability of the vaginal microbiota of normal pregnant women is different from that of non-pregnant women. Microbiome 2 (1): 4.CrossRefGoogle Scholar
  148. Rowe, D.B. 2003. Multivariate Bayesian statistics: Models for source separation and signal unmixing. Boca Raton, FL: CRC Press.Google Scholar
  149. Rush, S., C. Lee, et al. 2016. The phylogenetic LASSO and the microbiome. arXiv:1607.08877 [stat.ML].
  150. Samuel, B.S., and J.I. Gordon. 2006. A humanized gnotobiotic mouse model of host-archaeal-bacterial mutualism. Proceedings of the National Academy of Sciences of the United States of America 103 (26): 10011–10016.CrossRefGoogle Scholar
  151. Sanders, J.G., S. Powell, et al. 2014. Stability and phylogenetic correlation in gut microbiota: Lessons from ants and apes. Molecular Ecology 23 (6): 1268–1283.CrossRefGoogle Scholar
  152. Schloss, P.D., S.L. Westcott, et al. 2009. Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environment Microbiology 75 (23): 7537–7541.CrossRefGoogle Scholar
  153. Segata, N., S.K. Haake, et al. 2012. Composition of the adult digestive tract bacterial microbiome based on seven mouth surfaces, tonsils, throat and stool samples. Genome Biology 13 (6): 2012–2013.CrossRefGoogle Scholar
  154. Ren, Boyu, Sergio Bacallado, et al. 2017a. Bayesian nonparametric mixed effects models in microbiome data analysis. arXiv:1711.01241 [stat.ME].
  155. Ren, Boyu, Sergio Bacallado, et al. 2017b. Bayesian nonparametric ordination for the analysis of microbial communities. arXiv:1601.05156 [stat.ME].
  156. Sewankambo, N., R.H. Gray, et al. 1997. HIV-1 infection associated with abnormal vaginal flora morphology and bacterial vaginosis. Lancet 350 (9077): 546–550.Google Scholar
  157. Smith, M.I., T. Yatsunenko, et al. 2013. Gut microbiomes of Malawian twin pairs discordant for kwashiorkor. Science 339 (6119): 548–554.CrossRefGoogle Scholar
  158. Smith, C.C., L.K. Snowberg, et al. 2015. Dietary input of microbes and host genetic variation shape among-population differences in stickleback gut microbiota. ISME Journal 9 (11): 2515–2526.CrossRefGoogle Scholar
  159. Smyth, G. 2005. Limma: Linear models for microarray data. In Bioinformatics and computational biology solutions using R and bioconductor, ed. R. Gentleman, V. Carey, S. Dudoit, and W.R. Irizarry, 397–420. New York: Springer.Google Scholar
  160. Spor, A., O. Koren, et al. 2011. Unravelling the effects of the environment and host genotype on the gut microbiome. Nature Reviews Microbiology 9 (4): 279–290.CrossRefGoogle Scholar
  161. Stasinopoulos, D., and R. Rigby. 2007. Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software 23 (7): 1–46.Google Scholar
  162. Stein, R.R., V. Bucci, et al. 2013. Ecological modeling from time-series inference: Insight into dynamics and stability of intestinal microbiota. PLoS Computational Biology 9 (12): e1003388.CrossRefGoogle Scholar
  163. Stenseth, N.C., M. Llope, et al. 2006. Seasonal plankton dynamics along a cross-shelf gradient. Proceedings of the Royal Society B: Biological Sciences 273 (1603): 2831–2838.CrossRefGoogle Scholar
  164. Stige, L.C., J. Stave, et al. 2006. The effect of climate variation on agro-pastoral production in Africa. Proceedings of the National Academy of Sciences of the United States of America 103 (9): 3049–3053.CrossRefGoogle Scholar
  165. Storey, J.D., and R. Tibshirani. 2003. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences 100 (16): 9440–9445.MathSciNetzbMATHCrossRefGoogle Scholar
  166. Sugihara, G., R. May, et al. 2012. Detecting causality in complex ecosystems. Science 338 (6106): 496–500.zbMATHCrossRefGoogle Scholar
  167. Sun, J., and E.B. Chang. 2014. Exploring gut microbes in human health and disease: Pushing the envelope. Genes & Diseases 1 (2): 132–139.CrossRefGoogle Scholar
  168. Swenson, N.G. 2011. Phylogenetic beta diversity metrics, trait evolution and inferring the functional beta diversity of communities. PLoS ONE 6 (6): e21264.CrossRefGoogle Scholar
  169. Sze, M., and P.D. Schloss. 2016. Looking for a signal in the noise: Revisiting obesity and the microbiome. biorxiv.Google Scholar
  170. Tang, Y., L. Ma, et al. 2018. A phylogenetic scan test on a dirichlet-tree multinomial model for microbiome data. The Annals of Applied Statistics 12 (1): 1–26.MathSciNetzbMATHCrossRefGoogle Scholar
  171. Tang, Y., and D.L. Nicolae 2017. Mixed effect Dirichlet-Tree multinomial for longitudinal microbiome data andweight prediction. arXiv:1706.06380v1 [stat.AP]. 20 Jun 2017
  172. Thioulouse, J. 2011. Simultaneous analysis of a sequence of paired ecological tables: A comparison of several methods. The Annals of Applied Statistics 5 (4): 2300–2325.MathSciNetzbMATHCrossRefGoogle Scholar
  173. Tibshirani, R. 1996. Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society: Series B (Methodological) 58 (1): 267–288.Google Scholar
  174. Trosvik, P., K. Rudi, et al. 2008. Characterizing mixed microbial population dynamics using time-series analysis. ISME Journal 2 (7): 707–715.CrossRefGoogle Scholar
  175. Trosvik, P., N.C. Stenseth, et al. 2010. Convergent temporal dynamics of the human infant gut microbiota. ISME Journal 4 (2): 151–158.CrossRefGoogle Scholar
  176. Tseng, G.C., D. Ghosh, et al. 2012. Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Research 40 (9): 3785–3799.CrossRefGoogle Scholar
  177. Tung, J., L.B. Barreiro, et al. 2015. Social networks predict gut microbiome composition in wild baboons. Elife 4: e05224.Google Scholar
  178. Turnbaugh, P.J., V.K. Ridaura, et al. 2009. The effect of diet on the human gut microbiome: A metagenomic analysis in humanized gnotobiotic mice. Science Translational Medicine 1 (6): 6ra14–16ra14.CrossRefGoogle Scholar
  179. Udell, Madeleine, and A. Townsend. 2017. Nice latent variable models have log-rank. arXiv:1705.07474 [cs.LG].
  180. van den Boogaart, G.K., and R. Tolosana-Delgado. 2013a. Analyzing compositional data with R. Heidelberg: Springer.Google Scholar
  181. Van den Boogaart, K.G., and R. Tolosana-Delgado. 2013b. Analyzing compositional data with R. London: UK, Springer.Google Scholar
  182. Voigt, A.Y., P.I. Costea, et al. 2015. Temporal and technical variability of human gut metagenomes. Genome Biology 16: 73.Google Scholar
  183. Wadsworth, W.D., R. Argiento, et al. 2017. An integrative Bayesian dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data. BMC Bioinformatics 18 (1): 94.Google Scholar
  184. Walters, W.A., Z. Xu, et al. 2014. Meta-analyses of human gut microbes associated with obesity and IBD. FEBS Letters 588 (22): 4223–4233.CrossRefGoogle Scholar
  185. Wang, Y., X. Hu, et al. 2015. Predicting microbial interactions by using network-constrained regularization incorporating covariate coefficients and connection signs. In 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).Google Scholar
  186. Wang, J., L. B. Thingholm, et al. 2016. Genome-wide association analysis identifies variation in vitamin D receptor and other host factors influencing the gut microbiota. Nature Genetics (advance online publication).Google Scholar
  187. Wang, T., and H. Zhao. 2017. A dirichlet-tree multinomial regression model for associating dietary nutrients with gut microorganisms. Biometrics 73 (3): 792–801.MathSciNetCrossRefGoogle Scholar
  188. Wang, T., G. Cai, et al. 2012. Structural segregation of gut microbiota between colorectal cancer patients and healthy volunteers. ISME Journal 6 (2): 320–329.CrossRefGoogle Scholar
  189. Wei, W.W.S. 2005. Time series analysis: Univariate and multivariate methods. Boston: Pearson.Google Scholar
  190. White, J.R., N. Nagarajan, et al. 2009. Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLOS Computational Biology 5 (4): e1000352.CrossRefGoogle Scholar
  191. Wilson, W.G., P. Lundberg, et al. 2003. Biodiversity and species interactions: Extending Lotka-Volterra community theory. Ecology Letters 6 (10): 944–952.CrossRefGoogle Scholar
  192. Witkin, S.S., and W.J. Ledger. 2012. Complexities of the uniquely human vagina. Science Translational Medicine 4 (132): 132fs11.CrossRefGoogle Scholar
  193. Wong, R.G., J.R. Wu, et al. 2016. Expanding the UniFrac toolbox. PLoS ONE 11 (9): e0161196.CrossRefGoogle Scholar
  194. Wu, G.D., J. Chen, et al. 2011. Linking long-term dietary patterns with gut microbial enterotypes. Science 334 (6052): 105–108.CrossRefGoogle Scholar
  195. Wu, S., Y.G. Zhang, et al. 2015. Intestinal epithelial vitamin D receptor deletion leads to defective autophagy in colitis. Gut 64 (7): 1082–1094.CrossRefGoogle Scholar
  196. Wu, G.D., C. Compher, et al. 2016. Comparative metabolomics in vegans and omnivores reveal constraints on diet-dependent gut microbiota metabolite production. Gut 65 (1): 63–72.CrossRefGoogle Scholar
  197. Xia, Y., and J. Sun. 2017. Hypothesis testing and statistical analysis of microbiome. Genes & Diseases 4 (3): 138–148.CrossRefGoogle Scholar
  198. Xia, Y., N. Lu, et al. 2012a. Statistical methods and issues in the study of suicide. In Frontiers in suicide risk: Research, treatment and prevention, ed. J. Lavigne and J. Kemp, 139–158. Hauppauge, New York: Nova Science.Google Scholar
  199. Xia, Y., D. Morrison-Beedy, et al. 2012b. Modeling count outcomes from HIV risk reduction interventions: A comparison of competing statistical models for count responses. AIDS Research and Treatment 2012: 11 pages.Google Scholar
  200. Xia, F., J. Chen, et al. 2013a. A logistic normal multinomial regression model for microbiome compositional data analysis. Biometrics 69 (4): 1053–1063.MathSciNetzbMATHCrossRefGoogle Scholar
  201. Xia, J., C.D. Fjell, et al. 2013b. INMEX—A web-based tool for integrative meta-analysis of expression data. Nucleic Acids Research 41 (web server issue): W63–W70.Google Scholar
  202. Xie, H., and J. Huang. 2009. SCAD-penalized regression in high-dimensional partially linear models. The Annals of Statistics 37 (2): 673–696.MathSciNetzbMATHCrossRefGoogle Scholar
  203. Xu, L., A.D. Paterson, et al. 2015. Assessment and selection of competing models for zero-inflated microbiome data. PLoS ONE 10 (7): e0129606.CrossRefGoogle Scholar
  204. Yan, Q., J. Li, et al. 2016. Environmental filtering decreases with fish development for the assembly of gut microbiota. Environmental Microbiology 18 (12): 4739–4754.CrossRefGoogle Scholar
  205. Yang, H., X. Huang, et al. 2016. Uncovering the composition of microbial community structure and metagenomics among three gut locations in pigs with distinct fatness. Scientific Reports 6: 27427.Google Scholar
  206. Yassour, M., T. Vatanen, et al. 2016. Natural history of the infant gut microbiome and impact of antibiotic treatment on bacterial strain diversity and stability. Science Translational Medicine 8 (343): 343ra381.CrossRefGoogle Scholar
  207. Yin, X., and H. Hilafu. 2015. Sequential sufficient dimension reduction for large p, small n problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 77 (4): 879–892.Google Scholar
  208. Yin, X., J. Peng, et al. 2013. Structural changes of gut microbiota in a rat non-alcoholic fatty liver disease model treated with a Chinese herbal formula. Systematic and Applied Microbiology 36 (3): 188–196.CrossRefGoogle Scholar
  209. Zhang, C., and L. Zhao. 2016. Strain-level dissection of the contribution of the gut microbiome to human metabolic disease. Genome Medicine 8 (1): 016–0304.Google Scholar
  210. Zhang, H., Y. Xia, et al. 2011. Modeling longitudinal binomial responses: Implications from two dueling paradigms. Journal of Applied Statistics 38 (11): 2373–2390.MathSciNetCrossRefGoogle Scholar
  211. Zhang, X., H. Mallick, et al. 2016. Zero-inflated negative binomial regression for differential abundance testing in microbiome studies. Journal of Bioinformatics and Genomics 2 (2): 1–9.Google Scholar
  212. Zhang, X., H. Mallick, et al. 2017. Negative binomial mixed models for analyzing microbiome count data. BMC Bioinformatics 18 (1): 4.Google Scholar
  213. Zhao, L. 2013. The gut microbiota and obesity: From correlation to causality. Nature Reviews Microbiology 11 (9): 639–647.CrossRefGoogle Scholar
  214. Zhou, N., and J. Zhu. 2010. Group variable selection via a hierarchical lasso and its oracle property. Statistics and Its Interface 3: 557–574.MathSciNetzbMATHCrossRefGoogle Scholar
  215. Zou, H. 2006. The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101 (476): 1418–1429.MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Department of MedicineUniversity of Illinois at ChicagoChicagoUSA
  2. 2.School of Social WorkUniversity of North CarolinaChapel HillUSA
  3. 3.Department of Biostatistics, Gillings School of Global Public HealthUniversity of North CarolinaChapel HillUSA
  4. 4.Department of StatisticsUniversity of PretoriaPretoriaSouth Africa

Personalised recommendations