Skip to main content

Statistical Models and Analysis of Microbiome Data from Mice and Humans

Part of the Physiology in Health and Disease book series (PIHD)


After the initiation of the Human Microbiome Project in 2007, numerous statistical and bioinformatic tools for data analysis and computational methods were developed and applied to meet the needs of microbiome studies. One of the popular platforms is to implement the newly developed statistical and bioinformatic methods and models using R packages.

In this chapter, we introduce the widely used and newly developed statistical methods and models in the ecology and microbiome fields. We show readers how to use the current available statistical tools based on the R programming language to analyze microbiome data. Our purpose is to provide the analytical steps and tools to be implemented by microbiome researchers, who may not have advanced knowledge of statistical models and R programming language. Specifically, this chapter covers frequently used univariate and multivariate statistical models and visualization tools, in addition to alpha and beta metrics and R programming skills, using real data from mouse and human microbiome studies.


  • Gut microbiome
  • Statistical methods
  • Statistical analysis
  • R package

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4939-7534-1_12
  • Chapter length: 69 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-1-4939-7534-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   159.99
Price excludes VAT (USA)
Hardcover Book
USD   219.99
Price excludes VAT (USA)
Fig. 12.1
Fig. 12.2
Fig. 12.3
Fig. 12.4
Fig. 12.5
Fig. 12.6
Fig. 12.7
Fig. 12.8
Fig. 12.9
Fig. 12.10
Fig. 12.11
Fig. 12.12
Fig. 12.13
Fig. 12.14
Fig. 12.15
Fig. 12.16
Fig. 12.17
Fig. 12.18


  • Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc Ser B (Methodological) 44(2):139–177

    Google Scholar 

  • Borcard D, Gillet F et al (2011) Numerical ecology with R. Springer, New York

    CrossRef  Google Scholar 

  • Chao A (1984) Nonparametric estimation of the number of classes in a population. Scand J Stat 11:265–270

    Google Scholar 

  • Charlson ES, Chen J et al (2010) Disordered microbial communities in the upper respiratory tract of cigarette smokers. PLoS One 5(12):0015216

    CrossRef  Google Scholar 

  • Chen J (2012) GUniFrac: generalized UniFrac distances. R package version 1.0.

  • Clarke KR (1993) Non-parametric multivariate analysis of changes in community structure. Aust J Ecol 18:117–143

    CrossRef  Google Scholar 

  • Fernandes AD, Macklaim JM et al (2013) ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-Seq. PLoS One 8(7):e67019

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  • Gloor GB, Reid G (2016) Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data. Can J Microbiol 62(8):692–703

    CAS  CrossRef  PubMed  Google Scholar 

  • Gloor GB, Wu JR et al (2016) It’s all relative: analyzing microbiome data as compositions. Ann Epidemiol 26(5):322–329

    CrossRef  PubMed  Google Scholar 

  • Jin D, Wu S et al (2015) Lack of vitamin D receptor causes dysbiosis and changes the functions of the murine intestinal microbiome. Clin Ther 37(5):996–1009.e1007

    CAS  CrossRef  PubMed  Google Scholar 

  • Kindt R, Coe R (2005) Tree diversity analysis. A manual and software for common statistical methods for ecological and biodiversity studies. World Agroforestry Centre (ICRAF), Nairobi. ISBN: 92-9059-179-X

    Google Scholar 

  • Mandal S, Van Treuren W et al (2015) Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis 26:27663

    PubMed  Google Scholar 

  • Oksanen J, Guillaume Blanchet F et al (2016) Vegan: community ecology package. R package version 2.4-1.

  • R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.

  • RStudio Team (2016) RStudio: integrated development for R. RStudio, Boston.

  • Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423

    CrossRef  Google Scholar 

  • Shannon CE, Weaver W (1949) The mathematical theory of communication. University of Illinois Press, Urbana

    Google Scholar 

  • Simpson EH (1949) Measurement of diversity. Nature 163:688

    CrossRef  Google Scholar 

  • Wang J, Thingholm LB et al (2016) Genome-wide association analysis identifies variation in vitamin D receptor and other host factors influencing the gut microbiota. Nat Genet 48(11):1396–1406

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  • Wickham H, Francois R (2016). dplyr: a grammar of data manipulation. R package version 0.5.0.

  • Xia Y, Sun J (2017) Hypothesis testing and statistical analysis of microbiome. Genes Dis 4(3):138–148.

    CrossRef  Google Scholar 

Download references


We would like to acknowledge the NIDDK/National Institutes of Health grant R01 DK105118 and DOD BC160450P1 to Jun Sun. We thank the two anonymous reviewers whose comments/suggestions helped to improve and clarify this manuscript.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Yinglin Xia or Jun Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 The American Physiological Society

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Xia, Y., Sun, J. (2018). Statistical Models and Analysis of Microbiome Data from Mice and Humans. In: Sun, J., Dudeja, P. (eds) Mechanisms Underlying Host-Microbiome Interactions in Pathophysiology of Human Diseases. Physiology in Health and Disease. Springer, Boston, MA.

Download citation