A Model-Based Approach for Species Abundance Quantification Based on Shotgun Metagenomic Data

Abstract

The human microbiome, which includes the collective microbes residing in or on the human body, has a profound influence on the human health. DNA sequencing technology has made the large-scale human microbiome studies possible by using shotgun metagenomic sequencing. One important aspect of data analysis of such metagenomic data is to quantify the bacterial abundances based on the metagenomic sequencing data. Existing methods almost always quantify such abundances one sample at a time, which ignore certain systematic differences in read coverage along the genomes due to GC contents, copy number variation and the bacterial origin of replication. In order to account for such differences in read counts, we propose a multi-sample Poisson model to quantify microbial abundances based on read counts that are assigned to species-specific taxonomic markers. Our model takes into account the marker-specific effects when normalizing the sequencing count data in order to obtain more accurate quantification of the species abundances. Compared to currently available methods on simulated data and real data sets, our method has demonstrated an improved accuracy in bacterial abundance quantification, which leads to more biologically interesting results from downstream data analysis.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. 1.

    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    Article  Google Scholar 

  2. 2.

    Cho I, Blaser MJ (2012) The human microbiome: at the interface of health and disease. Nat Rev Genet 13:260–270

    Article  Google Scholar 

  3. 3.

    Du Z, Hudcovic T, Mrazek J, Kozakova H, Srutkova D, Schwarzer M, Tlaskalova-Hogenova H, Kostovcik M, Kverka M (2015) Development of gut inflammation in mice colonized with mucosa-associated bacteria from patients with ulcerative colitis. Gut Pathog 7:1

    Article  Google Scholar 

  4. 4.

    Gevers D, Kugathasan S, Denson LA, Vázquez-Baeza Y, Van Treuren W, Ren B, Schwager E, Knights D, Song SJ, Yassour M et al (2014) The treatment-naive microbiome in new-onset Crohns disease. Cell Host Microbe 15:382–392

    Article  Google Scholar 

  5. 5.

    Korem T, Zeevi D, Suez J, Weinberger A, Avnit-Sagi T, Pompan-Lotan M, Matot E, Jona G, Harmelin A, Cohen N, Sirota-Madi A, Thaiss CA, Pevsner-Fischer M, Sorek R, Xavier R, Elinav E, Segal E (2015) Growth dynamics of gut microbiota in health and disease inferred from single metagenomic samples. Science 349:1101–1106

    Article  Google Scholar 

  6. 6.

    Kostic AD, Gevers D, Siljander H, Vatanen T, Hyötyläinen T, Hämäläinen A-M, Peet A, Tillmann V, Pöhö P, Mattila I, Lähdesmäki H, Franzosa EA, Vaarala O, de Goffau M, Harmsen H, Ilonen J, Virtanen SM, Clish CB, Orešič M, Huttenhower C, Knip M, Xavier RJ (2015) The dynamics of the human infant gut microbiome in development and in progression toward type 1 diabetes. Cell Host Microbe 17:260–273

    Article  Google Scholar 

  7. 7.

    Langmead B, Trapnell C, Pop M, Salzberg SL et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25

    Article  Google Scholar 

  8. 8.

    Lee D, Baldassano R, Otley A, Albenberg L, Griffiths A, Compher C, Chen E, Li H, Gilroy E, Nessel L et al (2015) Comparative effectiveness of nutritional and biological therapy in North American children with active Crohn’s disease. Inflamm Bowel Dis 21:1786–1793

    Article  Google Scholar 

  9. 9.

    Lewis JD, Chen EZ, Baldassano RN, Otley AR, Griffiths AM, Lee D, Bittinger K, Bailey A, Friedman ES, Hoffmann C et al (2015) Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohns disease. Cell Host Microbe 18:489–500

    Article  Google Scholar 

  10. 10.

    Liu Y, van Kruiningen H, West A, Cartun R, Cortot A, Colombel J (1995) Immunocytochemical evidence of Listeria, Escherichia coli, and Streptococcus antigens in Crohn’s disease. Gastroenterology 108(5):1396–1404

    Article  Google Scholar 

  11. 11.

    Manichanh C, Borruel N, Casellas F, Guarner F (2012) The gut microbiota in IBD. Nat Rev Gastroenterol Hepatol 9:599–608

    Article  Google Scholar 

  12. 12.

    Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T et al (2010) A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464:59–65

    Article  Google Scholar 

  13. 13.

    Romero R, Hassan SS, Gajer P, Tarca AL, Fadrosh DW, Nikita L, Galuppi M, Lamont RF, Chaemsaithong P, Miranda J, Chaiworapongsa T, Ravel J (2014) The composition and stability of the vaginal microbiota of normal pregnant women is different From that of non-pregnant women. Microbiome 2:4

    Article  Google Scholar 

  14. 14.

    Ross MG, Russ C, Costello M, Hollinger A, Lennon NJ, Hegarty R, Nusbaum C, Jaffe DB (2013) Characterizing and measuring bias in sequence data. Genome Biol 14:R51

    Article  Google Scholar 

  15. 15.

    Sartor R (2006) Mechanisms of disease: pathogenesis of Crohn’s disease and ulcerative colitis. Nat Clin Pract Gastroenterol Hepatol 3:390–407

    Article  Google Scholar 

  16. 16.

    Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C (2012) Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 9:811–814

    Article  Google Scholar 

  17. 17.

    Stein RR, Bucci V, Toussaint NC, Buffie CG, Rätsch G, Pamer EG, Sander C, Xavier JB (2013) Ecological modeling from time-series inference: insight into dynamics and stability of intestinal microbiota. PLoS Comput Biol 9:e1003388

    Article  Google Scholar 

  18. 18.

    Sunagawa S, Mende DR, Zeller G, Izquierdo-Carrasco F, Berger SA, Kultima JR, Coelho LP, Arumugam M, Tap J, Nielsen HB et al (2013) Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods 10:1196–1199

    Article  Google Scholar 

  19. 19.

    Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI (2007) The human microbiome project. Nature 449:804–810

    Article  Google Scholar 

  20. 20.

    Van den Abbeele P, Belzer C, Goossens M, Kleerebezem M, De Vos WM, Thas O, De Weirdt R, Kerckhof F-M, Van de Wiele T (2013) Butyrate-producing Clostridium cluster XIVa species specifically colonize mucins in an in vitro gut model. ISME J 7:949–961

    Article  Google Scholar 

Download references

Acknowledgments

Supported by NIH Grants CA127334 and GM097505.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hongzhe Li.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chen, E.Z., Bushman, F.D. & Li, H. A Model-Based Approach for Species Abundance Quantification Based on Shotgun Metagenomic Data. Stat Biosci 9, 13–27 (2017). https://doi.org/10.1007/s12561-016-9148-x

Download citation

Keywords

  • Multi-sample Poisson model
  • Marker-specific effects
  • Microbiome
  • Read coverage variation