A Statistical Framework for the Functional Analysis of Metagenomes

  • Itai Sharon
  • Amrita Pati
  • Victor M. Markowitz
  • Ron Y. Pinter
Conference paper

DOI: 10.1007/978-3-642-02008-7_35

Volume 5541 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Sharon I., Pati A., Markowitz V.M., Pinter R.Y. (2009) A Statistical Framework for the Functional Analysis of Metagenomes. In: Batzoglou S. (eds) Research in Computational Molecular Biology. RECOMB 2009. Lecture Notes in Computer Science, vol 5541. Springer, Berlin, Heidelberg

Abstract

Metagenomicstudies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. We present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. We also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. We tested our method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that our framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.

Keywords

metagenomics functional analysis function comparison Lander-Waterman 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Itai Sharon
    • 1
  • Amrita Pati
    • 2
  • Victor M. Markowitz
    • 3
  • Ron Y. Pinter
    • 1
  1. 1.Department of Computer ScienceTechnionHaifaIsrael
  2. 2.Genome Biology ProgramDOE Joint Genome InstituteWalnut Creek 
  3. 3.Biological Data Management and Technology CenterLawrence Berkeley National LaboratoryBerkeley