A Statistical Framework for the Functional Analysis of Metagenomes

  • Itai Sharon
  • Amrita Pati
  • Victor M. Markowitz
  • Ron Y. Pinter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5541)

Abstract

Metagenomicstudies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. We present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. We also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. We tested our method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that our framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.

Keywords

metagenomics functional analysis function comparison Lander-Waterman 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Itai Sharon
    • 1
  • Amrita Pati
    • 2
  • Victor M. Markowitz
    • 3
  • Ron Y. Pinter
    • 1
  1. 1.Department of Computer ScienceTechnionHaifaIsrael
  2. 2.Genome Biology ProgramDOE Joint Genome InstituteWalnut Creek 
  3. 3.Biological Data Management and Technology CenterLawrence Berkeley National LaboratoryBerkeley 

Personalised recommendations