A Probabilistic Model for Gene Content Evolution with Duplication, Loss, and Horizontal Transfer
We introduce a Markov model for the evolution of a gene family along a phylogeny. The model includes parameters for the rates of horizontal gene transfer, gene duplication, and gene loss, in addition to branch lengths in the phylogeny. The likelihood for the changes in the size of a gene family across different organisms can be calculated in O(N+hM2) time and O(N+M2) space, where N is the number of organisms, h is the height of the phylogeny, and M is the sum of family sizes. We apply the model to the evolution of gene content in Proteobacteria using the gene families in the COG (Clusters of Orthologous Groups) database.
Unable to display preview. Download preview PDF.