Abstract
We consider the following problem: given a forest of gene family trees on a set of genomes, find a first speciation which splits these genomes into two subsets and minimizes the number of gene duplications that happened before this speciation. We call this problem the Minimum Duplication Bipartition Problem. Using a generalization of the Minimum Edge-Cut Problem, known as Submodular Function Minimization, we propose a polynomial time and space 2-approximation algorithm for the Minimum Duplication Bipartition Problem. We illustrate the potential of this algorithm on both synthetic and real data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bansal, M.S., et al.: Heuristics for the gene-duplication problem: A Θ(n) speed-up for the local search. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 238–252. Springer, Heidelberg (2007)
Bansal, M.S., Shamir, R.: A Note on the Fixed Parameter Tractability of the Gene-Duplication Problem (submitted, 2010)
Blomme, T., van de Peer, Y., et al.: The gain and loss of genes during 600 millions years of vertebrate evolution. Genome Biol. 7, R43 (2006)
Bryant, D.: Hunting for trees, building trees and comparing trees: theory and methods in phylogenetic analysis. Ph.D. thesis, Dept. of Math., Univ. of Canterbury, New Zealand (1997)
Burleig, J.G., et al.: Genome-scale phylogenetics: Inferring the plant tree of life from 18,896 discordant gene trees Systematic Biology (in press, 2010)
Byrka, J., Guillemot, S., Jansson, J.: New Results on Optimizing Rooted Triplets Consistency. In: Hong, S.-H., Nagamochi, H., Fukunaga, T. (eds.) ISAAC 2008. LNCS, vol. 5369, pp. 484–495. Springer, Heidelberg (2008)
Chauve, C., El-Mabrouk, N.: New perspectives on gene family evolution: losses in reconciliation and a link with supertrees. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 46–58. Springer, Heidelberg (2009)
Chauve, C., Ouangraoua, A.: A 3-approximation algorithm for computing a parsimonious first speciation in the gene duplication model. arXiv:0904.1645v2 (2009)
De Bie, T., et al.: CAFE: a computational tool for the study of gene family evolution. Bioinformatics 22, 1269–1271 (2006)
Fujishige, S.: Submodular Functions and Optimization, 2nd edn. Annals of Discrete Math., vol. 58. Elsevier, Amsterdam (2005)
Górecki, P., Tiuryn, J.: DLS-trees: a model of evolutionary scenarios. Theoret. Comput. Sci. 359, 378–399 (2006)
Guillemot, S.: Approches combinatoires pour le consensus d’arbres et de séquences. Ph.D. thesis, Univ. Montpellier II, France (2008)
Hallett, M.T., Lagergren, J.: New algorithms for the duplication-loss model. In: RECOMB 2000, pp. 138–146. ACM Press, New York (2000)
Iwata, S., Orlin, J.B.: Simple combinatorial algorithm for submodular function minimization. In: SODA 2009, pp. 1230–1237. SIAM, Philadelphia (2009)
Hahn, M.W., Han, M.V., Han, S.-G.: Gene family evolution across 12 Drosophilia genomes. PLoS Genet. 3, e197 (2007)
Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30, 729–752 (2000)
Mak, W.-K.: Faster Min-Cut Computation in Unweighted Hypergraphs/Circuit Netlists. In: VLSI Design, Automation and Test, 2005 (VLSI-TSA-DAT), pp. 67–70. IEEE, Los Alamitos (2005)
Page, R.D.M.: Modified mincut supertrees. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 537–551. Springer, Heidelberg (2002)
Sanderson, M.J., McMahon, M.M.: Inferring angiosperm phylogeny from EST data with widespread gene duplication. BMC Evol. Biol. 7, S3 (2007)
Scornavacca, C., Berry, V., Ranwez, V.: From gene trees to species trees through a supertree approach. In: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.) LATA 2009. LNCS, vol. 5457, pp. 702–714. Springer, Heidelberg (2009)
Semple, C., Steel, M.: A supertree method for rooted trees. Discrete Appl. Math. 105, 147–158 (2000)
Stege, U.: Gene Trees and Species Trees: The Gene-Duplication Problem in Fixed-Parameter Tractable. In: Dehne, F., Gupta, A., Sack, J.-R., Tamassia, R. (eds.) WADS 1999. LNCS, vol. 1663, pp. 288–293. Springer, Heidelberg (1999)
Wapinski, I., et al.: Natural history and evolutionary principles of gene duplication in fungi. Nature 449, 54–61 (2007)
Li, H., et al.: Treefam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res. 34, 572–580 (2006)
Wehe, A., et al.: DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics 24, 1540–1541 (2008)
Zhou, X., Lin, Z., Ma, H.: Phylogenetic detection of numerous gene duplications shared by animals, fungi and plants Genome Biol., 11, R38 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ouangraoua, A., Swenson, K.M., Chauve, C. (2010). An Approximation Algorithm for Computing a Parsimonious First Speciation in the Gene Duplication Model. In: Tannier, E. (eds) Comparative Genomics. RECOMB-CG 2010. Lecture Notes in Computer Science(), vol 6398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16181-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-16181-0_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16180-3
Online ISBN: 978-3-642-16181-0
eBook Packages: Computer ScienceComputer Science (R0)