Molecular Biology

, Volume 46, Issue 1, pp 161–167 | Cite as

Fast algorithm to reconstruct a species supertree from a set of protein trees

  • K. Y. Gorbunov
  • V. A. Lyubetsky


The problem of reconstructing a species supertree from a given set of protein, gene, and regulatorysite trees is the subject of this study. Under the traditional formulation, this problem is proven to be NP-hard. We propose a reformulation: to seek for a supertree, most of the clades of which contribute to the original protein trees. In such a variant, the problem seems to be biologically natural and a fast algorithm can be developed for its solution. The algorithm was tested on artificial and biological sets of protein trees, and it proved to be efficient even under the assumption of horizontal gene transfer. When horizontal transfer is not allowed, the algorithm correctness is proved mathematically; the time necessary for repeating the algorithm is assessed, and, in the worst case scenario, it is of the order n 3 · |V 0|3, where n is the number of gene trees and |V 0| is the number of tree species. Our software for supertree construction, examples of computations, and instructions can be freely accessed at Events associated with horizontal gene transfer are not included either in this study or in any variant of the software. A general case is described in the authors’ report (journal Problems of Information Transmission, 2011).


species tree species supertree new formulation of the problem of supertree reconstruction; fast algorithm to reconstruct a supertree generation of a gene set from a supertree modeling the gene evolution along a species tree 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ma B., Li M., Zhang L., et al. 1998. On reconcstructing species trees from gene trees in term of duplications and losses. Proc. Second Annu. Int. Conf. Res. Computat. Mol. Biol. NY: ACM, pp. 182–191.CrossRefGoogle Scholar
  2. 2.
    Phylogenetic Supertrees. Combining Information to Reveal the Tree of Life. 2004. Ed. Bininda-Emonds O.R.P. Dordrecht: Kluwer.Google Scholar
  3. 3.
    Bansal M.S., Burleigh J.G., Eulenstein O., Fernández-Baca D. 2010. Robinson-Foulds Supertrees. Algorithms Mol. Biol. 5, 18.PubMedCrossRefGoogle Scholar
  4. 4.
    Lyubetsky V.A., Gorbunov K.Yu., Rusin L.Y., V’yugin V.V. 2006. Algorithms to reconstruct evolutionary events at molecular level and infer species phylogeny. In: Bioinformatics of Genome Regulation and Structure II. Eds. Kolchanov N., Hofestaedt R., Milanesi L. Springer, pp. 189–204.Google Scholar
  5. 5.
    Gorbunov K.Yu., Lyubetsky V.A. 2009. Reconstructing the evolution of genes along the species tree. Mol. Biol. (Moscow). 43, 881–893.CrossRefGoogle Scholar
  6. 6.
    Gorbunov K.Yu., Lyubetsky V.A. 2010. An algorithm of reconciliation of gene and species trees and inferring gene duplications, losses and horizontal transfers. Inform. Protsesses. 10, 140–144.Google Scholar
  7. 7.
    Gorbunov K.Yu., Lyubetsky V. A. 2011. The tree nearest on average to a given set of trees. Probl. Inform. Trans. 47, 274–288.CrossRefGoogle Scholar
  8. 8.
    Doyon J.-P., Scornavacca C., Gorbunov K.Yu., Szöllösi G.J., Ranwez V., Berry V. 2010. Lecture Notes Comp. Sci. 6398, 93–108.CrossRefGoogle Scholar
  9. 9.
    V’yugin V.V., Gelfand M.S., Lyubetsky V.A. 2002. Identification of horizontal gene transfer from phylogenetic gene trees. Mol. Biol. (Moscow). 37, 650–658.CrossRefGoogle Scholar
  10. 10.
    Bansal M.S., Burleigh J.G., Eulenstein O., Wehe A. 2007. Heuristics for the gene-duplication problem: A θ(n) speed-up for the local search. Lecture Notes Comp. Sci. 4453, 238–252.CrossRefGoogle Scholar
  11. 11.
    Pisani D., Cotton J.A., McInerney J.O. 2007. Supertrees disentangle the chimerical origin of eukaryotic genomes. Mol. Biol. Evol. 24, 1752–1760.PubMedCrossRefGoogle Scholar
  12. 12.
    Guigo R., Muchnik I., Smith T.F. 1996. Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6, 189–213.PubMedCrossRefGoogle Scholar
  13. 13.
    Wu D., Hugenholtz P., Mavromatis K., et al. 2009. A phylogeny-driven genomic encyclopaedia of bacteria and archaea. Nature. 462, 1056–1060. doi:10.1038/ nature08656.PubMedCrossRefGoogle Scholar
  14. 14.
    Wehe A., Bansal M.S., Burleigh J.G., Eulenstein O. 2008. DupTree: A program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics. 24, 1540–1541.PubMedCrossRefGoogle Scholar
  15. 15.
    Gorbunov K.Yu. and Lyubetsky V.A. 2005. Identification of ancestral genes that introduce incongruence between protein- and species trees. Mol. Biol. (Moscow). 39, 741–751.CrossRefGoogle Scholar

Copyright information

© Pleiades Publishing, Ltd. 2012

Authors and Affiliations

  1. 1.Institute for Information Transmission ProblemsRussian Academy of SciencesMoscowRussia

Personalised recommendations