Skip to main content

The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI Based Local Searches

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4983))

Abstract

The gene-duplication problem is to infer a species supertree from a collection of gene trees that are confounded by complex histories of gene duplication events. This problem is NP-complete and thus requires efficient and effective heuristics. Existing heuristics perform a stepwise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. A classical local search problem is the \(\mathop{\rm NNI}\) search problem, which is based on the nearest neighbor interchange operation. In this work we (i) provide a novel near-linear time algorithm for the \(\mathop{\rm NNI}\) search problem, (ii) introduce extensions that significantly enlarge the search space of the \(\mathop{\rm NNI}\) search problem, and (iii) present algorithms for these extended versions that are asymptotically just as efficient as our algorithm for the \(\mathop{\rm NNI}\) search problem. The substantially extended \(\mathop{\rm NNI}\) search problem, along with the exceptional speed-up achieved, make the gene-duplication problem more tractable for large-scale phylogenetic analyses.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Annals of Combinatorics 5, 1–13 (2001)

    Article  MathSciNet  Google Scholar 

  2. Bansal, M.S., Burleigh, J.G., Eulenstein, O., Wehe, A.: Heuristics for the gene-duplication problem: A Θ(n) speed-up for the local search. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 238–252. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Bansal, M.S., Eulenstein, O.: An Ω(n 2/ logn) speed-up of TBR heuristics for the gene-duplication problem. In: Giancarlo, R., Hannenhalli, S. (eds.) WABI 2007. LNCS (LNBI), vol. 4645, pp. 124–135. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Bonizzoni, P., Vedova, G.D., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theor. Comput. Sci. 347(1-2), 36–53 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 8, 409–423 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen, K., Durand, D., Farach-Colton, M.: Notung: a program for dating gene duplications and optimizing gene family trees. Journal of Computational Biology 7, 429–447 (2000)

    Article  Google Scholar 

  8. Cotton, J.A., Page, R.D.M.: Tangled tales from multiple markers: reconciling conflict between phylogenies to build molecular supertrees. In: Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, pp. 107–125. Springer, Heidelberg (2004)

    Google Scholar 

  9. DasGupta, B., He, X., Jiang, T., Li, M., Tromp, J., Zhang, L.: On distances between phylogenetic trees. In: SODA, pp. 427–436 (1997)

    Google Scholar 

  10. Fellows, M., Hallett, M., Korostensky, C., Stege, U.: Analogs and duals of the mast problem for sequences and trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 103–114. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  11. Ganapathy, G., Ramachandran, V., Warnow, T.: Better hill-climbing searches for parsimony. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 245–258. Springer, Heidelberg (2003)

    Google Scholar 

  12. Ganapathy, G., Ramachandran, V., Warnow, T.: On contract-and-refine transformations between phylogenetic trees. In: SODA, pp. 900–909 (2004)

    Google Scholar 

  13. Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage. a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28, 132–163 (1979)

    Article  Google Scholar 

  14. Górecki, P., Tiuryn, J.: On the structure of reconciliations. In: Lagergren, J. (ed.) RECOMB-WS 2004. LNCS (LNBI), vol. 3388, pp. 42–54. Springer, Heidelberg (2005)

    Google Scholar 

  15. Guigó, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Molecular Phylogenetics and Evolution 6(2), 189–213 (1996)

    Article  Google Scholar 

  16. Hallett, M.T., Lagergren, J.: New algorithms for the duplication-loss model. In: RECOMB, pp. 138–146 (2000)

    Google Scholar 

  17. Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  18. Mirkin, B., Muchnik, I., Smith, T.F.: A biology consistent model for comparing molecular phylogenies. Journal of Computational Biology 2(4), 493–507 (1995)

    Google Scholar 

  19. Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43(1), 58–77 (1994)

    Article  Google Scholar 

  20. Page, R.D.M.: GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14(9), 819–820 (1998)

    Article  Google Scholar 

  21. Page, R.D.M.: Extracting species trees from complex gene trees: reconciled trees and vertebrate phylogeny. Molecular Phylogenetics and Evolution 14, 89–106 (2000)

    Article  Google Scholar 

  22. Page, R.D.M., Charleston, M.A.: From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molec. Phyl. and Evol. 7, 231–240 (1997)

    Article  Google Scholar 

  23. Page, R.D.M., Cotton, J.: Vertebrate phylogenomics: reconciled trees and gene duplications. In: Pacific Symposium on Biocomputing, pp. 536–547 (2002)

    Google Scholar 

  24. Page, R.D.M., Holmes, E.C.: Molecular evolution: a phylogenetic approach. Blackwell Science, Malden (1998)

    Google Scholar 

  25. Sanderson, M.J., McMahon, M.M.: Inferring angiosperm phylogeny from EST data with widespread gene duplication. BMC Evolutionary Biology 7 (suppl. 1), 3 (2007)

    Article  Google Scholar 

  26. Semple, C., Steel, M.: Phylogenetics. Oxford University Press, Oxford (2003)

    MATH  Google Scholar 

  27. Slowinski, J.B., Knight, A., Rooney, A.P.: Inferring species trees from gene trees: A phylogenetic analysis of the elapidae (serpentes) based on the amino acid sequences of venom proteins. Molecular Phylogenetics and Evolution 8, 349–362 (1997)

    Article  Google Scholar 

  28. Stege, U.: Gene trees and species trees: The gene-duplication problem in fixed-parameter tractable. In: WADS, pp. 288–293 (1999)

    Google Scholar 

  29. Zhang, L.: On a Mirkin-Muchnik-Smith conjecture for comparing molecular phylogenies. Journal of Computational Biology 4(2), 177–187 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ion Măndoiu Raj Sunderraman Alexander Zelikovsky

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bansal, M.S., Eulenstein, O. (2008). The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI Based Local Searches. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79450-9_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79449-3

  • Online ISBN: 978-3-540-79450-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics