Skip to main content

An Exact and Polynomial Distance-Based Algorithm to Reconstruct Single Copy Tandem Duplication Trees

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2676))

Included in the following conference series:

Abstract

The problem of reconstructing the duplication tree of a set of tandemly repeated sequences which are supposed to have arisen by unequal recombination, was first introduced by Fitch (1977), and has recently received a lot of attention. In this paper, we deal with the restricted problem of reconstructing single copy duplication trees. We describe an exact and polynomial distance based algorithm for solving this problem, the parsimony version of which has previously been shown to be NP-hard (like most evolutionary tree reconstruction problems). This algorithm is based on the minimum evolution principle, and thus involves selecting the shortest tree as being the correct duplication tree. After presenting the underlying mathematical concepts behind the minimum evolution principle, and some of its benefits (such as consistency), we provide a new recurrence equation to estimate the tree length using ordinary least-squares, given a matrix of pairwise distances between the copies. We then show how this equation naturally forms the dynamic programming framework on which our algorithm is based, and provide an implementation in O(n 3) time and O(n 2) space, where n is the number of copies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ohno, S.: Evolution by gene duplication. Springer Verlag, New York (1970)

    Google Scholar 

  2. Smith, G.: Evolution of repeated dna sequences by unequal crossover. Science 191 (1976) 528–535

    Article  Google Scholar 

  3. Fitch, W.: Phylogenies constrained by cross-over process as illustrated by human hemoglobins in a thirteen-cycle, eleven amino-acid repeat in human apolipoprotein A-I. Genetics 86 (1977) 623–644

    Google Scholar 

  4. Jeffreys, A., Harris, S.: Processes of gene duplication. Nature 296 (1981) 9–10

    Article  Google Scholar 

  5. Elemento, O., Gascuel, O., Lefranc, M.P.: Reconstruction de l’histoire de duplication de gènes répétés en tandem. In: Actes des Journées Ouvertes Biologie Informatique Mathématiques. (2001) 9–11

    Google Scholar 

  6. Elemento, O., Gascuel, O., Lefranc, M.P.: Reconstructing the duplication history of tandemly repeated genes. Molecular Biological Evolution 19 (2002) 278–288

    Google Scholar 

  7. Benson, G., Dong, L.: Reconstructing the duplication history of a tandem repeat. In Lengauer, T., Schneider, R., Bork, P., Brutlag, D., Glasgow, J., Mewes, H.W., Zimmer, R., eds.: Proceedings of Intelligent Systems in Molecular Biology ISMB’99. (1999) 44–53

    Google Scholar 

  8. Tang, M., Waterman, M., Yooseph, S.: Zinc finger gene clusters and tandem gene duplication. In El-Mabrouk, N., Lengauer, T., Sankoff, D., eds.: Proceedings of RECOMB 2001. (2001) 297–304

    Google Scholar 

  9. Tang, M., Waterman, M., Yooseph, S.: Zinc finger gene clusters and tandem gene duplication. Journal of Computational Biology 9 (2002) 429–446

    Article  Google Scholar 

  10. Jaitly, D., Kearney, P., Lin, G., Ma, B.: Methods for reconstructing the history of tandem repeats and their application to the human genome. Journal of Computer and System Sciences 65 (2002) 494–507.

    Article  MATH  MathSciNet  Google Scholar 

  11. Zhang, J., Nei, M.: Evolution of antennapedia-class homeobox genes. Genetics 142 (1996) 295–303

    Google Scholar 

  12. Wang, L., Gusfield, D.: Improved approximation algorithms for tree alignment. Journal of Algorithms 25 (1997) 255–273

    Article  MATH  MathSciNet  Google Scholar 

  13. Kidd, K., Sgaramella-Zonta, L.: Phylogenetic analysis: concepts and methods. American Journal of Human Genetics 23 (1971) 235–252

    Google Scholar 

  14. Rzhetsky, A., Nei, M.: Theoretical foundation of the minimum-evolution method of phylogenetic inference. Molecular Biological Evolution 10 (1993) 173–1095

    Google Scholar 

  15. Denis, F., Gascuel, O.: On the consistency of the minimum evolution principle of phylogenetic inference. Computational Molecular Biology Series, Issue IV. Discrete Applied Mathematics 127 (2003) 63–77

    Article  MATH  MathSciNet  Google Scholar 

  16. Felsenstein, J.: Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology 27 (1978) 401–410

    Article  Google Scholar 

  17. Vardi, I.: Computational Recreations in Mathematica. Addison-Wesley (1991)

    Google Scholar 

  18. Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4 (1987) 406–425

    Google Scholar 

  19. Vach, W.: Least-squares approximation of additive trees. In Opitz, O., ed.: Conceptual and Numerical Analysis of Data, Heidelberg, Springer (1989) 230–238

    Google Scholar 

  20. Gascuel, O.: Concerning the NJ algorithm and its unweighted version, UNJ. In Mirkin, B., McMorris, F., Roberts, F., Rzhetsky, A., eds.: Mathematical Hierarchies and Biology. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. Amer. Math. Society, Providence (1997) 149–170

    Google Scholar 

  21. Barthelemy, J., Guénoche, A.: Trees and proximity representations. Wiley and Sons (1991)

    Google Scholar 

  22. Elemento, O., Gascuel, O.: A fast and accurate distance-based algorithm to reconstruct tandem duplicatin trees. Bioinformatics 18 (2002) S92–S99 Proceedings of European Conference on Computational Biology (ECCB2002).

    Article  Google Scholar 

  23. Fitch, W., Margoliash, E.: Construction of phylogenetic trees. Science 155 (1967) 279–284

    Article  Google Scholar 

  24. Felsenstein, J.: An alternating least squares approach to inferring phylogenies from pairwise distances. Systematic Biology 46 (1997) 101–111

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Elemento, O., Gascuel, O. (2003). An Exact and Polynomial Distance-Based Algorithm to Reconstruct Single Copy Tandem Duplication Trees. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds) Combinatorial Pattern Matching. CPM 2003. Lecture Notes in Computer Science, vol 2676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44888-8_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-44888-8_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40311-1

  • Online ISBN: 978-3-540-44888-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics