Skip to main content
Log in

Approximation algorithms for tree alignment with a given phylogeny

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We study the following fundamental problem in computational molecular biology: Given a set of DNA sequences representing some species and a phylogenetic tree depicting the ancestral relationship among these species, compute an optimal alignment of the sequences by the means of constructing a minimum-cost evolutionary tree. The problem is an important variant of multiple sequence alignment, and is widely known astree alignment. We design an efficient approximation algorithm with performance ratio 2 for tree alignment. The algorithm is then extended to a polynomial-time approximation scheme. The construction actually works for Steiner trees in any metric space, and thus implies a polynomial-time approximation scheme for planar Steiner trees under a given topology (with any constant degree). To our knowledge, this is the first polynomial-time approximation scheme in the fields of computational biology and Steiner trees. The approximation algorithms may be useful in evolutionary genetics practice as they can provide a good initial alignment for the iterative method in [23].

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. Altschul and D. Lipman, Trees, stars, and multiple sequence alignment,SIAM J. Appl. Math.,49 (1989), 197–209.

    Google Scholar 

  2. S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy, Proof verification and hardness of approximation problems,Proc. 33rd IEEE Symp. on Foundations of Computer Science, 1992, pp. 14–23.

  3. M. Bern and P. Plassmann, The Steiner problem with edge lengths 1 and 2,Inform. Process. Lett.,32 (1989), 171–176.

    Google Scholar 

  4. H. Carrillo and D. Lipman, The multiple sequence alignment problem in biology,SIAM J. Appl. Math.,48 (1988), 1073–1082.

    Google Scholar 

  5. S. C. Chan, A. K. C. Wong, and D. K. T. Chiu, A survey of multiple sequence comparison methods,Bull. Math. Biol.,54(4) (1992), 563–598.

    Google Scholar 

  6. D. Z. Du, Y. Zhang, and Q. Feng, On better heuristic for Euclidean Steiner minimum trees,Proc. 32nd IEEE Symp. on Foundations of Computer Science, 1991, pp. 431–439.

  7. J. S. Farris, Methods for computing Wagner trees,Systematic Zoology,19 (1970), 83–92.

    Google Scholar 

  8. M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman, San Francisco, CA, 1979.

    Google Scholar 

  9. D. Gusfield, Efficient methods for multiple sequence alignment with guaranteed error bounds,Bull. Math. Biol.,55 (1993), 141–154.

    Google Scholar 

  10. J. J. Hein, A tree reconstruction method that is economical in the number of pairwise comparisons used,Mol. Biol. Evol.,6(6) (1989), 669–684.

    Google Scholar 

  11. J. J. Hein, A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phytogeny is given,Mol. Biol. Evol.,6(6) (1989), 649–668.

    Google Scholar 

  12. F. K. Hwang and D. S. Richards, Steiner tree problems,Networks,22 (1992), 55–89.

    Google Scholar 

  13. F. K. Hwang and J. F. Weng, The shortest network under a given topology,J. Algorithms,13 (1992), 468–488.

    Google Scholar 

  14. R. M. Karp, Probabilistic analysis of partitioning algorithms for the traveling salesman problem in the plane,Math. Oper. Res.,2 (1977), 209–224.

    Google Scholar 

  15. R. M. Karp, Mapping the genome: some combinatorial problems arising in molecular biology,Proc. ACM Symp. on Theory of Computing, 1993, pp. 278–285.

  16. E. S. Lander, R. Langridge, and D. M. Saccocio, Mapping and interpreting biological information,Comm. ACM,34(11) (1991), 33–39.

    Google Scholar 

  17. C. H. Papadimitriou and M. Yannakakis, Optimization, approximation, and complexity classes,J. Comput. System Sci.,43 (1991), 425–440.

    Google Scholar 

  18. D. Penny, Criteria for optimising phylogenetic trees and the problem of determining the root of a tree,J. Mol. Evol.,8 (1976), 95–116.

    Google Scholar 

  19. D. Sankoff, Minimal mutation trees of sequences,SIAM J. Appl. Math.,28(1) (1975), 35–42.

    Google Scholar 

  20. D. Sankoff and P. Rousseau, Locating the vertices of a Steiner tree in an arbitrary metric space,Math. Programming,9 (1975), 240–246.

    Google Scholar 

  21. D. Sankoff and R. Cedergren, Simultaneous comparisons of three or more sequences related by a tree, in D. Sankoff and J. Kruskal (eds.),Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison, pp. 253–264, Addison-Wesley, Reading, MA, 1983.

    Google Scholar 

  22. D. Sankoff and J. Kruskal (eds.),Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison, Addison-Wesley, Reading, MA, 1983.

    Google Scholar 

  23. D. Sankoff, R. Cedergren, and G. Lapalme, Frequency of insertion-deletion, transversion, and transition in the evolution of 5S ribosomal RNA,J. Mol. Evol.,7 (1976), 133–149.

    Google Scholar 

  24. N. Saitou and M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees,Mol. Biol. Evol.,4(4) (1987), 406–425.

    PubMed  Google Scholar 

  25. L. Wang and T. Jiang, On the complexity of multiple sequence alignment,J. Computat. Biol.,1(4) (1994), 337–348.

    Google Scholar 

  26. M. S. Waterman, Sequence alignments, in M. S. Waterman (ed.),Mathematical Methods for DNA Sequences, CRC, Boca Raton, FL, 1989, pp. 53–92.

    Google Scholar 

  27. M. S. Waterman and M. D. Perlwitz, Line geometries for sequence comparisons.Bull. Math. Biol.,46 (1984), 567–577.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Communicated by R. M. Karp.

Supported in part by NSERC Operating Grant OGP0046613.

Supported in part by NSERC Operating Grant OGP0046613 and a Canadian Genome Analysis and Technology Research Grant.

Supported in part by US Department of Energy Grant DE-FG03-90ER6099.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, L., Jiang, T. & Lawler, E.L. Approximation algorithms for tree alignment with a given phylogeny. Algorithmica 16, 302–315 (1996). https://doi.org/10.1007/BF01955679

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01955679

Key words

Navigation