Abstract
Multiple sequence alignment (MSA) is important in functional, structural and evolutionary studies of sequence data. While MSA construction has traditionally been an interactive process, the rapid growth of genetic sequence data has engendered a need for automated sequence analysis without human intervention. This requires more accurate methods based on rigorous mathematical models that reflect sequence biology in a realistic way. Focusing on MSA as an optimization problem, we examine the problem of unifying mathematical tractability with biological accuracy in cost function design. In particular, we consider tree alignment, which is often viewed as the most “biological” of the rigorous approaches to MSA. We point out several important pitfalls in current optimization approaches to MSA and identify characteristics for good cost function design. Design issues specific to approximation algorithms are also addressed. We hope these ideas will lead to future research on a biologically realistic and mathematically rigorous approach to MSA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Chan, A. Wong, D. Chiu: Bulletin of Mathematical Biology 54, 563–598 (1992)
P. Pevsner: Journal of Applied Mathematics 52, 1763–1779 (1992)
R. Durbin, S. Eddy, A. Krogh, G. Mitchison: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (Cambridge University Press, 1999)
M. McClure, T. Vasi, W. Fitch: Mol. Biol. Evol. 11, 571–592 (1994)
D.J. Bacon, W.F. Anderson: Journal of Molecular Biology 191, 153–161 (1986)
M. Murata, J.S. Richardson, J.L. Sussman: Proc.Natl.Acad.Sci. USA 82, 3073–3077 (1985)
D. Sanko.: Journal of Applied Mathematics 28, 443–453 (1975)
D. Sanko., R.J. Cedergren: Simultaneous Comparison of Three or More Sequences Related by a Tree, in Timewarps, Edits and Macromolecules: The Theory and Practise of Sequence Comparison (Addison-Wesley, Reading, MA, 1983), pp. 253–258
S. Altschul, D. Lipman: Journal of Applied Mathematics 49(1), 197–209 (1989)
D. Gusfield: Bulletin of Mathematical Biology 55, 141–154 (1993)
E. Sweedyk, T. Warnow: (1992), Manuscript
L. Wang, T. Jiang: Journal of Computational Biology 1(4), 337–348 (1994)
M.R. Garey, D.S. Johnson: Computers and Intractability: A Guide to the Theory of NP-Completeness (W. H. Freeman and Company, 1979)
T.H. Cormen, C.E. Leiserson, R.L. Rivest: Introduction to Algorithms (MIT Press/McGraw-Hill, 1990)
J.C. Setubal, J. Meidanis: Introduction to Computational Molecular Biology (PWS Publishing Company, Boston, 1997)
W.R. Taylor: CABIOS 3(2), 81–7 (1987)
J.D. Thompson, D.G. Higgins, T.J. Gibson: NAR 22(22), 4673–80 (1994)
D. Hochbaum: Approximation Algorithms for NP-hard Croblems (PWS Publishing Company, Boston, 1997)
V. Bafna, E.L. Lawler, P. Pevzner: Approximation Algorithms for Multiple Sequence Alignment, in 5th Ann. Symp. On Pattern Combinatorial Matching Vol. 807 (1994), pp. 43–53
R. Ravi, J.D. Kececioglu: Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree, in 6th Ann. Symp. On Pattern Combinatorial Matching, Springer Verlag Lecture notes in Computer Science (1995)
L. Wang, D. Gusfield: Improved Approximation Algorithms for Tree Alignment, in 7thA nn. Symp. On Pattern Combinatorial Matching Vol. 1075 (1996), pp. 220–33
J. Jiang, L. Wang, E. Lawler: Algorithmica 16, 302–15 (1996)
T. Jiang, E. Lawler, L. Wang: Aligning sequences via an evolutionary tree: Complexity and approximation, in Proceedings of the Symposium on the Theoretical Aspects of Computer Science (1994), pp. 760–769
D. Roos: J. Biol. Chem. 268, 6269–6280 (1993)
J. Cavender: Mathematical Biosciences 40, 271–280 (1978)
J. Felsenstein: Syst. Zool. 22, 240–249 (1978)
M. Farach, S. Kannan: Efficient algorithms for inverting evolution, in Proceedings of the Symposium on the Theoretical Aspects of Computer Science (1996)
R.G. Donald, D.S. Roos: Proc.Natl.Acad.Sci. USA 90, 11,703–11,707 (1993)
R.G.K. Donald, D.S. Roos: Molec. Biochem. Parasitol. 63, 243–253 (1994)
J. Hyde: Pharmacol Ther 48(1), 45–59 (1990)
M. Tanaka, H.M. Gu, D.J. Bzik, W.B. Li, J.W. Inselburg: Mol Biochem Parasitol 39, 127–134 (1990)
M. Reynolds, D. Carter, M. Schumacher, D.S. Roos: Personal communication
D.S. Roos: Personal communication
W. Gilbert: Nature 271, 501 (1978)
T.C. Sudhof, J.L. Goldstein, M.S. Brown, D.W. Russell: Science 228, 815–822 (1985)
T.C. Sudhof, D.W. Russell, J.L. Goldstein, M.S. Brown, R. Sanchez-Pescador, G.I. Bell: Science 228, 893–895 (1985)
R.L. Dorit, W. Gilbert: Curr Opin Genet Dev 1, 464–469 (1991)
M.D. Adams et al.: Science 287(5461), 2185–9 (2000)
J.C. Venter et al.: Science 291(5507), 1304–51 (2001)
P. Bork, R.F. Doolittle: Proc.Natl.Acad.Sci. USA 89, 8990–8994 (1992)
C.B. Stewart, A.C. Wilson: Cold Spring Harbor Symposium on Quantitative Biology 52, 891–899 (1987)
R. Gutell, N. Larsen, C. Woese: Microbiological Reviews 58(1), 10–26 (1994)
C.R. Woese, S. Winker, R.R. Gutell: Proc.Natl.Acad.Sci. USA 87, 8467–8471 (1990)
R. Luthy, A.D. McLachlan, D. Eisenberg: Proteins 10, 229–239 (1991)
P. Mehta, J. Heringa, P. Argos: Protein Science 4, 2517–2525 (1995)
M. Kreitman, R.R. Hudson: Genetics 127, 565–582 (1991)
S.W. Schaeffer, C.F. Aquadro: Genetics 117, 61–73(1987)
G. Barton, M. Sternberg: Protein Engineering 1, 89–94 (1987)
A. Lesk, C. Chothia: Journal of Molecular Biology 136, 225–270 (1980)
A. Godzik: Protein Science 5, 1325–1338 (1996)
A. Aevarsson: Journal of Molecular Evolution 41, 1096–1104 (1995)
A. Valencia, M. Kjeldgaard, E.F. Pai, C. Sander: Proc.Natl.Acad.Sci. USA 88, 5443–5447 (1991)
G. Vriend, C. Sander: Proteins 11(1), 52–58 (1991)
L. Holm, C. Sander: Journal of Molecular Biology (1993)
S. Pascarella, P. Argos: Protein Engineering 5, 121–37 (1992)
A. Godzik, J. Skolnick, A. Kolinski: Protein Engineering 6(8), 801–10 (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Durand, D., Farach-Colton, M. (2002). On the design of optimization criteria for multiple sequence alignment. In: Lässig, M., Valleriani, A. (eds) Biological Evolution and Statistical Physics. Lecture Notes in Physics, vol 585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45692-9_2
Download citation
DOI: https://doi.org/10.1007/3-540-45692-9_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43188-6
Online ISBN: 978-3-540-45692-6
eBook Packages: Springer Book Archive