Probabilistic ancestral sequences and multiple alignments
An evolutionary configuration (EC) is a set of aligned sequences of characters (possibly representing amino acids, DNA, RNA or natural language). We define the probability of an EC, based on a given phylogenetic tree and give an algorithm to compute this probability efficiently. From these probabilities, we can compute the most likely sequence at any place in the phylogenetic tree, or its probability profile. The probability profile at the root of the tree is called the probabilistic ancestral sequence. By computing the probability of an EC, we can find by dynamic programming alignments over two subtrees. This gives an algorithm for computing multiple alignments. These multiple alignments are maximum likelihood, and are a compatible generalization of two sequence alignments.
Unable to display preview. Download preview PDF.
- L. Allison and C.S. Wallace. The posterior probability distribution of alignments and its application to parameter estimation of evolutionary trees and to optimization of multiple alignments. J. Molecular Evolution, 39:418–430, 1994.Google Scholar
- Lachlan H. Bell, John R. Coggins, and James E. Milner-White. Mix'n'match: an improved multiple sequence alignment procedure for distantly related proteins using secondary structure predictions, designed to be independent of the choice of gap penalty and scoring matrix. Protein Engineering, 6(7):683–690, 1993.PubMedGoogle Scholar
- Steven A. Benner, Mark A. Cohen, and Gaston H. Gonnet. Empirical and structural models for insertions and deletions in the divergent evolution of proteins. J. Molecular Biology, 229:1065–1082, 1993.Google Scholar
- Steven A. Benner, Mark A. Cohen, and Gaston H. Gonnet. Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Engineering, 7(11), 1994.Google Scholar
- Humberto Carillo and David. Lipman. The multiple sequence alignment problem in biology. SIAM J. Appl. Math., 48(5):1073–1082, 1988.Google Scholar
- Margaret O. Dayhoff, R. M. Schwartz, and B. C. Orcutt. A model for evolutionary change in proteins. In Margaret O. Dayhoff, editor, Atlas of Protein Sequence and Structure, volume 5, pages 345–352. National Biochemical Research Foundation, Washington DC, 1978.Google Scholar
- Gaston H. Gonnet and Chantal Korostensky. Evaluation measures of multiple sequence alignments. In preparation, 1996.Google Scholar
- Sandeep K. Gupta, John Kececioglu, and Alejandro A. Schaffer. Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J. Computational Biology, 1996. To appear.Google Scholar
- Gregory D. Schuler, Stephen F. Altschul, and David J. Lipman. A work-bench for multiple alignment construction and analysis. PROTEINS: Structure, Function, and Genetics, 9:180–190, 1991.Google Scholar
- Peter H. Sellers. On the theory and computation of evolutionary distances. SIAM J Appl. Math., 26(4):787–793, Jun 1974.Google Scholar