Treebased maximal likelihood substitution matrices and hidden Markov models
 G. Mitchison,
 R. Durbin
 … show all 2 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
There has been considerable interest in the problem of making maximum likelihood (ML) evolutionary trees which allow insertions and deletions. This problem is partly one of formulation: how does one define a probabilistic model for such trees which treats insertion and deletion in a biologically plausible manner? A possible answer to this question is proposed here by extending the concept of a hidden Markov model (HMM) to evolutionary trees. The model, called a treeHMM, allows what may be loosely regarded as learnable affinetype gap penalties for alignments. These penalties are expressed in HMMs as probabilities of transitions between states. In the treeHMM, this idea is given an evolutionary embodiment by defining trees of transitions. Just as the probability of a tree composed of ungapped sequences is computed, by Felsenstein's method, using matrices representing the probabilities of substitutions of residues along the edges of the tree, so the probabilities in a treeHMM are computed by substitution matrices for both residues and transitions. How to define these matrices by a ML procedure using an algorithm that learns from a database of protein sequences is shown here. Given these matrices, one can define a treeHMM likelihood for a set of sequences, assuming a particular tree topology and an alignment of the sequences to the model. If one could efficiently find the alignment which maximizes (or comes close to maximizing) this likelihood, then one could search for the optimal tree topology for the sequences. An alignment algorithm is defined here which, given a particular tree topology, is guaranteed to increase the likelihood of the model. Unfortunately, it fails to find global optima for realistic sequence sets. Thus further research is needed to turn the treeHMM into a practical phylogenetic tool.
 Title
 Treebased maximal likelihood substitution matrices and hidden Markov models
 Journal

Journal of Molecular Evolution
Volume 41, Issue 6 , pp 11391151
 Cover Date
 19951201
 DOI
 10.1007/BF00173195
 Print ISSN
 00222844
 Online ISSN
 14321432
 Publisher
 SpringerVerlag
 Additional Links
 Topics
 Keywords

 Alignment
 Hidden Markov model
 Maximum likelihood
 Phylogenetic tree
 Protein sequences
 Substitution matrices
 Industry Sectors
 Authors

 G. Mitchison ^{(1)}
 R. Durbin ^{(1)}
 Author Affiliations

 1. MRC Laboratory of Molecular Biology, Hills Road, CB2 2QH, Cambridge, England