Finding the Minimal Change in a Given Tree
It is a common task in biology to determine the genealogy of species, populations, people, or genes and estimate the condition of the ancestral forms. That is often done for molecules such as proteins and nucleic acids. There are many procedures. I address here only the parsimony procedures which ask, “ How can I account for the descent of these various sequences from a common ancestor with the fewest number of changes?” The general problem being addressed is as follows. One has a set of s sequences, each sequence being a linear string of letters from some alphabet. In molecular biology the sequences are either proteins, of which there are 20 letters (amino acids), or nucleic acids, of which there are 4 letters (nucleotides). We assume that the sequences have been aligned by some method so that they are all of the same length, t. It is assumed that these s sequences arose from a common ancestral sequence by a branching process that is properly described as a strictly bifurcating tree, that is, as a graph in which there is one and only one path connecting any two nodes on the tree. The tree has s tips (exterior nodes of degree one), one for each of the s sequences and s — 2 interior nodes of degree three, plus one node of degree two, called the root, that is the ultimate ancestor, the node at which the branching process began. The edges connecting two adjacent nodes are called branches. The task is to discover for any given tree topology, the minimum amount of change, and its nature, on each branch.
KeywordsMinimal Score Interior Node Unrooted Tree Uniform Weighting Descendent Node
Unable to display preview. Download preview PDF.
- Foulds, L. R., D. Penny and M. D. Hendy. (1979b) A general approach to proving the minimality of phylogenetic trees illustrated by an example with a set of 23 vertebrates. J. Mol. Evol. J, 151–166.Google Scholar
- Hartigan, J. A. (1972). Minimum mutation fits to a given tree. Biometrics 28, 53–65.Google Scholar
- Sankoff, D. and R. J. Cedergren. (1983). Simultaneous comparison of three or more sequences related by a tree. In: Time warps, string edits and macromolecules: the theory and practice of sequence comparison, D. Sankoff and J. B. Kruskal eds. Addison-Wesley, London, 253–263.Google Scholar
- Williams, P. L. and W. M. Fitch. (1988). Weighted parsimony: when not all changes have the same value. Mol. Biol. and Evol. (submitted).Google Scholar