Bulletin of Mathematical Biology

, Volume 58, Issue 3, pp 449–469 | Cite as

A distance measure based on binary character data and its application to phylogeny reconstruction

  • W. Schmidt
  • E. -Ch. Müller


An essentially new method to relate a number of taxa on the basis of a predefined set of dichotomous properties (i.e. either present or not present) is described. The basic step of the analysis is the derivation of a sophisticated distance measure to describe the pairwise dissimilarities quantitatively on the basis of the individual properties. The presentation of the dissimilarity matrix by a tree-like structure is an obvious step implicated by the the distance measure and is related to the widely used method of successive joining of nearest neighbors with respect to the distances. The distance measure makes no use of stochastic or other mathematical models of evolutionary processes and can be interpreted best in terms of discrete information theory.


Distance Measure Evolutionary Tree Amino Acid Property Property Pattern Arbitrary Partition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bandelt, H.-J. and A. W. M. Dress, 1993. A relational approach to split decomposition. InInformation Systems and Data Analysis, Prospects, Foundations, Applications, H. H. Bock, W. Lenski and M. M. Richter (Eds). Berlin: Springer.Google Scholar
  2. Bork, P. and Ch. Grunwald, 1990. Recognition of different nucleotide binding sites in primary structure using a property pattern approach.Eur. J. Biochem. 191, 347–358.CrossRefGoogle Scholar
  3. Cavalli-Sforza, L. L. and A. W. F. Edwards. 1967. Phylogenetic analysis: Models and estimation procedures.Evolution 32, 550–570.CrossRefGoogle Scholar
  4. Felsenstein, J. 1973. Maximum-likelihood and minimum steps methods for evolutionary trees from data on discrete characters.Syst. Zool. 22, 240–249.CrossRefGoogle Scholar
  5. Feng, D.-F. and R. F. Doolittle. 1987. Progressive sequence alignment as a prerequisite to correct phylogenetic trees.J. Mol. Evol. 25, 351–360.Google Scholar
  6. Fitch, W. M. and E. Margoliash. 1967. Construction of phylogenetic trees.Science 15, 279–284.Google Scholar
  7. Fredman, M. L. 1984. Computing evolutionary similarity measures with length independent gap penalties.Bull. Math. Biol. 46, 553–566.MATHMathSciNetCrossRefGoogle Scholar
  8. GCG: Program Manual for the GCG package, version 7.3, 1993.Google Scholar
  9. Kimura, M. 1980. A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences.J. Mol. Evol. 17, 111–120.CrossRefGoogle Scholar
  10. Kishino, H. and M. Hasegawa, 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea.J. Mol. Evol. 29, 170–179.CrossRefGoogle Scholar
  11. Klotz, L. C. and R. L. Blanken, 1981. A practical method for calculating evolutionary trees from sequence data.J. Theor. Biol. 91, 261–272.CrossRefGoogle Scholar
  12. Moore, G. M., M. Goodman, and J. Barnabas. 1973. An iterative approach from the standpoint of the additive hypothesis to the dendrogram problem posed by molecular data sets.J. Theor. Biol. 38, 423–457.CrossRefGoogle Scholar
  13. Penny, D. and M. Hendy. 1986. Estimating the reliability of evolutionary trees.Mol. Biol. Evol. 3, 403–417.Google Scholar
  14. Sankoff, D., R. J. Cedergren, and W. M. McKay. 1982. A strategy for sequence phylogeny research.Nucl. Acids Res. 10, 421–431.Google Scholar
  15. Schmidt, W. 1993. Multiple alignment of protein sequences and construction of evolutionary tress based on amino acid properties—An algebraic approach.Information Systems and Data Analysis. Prospects, Foundations, Applications, H. H. Bock, W. Lenski, and M. M. Richter (Eds). Berlin: Springer.Google Scholar
  16. Schmidt, W. 1995. Phylogeny reconstruction for protein sequences based on amino acid properties.J. Mol. Evol.,41, 522–530.CrossRefGoogle Scholar
  17. Sellers, P. H. 1974. Evolutionary distances.SIAM J. Appl. Math. 26, 787–793.MATHMathSciNetCrossRefGoogle Scholar
  18. Swofford, D. L. 1991. PAUP: Phylogenetic Analysis Using Parsimony. Version 3.05. Computer program distributed by the Illinois Natural History Survey. Champaign, IL, USA.Google Scholar
  19. Tateno, Y., M. Nei, and F. Tajima. 1982. Accuracy of estimated phylogenetic trees from molecular data. I. Distantly related species.J. Mol. Evol. 18, 387–404.CrossRefGoogle Scholar
  20. Taylor, W. R. 1986. Identification of protein-sequence homology by consensus template.J. Mol. Biol. 188, 233–258.CrossRefGoogle Scholar
  21. Vodkin, M. H., V. R. Gordon and G. L. McLaughlin. 1993. A ribosomal protein in Acanthamoeba polyphaga is conserved in eukaryotic nuclei organelles and bacteria.Gene 131, 141–144.CrossRefGoogle Scholar
  22. Waterman, M. S. and T. F. Smith. 1978. On the similarity of dendrograms.J. Theor. Biol. 73, 789–800.MathSciNetCrossRefGoogle Scholar
  23. Waterman, M. S., T. F. Smith, M. Singh, and W. A. Beyer. 1977. Additive evolutionary trees.J. Theor. Biol. 64, 199–213.MathSciNetCrossRefGoogle Scholar

Copyright information

© Society for Mathematical Biology 1996

Authors and Affiliations

  • W. Schmidt
    • 1
  • E. -Ch. Müller
    • 1
  1. 1.Max-Delbrück-Center for Molecular MedicineBerlinGermany

Personalised recommendations