# Statistical properties of the ordinary least-squares, generalized least-squares, and minimum-evolution methods of phylogenetic inference

- 375 Downloads
- 174 Citations

## Summary

Statistical properties of the ordinary least-squares (OLS), generalized least-squares (GLS), and minimum-evolution (ME) methods of phylogenetic inference were studied by considering the case of four DNA sequences. Analytical study has shown that all three methods are statistically consistent in the sense that as the number of nucleotides examined (m) increases they tend to choose the true tree as long as the evolutionary distances used are unbiased. When evolutionary distances (d_{ij}'s) are large and sequences under study are not very long, however, the OLS criterion is often biased and may choose an incorrect tree more often than expected under random choice. It is also shown that the variance-covariance matrix of d_{ij}'s becomes singular as d_{ij}'s approach zero and thus the GLS may not be applicable when d_{ij}'s are small. The ME method suffers from neither of these problems, and the ME criterion is statistically unbiased. Computer simulation has shown that the ME method is more efficient in obtaining the true tree than the OLS and GLS methods and that the OLS is more efficient than the GLS when d_{ij}'s are small, but otherwise the GLS is more efficient.

## Key words

Phylogenetic inference Least-squares method Minimum-evolution method Statistical biases Efficiency of obtaining the true tree## Preview

Unable to display preview. Download preview PDF.

## References

- Bulmer M (1989) Estimating the variability of substitution rates. Genetics 123:615–619Google Scholar
- Bulmer M (1991) Use of the method of generalized least squares in reconstructing phylogenies from sequence data. Mol Biol Evol 8:868–883Google Scholar
- Cavalli-Sforza LL, Edwards AWF (1967) Phylogenetic analysis: models and estimation procedures. Am J Hum Genet 19:233–257Google Scholar
- Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 155:279–284Google Scholar
- Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HM (ed) Mammalian protein metabolism. Academic Press, New York, p 21Google Scholar
- Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120Google Scholar
- Kimura M, Ohta T (1972) On the stochastic model for estimation of mutational distance between homologous proteins. J Mol Evol 2:87–90Google Scholar
- Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York, p. 65Google Scholar
- Nei M, Jin L (1989) Variances of the average numbers of nucleotide substitutions within and between populations. Mol Biol Evol 6:290–300Google Scholar
- Nei M, Stephens JC, Saitou N (1985) Methods for computing the standard errors of branching points in an evolutionary tree and their application to molecular data from humans and apes. Mol Biol Evol 2:66–85Google Scholar
- Rao CR (1973) Linear statistical inference and its applications. Second edition. John Wiley and sons, New York, London, Sydney, Toronto, p. 220Google Scholar
- Rzhetsky A, Nei M (1992) A simple method for estimating and testing minimum-evolution trees. Mol Biol Evol (in press)Google Scholar
- Saitou N, Imanishi M (1989) Relative efficiencies of the Fitch-Margoliash, maximum parsimony, maximum likelihood, minimum evolution, and neighbor joining methods of phylogenetic tree construction in obtaining the correct tree. Mol Biol Evol 6:514–525Google Scholar
- Saitou, N, Nei M (1986) The number of nucleotides required to determine the branching order of three species, with special reference to the human-chimpanzee-gorilla divergence. J Mol Evol 24:189–204Google Scholar
- Saitou N, Nei M (1987) The neighbor joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425Google Scholar
- Sourdis J, Krimbas C (1987) Accuracy of phylogenetic trees estimated from DNA sequence data. Mol Biol Evol 4:159–166Google Scholar
- Sourdis J, Nei M (1988) Relative efficiencies of the maximum parsimony and distance-matrix methods in obtaining the correct phylogenetic tree. Mol Biol Evol 5:298–311Google Scholar
- Tateno Y, Nei M, Tajima F (1982) Accuracy of estimated phylogenetic trees from molecular data. I. Distantly related species. J Mol Evol 18:387–404Google Scholar