Statistical properties of the ordinary least-squares (OLS), generalized least-squares (GLS), and minimum-evolution (ME) methods of phylogenetic inference were studied by considering the case of four DNA sequences. Analytical study has shown that all three methods are statistically consistent in the sense that as the number of nucleotides examined (m) increases they tend to choose the true tree as long as the evolutionary distances used are unbiased. When evolutionary distances (d_{ij}'s) are large and sequences under study are not very long, however, the OLS criterion is often biased and may choose an incorrect tree more often than expected under random choice. It is also shown that the variance-covariance matrix of d_{ij}'s becomes singular as d_{ij}'s approach zero and thus the GLS may not be applicable when d_{ij}'s are small. The ME method suffers from neither of these problems, and the ME criterion is statistically unbiased. Computer simulation has shown that the ME method is more efficient in obtaining the true tree than the OLS and GLS methods and that the OLS is more efficient than the GLS when d_{ij}'s are small, but otherwise the GLS is more efficient.

Key words

Phylogenetic inference Least-squares method Minimum-evolution method Statistical biases Efficiency of obtaining the true tree