Statistical properties of the ordinary least-squares (OLS), generalized least-squares (GLS), and minimum-evolution (ME) methods of phylogenetic inference were studied by considering the case of four DNA sequences. Analytical study has shown that all three methods are statistically consistent in the sense that as the number of nucleotides examined (m) increases they tend to choose the true tree as long as the evolutionary distances used are unbiased. When evolutionary distances (dij's) are large and sequences under study are not very long, however, the OLS criterion is often biased and may choose an incorrect tree more often than expected under random choice. It is also shown that the variance-covariance matrix of dij's becomes singular as dij's approach zero and thus the GLS may not be applicable when dij's are small. The ME method suffers from neither of these problems, and the ME criterion is statistically unbiased. Computer simulation has shown that the ME method is more efficient in obtaining the true tree than the OLS and GLS methods and that the OLS is more efficient than the GLS when dij's are small, but otherwise the GLS is more efficient.
Phylogenetic inferenceLeast-squares methodMinimum-evolution methodStatistical biasesEfficiency of obtaining the true tree