Sensitivity of Patterns of Molecular Evolution to Alterations in Methodology: A Critique of Hughes and Yeager
Employing a set of 43 othologous mouse and rat genes, Hughes and Yeager (J. Mol. Evol. 45:125–130, 1997) reported (1) no correlation between synonymous and nonsynonymous rates of nucleotide substitution, (2) a positive correlation between intronic GC contents (GCi) and intronic substitution rates (Ki), (3) that the average Ki value was very similar to the average Ks value, and (4) that the compositional correlation between the rat and the mouse genes is stronger at the third codon position (GC3) than at the first and second codon positions (GC12). We have examined the robustness of these results to alterations in substitution rate estimation protocol, alignment protocol, and statistical procedure. We find that a significant correlation between Ka and Ks is observed either if a rank correlation statistic is used instead of regression analysis, if one outlier is excluded from the analysis, or if a regression weighted by gene size is employed. The correlation between Ki and GCi we find to be sensitive to changes in alignment protocol and disappears on the use of weighted means. The finding that Ks and Ki are approximately the same is dependent on the method for estimating Ks values. Finally, the variance around the regression line of rat GC3 versus mouse GC3 we find to be significantly higher than that in GC12. The source of the discrepancy between this and Hughes and Yeager's result is unclear. The variance around the line for GC4 is higher still, as might be expected. Using a methodology that may be considered preferable to that of Hughes and Yeager, we find that all four of their results are contradicted. More importantly this analysis reinforces the need for caution in assembling and analyzing data sets, as the degree of sensitivity to what many might consider minor methodological alterations is unexpected.