Evolving proteins are under selection for the ability to perform precise biochemical functions at minimal metabolic cost in a complex cellular environment. One way to investigate the different selective pressures is to examine what factors influence the rate of protein sequence evolution. In a recent study published in 2002, Fraser et al suggested that proteins that participate in more protein-protein interactions are under greater evolutionary constraint [1]. The basis for this claim was a weak but still statistically significant correlation between a protein's rate of sequence evolution and its number of interaction partners as measured by various studies of protein-protein interactions in yeast. However, subsequent studies found this correlation to be highly dependent on the particular choice of protein-protein interactions data set [2, 3].

We resolved this controversy by demonstrating that the correlation between evolutionary rate and the number of interaction partners is linked to a bias towards counting more interactions for abundant proteins[4]. Abundant proteins evolve more slowly [5] and some studies are biased towards finding more protein-protein interactions for abundant proteins [6]. Only those data sets that are biased towards finding more interactions for abundant proteins suggest a correlation between evolutionary rate and the number of interaction partners (Figure 1). Some of our findings have subsequently been echoed by others [7].

Now, Fraser and Hirsh again argue for a meaningful connection between the number of interaction partners and evolutionary rate[8]. We still cannot agree with their analysis. First, we note that the single data set they have re-analyzed is precisely the one which we identified as being the most biased (Figure 1). Their choice to only count interactions for the untagged proteins in mass-spectrometry studies not only fails to account for effects due to the choice of which protein to overexpress (as an interaction is inherently at least pairwise), but in fact increases the net bias in this data set [4]. Fraser and Hirsh also use partial correlation statistics to argue that abundance does not account for all of the correlation. While it is true that some of the data sets still show a statistically significant partial correlation (as we noted in [4]), statistical tests are only as good as the quality of the data to which they are applied and are not a substitute for carefully inspecting the effects of biases in individual data sets. Figure 1 shows a direct linear relationship between the apparent correlation and the bias of the data set, and data sets with no bias show no correlation.

Fraser and Hirsh comment that some of our previous analysis was based on expression levels measured in an aneuploid strain of yeast. This is true but irrelevant, since we observe identical trends if we quantify abundance using codon adaptation index [4] or expression levels from the microarray study preferred by Fraser and Hirsh (data not shown).

We readily acknowledge the possibility that there is a real connection between the number of interaction partners and evolutionary rate hidden in all the noise and biases. However, we feel that the appropriate null hypothesis is that there is no correlation, and we do not believe this null hypothesis has been convincingly disproven.

Hirsh and Fraser's original claim [1] rested on the idea that evolutionary constraints due to protein-protein interactions could be represented by a protein's total number of unique interaction partners. We suggest that if interactions do impose constraints on sequence evolution, they are likely to depend on more subtle factors such as the fraction of a protein's residues directly involved in an intermolecular contact or the total number of monomers present in a macromolecular complex. In fact, one study has investigated the effect of an interaction's type (transient or stable) although this analysis also failed to control for protein abundance [9].

The ultimate lesson of this controversy is that the complexities and interdependencies of protein evolutionary constraints must be properly controlled for. Many factors have now been investigated for their effects on protein evolutionary rate, and one of the interesting conclusions is that protein abundance has a far greater effect [5] than other apparently more intuitively appealing factors such as protein dispensability [1012] or the number of interaction partners. The best studies have acknowledged this fact by carefully controlling for protein abundance (see for example [13]), and we suggest that this should become standard procedure in the future.

figure 1

Figure 1