Understanding relationship between sequence and functional evolution in yeast proteins
- 239 Downloads
The underlying relationship between functional variables and sequence evolutionary rates is often assessed by partial correlation analysis. However, this strategy is impeded by the difficulty of conducting meaningful statistical analysis using noisy biological data. A recent study suggested that the partial correlation analysis is misleading when data is noisy and that the principal component regression analysis is a better tool to analyze biological data. In this paper, we evaluate how these two statistical tools (partial correlation and principal component regression) perform when data are noisy. Contrary to the earlier conclusion, we found that these two tools perform comparably in most cases. Furthermore, when there is more than one ‘true’ independent variable, partial correlation analysis delivers a better representation of the data. Employing both tools may provide a more complete and complementary representation of the real data. In this light, and with new analyses, we suggest that protein length and gene dispensability play significant, independent roles in yeast protein evolution.
KeywordsPartial correlation Principal component regression Functional genomic data Yeast protein evolution
We thank D. Allan Drummond and Claus Wilke for helpful personal communications, Charles Warden for critical reading of the manuscript. SY is supported by funds from the Georgia Institute of Technology.
- Coghlan A, Wolfe KH (2000) Yeast 16:1131–1145Google Scholar
- R Development Core Team (2004) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3–900051–00-3, URL http://www.R-project.org
- Weisberg S (1985) Applied linear regression. John Wiley and Sons, 336 ppGoogle Scholar
- Whittaker J (1996) Graphical models in applied multivariate statistics. John Wiley and Sons, New York, 466 ppGoogle Scholar