Abstract
The density of contacts or the fraction of buried sites in a protein structure is thought to be related to a protein’s designability, and genes encoding more designable proteins should evolve faster than other genes. Several recent studies have tested this hypothesis but have found conflicting results. Here, we investigate how a gene’s evolutionary rate is affected by its protein’s contact density, considering the four species Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. We find for all four species that contact density correlates positively with evolutionary rate, and that these correlations do not seem to be confounded by gene expression level. The strength of this signal, however, varies widely among species. We also study the effect of contact density on domain evolution in multidomain proteins and find that a domain’s contact density influences the domain’s evolutionary rate. Within the same protein, a domain with higher contact density tends to evolve faster than a domain with lower contact density. Our study provides evidence that contact density can increase evolutionary rates, and that it acts similarly on the level of entire proteins and of individual protein domains.
Similar content being viewed by others
References
Agrafioti I, Swire J, Abbott J, Huntley D, Butcher S, Stumpf MPH (2005) Comparative analysis of the Saccharomyces cerevisiae and Caenorhabditis elegans protein interaction networks. BMC Evol Biol 5:23
Appelgren H, Kniola B, Ekwall K (2003) Distinct centromere domain structures with separate functions demonstrated in live fission yeast cells. J Cell Sci 116:4035–4042
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy tat Soc B 57:289–300
Bloom JD, Adami C (2003) Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evol Biol 3:21
Bloom JD, Drummond DA, Arnold FH, Wilke CO (2006) Structural determinants of the rate of protein evolution in yeast. Mol Biol Evol 23:1751–1761
Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH (2005) Thermodynamic prediction of protein neutrality. Proc Natl Acad Sci USA 102:606–611
Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO (2004) Integrating high-throughput and computational data elucidates bacterial networks. Nature 429:92–96
Creighton TE (1992) Proteins: structures and molecular properties, 2nd edn. Freeman, New York
Dean AM, Neuhauser C, Grenier E, Golding GB (2002) The pattern of amino acid replacements in α\β-barrels. Mol Biol Evol 19:1846–1864
Dietmann S, Park J, Notredame C, Heger A, Lappe M, Holm L (2001) A fully automatic evolutionary classification of protein folds: Dali domain dictionary version 3. Nucleic Acids Res 29:55–57
Drummond DA, Bloom JD, Adami C, Wilke CO, Arnold FH (2005) Why highly expressed proteins evolve slowly. Proc Natl Acad Sci USA 102:14338–14343
Drummond DA, Raval A, Wilke CO (2006) A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 23:327–337
Duret L, Mouchiroud D (1999) Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci USA 96:4482–4487
Duret L, Mouchiroud D (2000) Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol Biol Evol 17:68–74
Edgar RC (2004) Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797
England JL, Shakhnovich EI (2003) Structural determinant of protein designability. Phys Rev Lett 90:218101
England JL, Shakhnovich BE, Shakhnovich EI (2003) Natural selection of more designable folds: a mechanism for thermophilic adaptation. Proc Natl Acad Sci USA 100:8727–8731
Fraser HB (2005) Modularity and evolutionary constraint on proteins. Nature Genet 37:351–352
Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW (2002) Evolutionary rate in the protein interaction network. Science 296:750–752
Goldman N, Thorne JL, Jones DT (1998) Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149:445–458
Gu W, Zhou T, Ma J, Sun X, Lu Z (2004) The relationship between synonymous codon usage and protein structure in Escherichia coli and Homo sapiens. Biosystems 73:89–97
Hahn MW, Kern AD (2005) Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol 22:803–806
Hakes L, Lovell SC, Oliver SG, Robertson DL (2007) Specificity in protein interactions and its relationship with sequence diversity and coevolution. Proc Natl Acad Sci USA 104:7999–8004
Herbeck JT, Wall DP, Wernegreen JJ (2003) Gene expression level influences amino acid usage, but not codon usage, in the tsetse fly endosymbiont Wigglesworthia. Microbiology 149:2585–2596
Hirsh AE, Fraser HB (2001) Protein dispensability and rate of evolution. Nature 411:1046–1049
Holstege FCP, Jennings E, Wyrick JJ, Lee TI, Hengartner CJ, Green MR, Golub TR, Lander ES, Young RA (1998) Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95:717–728
Holstein SE, Ungewickell H, Ungewickell E (1996) Mechanism of clathrin basket dissociation: separate functions of protein domains of the DnaJ homologue auxilin. J Cell Biol 135:925–937
Hurst LD, Smith NGC (1999) Do essential genes evolve slowly? Curr Biol 9:747–750
Ihaka R, Gentleman R (1996) R: a language for data analysis and graphics. J Comput Graph Stat 5:299–314
Jordan IK, Rogozin IB, Wolf YI, Koonin EV (2002) Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res 12:962–968
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637
Kawabata T, Fukuchi S, Homma K, Ota M, Araki J, Ito T, Ichiyoshi N, Nishikawa K (2002) Gtop: a database of protein structures predicted from genome sequence. Nucleic Acids Res 30:294–298
Kim PM, Lu LJ, Gerstein MB (2006) Relating three-dimensional structures to protein networks provides evolutionary insights. Science 314:1882–1883
Koshi JM, Goldstein RA (1995) Context-dependent optimal substitution matrices. Protein Eng 8:641–645
Kussell E (2005) The designability hypothesis and protein evolution. Protein Peptide Lett 12:111–116
Lemos B, Bettencourt BR, Meiklejohn CD, Hartl DL (2005) Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. Mol Biol Evol 22:1345–1354
Li H, Helling R, Tang C, Wingreen N (1996) Emergence of preferred structures in a simple model of protein folding. Science 273:666–669
Lin YS, Hsu WL, Hwang JK, Li WH (2007) Proportion of solvent-exposed amino acids in a protein and rate of protein evolution. Mol Biol Evol 24:1005–1011
Mandel J (1982) Use of the singular value decomposition in regression analysis. Am Stat 36:15–24
Marais G, Duret L (2001) Synonymous codon usage, accuracy of translation, and gene length in Caenorhabditis elegans. J Mol Evol 52:275–280
Meyerguz L, Kleinberg J, Elber R (2007) The network of sequence flow between protein structures. Proc Natl Acad Sci USA 104:11627–11632
Mintseris J, Weng Z (2005) Structure, function, and evolution of transient and obligate protein-protein interactions. Proc Natl Acad Sci USA 102:10930–10935
Mirny LA, Shakhnovich EI (1999) Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J Mol Biol 291:177–196
Orešič M, Shalloway D (1998) Specific correlations between relative synonymous codon usage and protein secondary structure. J Mol Biol 281:31–48
Pal C, Papp B, Hurst LD (2001) Highly expressed genes in yeast evolve slowly. Genetics 158:927–931
Pal C, Papp B, Hurst LD (2003) Rate of evolution and gene dispensability. Nature 421:496–497
Pal C, Papp B, Lercher MJ (2006) An integrated view of protein evolution. Nat Rev Genet 7:337–348
Ren M, Villamarin A, Shih A, Coutavas E, Moore MS, LoCurcio M, Clarke V, Oppenheim JD, D’Eustachio P, Rush MG (1995) Separate domains of the Ran GTPase interact with different factors to regulate nuclear protein import and RNA processing. Mol Cell Biol 15:2117–2124
Rocha EPC, Danchin A (2004) An analysis of determinants of amino acids substitution rates in bacterial proteins. Mol Biol Evol 21:108–116
Shakhnovich BE (2006) Relative contributions of structural designability and functional diversity in molecular evolution of duplicates. Bioinformatics 22:e440–e445
Shakhnovich BE, Deeds E, Delisi C, Shakhnovich E (2005) Protein structure and evolutionary history determine sequence space topology. Genome Res 15:385–392
Shakhnovich EI (1998) Protein design: a perspective from simple tractable models. Fold Des 3:R45–R58
Stolc V, Gauhar Z, Mason C, Halasz G, van Batenburg MF, Rifkin SA, Hua S, Herreman T, Tongprasit W, Barbano PE, Bussemaker HJ, White KP (2004) A gene expression map for the euchromatic genome of Drosophila melanogaster. Science 306:655–660
Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB (2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 101:6062–6067
Subramanian S, Kumar S (2004) Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics 168:373–381
Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman MW (2005) Functional genomic analysis of the rates of protein evolution. Proc Natl Acad Sci USA 102:5483–5488
Wolynes PG (1996) Symmetry and the energy landscapes of biomolecules. Proc Natl Acad Sci USA 93:14249–14255
Yang ZH (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556
Zhang J, He X (2005) Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol Biol Evol 22:1147–1155
Acknowledgments
This work was supported by NIH Grant AI 065960. D.A.D. received support through an NIH center grant to the FAS Center for Systems Biology. We would like to thank Jesse Bloom for helpful comments on this work.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Zhou, T., Drummond, D.A. & Wilke, C.O. Contact Density Affects Protein Evolutionary Rate from Bacteria to Animals. J Mol Evol 66, 395–404 (2008). https://doi.org/10.1007/s00239-008-9094-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-008-9094-4