Abstract
In general, the evolutionary rate of proteins is not primarily related to protein and amino acid functions, and factors such as protein abundance, codon usage, and the protein’s TM are more important. To better understand the factors that affect protein evolution, E. coli MG1655 orthologs were compared to those in closely related bacteria and to more distantly related prokaryotes, eukaryotes, and archaea. Also, the evolution of different types of proteins was studied. The analyses indicate that the amino acid conservation of enzymes that do not use macromolecules (e.g. DNA, RNA, and proteins) as substrates and that carry out metabolic processes involving small molecules (i.e. small molecule enzymes) is different than other enzymes. For example, the small molecule enzymes have a lower percent identity than other enzymes when sequences from closely related bacteria are compared. Analyses indicate the lower percent identity is not a result of the amino acid or codon usage of the small molecule enzymes. The small molecule enzymes also don’t have a significantly lower protein abundance indicating that is also not likely an important factor driving differences in amino acid conservation. Analyses indicate different methods to measure the TM of proteins have different relationships between amino acid conservation over different evolutionary distances. In totality, the results demonstrate that the relationship between the factors thought to affect protein evolution (protein abundance, codon usage, and proteins TMs) and protein evolution are complex and depend on the factor, the organisms, and the type of proteins being analyzed.
Similar content being viewed by others
Data Availability
Data is taken from public sources.
Code Availability
Upon request.
References
Dasmeh P, Girard É, Serohijos AWR (2017) Highly expressed genes evolve under strong epistasis from a proteome-wide scan in E. coli. Sci Rep 7:15844. https://doi.org/10.1038/s41598-017-16030-z
Pál C, Papp B, Hurst LD (2001) Highly expressed genes in yeast evolve slowly. Genetics 158:927–931
Rocha EPC, Danchin A (2004) An analysis of determinants of amino acids substitution rates in bacterial proteins. Mol Biol Evol 21:108–116. https://doi.org/10.1093/molbev/msh004
Boël G, Letso R, Neely H et al (2016) Codon influence on protein expression in E. coli correlates with mRNA levels. Nature 529:358–363. https://doi.org/10.1038/nature16509
Ishihama Y, Schmidt T, Rappsilber J et al (2008) Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 9:102. https://doi.org/10.1186/1471-2164-9-102
Sharp PM, Tuohy TM, Mosurski KR (1986) Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res 14:5125–5143
Sharp PM, Li WH (1987) The codon Adaptation Index–a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1295
Chaney JL, Steele A, Carmichael R et al (2017) Widespread position-specific conservation of synonymous rare codons within coding sequences. PLoS Comput Biol 13:e1005531. https://doi.org/10.1371/journal.pcbi.1005531
Kim SJ, Yoon JS, Shishido H et al (2015) Protein folding. Translational tuning optimizes nascent protein folding in cells. Science 348:444–448. https://doi.org/10.1126/science.aaa3974
Palenchar PM (2008) Amino acid biases in the N- and C-termini of proteins are evolutionarily conserved and are conserved between functionally related proteins. Protein J 27:283. https://doi.org/10.1007/s10930-008-9136-1
Saikia M, Wang X, Mao Y et al (2016) Codon optimality controls differential mRNA translation during amino acid starvation. RNA N Y N 22:1719–1727. https://doi.org/10.1261/rna.058180.116
Sato T, Terabe M, Watanabe H et al (2001) Codon and base biases after the initiation codon of the open reading frames in the Escherichia coli genome and their influence on the translation efficiency. J Biochem (Tokyo) 129:851–860. https://doi.org/10.1093/oxfordjournals.jbchem.a002929
Drummond DA, Bloom JD, Adami C et al (2005) Why highly expressed proteins evolve slowly. Proc Natl Acad Sci U S A 102:14338–14343. https://doi.org/10.1073/pnas.0504070102
Drummond DA, Wilke CO (2008) Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134:341–352. https://doi.org/10.1016/j.cell.2008.05.042
Serohijos AWR, Rimas Z, Shakhnovich EI (2012) Protein biophysics explains why highly abundant proteins evolve slowly. Cell Rep 2:249–256. https://doi.org/10.1016/j.celrep.2012.06.022
Biesiadecka MK, Sliwa P, Tomala K, Korona R (2020) An overexpression experiment does not support the hypothesis that avoidance of toxicity determines the rate of protein evolution. Genome Biol Evol 12:589–596. https://doi.org/10.1093/gbe/evaa067
Plata G, Vitkup D (2018) Protein stability and avoidance of toxic misfolding do not explain the sequence constraints of highly expressed proteins. Mol Biol Evol 35:700–703. https://doi.org/10.1093/molbev/msx323
Razban RM (2019) Protein melting temperature cannot fully assess whether protein folding free energy underlies the universal abundance-evolutionary rate correlation seen in proteins. Mol Biol Evol 36:1955–1963. https://doi.org/10.1093/molbev/msz119
Sikosek T, Chan HS (2014) Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 11:20140419. https://doi.org/10.1098/rsif.2014.0419
Bloom JD, Drummond DA, Arnold FH, Wilke CO (2006) Structural determinants of the rate of protein evolution in yeast. Mol Biol Evol 23:1751–1761. https://doi.org/10.1093/molbev/msl040
Aguilar-Rodríguez J, Wagner A (2018) Metabolic determinants of enzyme evolution in a genome-scale bacterial metabolic network. Genome Biol Evol 10:3076–3088. https://doi.org/10.1093/gbe/evy234
Goh C-S, Bogan AA, Joachimiak M et al (2000) Co-evolution of proteins with their interaction partners. J Mol Biol 299:283–293. https://doi.org/10.1006/jmbi.2000.3732
Hahn MW, Conant GC, Wagner A (2004) Molecular evolution in large genetic networks: does connectivity equal constraint? J Mol Evol 58:203–211. https://doi.org/10.1007/s00239-003-2544-0
Kann MG, Shoemaker BA, Panchenko AR, Przytycka TM (2009) Correlated evolution of interacting proteins: looking behind the mirrortree. J Mol Biol 385:91–98. https://doi.org/10.1016/j.jmb.2008.09.078
Palenchar PM, Palenchar JB (2012) The evolution of metabolic enzymes in plasmodium and trypanosomatids as compared to Saccharomyces and Schizosaccharomyces. Mol Biochem Parasitol 184:13–19. https://doi.org/10.1016/j.molbiopara.2012.03.007
Schütte M, Klitgord N, Segré D, Ebenhöh O (2010) Co-evolution of metabolism and protein sequences. In: Genome informatics 2009. IMPERIAL COLLEGE PRESS, pp 156–166
Jack BR, Meyer AG, Echave J, Wilke CO (2016) Functional sites induce long-range evolutionary constraints in enzymes. PLoS Biol 14:e1002452. https://doi.org/10.1371/journal.pbio.1002452
Sharir-Ivry A, Xia Y (2019) Non-catalytic binding sites induce weaker long-range evolutionary rate gradients than catalytic sites in enzymes. J Mol Biol 431:3860–3870. https://doi.org/10.1016/j.jmb.2019.07.019
Pazos F, Helmer-Citterich M, Ausiello G, Valencia A (1997) Correlated mutations contain information about protein-protein interaction 11Edited by A. R Fersht J Mol Biol 271:511–523. https://doi.org/10.1006/jmbi.1997.1198
Dasmeh P, Serohijos AWR (2018) Estimating the contribution of folding stability to nonspecific epistasis in protein evolution. Proteins Struct Funct Bioinforma 86:1242–1250. https://doi.org/10.1002/prot.25588
Wang D, Liu F, Wang L et al (2011) Nonsynonymous substitution rate (Ka) is a relatively consistent parameter for defining fast-evolving and slow-evolving protein-coding genes. Biol Direct 6:13. https://doi.org/10.1186/1745-6150-6-13
Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113. https://doi.org/10.1186/1471-2105-5-113
Zhang Z, Li J, Zhao X-Q et al (2006) KaKs_Calculator: calculating Ka and Ks through model selection and model averaging. Genomics Proteomics Bioinformatics 4:259–263. https://doi.org/10.1016/S1672-0229(07)60007-2
Kanehisa M, Sato Y, Kawashima M et al (2016) KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44:D457–D462. https://doi.org/10.1093/nar/gkv1070
The UniProt Consortium (2019) UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 47:D506–D515. https://doi.org/10.1093/nar/gky1049
Mateus A, Bobonis J, Kurzawa N et al (2018) Thermal proteome profiling in bacteria: probing protein state in vivo. Mol Syst Biol 14:e8242
Leuenberger P, Ganscha S, Kahraman A et al (2017) Cell-wide analysis of protein thermal unfolding reveals determinants of thermostability. Science. https://doi.org/10.1126/science.aai7825
R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Dilucca M, Cimini G, Giansanti A (2018) Essentiality, conservation, evolutionary pressure and codon bias in bacterial genomes. Gene 663:178–188. https://doi.org/10.1016/j.gene.2018.04.017
Zhang J, Yang J-R (2015) Determinants of the rate of protein sequence evolution. Nat Rev Genet 16:409–420. https://doi.org/10.1038/nrg3950
Faure G, Ogurtsov AY, Shabalina SA, Koonin EV (2017) Adaptation of mRNA structure to control protein folding. RNA Biol 14:1649–1654. https://doi.org/10.1080/15476286.2017.1349047
Yang J-R (2017) Does mRNA structure contain genetic information for regulating co-translational protein folding? Zool Res 38:36–43
Komar AA (2016) The Yin and Yang of codon usage. Hum Mol Genet 25:R77–R85. https://doi.org/10.1093/hmg/ddw207
Kimchi-Sarfaty C, Oh JM, Kim I-W et al (2007) A “silent” polymorphism in the MDR1 gene changes substrate specificity. Science 315:525–528. https://doi.org/10.1126/science.1135308
Benkovic SJ, Hammes-Schiffer S (2003) A perspective on enzyme catalysis. Science 301:1196–1202. https://doi.org/10.1126/science.1085515
Bunzel HA, Anderson JLR, Mulholland AJ (2021) Designing better enzymes: insights from directed evolution. Curr Opin Struct Biol 67:212–218. https://doi.org/10.1016/j.sbi.2020.12.015
Liang Z-X, Klinman JP (2004) Structural bases of hydrogen tunneling in enzymes: progress and puzzles. Curr Opin Struct Biol 14:648–655. https://doi.org/10.1016/j.sbi.2004.10.008
Pompliano DL, Peyman A, Knowles JR (1990) Stabilization of a reaction intermediate as a catalytic device: definition of the functional role of the flexible loop in triosephosphate isomerase. Biochemistry 29:3186–3194. https://doi.org/10.1021/bi00465a005
Funding
None.
Author information
Authors and Affiliations
Contributions
All work and ideas are the work of Peter M. Palenchar.
Corresponding author
Ethics declarations
Conflict of interest
None.
Ethical Approval
Not applicable.
Informed Consent
Not applicable.
Consent for Publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Palenchar, P.M. The Influence of Codon Usage, Protein Abundance, and Protein Stability on Protein Evolution Vary by Evolutionary Distance and the Type of Protein. Protein J 41, 216–229 (2022). https://doi.org/10.1007/s10930-022-10045-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10930-022-10045-w