Abstract
Genes can be classified as essential or nonessential based on their indispensability for a living organism. Previous researches have suggested that essential genes evolve more slowly than nonessential genes and the impact of gene dispensability on a gene’s evolutionary rate is not as strong as expected. However, findings have not been consistent and evidence is controversial regarding the relationship between the gene indispensability and the rate of gene evolution. Understanding how different classes of genes evolve is essential for a full understanding of evolutionary biology, and may have medical relevance in the design of new antibacterial agents. We therefore performed an investigation into the properties of essential and nonessential genes. Analysis of evolutionary conservation, protein length distribution and amino acid usage between essential and nonessential genes in Escherichia coli K12 demonstrated that essential genes are relatively preserved throughout the bacterial kingdom when compared to nonessential genes. Furthermore, results show that essential genes, compared to nonessential genes, have a significantly higher proportion of large (>534 amino acids) and small proteins (<139 amino acids) relative to medium-sized proteins. The pattern of amino acids usage shows a similar trend for essential and nonessential genes, although some notable exceptions are observed. These findings help to clarify our understanding of the evolutionary mechanisms of essential and nonessential genes, relevant to the study of mutagenesis and possibly allowing prediction of gene properties in other poorly understood organisms.
Similar content being viewed by others
References
Akashi H, Gojobori T (2002) Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci USA 99:3695–3700
Bharanidharan D, Gautham N (2005) Amino acid variation in cellular processes in 108 bacterial proteomes. Arch Microbiol 184:168–174
Blattner FR et al (1997) The complete genome sequence of Escherichia coli k-12. Science 277:1453–1474
Chiusano ML et al (2000) Second codon positions of genes and the secondary structures of proteins. Relationships and implications for the origin of the genetic code. Genes 261:63–69
Craig CL, Weber RS (1998) Selection cost of amino acid substitutions in ColEl and Colla gene clusters harbored by Escherichia coli. Mol Biol Evol 15:774–776
Dufton MJ (1997) Genetic code synonym quotas and amino acid complexity: cutting the cost of proteins? J Theor Biol 187:165–173
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. BMC Bioinformatics. doi:10.1186/1471-2105-5-113
Fang G, Rocha E, Danchin A (2005) How essential are nonessential genes? Mol Biol Evol 222:147–2156
Gerdes SY et al (2003) Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J Bacteriol 185:5672–5684
Hartl FU, Hayer-Hartl M (2002) Molecular chaperones in the cytosol: from nascent chain to folded protein. Science 295:1852–1858
Hayashi K et al (2006) Highly accurate genome sequences of the Escherichia K-12 strains MG1655 and W3110. Mol Syst Biol. doi:10.1038/msb4100049
Herve S (2003) Cost-minimization of amino acid usage. J Mol Evol 56:151–161
Hirsh AE, Fraser HB (2001) Protein dispensability and the rate of evolution. Nature 411:1046–1049
Hust LD, Smith NGC (1999) Do essential genes evolve slowly? Curr Biol 9:747–750
Jeong H, Mason SP, Barbasi A-L, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411:41–42
Jordan IK, Rogozin IB, Wolf YI, Koonin EV (2002) Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res 12:962–968
Lipman DJ, Souvorov A, Koonin EV, Panchenko AR, Tatusova TA (2002) The relationship of protein conservation and sequence length. BMC Evol Biol. doi:10.1186/1471-2148-2-20
Liu XI, Korde N, Jakob U, Leichert L (2006) CoSMoS: conserved sequence motif search in the proteome. BMC Bioinformatics. doi:10.1186/1471-2105-7-37
Ma Y (1982) Experimental statistics. Agriculture Press, Beijing
Palacios C, Wernegreen JJ (2002) A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high-expression genes. Mol Biol Evol 19:1575–1584
Pál C, Papp B, Hurst LD (2003) Genome function: rate of evolution and gene dispensability (response). Nature 421:497–498
Rocha EP, Danchin A (2004) An analysis of determinants of amino acids substitution rates in bacterial proteins. Mol Biol Evol 21:108–116
Steel RGD, Torrie JH (1980) Principles and procedures of statistics, a biometrical approach, 2nd edn. McGrawhill Book Company, New York
Tan T, Frenkel D, Gupta V, Deem MW (2005) Length, protein–protein interactions, and complexity. Physica A 350:52–62
Tu Z, Wang L, Xu M, Zhou X, Chen T, Sun F (2006) Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics. doi:10.1186/1471-2164-7-31
Warringer J, Blomberg A (2006) Evolutionary constraints on yeast protein size. BMC Evol Biol. doi:10.1186/1471-2148-6-61
Wilson AC, Carlson SS, White TJ (1977) Biochemical evolution. Annu Rev Biochem 46:573–639
Yang J, Gu Z, Li W-H (2003) Rate of protein evolution versus fitness effect of gene deletion. Mol Biol Evol 20:772–774
Zhang J (2000) Protein-length distributions for the three domains of life. Trends Genet 16:107–109
Zhang J, He X (2005) Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol Biol Evol 22:1147–1155
Acknowledgments
We are grateful to N.G.C. Smith and E.V. Koonin for critical reading and constructive comments. We would also like to thank all the members in the Bioinformatics Center in Northwest A&F University for the daily useful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by D. Ussery.
Rights and permissions
About this article
Cite this article
Gong, X., Fan, S., Bilderbeck, A. et al. Comparative analysis of essential genes and nonessential genes in Escherichia coli K12 . Mol Genet Genomics 279, 87–94 (2008). https://doi.org/10.1007/s00438-007-0298-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-007-0298-x