Skip to main content
Log in

Comparative analysis of essential genes and nonessential genes in Escherichia coli K12

  • Origina Paper
  • Published:
Molecular Genetics and Genomics Aims and scope Submit manuscript

Abstract

Genes can be classified as essential or nonessential based on their indispensability for a living organism. Previous researches have suggested that essential genes evolve more slowly than nonessential genes and the impact of gene dispensability on a gene’s evolutionary rate is not as strong as expected. However, findings have not been consistent and evidence is controversial regarding the relationship between the gene indispensability and the rate of gene evolution. Understanding how different classes of genes evolve is essential for a full understanding of evolutionary biology, and may have medical relevance in the design of new antibacterial agents. We therefore performed an investigation into the properties of essential and nonessential genes. Analysis of evolutionary conservation, protein length distribution and amino acid usage between essential and nonessential genes in Escherichia coli K12 demonstrated that essential genes are relatively preserved throughout the bacterial kingdom when compared to nonessential genes. Furthermore, results show that essential genes, compared to nonessential genes, have a significantly higher proportion of large (>534 amino acids) and small proteins (<139 amino acids) relative to medium-sized proteins. The pattern of amino acids usage shows a similar trend for essential and nonessential genes, although some notable exceptions are observed. These findings help to clarify our understanding of the evolutionary mechanisms of essential and nonessential genes, relevant to the study of mutagenesis and possibly allowing prediction of gene properties in other poorly understood organisms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Akashi H, Gojobori T (2002) Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci USA 99:3695–3700

    Article  PubMed  CAS  Google Scholar 

  • Bharanidharan D, Gautham N (2005) Amino acid variation in cellular processes in 108 bacterial proteomes. Arch Microbiol 184:168–174

    Article  PubMed  CAS  Google Scholar 

  • Blattner FR et al (1997) The complete genome sequence of Escherichia coli k-12. Science 277:1453–1474

    Article  PubMed  CAS  Google Scholar 

  • Chiusano ML et al (2000) Second codon positions of genes and the secondary structures of proteins. Relationships and implications for the origin of the genetic code. Genes 261:63–69

    CAS  Google Scholar 

  • Craig CL, Weber RS (1998) Selection cost of amino acid substitutions in ColEl and Colla gene clusters harbored by Escherichia coli. Mol Biol Evol 15:774–776

    PubMed  CAS  Google Scholar 

  • Dufton MJ (1997) Genetic code synonym quotas and amino acid complexity: cutting the cost of proteins? J Theor Biol 187:165–173

    Article  PubMed  CAS  Google Scholar 

  • Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. BMC Bioinformatics. doi:10.1186/1471-2105-5-113

  • Fang G, Rocha E, Danchin A (2005) How essential are nonessential genes? Mol Biol Evol 222:147–2156

    Google Scholar 

  • Gerdes SY et al (2003) Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J Bacteriol 185:5672–5684

    Article  Google Scholar 

  • Hartl FU, Hayer-Hartl M (2002) Molecular chaperones in the cytosol: from nascent chain to folded protein. Science 295:1852–1858

    Article  PubMed  CAS  Google Scholar 

  • Hayashi K et al (2006) Highly accurate genome sequences of the Escherichia K-12 strains MG1655 and W3110. Mol Syst Biol. doi:10.1038/msb4100049

  • Herve S (2003) Cost-minimization of amino acid usage. J Mol Evol 56:151–161

    Article  Google Scholar 

  • Hirsh AE, Fraser HB (2001) Protein dispensability and the rate of evolution. Nature 411:1046–1049

    Article  PubMed  CAS  Google Scholar 

  • Hust LD, Smith NGC (1999) Do essential genes evolve slowly? Curr Biol 9:747–750

    Article  Google Scholar 

  • Jeong H, Mason SP, Barbasi A-L, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411:41–42

    Article  PubMed  CAS  Google Scholar 

  • Jordan IK, Rogozin IB, Wolf YI, Koonin EV (2002) Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res 12:962–968

    Article  PubMed  CAS  Google Scholar 

  • Lipman DJ, Souvorov A, Koonin EV, Panchenko AR, Tatusova TA (2002) The relationship of protein conservation and sequence length. BMC Evol Biol. doi:10.1186/1471-2148-2-20

  • Liu XI, Korde N, Jakob U, Leichert L (2006) CoSMoS: conserved sequence motif search in the proteome. BMC Bioinformatics. doi:10.1186/1471-2105-7-37

  • Ma Y (1982) Experimental statistics. Agriculture Press, Beijing

    Google Scholar 

  • Palacios C, Wernegreen JJ (2002) A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high-expression genes. Mol Biol Evol 19:1575–1584

    PubMed  CAS  Google Scholar 

  • Pál C, Papp B, Hurst LD (2003) Genome function: rate of evolution and gene dispensability (response). Nature 421:497–498

    Article  Google Scholar 

  • Rocha EP, Danchin A (2004) An analysis of determinants of amino acids substitution rates in bacterial proteins. Mol Biol Evol 21:108–116

    Article  PubMed  CAS  Google Scholar 

  • Steel RGD, Torrie JH (1980) Principles and procedures of statistics, a biometrical approach, 2nd edn. McGrawhill Book Company, New York

    Google Scholar 

  • Tan T, Frenkel D, Gupta V, Deem MW (2005) Length, protein–protein interactions, and complexity. Physica A 350:52–62

    Article  CAS  Google Scholar 

  • Tu Z, Wang L, Xu M, Zhou X, Chen T, Sun F (2006) Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics. doi:10.1186/1471-2164-7-31

  • Warringer J, Blomberg A (2006) Evolutionary constraints on yeast protein size. BMC Evol Biol. doi:10.1186/1471-2148-6-61

  • Wilson AC, Carlson SS, White TJ (1977) Biochemical evolution. Annu Rev Biochem 46:573–639

    Article  PubMed  CAS  Google Scholar 

  • Yang J, Gu Z, Li W-H (2003) Rate of protein evolution versus fitness effect of gene deletion. Mol Biol Evol 20:772–774

    Article  PubMed  Google Scholar 

  • Zhang J (2000) Protein-length distributions for the three domains of life. Trends Genet 16:107–109

    Article  PubMed  Google Scholar 

  • Zhang J, He X (2005) Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol Biol Evol 22:1147–1155

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We are grateful to N.G.C. Smith and E.V. Koonin for critical reading and constructive comments. We would also like to thank all the members in the Bioinformatics Center in Northwest A&F University for the daily useful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiheng Tao.

Additional information

Communicated by D. Ussery.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gong, X., Fan, S., Bilderbeck, A. et al. Comparative analysis of essential genes and nonessential genes in Escherichia coli K12 . Mol Genet Genomics 279, 87–94 (2008). https://doi.org/10.1007/s00438-007-0298-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00438-007-0298-x

Keywords

Navigation