A Statistical Framework for Improving Genomic Annotations of Transposon Mutagenesis (TM) Assigned Essential Genes

  • Jingyuan DengEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1279)


Whole-genome transposon mutagenesis (TM) experiment followed by sequence-based identification of insertion sites is the most popular genome-wise experiment to identify essential genes in Prokaryota. However, due to the limitation of high-throughput technique, this approach yields substantial systematic biases resulting in the incorrect assignments of many essential genes. To obtain unbiased and accurate annotations of essential genes from TM experiments, we developed a novel Poisson model based statistical framework to refine these TM assignments. In the model, first we identified and incorporated several potential factors such as gene length and TM insertion information which may cause the TM assignment biases into the basic Poisson model. Then we calculated the conditional probability of an essential gene given the observed TM insertion number. By factorizing this probability through introducing a latent variable the real insertion number, we formalized the statistical framework. Through iteratively updating and optimizing model parameters to maximize the goodness-of-fit of the model to the observed TM insertion data, we finalized the model. Using this model, we are able to assign the probability score of essentiality to each individual gene given its TM assignment, which subsequently correct the experimental biases. To enable our model widely useable, we established a user-friendly Web-server that is accessible to the public:

Key words

Genomic annotations Essential genes Transposon mutagenesis Statistical framework 


  1. 1.
    Judson N, Mekalanos JJ (2000) Transposon-based approaches to identify essential bacterial genes. Trends Microbiol 8(11):521–526PubMedCrossRefGoogle Scholar
  2. 2.
    Hutchison CA, Peterson SN, Gill SR et al (1999) Global transposon mutagenesis and a minimal Mycoplasma genome. Science 286(5447):2165–2169, 8071 [pii]PubMedCrossRefGoogle Scholar
  3. 3.
    Hare RS, Walker SS, Dorman TE et al (2001) Genetic footprinting in bacteria. J Bacteriol 183(5):1694–1706. doi: 10.1128/JB.183.5.1694-1706.2001 PubMedCentralPubMedCrossRefGoogle Scholar
  4. 4.
    Akerley BJ, Rubin EJ, Novick VL et al (2002) A genome-scale analysis for identification of genes required for growth or survival of Haemophilus influenzae. Proc Natl Acad Sci U S A 99(2):966–971. doi: 10.1073/pnas.012602299, 99/2/966 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  5. 5.
    Gerdes SY, Scholle MD, Campbell JW et al (2003) Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J Bacteriol 185(19):5673–5684PubMedCentralPubMedCrossRefGoogle Scholar
  6. 6.
    Sassetti CM, Boyd DH, Rubin EJ (2003) Genes required for mycobacterial growth defined by high density mutagenesis. Mol Microbiol 48(1):77–84, 3425 [pii]PubMedCrossRefGoogle Scholar
  7. 7.
    Jacobs MA, Alwood A, Thaipisuttikul I et al (2003) Comprehensive transposon mutant library of Pseudomonas aeruginosa. Proc Natl Acad Sci U S A 100(24):14339–14344. doi: 10.1073/pnas.2036282100, 2036282100 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  8. 8.
    Tong X, Campbell JW, Balazsi G et al (2004) Genome-scale identification of conditionally essential genes in E. coli by DNA microarrays. Biochem Biophys Res Commun 322(1):347–354. doi: 10.1016/j.bbrc.2004.07.110, S0006-291X(04)01575-X [pii]PubMedCrossRefGoogle Scholar
  9. 9.
    Salama NR, Shepherd B, Falkow S (2004) Global transposon mutagenesis and essential gene analysis of Helicobacter pylori. J Bacteriol 186(23):7926–7935. doi: 10.1128/JB.186.23.7926-7935.2004, 186/23/7926 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  10. 10.
    Glass JI, Assad-Garcia N, Alperovich N et al (2006) Essential genes of a minimal bacterium. Proc Natl Acad Sci U S A 103(2):425–430PubMedCentralPubMedCrossRefGoogle Scholar
  11. 11.
    Liberati NT, Urbach JM, Miyata S et al (2006) An ordered, nonredundant library of Pseudomonas aeruginosa strain PA14 transposon insertion mutants. Proc Natl Acad Sci U S A 103(8):2833–2838PubMedCentralPubMedCrossRefGoogle Scholar
  12. 12.
    Suzuki N, Okai N, Nonaka H et al (2006) High-throughput transposon mutagenesis of Corynebacterium glutamicum and construction of a single-gene disruptant mutant library. Appl Environ Microbiol 72(5):3750–3755. doi: 10.1128/AEM.72.5.3750-3755.2006, 72/5/3750 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  13. 13.
    Filiatrault MJ, Picardo KF, Ngai H et al (2006) Identification of Pseudomonas aeruginosa genes involved in virulence and anaerobic growth. Infect Immun 74(7):4237–4245. doi: 10.1128/IAI. 02014-05, 74/7/4237 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  14. 14.
    Gallagher LA, Ramage E, Jacobs MA et al (2007) A comprehensive transposon mutant library of Francisella novicida, a bioweapon surrogate. Proc Natl Acad Sci U S A 104(3):1009–1014PubMedCentralPubMedCrossRefGoogle Scholar
  15. 15.
    French CT, Lao P, Loraine AE et al (2008) Large-scale transposon mutagenesis of Mycoplasma pulmonis. Mol Microbiol 69(1):67–76. doi: 10.1111/j.1365-2958.2008.06262.x, MMI6262 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  16. 16.
    Cameron DE, Urbach JM, Mekalanos JJ (2008) A defined transposon mutant library and its use in identifying motility genes in Vibrio cholerae. Proc Natl Acad Sci U S A 105(25):8736–8741. doi: 10.1073/pnas.0803281105, 0803281105 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  17. 17.
    Langridge GC, Phan MD, Turner DJ et al (2009) Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. Genome Res 19(12):2308–2316. doi: 10.1101/gr.097097.109, gr.097097.109 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  18. 18.
    Murray GL, Morel V, Cerqueira GM et al (2009) Genome-wide transposon mutagenesis in pathogenic Leptospira species. Infect Immun 77(2):810–816. doi: 10.1128/IAI. 01293-08, IAI.01293-08 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  19. 19.
    Chaudhuri RR, Allen AG, Owen PJ et al (2009) Comprehensive identification of essential Staphylococcus aureus genes using Transposon-Mediated Differential Hybridisation (TMDH). BMC Genomics 10:291. doi: 10.1186/1471-2164-10-291, 1471-2164-10-291 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  20. 20.
    Molina-Henares MA, de la Torre J, Garcia-Salamanca A et al (2010) Identification of conditionally essential genes for growth of Pseudomonas putida KT2440 on minimal medium through the screening of a genome-wide mutant library. Environ Microbiol 12(6):1468–1485. doi: 10.1111/j.1462-2920.2010.02166.x, EMI2166 [pii]PubMedGoogle Scholar
  21. 21.
    Lamichhane G, Freundlich JS, Ekins S et al (2011) Essential metabolites of Mycobacterium tuberculosis and their mimics. MBio 2(1):e00301–e00310. doi: 10.1128/mBio. 00301-10, mBio.00301-10 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  22. 22.
    Christen B, Abeliuk E, Collier JM et al (2011) The essential genome of a bacterium. Mol Syst Biol 7:528. doi: 10.1038/msb.2011.58, msb201158 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  23. 23.
    Soemphol W, Deeraksa A, Matsutani M et al (2011) Global analysis of the genes involved in the thermotolerance mechanism of thermotolerant Acetobacter tropicalis SKU1100. Biosci Biotechnol Biochem 75(10):1921–1928, JST.JSTAGE/bbb/110310 [pii]PubMedCrossRefGoogle Scholar
  24. 24.
    Mendum TA, Newcombe J, Mannan AA et al (2011) Interrogation of global mutagenesis data with a genome scale model of Neisseria meningitidis to assess gene fitness in vitro and in sera. Genome Biol 12(12):R127. doi: 10.1186/gb-2011-12-12-r127, gb-2011-12-12-r127 [pii]PubMedCentralPubMedCrossRefGoogle Scholar
  25. 25.
    Stahl M, Stintzi A (2011) Identification of essential genes in C. jejuni genome highlights hyper-variable plasticity regions. Funct Integr Genomics 11(2):241–257. doi: 10.1007/s10142-011-0214-7 PubMedCrossRefGoogle Scholar
  26. 26.
    Zhang R, Lin Y (2009) DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res 37(Database issue):D455–D458. doi: 10.1093/nar/gkn858 PubMedCentralPubMedCrossRefGoogle Scholar
  27. 27.
    Deng J, Su S, Lin X et al (2013) A statistical framework for improving genomic annotations of prokaryotic essential genes. PLoS One 8(3):e58178. doi: 10.1371/journal.pone.0058178 PubMedCentralPubMedCrossRefGoogle Scholar
  28. 28.
    Berg DE, Howe MM (1989) Mobile DNA. American Society for Microbiology, Washington, DCGoogle Scholar
  29. 29.
    Hamer L, DeZwaan TM, Montenegro-Chamorro MV et al (2001) Recent advances in large-scale transposon mutagenesis. Curr Opin Chem Biol 5(1):67–73, S1367-5931(00)00162-9 [pii]PubMedCrossRefGoogle Scholar
  30. 30.
    Lamichhane G, Zignol M, Blades NJ et al (2003) A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: application to Mycobacterium tuberculosis. Proc Natl Acad Sci U S A 100(12):7213–7218PubMedCentralPubMedCrossRefGoogle Scholar
  31. 31.
    Winsor GL, Lam DK, Fleming L et al (2011) Pseudomonas Genome Database: improved comparative analysis and population genomics capability for Pseudomonas genomes. Nucleic Acids Res 39(Database issue):D596–D600. doi: 10.1093/nar/gkq869 PubMedCentralPubMedCrossRefGoogle Scholar
  32. 32.
    Chen WH, Minguez P, Lercher MJ et al (2012) OGEE: an online gene essentiality database. Nucleic Acids Res 40(Database issue):D901–D906. doi: 10.1093/nar/gkr986 PubMedCentralPubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Division of Epidemiology and Biostatistics, Department of Environmental HealthUniversity of Cincinnati Medical CenterCincinnatiUSA

Personalised recommendations