Abstract
The Clostridium genus of bacteria contains the most widely studied biofuel-producing organisms such as Clostridium thermocellum and also some human pathogens, plus a few less characterized strains. Here, we present a comparative genomic analysis of 40 fully sequenced clostridial genomes, paying a particular attention to the biomass degradation ones. Our analysis indicates that some of the Clostridium botulinum strains may have been incorrectly classified in the current taxonomy and hence should be renamed according to the 16S ribosomal RNA (rRNA) phylogeny. A core-genome analysis suggests that only 169 orthologous gene groups are shared by all the strains, and the strain-specific gene pool consists of 22,668 genes, which is consistent with the fact that these bacteria live in very diverse environments and have evolved a very large number of strain-specific genes to adapt to different environments. Across the 40 genomes, 1.4–5.8 % of genes fall into the carbohydrate active enzyme (CAZyme) families, and 20 out of the 40 genomes may encode cellulosomes with each genome having 1 to 76 genes bearing the cellulosome-related modules such as dockerins and cohesins. A phylogenetic footprinting analysis identified cis-regulatory motifs that are enriched in the promoters of the CAZyme genes, giving rise to 32 statistically significant motif candidates.
Similar content being viewed by others
References
Tracy BP, Jones SW, Fast AG, Indurthi DC, Papoutsakis ET (2011) Clostridia: the importance of their exceptional substrate and metabolite diversity for biofuel and biorefinery applications. Curr Opin Biotechnol 23(3):364–381
Hemme CL, Mouttaki H, Lee YJ, Zhang G, Goodwin L, Lucas S, Copeland A, Lapidus A, Glavina del Rio T, Tice H, Saunders E, Brettin T, Detter JC, Han CS, Pitluck S, Land ML, Hauser LJ, Kyrpides N, Mikhailova N, He Z, Wu L, Van Nostrand JD, Henrissat B, He Q, Lawson PA, Tanner RS, Lynd LR, Wiegel J, Fields MW, Arkin AP, Schadt CW, Stevenson BS, McInerney MJ, Yang Y, Dong H, Xing D, Ren N, Wang A, Huhnke RL, Mielenz JR, Ding SY, Himmel ME, Taghavi S, van der Lelie D, Rubin EM, Zhou J (2010) Sequencing of multiple clostridial genomes related to biomass conversion and biofuel production. J Bacteriol 192(24):6494–6496. doi:10.1128/JB.01064-10
Demain AL, Newcomb M, Wu JH (2005) Cellulase, clostridia, and ethanol. Microbiol Mol Biol Rev 69(1):124–154
Bayer EA, Lamed R, White BA, Flint HJ (2008) From cellulosomes to cellulosomics. Chem Rec 8(6):364–377. doi:10.1002/tcr.20160
Raman B, McKeown CK, Rodriguez M Jr, Brown SD, Mielenz JR (2011) Transcriptomic analysis of Clostridium thermocellum ATCC 27405 cellulose fermentation. BMC Microbiol 11:134. doi:10.1186/1471-2180-11-134
Tamaru Y, Miyake H, Kuroda K, Nakanishi A, Matsushima C, Doi RH, Ueda M (2011) Comparison of the mesophilic cellulosome-producing Clostridium cellulovorans genome with other cellulosome-related clostridial genomes. Microb Biotechnol 4(1):64–73. doi:10.1111/j.1751-7915.2010.00210.x
Nolling J, Breton G, Omelchenko MV, Makarova KS, Zeng Q, Gibson R, Lee HM, Dubois J, Qiu D, Hitti J, Wolf YI, Tatusov RL, Sabathe F, Doucette-Stamm L, Soucaille P, Daly MJ, Bennett GN, Koonin EV, Smith DR (2001) Genome sequence and comparative analysis of the solvent-producing bacterium Clostridium acetobutylicum. J Bacteriol 183(16):4823–4838. doi:10.1128/JB.183.16.4823-4838.2001
Wang Y, Li X, Mao Y, Blaschek HP (2011) Single-nucleotide resolution analysis of the transcriptome structure of Clostridium beijerinckii NCIMB 8052 using RNA-Seq. BMC Genomics 12:479. doi:10.1186/1471-2164-12-479
Miller DA, Suen G, Bruce D, Copeland A, Cheng JF, Detter C, Goodwin LA, Han CS, Hauser LJ, Land ML, Lapidus A, Lucas S, Meincke L, Pitluck S, Tapia R, Teshima H, Woyke T, Fox BG, Angert ER, Currie CR (2011) Complete genome sequence of the cellulose-degrading bacterium Cellulosilyticum lentocellum. J Bacteriol 193(9):2357–2358. doi:10.1128/JB.00239-11
Feinberg L, Foden J, Barrett T, Davenport KW, Bruce D, Detter C, Tapia R, Han C, Lapidus A, Lucas S, Cheng JF, Pitluck S, Woyke T, Ivanova N, Mikhailova N, Land M, Hauser L, Argyros DA, Goodwin L, Hogsett D, Caiazza N (2011) Complete genome sequence of the cellulolytic thermophile Clostridium thermocellum DSM1313. J Bacteriol 193(11):2906–2907. doi:10.1128/JB.00322-11
Kopke M, Held C, Hujer S, Liesegang H, Wiezer A, Wollherr A, Ehrenreich A, Liebl W, Gottschalk G, Durre P (2010) Clostridium ljungdahlii represents a microbial production platform based on syngas. Proc Natl Acad Sci U S A 107(29):13087–13092. doi:10.1073/pnas.1004716107
Yokoyama S, Oshima K, Nomura I, Hattori M, Suzuki T (2011) Complete genomic sequence of the O-desmethylangolensin-producing bacterium Clostridium rRNA cluster XIVa strain SY8519, isolated from adult human intestine. J Bacteriol 193(19):5568–5569. doi:10.1128/JB.05637-11
Skarin H, Hafstrom T, Westerberg J, Segerman B (2011) Clostridium botulinum group III: a group with dual identity shaped by plasmids, phages and mobile elements. BMC Genomics 12:185. doi:10.1186/1471-2164-12-185
Seedorf H, Fricke WF, Veith B, Bruggemann H, Liesegang H, Strittmatter A, Miethke M, Buckel W, Hinderberger J, Li F, Hagemeier C, Thauer RK, Gottschalk G (2008) The genome of Clostridium kluyveri, a strict anaerobe with unique metabolic features. Proc Natl Acad Sci U S A 105(6):2128–2133. doi:10.1073/pnas.0711093105
Bettegowda C, Huang X, Lin J, Cheong I, Kohli M, Szabo SA, Zhang X, Diaz LA Jr, Velculescu VE, Parmigiani G, Kinzler KW, Vogelstein B, Zhou S (2006) The genome and transcriptomes of the anti-tumor agent Clostridium novyi-NT. Nat Biotechnol 24(12):1573–1580. doi:10.1038/nbt1256
Hill KK, Smith TJ, Helma CH, Ticknor LO, Foley BT, Svensson RT, Brown JL, Johnson EA, Smith LA, Okinaka RT, Jackson PJ, Marks JD (2007) Genetic diversity among botulinum neurotoxin-producing clostridial strains. J Bacteriol 189(3):818–832. doi:10.1128/JB.01180-06
Myers GS, Rasko DA, Cheung JK, Ravel J, Seshadri R, DeBoy RT, Ren Q, Varga J, Awad MM, Brinkac LM, Daugherty SC, Haft DH, Dodson RJ, Madupu R, Nelson WC, Rosovitz MJ, Sullivan SA, Khouri H, Dimitrov GI, Watkins KL, Mulligan S, Benton J, Radune D, Fisher DJ, Atkins HS, Hiscox T, Jost BH, Billington SJ, Songer JG, McClane BA, Titball RW, Rood JI, Melville SB, Paulsen IT (2006) Skewed genomic variability in strains of the toxigenic bacterial pathogen, Clostridium perfringens. Genome Res 16(8):1031–1040. doi:10.1101/gr.5238106
Bruggemann H, Baumer S, Fricke WF, Wiezer A, Liesegang H, Decker I, Herzberg C, Martinez-Arias R, Merkl R, Henne A, Gottschalk G (2003) The genome sequence of Clostridium tetani, the causative agent of tetanus disease. Proc Natl Acad Sci U S A 100(3):1316–1321. doi:10.1073/pnas.0335853100
Lapierre P, Gogarten JP (2009) Estimating the size of the bacterial pan-genome. Trends Genet 25(3):107–110
Medini D, Donati C, Tettelin H, Masignani V, Rappuoli R (2005) The microbial pan-genome. Curr Opini Genet Dev 15(6):589–594. doi:10.1016/j.gde.2005.09.006
Mao F, Dam P, Chou J, Olman V, Xu Y (2009) DOOR: a database for prokaryotic operons. Nucleic Acids Res 37(Database issue):D459–D463. doi:10.1093/nar/gkn757
Mao X, Ma Q, Zhou C, Chen X, Zhang H, Yang J, Mao F, Lai W, Xu Y (2013) DOOR 2.0: presenting operons and their functions through dynamic and integrated views. Nucleic Acids Res. doi:10.1093/nar/gkt1048
Weyer ER, Rettger LF (1927) A Comparative study of six different strains of the organism commonly concerned in large-scale production of butyl alcohol and acetone by the biological process. J Bacteriol 14(6):399–424
Bao G, Wang R, Zhu Y, Dong H, Mao S, Zhang Y, Chen Z, Li Y, Ma Y (2011) Complete genome sequence of Clostridium acetobutylicum DSM 1731, a solvent-producing strain with multireplicon genome architecture. J Bacteriol 193(18):5007–5008. doi:10.1128/JB.05596-11
Hu S, Zheng H, Gu Y, Zhao J, Zhang W, Yang Y, Wang S, Zhao G, Yang S, Jiang W (2011) Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018. BMC Genomics 12:93. doi:10.1186/1471-2164-12-93
O'Brien RW, Morris JG (1971) Oxygen and the growth and metabolism of Clostridium acetobutylicum. J Gen Microbiol 68(3):307–318
Giallo J, Gaudin C, Belaich JP, Petitdemange E, Caillet-Mangin F (1983) Metabolism of glucose and cellobiose by cellulolytic mesophilic Clostridium sp. strain H10. Appl Environ Microbiol 45(3):843–849
Tamaru Y, Miyake H, Kuroda K, Nakanishi A, Kawade Y, Yamamoto K, Uemura M, Fujita Y, Doi RH, Ueda M (2010) Genome sequence of the cellulosome-producing mesophilic organism Clostridium cellulovorans 743B. J Bacteriol 192(3):901–902. doi:10.1128/JB.01450-09
Shiratori H, Sasaya K, Ohiwa H, Ikeno H, Ayame S, Kataoka N, Miya A, Beppu T, Ueda K (2009) Clostridium clariflavum sp. nov. and Clostridium caenicola sp. nov., moderately thermophilic, cellulose-/cellobiose-digesting bacteria isolated from methanogenic sludge. Int J Syst Evol Microbiol 59(Pt 7):1764–1770. doi:10.1099/ijs.0.003483-0
William D, Murray Awk, And L, Van Den Berg (1982) Clostridium saccharolyticurn sp. nov., a saccharolytic species from sewage sludge. Int J Syst Bacteriol 132–135
Li LL, Taghavi S, Izquierdo JA, van der Lelie D (2012) Complete genome sequence of Clostridium sp. strain BNL1100, a cellulolytic mesophile isolated from corn stover. J Bacteriol 194(24):6982–6983. doi:10.1128/JB.01908-12
Tamaru Y, Miyake H, Kuroda K, Ueda M, Doi RH (2010) Comparative genomics of the mesophilic cellulosome-producing Clostridium cellulovorans and its application to biofuel production via consolidated bioprocessing. Environ Technol 31(8-9):889–903. doi:10.1080/09593330.2010.490856
Smith TJ, Hill KK, Foley BT, Detter JC, Munk AC, Bruce DC, Doggett NA, Smith LA, Marks JD, Xie G, Brettin TS (2007) Analysis of the neurotoxin complex genes in Clostridium botulinum A1-A4 and B1 strains: BoNT/A3, /Ba4 and /B1 clusters are located within plasmids. PLoS One 2(12):e1271. doi:10.1371/journal.pone.0001271
Pagani I, Liolios K, Jansson J, Chen IM, Smirnova T, Nosrat B, Markowitz VM, Kyrpides NC (2012) The Genomes OnLine Database (GOLD) v. 4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 40(Database issue):D571–D579. doi:10.1093/nar/gkr1100
Edmond BJ, Guerra FA, Blake J, Hempler S (1977) Case of infant botulism in Texas. Tex Med 73(10):85–88
Carter AT, Pearson BM, Crossman LC, Drou N, Heavens D, Baker D, Febrer M, Caccamo M, Grant KA, Peck MW (2011) Complete genome sequence of the proteolytic Clostridium botulinum type A5 (B3′) strain H04402 065. J Bacteriol 193(9):2351–2352. doi:10.1128/JB.00072-11
He M, Sebaihia M, Lawley TD, Stabler RA, Dawson LF, Martin MJ, Holt KE, Seth-Smith HM, Quail MA, Rance R, Brooks K, Churcher C, Harris D, Bentley SD, Burrows C, Clark L, Corton C, Murray V, Rose G, Thurston S, van Tonder A, Walker D, Wren BW, Dougan G, Parkhill J (2010) Evolutionary dynamics of Clostridium difficile over short and long time scales. Proc Natl Acad Sci U S A 107(16):7527–7532. doi:10.1073/pnas.0914322107
Brazier JS, Duerden BI, Hall V, Salmon JE, Hood J, Brett MM, McLauchlin J, George RC (2002) Isolation and identification of Clostridium spp. from infections associated with the injection of drugs: experiences of a microbiological investigation team. J Med Microbiol 51(11):985–989
Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182
Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8(10):785–786
Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y (2012) dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 40(W1):W445–W451
Cantarel BL, Coutinho PM, Rancurel C, Bernard T, Lombard V, Henrissat B (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res 37(Database issue):D233–D238. doi:10.1093/nar/gkn663
Case RJ, Boucher Y, Dahllof I, Holmstrom C, Doolittle WF, Kjelleberg S (2007) Use of 16S rRNA and rpoB genes as molecular markers for microbial ecology studies. Appl Environ Microbiol 73(1):278–288. doi:10.1128/AEM.01177-06
Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575–1584
Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, Angiuoli SV, Crabtree J, Jones AL, Durkin AS, Deboy RT, Davidsen TM, Mora M, Scarselli M, Margarit y Ros I, Peterson JD, Hauser CR, Sundaram JP, Nelson WC, Madupu R, Brinkac LM, Dodson RJ, Rosovitz MJ, Sullivan SA, Daugherty SC, Haft DH, Selengut J, Gwinn ML, Zhou L, Zafar N, Khouri H, Radune D, Dimitrov G, Watkins K, O'Connor KJ, Smith S, Utterback TR, White O, Rubens CE, Grandi G, Madoff LC, Kasper DL, Telford JL, Wessels MR, Rappuoli R, Fraser CM (2005) Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc Natl Acad Sci U S A 102(39):13950–13955. doi:10.1073/pnas.0506758102
Li G, Liu B, Ma Q, Xu Y (2011) A new framework for identifying cis-regulatory motifs in prokaryotes. Nucleic Acids Res 39(7):e42. doi:10.1093/nar/gkq948
Ma Q, Liu B, Zhou C, Yin Y, Li G, Xu Y (2013) An integrated toolkit for accurate prediction and analysis of cis-regulatory motifs at a genome scale. Bioinformatics 29(18):2261–2268. doi:10.1093/bioinformatics/btt397
Mao X, Cai T, Olyarchuk JG, Wei L (2005) Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary. Bioinformatics 21(19):3787–3793. doi:10.1093/bioinformatics/bti430
Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, Kong L, Gao G, Li CY, Wei L (2011) KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res 39(Web Server issue):W316–W322. doi:10.1093/nar/gkr483
Kim KI, van de Wiel MA (2008) Effects of dependence in high-dimensional multiple testing problems. BMC Bioinforma 9:114. doi:10.1186/1471-2105-9-114
Hutson RA, Thompson DE, Collins MD (1993) Genetic interrelationships of saccharolytic Clostridium botulinum types B, E and F and related clostridia as revealed by small-subunit rRNA gene sequences. FEMS Microbiol Lett 108(1):103–110
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P (2006) Toward automatic reconstruction of a highly resolved tree of life. Science 311(5765):1283–1287. doi:10.1126/science.1123061
Moran NA (2002) Microbial minimalism: genome reduction in bacterial pathogens. Cell 108(5):583–586
Ochman H, Davalos LM (2006) The nature and dynamics of bacterial genomes. Science 311(5768):1730–1733. doi:10.1126/science.1119966
Rocha EP, Danchin A (2002) Base composition bias might result from competition for metabolic resources. Trends Genet 18(6):291–294. doi:10.1016/S0168-9525(02)02690-2
da Huang W, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57. doi:10.1038/nprot.2008.211
Ma Q, Zhang H, Mao X, Zhou C, Liu B, Chen X, Xu Y (2014) DMINDA: an integrated web server for DNA motif identification and analyses. Nucleic Acids Res. doi:10.1093/nar/gku315
Kazakov AE, Cipriano MJ, Novichkov PS, Minovitsky S, Vinogradov DV, Arkin A, Mironov AA, Gelfand MS, Dubchak I (2007) RegTransBase—a database of regulatory sequences and interactions in a wide range of prokaryotic genomes. Nucleic Acids Res 35(Database issue):D407–D412. doi:10.1093/nar/gkl865
Munch R, Hiller K, Barg H, Heldt D, Linz S, Wingender E, Jahn D (2003) PRODORIC: prokaryotic database of gene regulation. Nucleic Acids Res 31(1):266–269
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37(Web Server issue):W202–W208. doi:10.1093/nar/gkp335
Tanaka E, Bailey T, Grant CE, Noble WS, Keich U (2011) Improved similarity scores for comparing motifs. Bioinformatics 27(12):1603–1609. doi:10.1093/bioinformatics/btr257
Dillon SC, Dorman CJ (2010) Bacterial nucleoid-associated proteins, nucleoid structure and gene expression. Nat Rev Microbiol 8(3):185–195. doi:10.1038/nrmicro2261
Benza VG, Bassetti B, Dorfman KD, Scolari VF, Bromek K, Cicuta P, Lagomarsino MC (2012) Physical descriptions of the bacterial nucleoid at large scales, and their biological implications. Rep Prog Phys Phys Soc 75(7):076602. doi:10.1088/0034-4885/75/7/076602
Hugovieux-Cotte-Pattat N, Robert-Baudouy J (1982) Regulation and transcription direction of exuR, a self-regulated repressor in Escherichia coli K-12. J Mol Biol 156(1):221–228
Rodionov DA, Mironov AA, Rakhmaninova AB, Gelfand MS (2000) Transcriptional regulation of transport and utilization systems for hexuronides, hexuronates and hexonates in gamma purple bacteria. Mol Microbiol 38(4):673–683
Acknowledgments
This research was supported in part by the National Science Foundation (#NSF DEB-0830024 and NSF MCB-0958172), the US Department of Energy’s BioEnergy Science Center (BESC) grant through the Office of Biological and Environmental Research, and National Science Foundation of China (NSFC 61272016 and 61303084). The BioEnergy Science Center is a US Department of Energy Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. Funding for open access charge was provided by the US Department of Energy’s BioEnergy Science Center (BESC).
Author Contribution
Y.Y. and Y.X. conceived the basic idea and planned the project. Q.M. and C.Z. carried out the experiments and analyzed the data. X.M. did the pathway enrichment analysis and proposed good suggestions to interpret the data in the view of biology. All authors edited the manuscript and approved the final manuscript. Q.M., C.Z., and X.M. contributed equally to this paper.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Chuan Zhou, Qin Ma, and Xizeng Mao contributed equally to this paper.
Rights and permissions
About this article
Cite this article
Zhou, C., Ma, Q., Mao, X. et al. New Insights into Clostridia Through Comparative Analyses of Their 40 Genomes. Bioenerg. Res. 7, 1481–1492 (2014). https://doi.org/10.1007/s12155-014-9486-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12155-014-9486-9