Journal of Microbiology

, Volume 56, Issue 4, pp 281–285 | Cite as

UBCG: Up-to-date bacterial core gene set and pipeline for phylogenomic tree reconstruction

  • Seong-In Na
  • Yeong Ouk Kim
  • Seok-Hwan Yoon
  • Sung-min Ha
  • Inwoo Baek
  • Jongsik Chun
Systems and Synthetic Microbiology and Bioinformatics

Abstract

Genome-based phylogeny plays a central role in the future taxonomy and phylogenetics of Bacteria and Archaea by replacing 16S rRNA gene phylogeny. The concatenated core gene alignments are frequently used for such a purpose. The bacterial core genes are defined as single-copy, homologous genes that are present in most of the known bacterial species. There have been several studies describing such a gene set, but the number of species considered was rather small. Here we present the up-to-date bacterial core gene set, named UBCG, and software suites to accommodate necessary steps to generate and evaluate phylogenetic trees. The method was successfully used to infer phylogenomic relationship of Escherichia and related taxa and can be used for the set of genomes at any taxonomic ranks of Bacteria. The UBCG pipeline and file viewer are freely available at https://www.ezbiocloud.net/tools/ubcg and https://www.ezbiocloud.net/tools/ubcg_viewer, respectively.

Keywords

phylogeny phylogenetic analysis phylogenomics bacterial core gene 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

12275_2018_8014_MOESM1_ESM.pdf (702 kb)
Supplementary material, approximately 701 KB.

References

  1. Ankenbrand, M.J. and Keller, A. 2016. bcgTree: automatized phylogenetic tree building from bacterial core genomes. Genome 59, 783–791.CrossRefPubMedGoogle Scholar
  2. Chun, J. and Rainey, F.A. 2014. Integrating genomics into the taxonomy and systematics of the Bacteria and Archaea. Int. J. Syst. Evol. Microbiol. 64, 316–324.CrossRefPubMedGoogle Scholar
  3. Chun, J., Oren, A., Ventosa, A., Christensen, H., Arahal, D.R., da Costa, M.S., Rooney, A.P., Yi, H., Xu, X.W., De Meyer, S., et al. 2018. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int. J. Syst. Evol. Microbiol. 68, 461–466.CrossRefPubMedGoogle Scholar
  4. Creevey, C.J., Doerks, T., Fitzpatrick, D.A., Raes, J., and Bork, P. 2011. Universally distributed single-copy genes indicate a constant rate of horizontal transfer. PLoS One 6, e22099.CrossRefPubMedPubMedCentralGoogle Scholar
  5. Darling, A.E., Jospin, G., Lowe, E., Matsen, F.I., Bik, H.M., and Eisen, J.A. 2014. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ 2, e243.CrossRefPubMedPubMedCentralGoogle Scholar
  6. Dupont, C.L., Rusch, D.B., Yooseph, S., Lombardo, M.J., Richter, R.A., Valas, R., Novotny, M., Yee-Greenbaum, J., Selengut, J.D., Haft, D.H., et al. 2012. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. ISME J. 6, 1186–1199.CrossRefPubMedGoogle Scholar
  7. Eddy, S.R. 2011. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195.CrossRefPubMedPubMedCentralGoogle Scholar
  8. Edgar, R.C. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461.CrossRefPubMedGoogle Scholar
  9. Eisen, J.A. and Fraser, C.M. 2003. Phylogenomics: intersection of evolution and genomics. Science 300, 1706–1707.CrossRefPubMedGoogle Scholar
  10. Felsenstein, J. 1985. Confidence-limits on phylogenies–an approach using the bootstrap. Evolution 39, 783–791.CrossRefPubMedGoogle Scholar
  11. Finn, R.D., Coggill, P., Eberhardt, R.Y., Eddy, S.R., Mistry, J., Mitchell, A.L., Potter, S.C., Punta, M., Qureshi, M., Sangrador-Vegas, A., et al. 2016. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285.CrossRefPubMedGoogle Scholar
  12. Fox, G.E., Wisotzkey, J.D., and Jurtshuk, P.J. 1992. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity. Int. J. Syst. Bacteriol. 42, 166–170.CrossRefPubMedGoogle Scholar
  13. Haft, D.H., Selengut, J.D., Richter, R.A., Harkins, D., Basu, M.K., and Beck, E. 2013. TIGRFAMs and genome properties in 2013. Nucleic Acids Res. 41, D387–D395.CrossRefPubMedGoogle Scholar
  14. Hyatt, D., Chen, G.L., LoCascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119.CrossRefPubMedPubMedCentralGoogle Scholar
  15. Jeon, Y.S., Lee, K., Park, S.C., Kim, B.S., Cho, Y.J., Ha, S.M., and Chun, J. 2014. EzEditor: a versatile sequence alignment editor for both rRNA-and protein-coding genes. Int. J. Syst. Evol. Microbiol. 64, 689–691.CrossRefPubMedGoogle Scholar
  16. Katoh, K. and Standley, D.M. 2013. MAFFT Multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780.CrossRefPubMedPubMedCentralGoogle Scholar
  17. Price, M.N., Dehal, P.S., and Arkin, A.P. 2010. FastTree 2-approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490.CrossRefPubMedPubMedCentralGoogle Scholar
  18. Radford, A.D., Chapman, D., Dixon, L., Chantrey, J., Darby, A.C., and Hall, N. 2012. Application of next-generation sequencing technologies in virology. J. Gen. Virol. 93, 1853–1868.CrossRefPubMedPubMedCentralGoogle Scholar
  19. Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N.N., Anderson, I.J., Cheng, J.F., Darling, A., Malfatti, S., Swan, B.K., Gies, E.A., et al. 2013. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437.CrossRefPubMedGoogle Scholar
  20. Rosselló-Mora, R. and Amann, R. 2001. The species concept for prokaryotes. FEMS Microbiol. Rev. 25, 39–67.CrossRefPubMedGoogle Scholar
  21. Shih, P.M., Wu, D.Y., Latifi, A., Axen, S.D., Fewer, D.P., Talla, E., Calteau, A., Cai, F., de Marsac, N.T., Rippka, R., et al. 2013. Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing. Proc. Natl. Acad. Sci. USA 110, 1053–1058.CrossRefPubMedGoogle Scholar
  22. Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313.CrossRefPubMedPubMedCentralGoogle Scholar
  23. Tagini, F. and Greub, G. 2017. Bacterial genome sequencing in clinical microbiology: a pathogen-oriented review. Eur. J. Clin. Microbiol. Infect. Dis. 36, 2007–2020.CrossRefPubMedPubMedCentralGoogle Scholar
  24. Wu, D., Hugenholtz, P., Mavromatis, K., Pukall, R., Dalin, E., Ivanova, N.N., Kunin, V., Goodwin, L., Wu, M., Tindall, B.J., et al. 2009. A phylogeny-driven genomic encyclopaedia of bacteria and archaea. Nature 462, 1056–1060.CrossRefPubMedPubMedCentralGoogle Scholar
  25. Wu, D.Y., Jospin, G., and Eisen, J.A. 2013. Systematic identification of gene families for use as markers for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups. PLoS One 8, e77033.CrossRefPubMedPubMedCentralGoogle Scholar
  26. Yoon, S.H., Ha, S.M., Kwon, S., Lim, J., Kim, Y., Seo, H., and Chun, J. 2017. Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int. J. Syst. Evol. Microbiol. 67, 1613–1617.CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© The Microbiological Society of Korea and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Seong-In Na
    • 1
    • 2
  • Yeong Ouk Kim
    • 1
    • 2
  • Seok-Hwan Yoon
    • 4
  • Sung-min Ha
    • 3
    • 4
  • Inwoo Baek
    • 2
    • 3
  • Jongsik Chun
    • 1
    • 2
    • 3
    • 4
  1. 1.Interdisciplinary Program in BioinformaticsSeoul National UniversitySeoulRepublic of Korea
  2. 2.Institute of Molecular Biology & GeneticsSeoul National UniversitySeoulRepublic of Korea
  3. 3.School of Biological SciencesSeoul National UniversitySeoulRepublic of Korea
  4. 4.ChunLab, Inc.SeoulRepublic of Korea

Personalised recommendations