The concept of operational taxonomic units revisited: genomes of bacteria that are regarded as closely related are often highly dissimilar
The concept of operational taxonomic units (OTUs), which constructs “mathematically” defined taxa, is widely accepted and applied to describe bacterial communities using amplicon sequencing of 16S rRNA gene. OTUs are often used to infer functional traits since they are considered to fairly represent of community members. However, the link between molecular taxa, real taxa, and OTUs seems to be much more complicated. Strains of the same bacterial species (ideally belonging to the same OTU) typically only share some genes (the core genome), while other genes are strain-specific and unique. It is thus unclear to what extent are important functional traits homogeneous within an OTU and how correctly can functional traits be inferred for individual OTU members. Here, we have tested in silico the similarity of all genes and, more specifically, the set of genes encoding for glycoside hydrolases (GH) in bacterial genomes that belong to the same OTU. Genome similarity varied among OTUs, but as many as 5–78% of genes were not shared between the two bacterial genomes in the pair. The complement of GH families (the presence of gene families and the number of genes per family) differed in 95% of OTUs. In average, 43% of GH families either differed in gene counts or were present in one genome and absent in the other. These results show a serious limitation of the OTU-based approaches when used to infer the functional traits of bacterial communities and open the questions how to link environmental sequencing data and microbial functions.
This work was supported by the Czech Science Foundation (18-25706S) and by the Ministry of Education, Youth and Sports of the Czech Republic (LTT17022).
- Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, Owens SM, Betley J, Fraser L, Bauer M, Gormley N, Gilbert JA, Smith G, Knight R (2012) Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J 6:1621–1624CrossRefPubMedPubMedCentralGoogle Scholar
- Edgar RC (2018) Updating the 97% identity threshold for 16S ribosomal RNA OTUs. Bioinformatics, bty113.Google Scholar
- Langille MGI, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes JA, Clemente JC, Burkepile DE, Vega Thurber RL, Knight R, Beiko RG, Huttenhower C (2013) Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol 31:814–821CrossRefPubMedPubMedCentralGoogle Scholar
- Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM, Kubal M, Paczian T, Rodriguez A, Stevens R, Wilke A, Wilkening J, Edwards RA (2008) The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 9:386CrossRefPubMedPubMedCentralGoogle Scholar