Skip to main content
Log in

Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets

  • Short Communication
  • Published:
Antonie van Leeuwenhoek Aims and scope Submit manuscript

Abstract

The ribosomal small subunit (SSU) rRNA gene has emerged as an important genetic marker for taxonomic identification in environmental sequencing datasets. In addition to being present in the nucleus of eukaryotes and the core genome of prokaryotes, the gene is also found in the mitochondria of eukaryotes and in the chloroplasts of photosynthetic eukaryotes. These three sets of genes are conceptually paralogous and should in most situations not be aligned and analyzed jointly. To identify the origin of SSU sequences in complex sequence datasets has hitherto been a time-consuming and largely manual undertaking. However, the present study introduces Metaxa (http://microbiology.se/software/metaxa/), an automated software tool to extract full-length and partial SSU sequences from larger sequence datasets and assign them to an archaeal, bacterial, nuclear eukaryote, mitochondrial, or chloroplast origin. Using data from reference databases and from full-length organelle and organism genomes, we show that Metaxa detects and scores SSU sequences for origin with very low proportions of false positives and negatives. We believe that this tool will be useful in microbial and evolutionary ecology as well as in metagenomics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  PubMed  CAS  Google Scholar 

  • Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2009) GenBank. Nucleic Acids Res 37:D26–D31

    Article  PubMed  CAS  Google Scholar 

  • Bentley DR (2006) Whole genome re-sequencing. Curr Opin Genet Dev 16:545–552

    Article  PubMed  CAS  Google Scholar 

  • Bidartondo MI, Bruns TD, Blackwell M et al (2008) Preserving accuracy in GenBank. Science 319:1616

    Article  PubMed  CAS  Google Scholar 

  • Cannone JJ, Subramanian S, Schnare MN et al (2002) The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics 3:2

    Article  PubMed  Google Scholar 

  • Christen R (2008) Global sequencing: a review of current molecular data and new methods available to assess microbial diversity. Microbes Environ 23:253–268

    Article  PubMed  Google Scholar 

  • Cole JR, Wang Q, Cardenas E et al (2009) The ribosomal database project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 37:D141–D145

    Article  PubMed  CAS  Google Scholar 

  • DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D, Hu P, Andersen GL (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 72:5069–5072

    Article  PubMed  CAS  Google Scholar 

  • Eddy SR (1998) Profile hidden markov models. Bioinformatics 14:755–763

    Article  PubMed  CAS  Google Scholar 

  • Hartmann M, Howes CG, Abarenkov K, Mohn WW, Nilsson RH (2010) V-Xtractor: an open-source, high-throughput software tool to identify and extract hypervariable regions of small subunit (16S/18S) ribosomal RNA gene sequences. J Microbiol Methods 83:250–253

    Article  PubMed  CAS  Google Scholar 

  • Hartmann M, Howes CG, Veldre V et al (2011) V-RevComp: automated high-throughput detection of reverse complementary 16S ribosomal RNA gene sequences in large environmental and taxonomic datasets. FEMS Microbiol Lett 319:140–145

    Google Scholar 

  • Kang S, Mansfield MA, Park B, Geiser DM, Ivors KL, Coffey MD, Grünwald NJ, Martin FN, Lévesque CA, Blair J (2010) The promise and pitfalls of sequence-based identification of plant-pathogenic fungi and oomycetes. Phytopathology 100:732–737

    Article  PubMed  Google Scholar 

  • Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9:286–298

    Article  PubMed  CAS  Google Scholar 

  • Langesen K, Hallin P, Rødland EA, Stærfeldt H-H, Rognes T, Ussery DW (2007) RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35:3100–3108

    Article  Google Scholar 

  • Lupi R, D’Onorio de Meo P, Picardi E, D’Antonio M, Paoletti D, Castrignanò T, Pesolec G, Gissi C (2010) MitoZoa: a curated mitochondrial genome database of metazoans for comparative genomics studies. Mitochondrion 10:192–199

    Article  PubMed  CAS  Google Scholar 

  • Margulies M, Egholm M, Altman WE et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380

    PubMed  CAS  Google Scholar 

  • Nilsson RH, Tedersoo L, Lindahl BD et al (2011) Towards standardization of the description and publication of next-generation sequencing datasets of fungal communities. New Phytol (in press). doi: 10.1111/j.1469-8137.2011.03755.x

  • O’Brien EA, Zhang Y, Wang E, Marie V, Badejoko W, Lang BF, Burger G (2009) GOBASE: an organelle genome database. Nucleic Acids Res 37:D946–D950

    Article  PubMed  Google Scholar 

  • Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85:2444–2448

    Article  PubMed  CAS  Google Scholar 

  • Preusse EC, Quast C, Knittel K, Fuchs B, Ludwig W, Peplies J, Glöckner FO (2007) SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35:7188–7196

    Article  Google Scholar 

  • Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet 16:276–277

    Article  PubMed  CAS  Google Scholar 

  • Ryberg M, Kristiansson E, Sjökvist E, Nilsson RH (2009) An outlook on the fungal internal transcribed spacer sequences in GenBank and the introduction of a web-based tool for the exploration of fungal diversity. New Phytol 181:471–477

    Article  PubMed  CAS  Google Scholar 

  • Schneider KL, Pollard KS, Baertsch R, Pohl A, Lowe TM (2006) The UCSC archaeal genome browser. Nucleic Acid Res 34:D407–D410

    Article  PubMed  CAS  Google Scholar 

  • Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145

    Article  PubMed  CAS  Google Scholar 

  • Trevors JT, Masson L (2010) DNA technologies: What’s next applied to microbiology research? Antonie Leeuwenhoek 98:249–262

    Article  PubMed  CAS  Google Scholar 

  • Wooley JC, Godzik A, Friedberg I (2010) A primer on metagenomics. PLoS Comput Biol 6:2

    Article  Google Scholar 

Download references

Acknowledgments

The Frontiers in Biodiversity Research Centre of Excellence (University of Tartu) and the Platform in Ecotoxicology—From Gene to Ocean (University of Gothenburg) are gratefully acknowledged for their support.

Conflict of interest

The authors declare that they have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johan Bengtsson.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bengtsson, J., Eriksson, K.M., Hartmann, M. et al. Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets. Antonie van Leeuwenhoek 100, 471–475 (2011). https://doi.org/10.1007/s10482-011-9598-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10482-011-9598-6

Keywords

Navigation