Advertisement

Mycoscience

, Volume 52, Issue 4, pp 278–282 | Cite as

A note on the incidence of reverse complementary fungal ITS sequences in the public sequence databases and a software tool for their detection and reorientation

  • R. Henrik Nilsson
  • Vilmar Veldre
  • Zheng Wang
  • Martin Eckart
  • Sara Branco
  • Martin Hartmann
  • Christopher Quince
  • Anna Godhe
  • Yann Bertrand
  • Johan F. Alfredsson
  • Karl-Henrik Larsson
  • Urmas Kõljalg
  • Kessy Abarenkov
Note

Abstract

Reverse complementary DNA sequences––sequences that are inadvertently cast backward and in which all purines and pyrimidines are transposed––are not uncommon in sequence databases, where they may introduce noise into sequence-based research. We show that about 1% of the public fungal ITS sequences, the most commonly sequenced genetic marker in mycology, are reverse complementary, and we introduce an open source software solution to automate their detection and reorientation. The MacOSX/Linux/UNIX software operates on public or private datasets of any size, although some 50 base pairs of the 5.8S gene of the ITS region are needed for the analysis.

Keywords

DNA barcoding Environmental sampling Hidden Markov models Quality assessment Sequence identification 

Notes

Acknowledgments

R.H.N. and K.A. gratefully acknowledge support from the Frontiers in Biodiversity Research Centre of Excellence (University of Tartu) and the Fungi in Boreal Forest Soils network. Matt von Konrat and Anders Hagborg are acknowledged for valuable advice on the liverwort data. Two anonymous reviewers are acknowledged for valuable input on the manuscript. The authors declare that they have no conflict of interests. No laboratory experiments were undertaken as a component––or result––of the present study.

Supplementary material

10267_2010_86_MOESM1_ESM.zip (13 mb)
Supplementary Item 1. The software package together with its documentation, reference sequences from INSD, and a test dataset (including ten sequences that are in the correct orientation; ten that are reverse complementary; and ten that cannot be processed for lack of 5.8S or for reasons of poor sequence quality.) The user will also have to install HMMER (and optionally MAFFT and NCBI-BLAST); detailed installation instructions are provided in the documentation (ZIP 13298 kb)
10267_2010_86_MOESM2_ESM.zip (16.1 mb)
Supplementary Item 2. The 3,443 multiple alignments used to evaluate the performance of the software. The first 1,000 alignments form the dataset used to assess the proportion of correct calls on sequences that were given in the correct orientation to begin with (true negatives). The set of 1,443 alignments corresponds to the dataset used to assess the proportion of correct calls on sequences that were reverse complementary to begin with (true positives + false positives). The final 1,000 alignments form the dataset for examination of the sequences for which 5.8S could not be found (false negatives) (ZIP 16440 kb)
10267_2010_86_MOESM3_ESM.zip (24 kb)
Supplementary Item 3. HMMs of the 5’- and 3’- ends of 5.8S for animals, plants, oomycetes, red algae, green algae, brown algae, mosses, and liverworts provided to facilitate the implementation of the software for other groups of organisms (ZIP 23 kb)

References

  1. Abarenkov K, Nilsson RH, Larsson K-H et al (2010) The UNITE database for molecular identification of fungi: recent updates and future perspectives. New Phytol 186:281–285PubMedCrossRefGoogle Scholar
  2. Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  3. Begerow D, Nilsson RH, Unterseher M, Maier W (2010) Current state and perspectives of fungal DNA barcoding and rapid identification procedures. Appl Microbiol Biotechnol 87:99–108PubMedCrossRefGoogle Scholar
  4. Bidartondo MI, Bruns TD, Blackwell M et al (2008) Preserving accuracy in GenBank. Science 319:1616PubMedCrossRefGoogle Scholar
  5. Blackwell M, Hibbett DS, Taylor JW, Spatafora JW (2006) Research coordination networks: a phylogeny for kingdom Fungi (Deep Hypha). Mycologia 98:829–837PubMedCrossRefGoogle Scholar
  6. Eckart M, Fliegerova K, Hoffmann K, Voigt K (2010) Molecular identification of anaerobic rumen fungi. In: Gherbawy Y, Voigt K (eds) Molecular identification of fungi. Springer, New York, pp 297–313CrossRefGoogle Scholar
  7. Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14:755–763PubMedCrossRefGoogle Scholar
  8. Feibelman TP, Bayman P, Cibula WG (1994) Length variation in the internal transcribed spacer of ribosomal DNA in chanterelles. Mycol Res 98:614–618CrossRefGoogle Scholar
  9. Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224PubMedCrossRefGoogle Scholar
  10. Harris DJ (2003) Can you bank on GenBank? Trends Ecol Evol 18:317–319CrossRefGoogle Scholar
  11. Hartmann M, Howes CG, Abarenkov K, Mohn WW, Nilsson RH (2010) V-Xtractor: an open-source, high-throughput software tool to identify and extract hypervariable regions of small subunit (16 S/18 S) ribosomal RNA gene sequences. J Microbiol Methods 83:250–253PubMedCrossRefGoogle Scholar
  12. Hibbett DS (2007) After the gold rush, or before the flood? Evolutionary morphology of mushroom-forming fungi (Agaricomycetes) in the early 21st century. Mycol Res 111:1001–1018PubMedCrossRefGoogle Scholar
  13. Hibbett DS, Nilsson RH, Snyder M, Fonseca M, Costanzo J, Shonfeld M (2005) Automated phylogenetic taxonomy: an example in the homobasidiomycetes. Syst Biol 54:660–668PubMedCrossRefGoogle Scholar
  14. Hibbett DS, Ohman A, Kirk PM (2009) Fungal ecology catches fire. New Phytol 184:279–282PubMedCrossRefGoogle Scholar
  15. Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9:286–298PubMedCrossRefGoogle Scholar
  16. Kauff F, Cox C, Lutzoni F (2007) WASABI: an automated sequence processing system for multi-gene phylogenies. Syst Biol 56:523–531PubMedCrossRefGoogle Scholar
  17. Moncalvo J-M, Nilsson RH, Koster B et al (2006) The cantharelloid clade: dealing with incongruent gene trees and phylogenetic reconstruction methods. Mycologia 98:937–948PubMedCrossRefGoogle Scholar
  18. Nilsson RH, Kristiansson E, Ryberg M, Larsson K-H (2005) Approaching the taxonomic affiliation of unidentified sequences in public databases: an example from the mycorrhizal fungi. BMC Bioinform 6:178CrossRefGoogle Scholar
  19. Nilsson RH, Kristiansson E, Ryberg M, Hallenberg N, Larsson K-H (2008) Intraspecific ITS variability in the kingdom Fungi as expressed in the international sequence databases and its implications for molecular species identification. Evol Bioinf Online 4:193–201Google Scholar
  20. Nilsson RH, Abarenkov K, Veldre V, Nylinder S, De Wit P, Brosché S, Alfredsson JF, Ryberg M, Kristiansson E (2010) An open source chimera checker for the fungal ITS region. Mol Ecol Res 10:1076–1081CrossRefGoogle Scholar
  21. Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85:2444–2448PubMedCrossRefGoogle Scholar
  22. Pennisi E (2008) Proposal to ‘Wikify’ GenBank meets stiff resistance. Science 319:1598–1599PubMedCrossRefGoogle Scholar
  23. Porter TM, Skillman JE, Moncalvo J-M (2008) Fruiting body and soil rDNA sampling detects complementary assemblage of Agaricomycotina (Basidiomycota, Fungi) in a hemlock-dominated forest plot in southern Ontario. Mol Ecol 17:3037–3050PubMedCrossRefGoogle Scholar
  24. Ryberg M, Kristiansson E, Sjökvist E, Nilsson RH (2009) An outlook on the fungal internal transcribed spacer sequences in GenBank and the introduction of a web-based tool for the exploration of fungal diversity. New Phytol 181:471–477PubMedCrossRefGoogle Scholar
  25. Sayers EW, Barrett T, Benson DA et al (2010) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 38:D5–D16PubMedCrossRefGoogle Scholar
  26. Seifert KA (2009) Progress towards DNA barcoding of fungi. Mol Ecol Res 9:83–89CrossRefGoogle Scholar
  27. Stajich JE, Block D, Boulez K et al (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12:1611–1618PubMedCrossRefGoogle Scholar
  28. Suzuki A, Bärlocher F (2009) Editorial for the special feature: propagation strategy of fungi. Mycoscience 50:1–2CrossRefGoogle Scholar
  29. Taylor DL, McCormick K (2008) Internal transcribed spacer primers and sequences for improved characterization of basidiomycetous orchid mycorrhizas. New Phytol 177:1020–1033PubMedCrossRefGoogle Scholar
  30. Taylor JW, Jacobson DJ, Kroken S, Kasuga T, Geiser DM, Hibbett DS, Fisher MC (2000) Phylogenetics species recognition and species concepts in fungi. Fungal Genet Biol 31:21–32PubMedCrossRefGoogle Scholar

Copyright information

© The Mycological Society of Japan and Springer 2010

Authors and Affiliations

  • R. Henrik Nilsson
    • 1
    • 2
  • Vilmar Veldre
    • 1
  • Zheng Wang
    • 3
  • Martin Eckart
    • 4
  • Sara Branco
    • 5
  • Martin Hartmann
    • 6
  • Christopher Quince
    • 7
  • Anna Godhe
    • 8
  • Yann Bertrand
    • 2
  • Johan F. Alfredsson
    • 9
  • Karl-Henrik Larsson
    • 10
  • Urmas Kõljalg
    • 1
  • Kessy Abarenkov
    • 1
  1. 1.Department of Botany, Institute of Ecology and Earth SciencesUniversity of TartuTartuEstonia
  2. 2.Department of Plant and Environmental SciencesUniversity of GothenburgGöteborgSweden
  3. 3.Department of Ecology and Evolutionary BiologyYale UniversityNew HavenUSA
  4. 4.Jena Microbial Resource Collection, Department of Molecular and Applied MicrobiologyHKI, University of JenaJenaGermany
  5. 5.Forschungsinstitut Senckenberg, BiK-FFrankfurtGermany
  6. 6.Department of Microbiology and Immunology, Life Sciences CentreUniversity of British ColumbiaVancouverCanada
  7. 7.Department of Civil EngineeringGlasgow UniversityGlasgowUnited Kingdom
  8. 8.Department of Marine EcologyUniversity of GothenburgGöteborgSweden
  9. 9.Oepir ConsultingGöteborgSweden
  10. 10.The Mycological Herbarium, Natural History MuseumUniversity of OsloOsloNorway

Personalised recommendations