Improving ITS sequence data for identification of plant pathogenic fungi
Plant pathogenic fungi are a large and diverse assemblage of eukaryotes with substantial impacts on natural ecosystems and human endeavours. These taxa often have complex and poorly understood life cycles, lack observable, discriminatory morphological characters, and may not be amenable to in vitro culturing. As a result, species identification is frequently difficult. Molecular (DNA sequence) data have emerged as crucial information for the taxonomic identification of plant pathogenic fungi, with the nuclear ribosomal internal transcribed spacer (ITS) region being the most popular marker. However, international nucleotide sequence databases are accumulating numerous sequences of compromised or low-resolution taxonomic annotations and substandard technical quality, making their use in the molecular identification of plant pathogenic fungi problematic. Here we report on a concerted effort to identify high-quality reference sequences for various plant pathogenic fungi and to re-annotate incorrectly or insufficiently annotated public ITS sequences from these fungal lineages. A third objective was to enrich the sequences with geographical and ecological metadata. The results – a total of 31,954 changes – are incorporated in and made available through the UNITE database for molecular identification of fungi (http://unite.ut.ee), including standalone FASTA files of sequence data for local BLAST searches, use in the next-generation sequencing analysis platforms QIIME and mothur, and related applications. The present initiative is just a beginning to cover the wide spectrum of plant pathogenic fungi, and we invite all researchers with pertinent expertise to join the annotation effort.