IMA Genome-F 2A

Draft genome sequence of the pine fungal pathogen Diplodia sapinea

Introduction

Diplodia sapinea, also known as Diplodia pinea or Sphaeropsis sapinea (Phillips et al. 2013), was first reported in France in 1842 as a saprobe on dead Pinus sylvestris. It has subsequently been reported from many countries of the world on Pinus species growing in their natural environment and where they are propagated as non-natives in commercially managed plantations (Swart et al. 1991, Burgess et al. 2004). This fungus exists as an endophyte in healthy tree tissues, but causes disease when trees are stressed (Swart et al. 1991, Stanosz et al. 2007, 2001).

No sexual morph has been reported for D. sapinea (Smith et al. 2000, Burgess et al. 2004). However, results of a population genetics study of the fungus considering the lack of linkage disequilibrium amongst alleles, as well as the generally high genotypic diversity, proposed that a cryptic sexual state probably exists for this fungus (Bihon et al. 2012). In support of this conclusion, a recent study of mating type loci showed various populations of D. sapinea contained the two mating type idiomorphs in more or less equal frequency, which is indicative of a heterothallic sexual cycle (Bihon et al. 2014).

The aim of this study was to produce a full genome sequence for an isolate of D. sapinea and to make this available for further study. Such studies could address aspects of the biology of the pathogen such as its selective pathogenicity on conifers, compared to most other Botryosphaeriales that infect angiosperms.

Sequenced Strains

USA: Wisconsin: isol. ex Pinus banksiana, June 1986, M. Palmer (CMW 190/CBS 117911; CBS H-21777 — dried cultures). — South Africa: Kwa-Zulu Natal: 7-Oaksisol. ex Pinus patula, Sept. 2008, W. Bihon (CMW 39103/CBS 138184; CBS H-21778-dried cultures).

Nucleotide Sequence Accession Number

The Whole Genome Shotgun projects have been deposited at DBJ/EMBL/GenBank under the accessions AXCF00000000 and JHUM00000000. The version described in this paper is version AXCF01000000 and JHUM01000000 for strains CMW 190 and CMW 39103 respectively.

Methods

DNA from single spore cultures of two strains of Diplodia sapinea (CMW190/CBS117911 and CMW39103) was extracted and sequenced using Illumina: HiSeq and MiSeq genome analyser at Agricultural Research Council (ARC) and Inqaba biotech, Pretoria, South Africa, respectively. Reads received were subjected to the necessary sequence quality analysis and those of less than 30 bases were trimmed. Reads were assembled into a draft genome using CLC Genomic de novo assembler 6.0 (CLC bio, Aarhus, Denmark). Completeness of the genome was estimated using the Core Eukaryotic Genes Mapping Approach (CEGMA) analysis (Parra et al. 2007). Gene prediction from the genome was done using AUGUSTUS (Stanke et al. 2006).

Results and Discussion

The final assembly of isolate CMW190/CBS117911 consisted of 2194 contigs with N50 contig size of 37659 and that of CMW39103 had 4102 contigs with N50 of 38230 bases. Maximum contig size was 265719 bases. Contigs of ≥ 200 bases were submitted to the genome database of NCBI. The output from the CEGMA (Parra et al. 2007) pipeline analysis indicated that the genome sequence was estimated to be >95.6 % complete by mapping to the more conserved set of 248 Core Eukaryotic Genes (CEGs). Putative gene prediction using AUGUSTUS (Stanke et al. 2006) identified 13020 open reading frames (ORFs).

The estimated genome size of Diplodia sapinea was 36.97 Mb, which is smaller than the genome of most closely related sequenced species, Botryosphaeria dothidea (43.50 Mb) and Neofusicoccum parvum (42.50 Mb). It contains fewer genes compared to B. dothidea (14999) but higher than that of N. parvum (10470) (https://doi.org/genome.jgi-psf.org/Botdo1/Botdo1.info.html; Blanco-Ulate et al. 2013). Diplodia sapinea has similar genome size to Fusarium graminearum (36.1 Mb), but with a greater number of genes (11640) (Cuomo et al. 2007). The genome sequence of D. sapinea species has already made the characterisation of the MAT locus possible (Bihon et al. 2014) and access to this genome will no doubt facilitate further research on this important tree pathogen.

Authors: W. Bihon, M.J. Wingfield, Bernard Slippers, and B.D. Wingfield

IMA Genome-F 2B

Draft nuclear genome sequence for the sapstain fungus Ceratocystis moniliformis

Introduction

The Ceratocystis moniliformis complex defines one of several monophyletic assemblages in the genus Ceratocystis sensu lato (Yuan & Mohammed 2002, van Wyk et al. 2006, Kamgan Nkuekam et al. 2008, 2013, Tarigan et al. 2010, 2011). Members of this complex produce hat-shaped ascospores from ascomata with spiny bases and have disc-like structures at the bases of their ascomatal necks (Hunt 1956, Upadhyay 1981, van Wyk et al. 2004, 2006). These fungi are relatively fast-growing and produce strong fruity aromas and enzymes that could be industrially relevant. These include invertases that catalyse sucrose biotransformation (van Wyk et al. 2013) and various terpenes with fruity or floral odours that are used for large-scale production of bioflavours (Krings & Berger 1998, Vandamme & Soetaert 2002).

Species in the C. moniliformis complex are found on the surfaces of freshly wounded woody plants, especially trees (Kile 1993, Roux et al. 2004, Tarigan et al. 2010). Interestingly this group of fungi are all saprobes (Kile 1993, Yuan & Mohammed 2002, Tarigan et al. 2010), unlike species in the C. fimbriata complex which includes serious pathogens of economically important plants (Roux et al. 2000, Baker et al. 2003, Barnes et al. 2003, van Wyk et al. 2007, Heath et al. 2009). In some cases, species in the C. moniliformis complex cause sapstain that can result in economic losses as they lower the value of timber (van Wyk et al. 2006). Ceratocystis species are known to be transported to the wounded surfaces by insects such as sap-feeding beetles (Coleoptera: Nitidulidae) (Kirisits 2004). One species, C. bhutanensis, is also associated with a bark beetle (Ips schmutzenhoferi) on Picea spinulosa in Bhutan, but it does not appear to be a pathogen (van Wyk et al. 2004, Kirisits et al. 2013).

Overall, little is known regarding the biology of species in the C. moniliformis complex. The availability of the nuclear genome sequence for one of its members, C. moniliformis s.str., would improve our knowledge regarding the molecular processes underlying their ecology and potentially inform industrial applications for the production of biocompounds. Together with the publicly available genome sequences for other species of Ceratocystis, particularly the sweet potato pathogen C. fimbriata (Wilken et al. 2013) and the mango wilt pathogen C. manginecans, the C. moniliformis s.str. genome will also be a valuable resource for comparative genomics studies into the evolution and general biology of these important fungi.

Sequenced Strain

South Africa: Mpumalanga: Sabie, isol. ex Eucalyptus grandis, Apr. 2002, M. van Wyk (CMW 10134, CBS118127; CBS H-21775-dried culture).

Nucleotide Sequence Accession Number

The Whole Genome Shotgun project of the Ceratocystis moniliformis genome has been deposited at DDBJ/EMBL/GenBank under the accession no. JMSH00000000. The version described in this paper is version JMSH01000000.

Methods

Genomic DNA was isolated and sequenced using the Genome Analyzer Ilx platform (Illumina) at the Genome Centre, University of California at Davis (CA, USA). For this purpose, paired-end libraries with respective insert sizes of approximately 350 and 600 bases were used to produce reads with an average length of 100 bases. Poor-quality reads and/or terminal nucleotides were discarded using the software package CLC Genomics Workbench v. 6.0.1 (CLCbio, Aarhus, Denmark). The remaining reads were assembled using Abyss v. 1.3.7 with an optimized k-mer size of 91 (Simpson et al. 2009). Open reading frames (ORFs) were predicted using AUGUSTUS (Stanke et al. 2006) based on the gene models for Fusarium graminearum (https://doi.org/bioinf.uni-greifswald.de/augustus), while genome completeness was evaluated using the Core Eukaryotic Genes Mapping Approach (CEGMA) pipeline (Parra et al. 2007).

Results and Discussion

The draft nuclear genome of Ceratocystis moniliformis has an estimated size of 25 429 610 bases. A value of 191 280 was obtained for the N50 and a mean GC content of 48 %. The Abyss assembly generated 680 contigs, of which 365 were retained after filtering out contigs consisting of fewer than 500 nucleotides. This assembly was also predicted to encode 6 832 ORFs at a density of 269 ORFs/Mb. A CEGMA completeness score of at least 96.4 % were obtained for this version of the assembly.

Comparison of the C. moniliformis genome to those of C. fimbriata s.str. (Wilken et al. 2013) and C. manginecans showed differences in several key genome statistics. The C. moniliformis genome is 4.0 Mb smaller than the 29.4 Mb C. fimbriata s.str. genome, and 6.3 Mb smaller than the 31.7 Mb C. manginecans genome. Additionally, 533 and 761 fewer protein coding genes are predicted for the C. moniliformis genome than for C. fimbriata s.str. with its 7 266 predicted genes (Wilken et al. 2013) and C. manginecans with its 7 494 genes (see below), respectively. This is despite the fact that the three genomes are characterized by similar levels of completeness (i.e. 96.8 % for C. manginecans and 96.9 % for C. fimbriata s.str.). Although these genome differences for C. moniliformis could be linked to its non-pathogenic lifestyle (i.e. C. moniliformis is a saprophytic fungus that occurs on a wide range of woody hosts; van Wyk et al. 2006), further research is required for determining the significance of these differences in the overall biology of this group offungi.

Authors: M.A. van der Nest, K. Naidoo, P.M. Wilken, E. Rubagotti, A. Wilson, L. De Vos, E.T. Steenkamp, M.J. Wingfield, and B.D. Wingfield

IMA Genome-F 2C

Draft nuclear genome sequence for Ceratocystis manginecans, the causal agent of mango wilt disease

Introduction

The genus Ceratocystis (Ascomycota, Microscales) represents an important group of plant pathogens (Roux & Wingfield 2009). These fungi cause diseases on a wide range of root and tree crops, where they are associated with significant economic losses (Roux & Wingfield 2009). The mango (Mangifera indica) wilt pathogen, Ceratocystis manginecans, is a particularly virulent member of this genus that has devastated the mango industry in Oman and Pakistan (Al Adawi et al. 2006, 2013, van Wyk et al. 2007, Al-Sadi et al. 2010). This pathogen also threatens leguminous trees in Oman, Pakistan and Indonesia (Poussio et al. 2010, Tarigan et al. 2011, Al Adawi et al. 2013).

Ceratocystis manginecans is a member of the C. fimbriata s.lat. species complex, which is an assemblage of morphologically similar and phylogenetically closely related species (Webster & Butler 1967, van Wyk et al. 2007, Wingfield et al. 2013). In this complex, C. manginecans is closely related to C. acaciivora, which is responsible for a debilitating canker and wilt disease of plantation-grown Acacia mangium in Indonesia (van Wyk et al. 2007, Tarigan et al. 2011). Although there is a need to refine the taxonomic position of some species in the complex (Wingfield et al. 2013), the close relationships among its members could indicate similar or shared mechanisms relating to their biology and role as pathogens. Elucidation of questions regarding their pathology and general biology would be facilitated by genome sequence comparisons (Rokas et al. 2003, Wall & Tonellato 2012). For this reason, the genome of the sweet potato pathogen, C. fimbriata was recently sequenced and shared publicly (Wilken et al. 2013). In this study we determined the genome sequence for C. manginecans, which will allow for comparisons between the two species, advancing studies on various aspects of the biology of species in this complex.

Sequenced Strain

Oman: Sohar area, isol. ex Prosopsis cineraria, Mar. 2005, A. O. Al Adawi (CBS 138185, CMW 17570; CBS H-21776 — dried).

Nucleotide Sequence Accession Number

The Whole Genome Shotgun project of the Ceratocystis manginecans genome has been deposited at DDBJ/EMBL/GenBank under the accession number JJRZ00000000. The version described in this paper is version JJRZ01000000.

Methods

All sequencing was performed on the Genome Analyzer IIx platform (Illumina) at the Genome Centre, University of California at Davis (CA, USA). Paired-end libraries with respective insert sizes of approximately 350 and 600 bases were used to produce read lengths of 100 bases. The software package CLC Genomics Workbench v. 6.0.1 (CLCbio, Aarhus, Denmark) was used to discard poor-quality reads and/or terminal nucleotides. The remaining reads were assembled using the Velvet de novo assembler (Zerbino & Birney 2008), with an optimised k-mer size of 71. The pre-assemblies were scaffolded using SSPACE v. 2.0 (Boetzer et al. 2011) and the gaps were reduced using GapFiller v. 2.2.1 (Boetzer & Pirovano 2012). Open Reading Frames (ORFs) were predicted using AUGUSTUS (Stanke et al. 2006) based on the gene models for Fusarium graminearum (https://doi.org/bioinf.uni-greifswald.de/augustus), while genome completeness was evaluated using the Core Eukaryotic Genes Mapping Approach (CEGMA) pipeline (Parra et al. 2007).

Results and Discussion

The Ceratocystsis manginecans draft genome had an estimated size of 31 706 104 DNA bases, a 22× average coverage, N50 contig size of 77 070 bases and a mean GC content of 47.9 %. The assembly generated 2 234 contigs, of which 980 were retained after filtering out contigs consisting of fewer than 500 nucleotides. The filtered assembly had a CEGMA completeness score of at least 96.4 % and was predicted to encoded 7 494 putative ORFs at a density of 236 ORFs/Mb.

The C. manginecans draft genome is similar in size than the genome of the sweet potato pathogen, C. fimbriata (29.4 Mb, 7 266 ORFs) (Wilken et al. 2013) and the wood-staining fungus Ophiostoma piceae (32.84 Mb, 8884 ORFs) (Haridas et al. 2013). However, the C. manginecans genome appears to be relatively small and harbours fewer genes than other species in Sordariomycetes. For example, the genomes of Podospora anserina (35.01 Mb, 10588 ORFs) (Espagne et al. 2008), Fusarium fujikuroi (43.83 Mb, 14813 ORFs) (Wiemann et al. 2013) and Cryphonectria parasitica (43.9 Mb, 11,184 ORFs) (https://doi.org/genome.jgi.doe.gov/Crypa2/Crypa2.home.html) are much bigger in size and harbour more genes. The genome sequence information for C. manginecans will, therefore, increase our understanding of the biology, systematics and pathology of this group of globally important pathogens.

Authors: M.A. van der Nest, K. Naidoo, P.M. Wilken, E. Rubagotti, D. Roodt, L. De Vos, E.T. Steenkamp, M.J. Wingfield, and B.D. Wingfield