Draft genome sequence of the apple pathogen Colletotrichum chrysophilum strain M932

Colletotrichum chrysophilum (Ascomycota, Sordariomycetes, Glomerellaceae) is a species belonging to the C. gloeosporioides complex. Described in 2017 as responsible for anthracnose on Musa acuminata (banana plants; Vieira et al. 2017), C. chrysophilum has been associated with Persea americana (avocado) and Prunus persica (peach) (Talhinhas and Baroncelli 2021). Moreover, together with Colletotrichum fructicola and C. noveboracense, it is considered one of the major causal agents of Glomerella leaf spot (GLS) and Apple bitter rot (ABR) diseases on Malus domestica (apple) (Astolfi et al. 2022; Khodadadi et al. 2020). Originally, C. chrysophilum was presumed to be limited to the American and Asian continents (Astolfi et al. 2022; Talhinhas and Baroncelli 2021), however, reports of GLS and ABR caused by this pathogen in European apple orchards, such as in Italy and Spain, start emerging in 2022 (Cabrefiga et al. 2022; Deltedesco and Oettl 2022). Colletotrichum chrysophilum was isolated in September 2021 from symptomatic leaves showing GLS symptoms from an apple orchard with a disease incidence close to 50%, in northern Italy (Province of Ferrara, Emilia-Romagna). The monosporic strain M932 was transferred onto fresh PDA medium (supplemented with 200 ml/L streptomycin and 200 ml/L neomycin) and incubated at 20 °C for 10 days to obtain mycelium for genomic DNA extraction using a modified CTAB method (Prodi et al. 2011). The DNA of C. chrysophilum strain M932 was sequenced using the Illumina NovaSeq 6000 150bp paired-end sequencing system. NovaSeq 6000 adapters were trimmed using Trimmomatic v0.39 (Bolger et al. 2014) and low-quality reads were removed using TrimGalore v0.6.4 (Krueger 2015). The quality of the reads was assessed and compared using FastQC v0.11.9 (Andrews 2010). Illumina reads were assembled using SPAdes v3.15.1 (Bankevich et al. 2012). The first draft of the nuclear genome of C. chrysophilum consists of 1497 scaffolds with a total length of 55.56 Mbp (N50= 86538 bp and N75= 44545 bp). BUSCO v5.2.2 (Seppey et al. 2019) software was used to assess the integrity of the fungal genome assembly while assembly statistics were evaluated with QUAST v5.0.2 (Gurevich et al. 2013). Results are reported in Table 1. A total of 20,041 protein-coding genes were predicted to be encoded by the nuclear using MAKER v3.01.02 pipeline (Holt and Yandell 2011) with self-trained GeneMark-ES v4.10 (Borodovsky and Lomsadze 2011) and AUGUSTUS v3.3 prediction performed using the “Fusarium” model (Stanke et al. 2008). SignalP v5.0 (Almagro Armenteros et al. 2019) revealed that 2,350 proteins in C. chrysophilum are secreted and among those 991 have been predicted to be candidate effectors by EffectorP v3.0 (Sperschneider and Dodds 2022). A comparative analysis of the newly sequenced genome with those publicly Riccardo Baroncelli and Antonio Prodi contributed equally to this work.

to obtain mycelium for genomic DNA extraction using a modified CTAB method (Prodi et al. 2011).
The DNA of C. chrysophilum strain M932 was sequenced using the Illumina NovaSeq 6000 150bp paired-end sequencing system. NovaSeq 6000 adapters were trimmed using Trimmomatic v0.39 (Bolger et al. 2014) and low-quality reads were removed using TrimGalore v0.6.4 (Krueger 2015). The quality of the reads was assessed and compared using FastQC v0.11.9 (Andrews 2010 (Seppey et al. 2019) software was used to assess the integrity of the fungal genome assembly while assembly statistics were evaluated with QUAST v5.0.2 (Gurevich et al. 2013). Results are reported in Table 1. A total of 20,041 protein-coding genes were predicted to be encoded by the nuclear using MAKER v3.01.02 pipeline (Holt and Yandell 2011) with self-trained Gen-eMark-ES v4.10 (Borodovsky and Lomsadze 2011) and AUGUSTUS v3.3 prediction performed using the "Fusarium" model (Stanke et al. 2008). SignalP v5.0 (Almagro Armenteros et al. 2019) revealed that 2,350 proteins in C. chrysophilum are secreted and among those 991 have been predicted to be candidate effectors by EffectorP v3.0 (Sperschneider and Dodds 2022). A comparative analysis of the newly sequenced genome with those publicly Riccardo Baroncelli and Antonio Prodi contributed equally to this work.
available (Gan et al. 2013;Armitage et al. 2020;Gan et al. 2021; Baroncelli et al. 2022) showed similar genomic features in terms of genome size and GC% but a high diversity in gene content within strains of C. chrysophilum and with closely related species (Fig. 1). A phylogenomic approach, performed as described in Baroncelli et al. 2022 did also highlight incongruence in the taxonomic designation of deposited data as strains C. nupharicola and C. noveboracense do not form distinct clusters (Figure 1); further analyses are needed to fully understand the diversity and the taxonomy of this group.
The availability of the genome of C. chrysophilum M932 offers the possibility to perform further comparative analyses, to fully understand species boundaries within the Colletotrichum gloeosporioides species complex and to develop molecular diagnostic methods.

Nucleotide sequence accession numbers
This whole-genome shotgun project has been deposited in GenBank under the accession no. JAQOWY000000000 (BioProject: PRJNA928458; BioSample: SAMN32933927).

Data Availability
The "data availability statement" is reported in the "Nucleotide sequence accession numbers" section.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.