Draft genome sequence of the keylime (Citrus × aurantiifolia) pathogen Colletotrichum limetticola

Many species belonging to the genus Colletotrichum are causal agents of plant diseases, generally referred to as anthracnose, in a wide range of hosts worldwide. Colletotrichum spp. are responsible for impacting numerous economically important crops on a global scale. This genus comprises approximately 257 distinct species, which are further organized into at least 15 major phylogenetic lineages known as species complexes (Talhinhas and Baroncelli 2021). Virtually every crop grown in the world is susceptible to one or more species of Colletotrichum (Baroncelli et al. 2014). Among these, the Colletotrichum acutatum species complex stands out as a diverse group of closely related plant pathogenic fungi within the genus (Baroncelli et al. 2017). Members of the Colletotrichum acutatum species complex have a wide host range in both domesticated and wild plant species, and their capability to infect insects has also been described (Damn et al. 2012, Marcelino et al. 2008). In this species complex, Colletotrichum limetticola (formerly known as Gloeosporium limetticola; Clausen 1912) was initially described in 2012 as a species predominantly associated with wither tip symptoms on sour lime (Citrus aurantiifolia) in Cuba and the USA during the 1910s (Damm et al. 2012). Later descriptions associated the disease with strains of C. gloeosporioides (Brown et al. 1996) or C. acutatum (Peres et al. 2008). Recent findings in Brazil have revealed the presence of C. limetticola causing Glomerella leaf spot on apples, although its prevalence remains low while displaying high virulence (Moreira et al. 2019). To the best of our knowledge, no further occurrences of C. limetticola have been documented, despite the presence of other known Colletotrichum species that infect citrus and apples (Talhinhas and Baroncelli 2021). This raises concerns regarding the conservation status of C. limetticola considering the scarcity of records on its original hosts and the occurrence of cross-infections. In the present study, Colletotrichum limetticola strain KLA-Anderson was isolated from a leaf tissue of Citrus x aurantiifolia commonly known as the Key lime or Mexican lime in the Lake Alfred region (Florida, USA). C. limetticola genome was sequenced using the Illumina NovaSeq 6000 150 bp paired-end sequencing system. Illumina sequences were analyzed with FastQC (Babraham Bioinformatics) to assess the quality of the reads. Sequences adapters and low-quality reads were trimmed with TrimGalore! v0.6.10 (Krueger et al. 2021). Pairend reads were merged with FLASH v1.2.11 (Magoc and Salzberg 2011). Merged and unmerged reads were then assembled using SPAdes v3.15.1 (Bankevich et al. 2012). Scaffolds with low coverage were removed as possible contaminations. Scaffolds corresponding to the mitochondrial DNA (mtDNA) and ribosomal DNA (rDNA) genome were identified by BLASTN v2.9.0 (Camacho et al. 2009) using queries of the closely related species Colletotrichum lupini (Baroncelli et al. 2021) which was Andrea Menicucci and Isis Tikami contributed equally to this work.

causal agents of plant diseases, generally referred to as anthracnose, in a wide range of hosts worldwide. Colletotrichum spp. are responsible for impacting numerous economically important crops on a global scale. This genus comprises approximately 257 distinct species, which are further organized into at least 15 major phylogenetic lineages known as species complexes (Talhinhas and Baroncelli 2021). Virtually every crop grown in the world is susceptible to one or more species of Colletotrichum (Baroncelli et al. 2014). Among these, the Colletotrichum acutatum species complex stands out as a diverse group of closely related plant pathogenic fungi within the genus (Baroncelli et al. 2017). Members of the Colletotrichum acutatum species complex have a wide host range in both domesticated and wild plant species, and their capability to infect insects has also been described (Damn et al. 2012, Marcelino et al. 2008. In this species complex, Colletotrichum limetticola (formerly known as Gloeosporium limetticola; Clausen 1912) was initially described in 2012 as a species predominantly associated with wither tip symptoms on sour lime (Citrus aurantiifolia) in Cuba and the USA during the 1910s (Damm et al. 2012). Later descriptions associated the disease with strains of C. gloeosporioides (Brown et al. 1996) or C. acutatum (Peres et al. 2008). Recent findings in Brazil have revealed the presence of C. limetticola causing Glomerella leaf spot on apples, although its prevalence remains low while displaying high virulence (Moreira et al. 2019). To the best of our knowledge, no further occurrences of C. limetticola have been documented, despite the presence of other known Colletotrichum species that infect citrus and apples (Talhinhas and Baroncelli 2021). This raises concerns regarding the conservation status of C. limetticola considering the scarcity of records on its original hosts and the occurrence of cross-infections.
In the present study, Colletotrichum limetticola strain KLA-Anderson was isolated from a leaf tissue of Citrus x aurantiifolia commonly known as the Key lime or Mexican lime in the Lake Alfred region (Florida, USA). C. limetticola genome was sequenced using the Illumina NovaSeq 6000 150 bp paired-end sequencing system. Illumina sequences were analyzed with FastQC (Babraham Bioinformatics) to assess the quality of the reads. Sequences adapters and low-quality reads were trimmed with TrimGalore! v0.6.10 (Krueger et al. 2021). Pairend reads were merged with FLASH v1.2.11 (Magoc and Salzberg 2011). Merged and unmerged reads were then assembled using SPAdes v3.15.1 (Bankevich et al. 2012). Scaffolds with low coverage were removed as possible contaminations. Scaffolds corresponding to the mitochondrial DNA (mtDNA) and ribosomal DNA (rDNA) genome were identified by BLASTN v2.9.0 (Camacho et al. 2009) using queries of the closely related species Colletotrichum lupini ) which was Andrea Menicucci and Isis Tikami contributed equally to this work. the closest complete genome to C. limetticola. The completeness of the assembly was assessed using BUSCO v5.4.7 (Simão et al. 2015) while statistics were evaluated with QUAST v5.2.0 (Gurevich et al. 2013). The total size of the nuclear genome assembly was 50,48 Mb, with an N50 contig length of 68638 kb and a L50 of 229. The nuclear genome assembly resulted in 1750 contigs with an average coverage of 90X and it was assessed to be 97.7% complete (Table 1). A total of 15248 protein-coding genes were predicted to be encoded using MAKER v3.01.02 pipeline (Holt and Yandell 2011) with both self-trained GeneMark-ES v4.10 (Borodovsky and Lomsadze 2011) and AUGUSTUS v3.3 prediction using the "Colletotrichum" model (Becerra et al. 2023). SignalP v5.0 (Almagro Armenteros et al. 2019) revealed that 1981 proteins in C. limetticola are secreted and among those 624 have been predicted by EffectorP v3.0 (Sperschneider and Dodds 2022) to be candidate effectors. A comparative analysis of the newly sequenced genome with those publicly available (Baroncelli et al. 2016(Baroncelli et al. , 2022Goulin et al. 2023) revealed similar genomic features and gene content within closely related species (Fig. 1).
In this study we presented a draft genome sequence of C. limetticola, obtained using Illumina sequencing technology, providing a range of new resources that serve as a useful platform for further research in the field of comparative genomics of fungi. Further analysis of these genomes will enhance our understanding of the molecular mechanisms underlying the pathogenicity and virulence of Colletotrichum species facilitating the exploration of potential targeted and environmentally friendly strategies for its control.

Data availability
The data generated in this study are publicly available from the NCBI GenBank database at Bioproject ID PRJNA952538 and Biosample ID SAMN34075281. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession JARUPT000000000. The version described in this paper is version JARUPT010000000.

Conflict of interest
The authors have no relevant financial or nonfinancial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.