Abstract
Lemon (Citrus limon (L.) Burm. f.) is an evergreen tree belonging to the genus Citrus. The fruits are particularly prized for the organoleptic and nutraceutical properties of the juice and for the quality of the essential oils in the peel.
Herein, we report, for the first time, the release of a high-quality reference genome of the two haplotypes of lemon. The sequencing has been carried out coupling Illumina short reads and Oxford Nanopore data leading to the definition of a primary and an alternative assembly characterized by a genome size of 312.8 Mb and 324.74 Mb respectively, which agree well with an estimated genome size of 312 Mb. The analysis of the transposable element (TE) allowed the identification of 2878 regions on the primary and 2897 on the alternative assembly distributed across the nine chromosomes. Furthermore, an in silico analysis of the microRNA genes was carried out using 246 mature miRNA and the respective pre-miRNA hairpin sequences of Citrus sinensis. Such analysis highlighted a high conservation between the two species with 233 mature miRNAs and 51 pre-miRNA stem-loops aligning with perfect match on the lemon genome.
In parallel, total RNA was extracted from fruit, flower, leaf, and root enabling the detection of 35,020 and 34,577 predicted transcripts on primary and alternative assemblies respectively. To further characterize the annotated transcripts based on their function, a gene ontology and a gene orthology analysis with other Citrus and Citrus-related species were carried out.
The availability of a reference genome is an important prerequisite both for the setup of high-throughput genotyping analysis and for functional genomic approaches toward the characterization of the genetic determinism of traits of agronomic interest.
Similar content being viewed by others
Change history
27 November 2021
A Correction to this paper has been published: https://doi.org/10.1007/s11295-021-01531-w
References
Alonge M, Soyk S, Ramakrishnan S et al (2019) RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol 20:1–17
Barry GH, Caruso M, Gmitter FG (2020) Chapter 5—Commercial scion varieties. In: Talon M, Caruso M, Gmitter FG (eds) The genus citrus. Woodhead Publishing, pp 83–104
Bateman A, Martin MJ, Orchard S et al (2021) UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res 49:D480–D489. https://doi.org/10.1093/nar/gkaa1100
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Cabanettes F, Klopp C (2018) D-GENIES: dot plot large genomes in an interactive, efficient and simple way. PeerJ 6:e4958
Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. https://doi.org/10.1186/1471-2105-10-421
Carbonell-Caballero J, Alonso R, Ibañez V et al (2015) A phylogenetic analysis of 34 chloroplast genomes elucidates the relationships between wild and domestic species within the genus citrus. Mol Biol Evol 32:2015–2035. https://doi.org/10.1093/molbev/msv082
Catalano C, Di Guardo M, Distefano G et al (2021) Biotechnological approaches for genetic improvement of lemon (Citrus limon (l.) burm. f.) against mal secco disease. Plants 10:1–16. https://doi.org/10.3390/plants10051002
Chang Z, Li G, Liu J et al (2015) Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol 16:1–10. https://doi.org/10.1186/s13059-015-0596-2
Chin C-S, Peluso P, Sedlazeck FJ et al (2016) Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods 13:1050–1054
Di Tommaso P, Chatzou M, Floden EW et al (2017) Nextflow enables reproducible computational workflows. Nat Biotechnol 35:316–319
Emms DM, Kelly S (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20:1–14. https://doi.org/10.1186/s13059-019-1832-y
Food and Agriculture Organization of the United Nations. FAOSTAT (1997) Statistical Database. [Rome] :FAO
Gremme G, Brendel V, Sparks ME, Kurtz S (2005) Engineering a software tool for gene structure prediction in higher organisms. Inf Softw Technol 47:965–978. https://doi.org/10.1016/j.infsof.2005.09.005
Gulsen O, Roose ML (2001) Chloroplast and nuclear genome analysis of the parentage of lemons. J Am Soc Hortic Sci 126:210–215. https://doi.org/10.21273/jashs.126.2.210
Huerta-Cepas J, Szklarczyk D, Heller D et al (2018) eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47:D309–D314. https://doi.org/10.1093/nar/gky1085
Huson DH, Scornavacca C (2012) Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks. Syst Biol 61:1061–1067. https://doi.org/10.1093/sysbio/sys062
Jones P, Binns D, Chang HY et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240. https://doi.org/10.1093/bioinformatics/btu031
Klopfenstein DV, Zhang L, Pedersen BS et al (2018) GOATOOLS: a Python library for Gene Ontology analyses. Sci Rep 8:1–17. https://doi.org/10.1038/s41598-018-28948-z
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10. https://doi.org/10.1186/gb-2009-10-3-r25
Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659. https://doi.org/10.1093/bioinformatics/btl158
Mehl F, Marti G, Boccard J et al (2014) Differentiation of lemon essential oil based on volatile and non-volatile fractions with various analytical techniques: a metabolomic approach. Food Chem 143:325–335. https://doi.org/10.1016/j.foodchem.2013.07.125
Muccilli V, Vitale A, Sheng L et al (2020) Substantial equivalence of a transgenic lemon fruit showing postharvest fungal pathogens resistance. J Agric Food Chem 68:3806–3816. https://doi.org/10.1021/acs.jafc.9b07925
Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. https://doi.org/10.1093/molbev/msu300
Nicolosi E, Deng ZN, Gentile A et al (2000) Citrus phylogeny and genetic origin of important species as investigated by molecular markers. Theor Appl Genet 100:1155–1166. https://doi.org/10.1007/s001220051419
Ou S, Su W, Liao Y et al (2019) Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 20:275. https://doi.org/10.1186/s13059-019-1905-y
Poles L, Licciardello C, Distefano G et al (2020) Recent advances of in vitro culture for the application of new breeding techniques in citrus. Plants 9:1–25. https://doi.org/10.3390/plants9080938
Rhie A, Walenz BP, Koren S, Phillippy AM (2020) Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol 21:1–27. https://doi.org/10.1186/s13059-020-02134-9
Roach MJ, Schmidt SA, Borneman AR (2018) Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinformatics 19:1–10
Simão FA, Waterhouse RM, Ioannidis P et al (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212
Singh B (1981) Establishment of first gene sanctuary in India for Citrus in Garo Hills. Concept Publishing Company
Stanke M, Diekhans M, Baertsch R, Haussler D (2008) Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24:637–644. https://doi.org/10.1093/bioinformatics/btn013
Sun Y, Singh Z, Tokala VY, Heather B (2019) Harvest maturity stage and cold storage period influence lemon fruit quality. Sci Hortic (amsterdam) 249:322–328. https://doi.org/10.1016/j.scienta.2019.01.056
Tang S, Lomsadze A, Borodovsky M (2015) Identification of protein coding regions in RNA transcripts. Nucleic Acids Res 43:1–10. https://doi.org/10.1093/nar/gkv227
The UniProt Consortium, UniProt: the universal protein knowledgebase in 2021 (2021) Nucleic Acids Research 49(D1):D480–D489, https://doi.org/10.1093/nar/gkaa1100
Vurture GW, Sedlazeck FJ, Nattestad M et al (2017) GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33:2202–2204. https://doi.org/10.1093/bioinformatics/btx153
Walker BJ, Abeel T, Shea T, et al (2014) Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963
Wang L, He F, Huang Y et al (2018) Genome of wild mandarin and domestication history of mandarin. Mol Plant 11:1024–1037. https://doi.org/10.1016/j.molp.2018.06.001
Wang X, Xu Y, Zhang S et al (2017) Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction. Nat Genet 49:765–772. https://doi.org/10.1038/ng.3839
Wu GA, Prochnik S, Jenkins J et al (2014) Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication. Nat Biotechnol 32:656–662. https://doi.org/10.1038/nbt.2906
Wu GA, Terol J, Ibanez V, et al (2018) Genomics of the origin and evolution of Citrus. Nature. https://doi.org/10.1038/nature25447
Wu J, Fu L, Yi H (2016) Genome-wide identification of the transcription factors involved in citrus fruit ripening from the transcriptomes of a late-ripening sweet orange mutant and its wild type. PLoS ONE 11:1–22. https://doi.org/10.1371/journal.pone.0154330
Zimin AV, Marçais G, Puiu D et al (2013) The MaSuRCA genome assembler. Bioinformatics 29:2669–2677
Zimin AV, Puiu D, Luo M-C et al (2017) Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm. Genome Res 27:787–792
Zimin AV, Salzberg SL (2020) The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies. PLoS Comput Biol 16:e1007981
Funding
Projects ‘Development of Resistance Inductor against Citrus Vascular Pathogens’ (Sviluppo di Induttori di Resistenza a Patogeni Vascolari negli Agrumi, S.I.R.P.A., http://www.progettosirpa.it/home, 08CT7211000254) and ‘Fruit Crops Resilience to Climate Change in the Mediterranean Basin’ (FREECLIMB, https://primafreeclimb.com/) and ‘Valutazione di genotipi di agrumi per l’individuazione di fonti di resistenza a stress biotici e abiotici’ (Linea 2 del Piano della Ricerca di Ateneo 2020, University of Catania) are supporting the proposed work related to new biotechnological approaches carried out to unlock genetic basis of mal secco resistance and to obtain new tolerant genotypes. The APC was funded by Fondi di Ateneo 2020–2022, University of Catania, linea Open Access. Mario Di Guardo took part on this work in the frame of the PON ‘AIM: Attrazione e Mobilità Internazionale’, project number 1848200–2.
Author information
Authors and Affiliations
Contributions
MDG: conceptualization, investigation, writing—original draft; MM: conceptualization, formal analysis, investigation, writing—original draft; MM: conceptualization, formal analysis, investigation, writing—original draft; CC: investigation; TM: conceptualization; DZ: investigation; CA: investigation; CM: investigation, writing—review and editing; DG: investigation, writing—review and editing, funding acquisition; LMS: conceptualization, writing—review and editing, funding acquisition; BL: conceptualization, formal analysis, investigation, writing—original draft; GA: conceptualization, writing—review and editing, funding acquisition.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Data archiving statement
The genome assembly sequences and gene predictions have been submitted to the citrus genome database (https://www.citrusgenomedb.org/Analysis/1462349) where they can be downloaded and accessed through the genome browser and BLAST services. Raw data have been submitted to NCBI’s SRA under the bioproject id PRJNA732837.
Additional information
Communicated by D. Chagné
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Mario Di Guardo, Marco Moretto and Mirko Moser contributed equally to this work and share first authorship.
The original online version of this article was revised: The Author names and contribution text are modified in the original proof.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Guardo, M.D., Moretto, M., Moser, M. et al. The haplotype-resolved reference genome of lemon (Citrus limon L. Burm f.). Tree Genetics & Genomes 17, 46 (2021). https://doi.org/10.1007/s11295-021-01528-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11295-021-01528-5