Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia

Hamilton, John P.; Vaillancourt, Brieanne; Wood, Joshua C.; Wang, Haiyan; Jiang, Jiming; Soltis, Douglas E.; Buell, C. Robin; Soltis, Pamela S.

doi:10.1186/s12863-023-01181-y

Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia

Data Note
Open access
Published: 13 December 2023

Volume 24, article number 75, (2023)
Cite this article

Download PDF

You have full access to this open access article

BMC Genomic Data Aims and scope Submit manuscript

Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia

Download PDF

John P. Hamilton^1,2,
Brieanne Vaillancourt¹,
Joshua C. Wood¹,
Haiyan Wang³,
Jiming Jiang^3,4,
Douglas E. Soltis^5,6,
C. Robin Buell^1,2,7 &
…
Pamela S. Soltis^5,6

791 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Objectives

Lavandula angustifolia (English lavender) is commercially important not only as an ornamental species but also as a major source of fragrances. To better understand the genomic basis of chemical diversity in lavender, we sequenced, assembled, and annotated the ‘Munstead’ cultivar of L. angustifolia.

Data description

A total of 80 Gb of Oxford Nanopore Technologies reads was used to assemble the ‘Munstead’ genome using the Canu genome assembler software. Following multiple rounds of error correction and scaffolding using Hi-C data, the final chromosome-scale assembly represents 795,075,733 bp across 25 chromosomes with an N50 scaffold length of 31,371,815 bp. Benchmarking Universal Single Copy Orthologs analysis revealed 98.0% complete orthologs, indicative of a high-quality assembly representative of genic space. Annotation of protein-coding sequences revealed 58,702 high-confidence genes encoding 88,528 gene models. Access to the ‘Munstead’ genome will permit comparative analyses within and among lavender accessions and provides a pivotal species for comparative analyses within Lamiaceae.

View this article's peer review reports

Objective

Lavender (Lavandula) of Lamiaceae, or the mint family, comprises nearly 50 species, a number of which are used as ornamentals, culinary herbs, and/or sources of essential oils. Lavender produces a rich array of chemical compounds that largely serve a defensive role in nature to deter herbivory. Lavender oil, for example, contains roughly 100 different chemical compounds; a large part of this secondary metabolite diversity is primarily due to the diversity of terpenes (e.g., monoterpenes, sesquiterpenes, diterpenes). The most widely grown and commercially important species is Lavandula angustifolia (English lavender), grown as both an ornamental and a source of oil, but Lavandula contains a wide range of cultivated species and hybrids; for example, L. stoechas, L. dentata, and L. multifida are widely grown as ornamentals. In addition, a hybrid between L. angustifolia and L. latifolia (Lavandula × intermedia), Dutch lavender, is grown commercially. Essential oils of L. angustifolia are widely used in various medicines, including balms and salves, as well as in perfumes and cosmetics. Although lavender is of great economic and ornamental importance, its underlying chemical diversity remains poorly understood. To help elucidate the genetic control of the rich chemical diversity of lavender, we report here the sequencing and chromosome-scale assembly and annotation of the complete nuclear genome of L. angustifolia cv. ‘Munstead’. Two other genome assemblies of L. angustifolia have been reported, from the ‘Maillette’ [1] and ‘Jingxun 2’ [2] cultivars as well as from L. × intermedia ‘Super’ [3]. Given the large number of lavender cultivars, access to the ‘Munstead’ genome will be of further utility by facilitating comparative analyses of the genetic basis of chemodiversity within this commercially important species.

Data description

Lavandula angustifolia ‘Munstead’ was obtained from Van Atta’s nursery (Bath, MI, USA) and grown in a growth chamber under 300 μE light intensity, 70% relative humidity, 15 h day length, 23.3 °C day, and 13.8 °C night. Flow cytometry of Munstead leaf tissue revealed an estimated genome size (1C) of 907 Mb, and a chromosome squash of root tips [4] revealed 50 chromosomes, indicating a haploid chromosome number of 25 (Table 1, Data file 1, [5]). High-molecular-weight DNA was isolated from immature leaves of a single ‘Munstead’ plant using a modified CTAB isolation protocol followed by a Qiagen Genomic Tip and an Amicon buffer exchange as described previously [6]. High-molecular-weight DNA was used to construct libraries for sequencing using the Oxford Nanopore Technologies (ONT) and Illumina platforms (Table 1, Data file 2, Data sets 1–16, [5, 7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22]). For whole genome shotgun sequence data, a single Illumina TruSeq Nano DNA library was constructed and sequenced on a HiSeq 4000 in paired-end mode, generating ~ 310 M paired-end 150-nt reads (Table 1, Data file 2, Data set 16, [5, 22]). These reads were used to estimate heterozygosity using Genomescope (v2.0) [23] using the diploid model in which the observed k-mer distribution did not match the expected model (Table 1, Data file 3, [5]). Estimation of ploidy using Smudgeplot [23] suggested Munstead was a tetraploid genome as shown by the allele distribution (Table 1, Data file 3, [5]). Next, a single SQK-LSK108 and 14 SQK-LSK109 ONT Ligation Sequencing libraries were constructed and sequenced on five FLO-MIN106 and 10 FLO-MIN106 Rev D ONT flow cells (Table 1, Data file 2, Data sets 1–15, [5, 7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]). ONT genomic DNA reads were base called using Guppy (v3.4.1) and reads less than 10 kb removed. The final dataset used for assembly included 2,953,837 reads (80.4 Gb), providing an estimated 89 × coverage of the genome. Reads were assembled using Canu (v1.9) [24] with the options: minOverlapLength = 4000 and genomeSize = 980 m. The contigs were error corrected using two rounds of Racon (v1.4.10) [25] followed by two rounds of Medaka (v0.11.5) [26] and two rounds of Pilon (v1.23) [27]. Due to the heterozygosity of the Munstead genome, haplotigs were removed through two rounds of purge_dups (v1.0.0) [28]. Hi-C Illumina HiSeq 4000 paired-end 150-nt reads were generated from two Dovetail Hi-C [29] libraries constructed from immature leaves (Table 1, Data files 2 & 4, Data sets 17 & 18, [5, 30, 31]) and used to scaffold the contigs into 25 chromosomes using Juicer (v1.6) [32] and 3D-DNA (git commit: 529ccf4) [33]. To screen the final contigs for contamination, contigs were split into 10-kb windows and searched against the NCBI nt database using Centrifuge (v1.0.4-beta; [34]); no full contigs or regions were identified as contaminants. The final assembly is 795,075,733 bp with an N50 scaffold length of 31,371,815 bp (Table 1, Data files 5 & 6, Data set 19, [5, 35]). Estimation of k-mer representation in the assembly using KAT (v3.4.1) [36] revealed nearly complete purging of haplotigs (Table 1, Data file 7, [5]) although k-mers not represented in the assembly suggest that lavender is a diploidized autopolyploid in which a subset of the alleles are heterozygous. Benchmarking Universal Single Copy Orthologs (BUSCO, v5.4.3) [37] analysis of the genome assembly revealed 98.0% complete BUSCOs and 68.3% complete and duplicated BUSCOs (Table 1, Data file 8, [5]) indicating a near-complete genic space consistent with a polyploid ancestry and recent diploidization of lavender.

Table 1 Overview of data files and data sets used in this study

Full size table

A custom repeat library was created using RepeatModeler (v2.0.1) [42, 43] and used to mask the genome (65.2% masked, Table 1, Data file 9, [5]) prior to annotation with RepeatMasker (v4.1.0) [44]. To provide empirical transcript evidence for annotation, RNA from three plants was isolated from a tissue panel (immature leaf, mature leaf, inflorescence, and stem) using the hot phenol extraction method as described previously [45]. Illumina TruSeq Stranded mRNA libraries were constructed and sequenced on an Illumina HiSeq 4000, generating 150-nt paired-end reads (Table 1, Data file 2, Data sets 20–24, [5, 38,39,40,41]). Reads were cleaned of adaptors and low-quality sequences using Cutadapt (v2.9) [46] and then aligned to the genome assembly with HISAT2 (v2.2.0) [47] and assembled using Stringtie (v2.1.1) [48] to generate genome-guided transcript assemblies. Ab initio gene models were created using the BRAKER2 (v2.1.5) [49] pipeline using the RNA-seq alignments as hints and then refined using PASA (v2.4.1) [50, 51]. Functional annotation was assigned to the working gene models using matches to the Arabidopsis predicted proteome (TAIR10) [52], Pfam domains [53], and SwissProt plant protein sequences as described previously [43]. High-confidence models within the working model set were defined based on the presence of a Pfam domain and/or gene expression data as defined as a transcript per million greater than zero. Representative gene models were defined as the gene model with the longest coding sequence at a locus. The final working gene set is composed of 68,432 genes and 98,924 gene models (Table 1, Data file 10, [5]). The high-confidence set is composed of 58,702 genes and 88,528 gene models (Table 1, Data file 10, [5]). BUSCO analyses of the gene models revealed > 95% complete BUSCOs (Table 1, Data file 8, [5]) in the working, high-confidence, and high-confidence representative gene model sets including high-quality annotation of the ‘Munstead’ genome.

Limitations

The estimated genome size of ‘Munstead’ is greater than the final, error-corrected assembly, suggesting that we are missing portions of the genome. Due to sequence complexity, it is likely that highly repetitive sequences within the centromeric and pericentromeric regions are under-represented in the assembly. K-mer estimations of heterozygosity suggest the ‘Munstead’ genome is heterozygous, which is consistent with the presence of haplotigs in the initial Canu assembly. By purging duplicates to generate a haploid assembly, we may have removed structural variants such as diverged paralogs or presence/absence variants from the final assembly. To prevent over-annotation of pseudogenes, our annotation pipeline is dependent on empirical support and/or similarity to annotation proteins and Pfam domains. Thus, it is possible that genes shorter than our minimum length (50 amino acids) and/or that lack expression or protein evidence are not present in our annotated gene sets.

Availability of data and materials

The data described in this Data note can be freely and openly accessed at the National Center for Biotechnology Information Short Read Archive under BioProject ID PRJNA762277 (https://identifiers.org/bioproject:PRJNA762277; [35]). The assembled genome is available in GenBank under the accession JAPVEC000000000 (https://identifiers.org/assembly:GCA_028984105.1; [35]) and on Figshare [5]. A summary of data sets available on Figshare [5] is available in Table 1. A voucher specimen is available at the Michigan State University Herbarium (MSC0273865).

Abbreviations

BUSCO:: Benchmarking Universal Single Copy Orthologs
ONT:: Oxford Nanopore Technologies
PASA:: Program to Assemble Spliced Alignments

References

Malli RPN, Adal AM, Sarker LS, Liang P, Mahmoud SS. De novo sequencing of the Lavandula angustifolia genome reveals highly duplicated and optimized features for essential oil production. Planta. 2019;249:251–6.
Article CAS PubMed Google Scholar
Li J, Wang Y, Dong Y, Zhang W, Wang D, Bai H, et al. The chromosome-based lavender genome provides new insights into lamiaceae evolution and terpenoid biosynthesis. Hortic Res. 2021;8:53.
Article PubMed PubMed Central Google Scholar
Li J, Li H, Wang Y, Zhang W, Wang D, Dong Y, et al. Decoupling subgenomes within hybrid lavandin provide new insights into speciation and monoterpenoid diversification of Lavandula. Plant Biotechnol J. 2023;21(10):2084–99.
Article CAS PubMed PubMed Central Google Scholar
Braz GT, He L, Zhao H, Zhang T, Semrau K, Rouillard JM, et al. Comparative oligo-FISH mapping: an efficient and powerful methodology to reveal karyotypic and chromosomal evolution. Genetics. 2018;208:513–23.
Article CAS PubMed Google Scholar
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia. Data files and data sets associated with “Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia.” 2023. https://doi.org/10.6084/m9.figshare.23982972.v3. Accessed 05 Dec 2023
Vaillancourt B, Buell CR. High molecular weight DNA isolation method from diverse plant species for use with Oxford Nanopore sequencing. BioRxiv. 2019; 783159; doi: https://doi.org/10.1101/783159.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929008. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929008. 2023. https://doi.org/10.6084/m9.figshare.23982972.v3. Accessed 05 Dec 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929007. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929007. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15929007. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929001. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929001. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15929001. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929000. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929000. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15929000. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928999. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928999. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15928999. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928998. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928998. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15928998. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928997. Oxford Nanopore Technologies High molecular weight genomic DNA, SSRR15928997. 2023. https://identifiers.org/ncbi/insdc.sra: SSRR15928997. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928996. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928996. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15928996. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928995. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928995. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15928995. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928994. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15928994. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15928994. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929006. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929006. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15929006. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929005. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929005. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15929005. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929004. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929004. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15929004. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929003. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929003. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15929003. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929002. Oxford Nanopore Technologies High molecular weight genomic DNA, SRR15929002. 2023. https://identifiers.org/ncbi/insdc.sra: SRR15929002. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Illumina WGS DNA, SRR15915200. Illumina WGS DNA, SRR15915200. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15915200. Accessed 21 Aug 2023.
Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 20 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 2020;11:1432.
Article CAS PubMed PubMed Central Google Scholar
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27:722–36.
Article CAS PubMed PubMed Central Google Scholar
Vaser R, Sovic I, Nagarajan N, Sikic M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017;27:737–46.
Article CAS PubMed PubMed Central Google Scholar
Medaka tool for Oxford Nanopore Sequences. https://nanoporetech.github.io/medaka/index.html. Accessed 2020/9.
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963.
Article PubMed PubMed Central Google Scholar
purge_dups. https://github.com/dfguan/purge_dups. Accessed Oct 2022.
Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326:289–93.
Article CAS PubMed PubMed Central Google Scholar
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Illumina Hi-C DNA, SRR15931069. Illumina Hi-C DNA, SRR15931069. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15931069. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Illumina Hi-C DNA, SRR15931068. Illumina Hi-C DNA, SRR15931068. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15931068. Accessed 21 Aug 2023.
Durand NC, Shamim MS, Machol I, Rao SSP, Huntley MH, Lander ES, et al. Juicer provides a one-click system for analyzing loop-resolution hi-C experiments. Cell Syst. 2016;3:95–8.
Article CAS PubMed PubMed Central Google Scholar
Dudchenko O, Batra SS, Omer AD, Nyquist SK, Hoeger M, Durand NC, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92–5.
Article CAS PubMed PubMed Central Google Scholar
Kim D, Song L, Breitwieser FP, Salzberg SL. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 2016;26:1721–9.
Article CAS PubMed PubMed Central Google Scholar
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Genome assembly of Lavandula angustifolia fasta file. Genome assembly of Lavandula angustifolia fasta file. 2023. https://identifiers.org/assembly/GCA_028984105. Accessed 21 Aug 2023.
Mapleson D, Garcia Accinelli G, Kettleborough G, Wright J, Clavijo BJ. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics. 2017;33:574–6.
Article CAS PubMed Google Scholar
Waterhouse RM, Seppey M, Simão FA, Manni M, Ioannidis P, Klioutchnikov G, et al. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol. 2018;35:543–8.
Article CAS PubMed Google Scholar
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Illumina RNA-seq: RNA-seq-mature leaf, SRR15915199. Illumina RNA-seq: RNA-seq-mature leaf, SRR15915199. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15915199. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Illumina RNA-seq: immature leaf, SRR15915191. Illumina RNA-seq: immature leaf, SRR15915191. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15915191. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Illumina RNA-seq: inflorescence , SRR15915190. Illumina RNA-seq: inflorescence , SRR15915190. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15915190. Accessed 21 Aug 2023.
Hamilton JP, Vaillancourt B, Wood JC, Wang H, Jiang J, Soltis DE, et al. Illumina RNA-seq: stem, SRR15915189. Illumina RNA-seq: stem, SRR15915189. 2023. https://identifiers.org/ncbi/insdc.sra:SRR15915189. Accessed 21 Aug 2023.
Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A. 2020;117:9451–7.
Article CAS PubMed PubMed Central Google Scholar
Pham GM, Hamilton JP, Wood JC, Burke JT, Zhao H, Vaillancourt B, et al. Construction of a chromosome-scale long-read reference genome assembly for potato. Gigascience. 2020;9:giaa100.
Article PubMed PubMed Central Google Scholar
Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2004;Chapter 4:Unit 4 10.
Davidson RM, Hansey CN, Gowda M. Utility of RNA sequencing for analysis of maize reproductive transcriptomes. Plant Genome. 2011;4:191–203.
Article CAS Google Scholar
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10–2.
Article Google Scholar
Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37:907–15.
Article CAS PubMed PubMed Central Google Scholar
Kovaka S, Zimin AV, Pertea GM, Razaghi R, Salzberg SL, Pertea M. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:278.
Article CAS PubMed PubMed Central Google Scholar
Hoff KJ, Lomsadze A, Borodovsky M, Stanke M. Whole-Genome Annotation with BRAKER. In: Kollmar M, editor. Gene Prediction: Methods and Protocols. New York: Springer, New York; 2019. p. 65–95.
Google Scholar
Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, et al. Improving the arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003;31:5654–66.
Article CAS PubMed PubMed Central Google Scholar
Campbell MA, Haas BJ, Hamilton JP, Mount SM, Buell CR. Comprehensive analysis of alternative splicing in rice and comparative analyses with arabidopsis. BMC Genomics. 2006;7:327.
Article PubMed PubMed Central Google Scholar
Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, et al. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 2012;40 Database issue:D1202-10.
Article Google Scholar
El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, et al. The Pfam protein families database in 2019. Nucleic Acids Res. 2019;47:D427–32.
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We acknowledge the sequencing performed at the Michigan State University Research Technology Support Facility.

Funding

Funding for this work was provided via grants to CRB, DES, and PSS from the National Science Foundation (IOS-1444499), the Georgia Research Alliance (CRB), Georgia Seed Development (CRB), and the University of Georgia (CRB). The funders had no role in the design, execution, interpretation, or written summary of this study.

Author information

Authors and Affiliations

Center for Applied Genetic Technologies, University of Georgia, Athens, GA, USA
John P. Hamilton, Brieanne Vaillancourt, Joshua C. Wood & C. Robin Buell
Department of Crop and Soil Sciences, University of Georgia, Athens, GA, USA
John P. Hamilton & C. Robin Buell
Department of Plant Biology, Michigan State University, East Lansing, MI, USA
Haiyan Wang & Jiming Jiang
Department of Horticulture, Michigan State University, East Lansing, MI, USA
Jiming Jiang
Department of Biology, University of Florida, Gainesville, FL, USA
Douglas E. Soltis & Pamela S. Soltis
Museum of Natural History, University of Florida, Gainesville, FL, USA
Douglas E. Soltis & Pamela S. Soltis
Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, GA, USA
C. Robin Buell

Authors

John P. Hamilton
View author publications
You can also search for this author in PubMed Google Scholar
Brieanne Vaillancourt
View author publications
You can also search for this author in PubMed Google Scholar
Joshua C. Wood
View author publications
You can also search for this author in PubMed Google Scholar
Haiyan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiming Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Douglas E. Soltis
View author publications
You can also search for this author in PubMed Google Scholar
C. Robin Buell
View author publications
You can also search for this author in PubMed Google Scholar
Pamela S. Soltis
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

BV and JCW generated sequence, performed quality assessments, and performed data management. JPH assembled and annotated the genome. HW and JJ performed chromosome counting. CRB, JPH, DES, PES, and BV wrote the manuscript. CRB, DES, and PES conceived of the study and obtained project funding. All authors approved the manuscript.

Corresponding authors

Correspondence to Douglas E. Soltis, C. Robin Buell or Pamela S. Soltis.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Hamilton, J.P., Vaillancourt, B., Wood, J.C. et al. Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia. BMC Genom Data 24, 75 (2023). https://doi.org/10.1186/s12863-023-01181-y

Download citation

Received: 01 September 2023
Accepted: 06 December 2023
Published: 13 December 2023
DOI: https://doi.org/10.1186/s12863-023-01181-y

Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia

Abstract

Objectives

Data description

Objective

Data description

Limitations

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia

Abstract

Objectives

Data description

Objective

Data description

Limitations

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation