Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda

Wang, Jiajia; Lv, Jianjian; Shi, Miao; Ge, Qianqian; Wang, Qiong; He, Yuying; Li, Jian; Li, Jitao

doi:10.1038/s41597-024-03423-9

Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda

Data Descriptor
Open access
Published: 04 June 2024

Volume 11, article number 576, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Data

Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda

Download PDF

Jiajia Wang ORCID: orcid.org/0000-0003-2292-6528^1,2^na1,
Jianjian Lv^1,2^na1,
Miao Shi³^na1,
Qianqian Ge²,
Qiong Wang^1,2,
Yuying He^1,2,
Jian Li^1,2 &
…
Jitao Li^1,2

752 Accesses
2 Altmetric
Explore all metrics

Abstract

Exopalaemon carinicauda, a eurythermal and euryhaline shrimp, contributes one third of the total biomass production of polyculture ponds in eastern China and is considered as a potential ideal experimental animal for research on crustaceans. We conducted a high-quality chromosome-level genome assembly of E. carinicauda combining PacBio HiFi and Hi-C sequencing data. The total assembly size was 5.86 Gb, with a contig N50 of 235.52 kb and a scaffold N50 of 138.24 Mb. Approximately 95.29% of the assembled sequences were anchored onto 45 pseudochromosomes. BUSCO analysis revealed that 92.89% of 1,013 single-copy genes were highly conserved orthologs. A total of 44, 288 protein-coding genes were predicted, of which 70.53% were functionally annotated. Given its high heterozygosity (2.62%) and large proportion of repeat sequences (71.49%), it is one of the most complex genome assemblies. This chromosome-scale genome will be a valuable resource for future molecular breeding and functional genomics research on E. carinicauda.

A chromosome-level genome assembly and evolutionary analysis of Coregonus ussuriensis Berg

Article Open access 18 July 2024

Chromosome-level assembly of Triplophysa yarkandensis genome based on the single molecule real-time sequencing

Article Open access 05 January 2024

Improved chromosomal-level genome assembly and re-annotation of leopard coral grouper

Article Open access 22 March 2023

Background & Summary

The family Palaemonidae, including more than 1400 species in 181 genera, represents the largest family of the order Decapoda¹. Animals from this family are found in marine and freshwater environments in tropical to temperate regions worldwide. It includes several shrimps with high economic value, such as Macrobrachium rosenbergii, Macrobrachium nipponense and Exopalaemon carinicauda. The ridgetail white shrimp E. carinicauda is a eurythermal and euryhaline shrimp distributed over a wide geographical area throughout tropical, subtropical, and temperate coastal waters^2,3. It can survive in a multitude of environmental extremes, has a broad salinity tolerance of 2–44 and can survive in freshwater after domestication⁴. It is also capable of inhabiting temperatures as low as −3 °C and as high as 39 °C^5,6. As one of the most commercially valuable pond-raised species of shrimp, E. carinicauda contributes to one third of the total production of polyculture ponds in eastern China⁷.

In addition to its important economic value in aquaculture, it is considered a potential ideal experimental animal for research on crustaceans for its moderate size, transparent body (Fig. 1), short reproductive cycle, large eggs (diameters ranging 0.57–1.08 mm) and ease of culturing and breeding in captive conditions⁸. Currently, CRISPR/Cas9-mediated genome editing technology has been successfully used in E. carinicauda, which is the first time that gene editing has been realized in a decapod crustacean^9,10. However, the absence of genomic data limits the further application of gene editing in studying the molecular biology, cytobiology and genetics of crustaceans. Therefore, a high-quality reference genome is essential for understanding the molecular biology, genetics, breeding, ecology and adaptation of E. carinicauda.

A fragmented draft genome of E. carinicauda has been assembled using Illumina short reads containing 13,897,062 scaffolds (contig N50, 263 bp)¹¹. Genome survey analysis indicated that E. carinicauda has a relatively large genome size of 5.73 Gb, which is at least twice as large as that of many decapod shrimps^12,13,14. In this study, an improved chromosome-level genome of E. carinicauda was assembled using the PacBio sequencing platform, Illumina paired-end sequencing, and high-throughput chromatin conformation capture (Hi-C) technology. Our previous studies suggested that the E. carinicauda karyotype is 2n = 90¹⁵, similar to that of other Exopalaemon species¹⁶. The final genome size was 5.86 Gb with a contig N50 length of 235.52 kb and a scaffold N50 length of 138.24 Mb. A total of 44,288 protein-coding genes were predicted in the genome of E. carinicauda. This chromosome-level genome assembly of E. carinicauda provides a valuable genomic resource for further genetic improvement and understanding of the functional genes and molecular mechanisms of E. carinicauda.

Methods

Animal materials and genome sequencing

A female shrimp was collected from Rizhao Haichen Aquatic Co., Ltd. The muscle tissue was collected for DNA extraction and library construction. Total genomic DNA was extracted using a cetyltrimethylammonium bromide method. For the genome survey, a 350 bp paired-end library was constructed according to the manufacturer’s instructions (Illumina, San Diego, CA, USA) and sequenced on an Illumina NovaSeq 6000 platform. A total of 276.18 Gb of raw data were obtained, which covered approximately 54 × of the estimated genome (Table 1).

Table 1 Genome assembly statistics of E. carinicauda.

Full size table

For PacBio sequencing, a 15 kb library was constructed using the SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences, Menlo Park, CA, USA) and sequenced with circular consensus sequencing mode using a single 8 M SMRT Cell on the PacBio Sequel II platform (Pacific Biosciences). After filtering out the low-quality reads and sequence adapters, 3636.91 Gb subreads of PacBio Data were obtained, representing approximately 708 × sequence coverage based on the estimated genome size (Table 1). Finally, 203.27 Gb of CCS reads were generated using SMRTLink 9.0 which covered approximately 40 × of the estimated genome.

For the construction of the Hi-C library, DNA was fixed with 4% formaldehyde solution and digested with the 4-cutter restriction enzyme MboI. The digested fragments were labeled with biotin-14-dCTP, then the cross-linked fragments were subjected to blunt-end ligation. The library was sequenced on the Illumina NovaSeq 6000 platform, and approximately 552.65 Gb of Hi-C clean reads were generated, covering approximately 108 × of the estimated genome (Table 1).

Genome survey

The genome size and heterozygosity were estimated using the k-mer method before genome assembly¹⁷. The k-mer distribution was calculated from Illumina short reads using Jellyfish based on k-mer (k = 17)¹⁸. The heterozygosity ratio was estimated by the online tool of GenomeScope¹⁹ (https://github.com/schatzlab/genomescope). Finally, the estimated genome size of E. carinicauda was predicted to be approximately 5.12 Gb, with 84.74% repetitive sequences, and the genome heterozygosity was 2.62% using a 17-mer analysis (Fig. 2), suggesting a complex genome of E. carinicauda.

Chromosome-level genome assembly

The initial genome was assembled with HiFi reads using the Peregrine (v0.1.6.1) (https://github.com/cschin/peregrine). A modified “best overlap graph” strategy was used to get the contig assembly based on the overlap graph. Contig overlaps were removed from the assembled contig sequences using Purge_dups (https://github.com/dfguan/purge_dups). De novo assembly of PacBio sequences yielded a preliminary assembly of 5.86 Gb, containing 47,421 contigs with a contig N50 length of 235.28 kb, a maximum length of 3,038,493 bp and a GC content of 34.79% (Table 1).

Chromosome-level assembly of E. carinicauda was conducted using Hi-C technology. Juicer (v1.6.2)²⁰ and 3D-DNA (v180922)²¹ software were implemented to obtain the chromosome-level whole genome assembly. The filtered Hi-C reads were aligned to the initial draft genome using Juicer (v1.6.2). Only uniquely mapped and valid paired-end reads were used for the assembly using 3D-DNA. Juicebox (v1.9.8) was used to manually order the scaffolds to generate more precise chromosome-level genome of E. carinicauda according to the chromosomal interaction heatmap²². Contact maps were visualized using HiCExplorer (v3.3)²³. The number of chromosomes was 90, which was determined based on karyological observations of E. carinicauda chromosomes in our previous study¹⁵. The contigs were ultimately clustered into 45 pseudochromosomes for E. carinicauda, with a scaffold N50 length of 138.24 Mb. The total length of the 45 pseudochromosomes was 5.58 Gb (covered 95.29%) (Fig. 3a,b), of which the length ranged from 46.25 Mb to 338.48 Mb. The length of the un-placed scaffolds was 275.86 Mb (Table 2).

Table 2 Statistics of cluster number and length of single chromosome.

Full size table

The quality of the final chromosome-level genome assembly was assessed using the following three methods. First, we aligned the Illumina DNA short reads obtained from our previous study to the assembled genome and found that approximately 99.00% of the DNA short reads could be mapped to our assembly using BWA (v0.7.15)²⁴. Second, read depth and GC content with 10 kb windows were used to evaluate the assembly results and determine whether there was a significant GC bias or sample contamination, showing that the assembled genome was clean without contamination (Fig. 4). Finally, genome assembly and completeness were further evaluated using conserved genes in benchmarking universal single-copy orthologs (BUSCO, v5.2.2) with the arthropoda_odb10 database²⁵. The results showed that 92.89% of the 1013 single-copy genes were highly conserved orthologs (88.75% complete, 4.15% fragmented, and 7.11% missing) (Table 3).

Table 3 Universal single copy ortholog (BUSCO) assessment of E. carinicauda.

Full size table

Compared to the published genome of E. carinicauda¹¹, our assembled genome is of significantly improved quality and integrity. The contig N50 increased from 263 bp to 235,277 bp, with an increase of nearly 900-fold, and scaffold N50 increased from 816 bp to 138,242,434 bp. Meanwhile, the assembled complete orthologue proportion enhanced from 43.44% to 88.75% according to the BUSCO assessment.

Repetitive and non-coding gene prediction

To detect repeat elements in E. carinicauda genome, de novo and homology-based strategies were combined using multiple methods. Mini-inverted repeat transposable elements (MITEs) were identified using MITE-Hunter (v1.0)²⁶ for de novo annotations. Long terminal repeat sequences (LTRs) were detected using LTRharvest²⁷ and LTR_Finder (v1.07)²⁸, and the prediction results of these two software programs were integrated using LTR_retriever (v2.8.2)²⁹. RepeatMasker (v4.1.0)³⁰ was used in the homology-based alignment to search E. carinicauda genome sequence in the RepBase database (http://www.girinst.org/repbase). RepeatMasker was used to mask the repetitive sequences obtained by the above method, and RepeatModeler (v2.0)³¹ was used to perform the de novo identification of other repetitive sequences with the repeat-masked genome. Ultimately, we identified approximately 4.19 Gb of repetitive sequences, accounting for approximately 71.49% of the assembled genome, among which 9.97% were tandem repeat sequences. Among these repetitive sequences, LTRs (42.52%) accounted for the highest proportion of the assembly, followed by DNA (10.81%) and LINE (3.33%) (Table 4).

Table 4 Repeat components in E. carinicauda genome.

Full size table

Five types of noncoding RNA (ncRNA) were identified in the genome of E. carinicauda, including microRNAs (miRNAs), transfer RNAs (tRNAs), ribosomal RNAs (rRNA), small nuclear RNAs (snRNAs) and small nucleolar RNAs (snoRNAs). The tRNA was predicted using tRNAscan-SE (v2.0)³². Other types of ncRNAs were detected by alignment to Rfam database³³ using infernal (v1.1.3) software³⁴. In total, 10249 non-coding RNAs (ncRNAs) were annotated, including 3,702 rRNAs, 386 miRNAs, 5,811 tRNAs, 269 snRNAs, and 81 snoRNAs (Table 5).

Table 5 Classification of ncRNAs in E. carinicauda genome.

Full size table

Gene prediction and annotation

We detected the protein-coding genes in the E. carinicauda genome assembly by a comprehensive strategy that combined ab initio prediction, protein-based homology searches, and RNA sequencing data predictions. For ab initio prediction, augustus (v3.2.2)³⁵, SNAP (v6.0)³⁶, Glimmer hmm (v3.0.4)³⁷ and GeneMark-ET³⁸ were used to predict the repeat-masked genome structure. For protein-based homology prediction, the protein sequences of homologous species including Daphnia pulex (GCA_021134715.1), Procambarus virginalis (GCA_020271785.1), Fenneropenaeus chinensis (GCA_019202785.2), Penaeus japonicus (GCA_017312705.1), Penaeus monodon (GCA_015228065.1), Litopenaeus vannamei (GCA_003789085.1), Portunus trituberculatus (GCA_017591435.1) and M. nipponense (GCA_015104395.1) were downloaded from the NCBI database and aligned against the E. carinicauda genome using GeMoMa (v1.7.1)³⁹ to perform homology prediction. Furthermore, the RNA-seq data from different tissues and embryonic development stages (PRJNA594425, PRJNA746617, PRJNA756619, PRJNA881755, and PRJNA881756) were mapped to the genome by HISAT2 (v2.1.0)⁴⁰. The full-length transcripts (PRJNA594425) from our previous study⁴¹ were assembled using Cufflinks (v2.1.1)⁴², then the open reading frame was predicted using PASA (v20140417)⁴³. The EVidenceModeler⁴⁴ was employed to consolidate the results from these three methods, enabling the merging and integration of gene predictions. Finally, 44,288 high-quality protein-coding genes were predicted. These predicted genes displayed an average gene length of 28,448 bp, an average coding length of 1,424 bp and 6.09 coding exons per gene.

These genes were functionally annotated using BLAST against NR, SwissProt, eggNOG, InterPro, GO and KEGG⁴⁵. The protein-coding gene functional annotation results were merged using the aforementioned methods. Finally, 70.53% of the total predicted genes were successfully assigned with at least one functional annotation (Table 6).

Table 6 Statistical results of gene function annotation.

Full size table

Data Records

All sequencing data have been uploaded to the NCBI SRA database. The Illumina sequencing data for genomic survey has been deposited in the NCBI Sequence Read Archive with accession number SRR27880589⁴⁶ under BioProject accession number PRJNA1070324.

The genomic PacBio sequencing data has been deposited in the NCBI Sequence Read Archive with accession number SRR27756800⁴⁷, SRR27756801⁴⁸, SRR27862044⁴⁹ and SRR27862045⁵⁰ under BioProject accession number PRJNA1070324.

The Hi-C sequencing data has been deposited in the NCBI Sequence Read Archive with accession number SRR27880535⁵¹, SRR27880536⁵², SRR27880537⁵³, SRR27880538⁵⁴, SRR27880539⁵⁵ and SRR27880540⁵⁶ under BioProject accession number PRJNA1073006.

The final chromosome-level assembled genome file has been uploaded to the GenBank database under the accession JAZBEV000000000⁵⁷.

Technical Validation

To evaluate the integrity and accuracy of the genome assembly, the completeness of the final genome assembly was assessed using BUSCO (v5.2.2) and the arthropoda_odb10 database²⁵. It was shown that 92.89% of the 1013 single-copy genes were highly conserved orthologs (88.75% complete, 4.15% fragmented, and 7.11% missing). By aligning the Illumina sequencing reads (PRJNA471201)³ to the genome using BWA (v0.7.15)²⁴, the read-mapping rate was 99.00%. This indicates a high mapping efficiency. Thus, the above results indicated that we obtained a high-quality genome of the E. carinicauda.

Code availability

No specific code was used in this study. The data analyses used standard bioinformatic tools specified in the methods.

References

World Register of Marine Species https://www.marinespecies.org (2024).
Zhang, Q., Zhang, C., Yu, Y. & Li, F. Analysis of genetic diversity and population structure of the ridgetail white prawn Exopalaemon carinicauda in China. Aquacult Rep. 27, 101369 (2022).
Google Scholar
Li, J. et al. Genome survey and high-resolution backcross genetic linkage map construction of the ridgetail white prawn Exopalaemon carinicauda applications to QTL mapping of growth traits. Bmc Genomics. 20, 598 (2019).
Article PubMed PubMed Central Google Scholar
Ge, Q., Li, Z., Li, J., Wang, J. & Li, J. Effects of acute salinity stress on the survival and prophenoloxidase system of Exopalaemon carinicauda. Acta Oceanol Sin. 39, 57–64 (2020).
Article CAS Google Scholar
Wang, X., Yan, B., Ma, S. & Dong, S. Study on The Biology and Cultural Ecology of Exopalaemon carinicauda. Shandong Fisheries. 22, 21–24 (2005).
Google Scholar
Huan, G. et al. Analysis to the Activities of Five Factors in Response to Temperature in Exopalaemon carinicauda. Journal of Huaihai Institute of Technology. 23, 72–75 (2014).
Google Scholar
Zhang, Z. et al. Effects of adding EM bacteria and mechanical aeration on water quality, growth and antioxidant status of Meretrix meretrix and Exopalaemon carinicauda farmed in the clam–shrimp polyculture system. Aquac Res. 53, 1823–1832 (2022).
Article CAS Google Scholar
Gui, T. et al. CRISPR/Cas9-Mediated Genome Editing and Mutagenesis of EcChi4 in Exopalaemon carinicauda. G3 Genes Genom Genet. 6, 3757–3764 (2016).
Article CAS Google Scholar
Miao, M. et al. CRISPR/Cas9-mediated gene mutation of EcIAG leads to sex reversal in the male ridgetail white prawn Exopalaemon carinicauda. Front Endocrinol. 14, 1266641 (2023).
Article Google Scholar
Gao, Y. et al. CRISPR/Cas9-mediated mutation on an insulin-like peptide encoding gene affects the growth of the ridgetail white prawn Exopalaemon carinicauda. Front Endocrinol. 13, 986491 (2022).
Article Google Scholar
Yuan, J. et al. Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea. Mar Drugs. 15, 213–230 (2017).
Article PubMed PubMed Central Google Scholar
Uengwetwanit, T. et al. A chromosome-level assembly of the black tiger shrimp (Penaeus monodon) genome facilitates the identification of growth-associated genes. Mol Ecol Resour. 21, 1620–1640 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wang, Q. et al. Improved genome assembly of Chinese shrimp (Fenneropenaeus chinensis) suggests adaptation to the environment during evolution and domestication. Mol Ecol Resour. 22, 334–344 (2022).
Article CAS PubMed Google Scholar
Zhang, X. et al. Penaeid shrimp genome provides insights into benthic adaptation and frequent molting. Nat Commun. 10, 356 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, Y., Liu, P., Li, J., Li, J. & Gao, B. The chromosome preparation and karyotype in the ridgetail white prawn Exopalaemon carinicauda. Journal of Dalian Ocean University. 27, 453–456 (2012).
Google Scholar
Jiang, Q., Xie, S., Zhou, Q. & Lan, W. Chromosome Karyotype in Freshwater Prown Exopalaemon modestus. Fisheries Science. 27, 470–472 (2008).
Google Scholar
Liu, B., et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects (2013).
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 27, 764–770 (2011).
Article PubMed PubMed Central Google Scholar
Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 33, 2202–2204 (2017).
Article CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 356, 92–95 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 3, 99–101 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wolff, J. et al. Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res. 48, W177–w184 (2020).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Seppey, M., Manni, M. & Zdobnov, E. M. BUSCO: Assessing Genome Assembly and Annotation Completeness. Methods Mol Biol. 1962, 227–245 (2019).
Article CAS PubMed Google Scholar
Han, Y. & Wessler, S. R. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res. 38, e199 (2010).
Article PubMed PubMed Central Google Scholar
Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. Bmc Bioinformatics. 9, 18 (2008).
Article PubMed PubMed Central Google Scholar
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–268 (2007).
Article PubMed PubMed Central Google Scholar
Ou, S. & Jiang, N. LTR_retriever: A Highly Accurate and Sensitive Program for Identification of Long Terminal Repeat Retrotransposons. Plant Physiol. 176, 1410–1422 (2018).
Article CAS PubMed Google Scholar
Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. Chapter 4, Unit 4.10 (2004).
PubMed Google Scholar
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci USA 117, 9451–9457 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997).
Article CAS PubMed PubMed Central Google Scholar
Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005).
Article CAS PubMed Google Scholar
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 29, 2933–2935 (2013).
Article CAS PubMed PubMed Central Google Scholar
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–439 (2006).
Article CAS PubMed PubMed Central Google Scholar
Korf, I. Gene finding in novel genomes. Bmc Bioinformatics. 5, 59 (2004).
Article PubMed PubMed Central Google Scholar
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 20, 2878–2879 (2004).
Article CAS PubMed Google Scholar
Lomsadze, A., Burns, P. D. & Borodovsky, M. Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res. 42, e119 (2014).
Article PubMed PubMed Central Google Scholar
Keilwagen, J., Hartung, F. & Grau, J. GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data. Methods Mol Biol. 1962, 161–177 (2019).
Article CAS PubMed Google Scholar
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 37, 907–915 (2019).
Article CAS PubMed PubMed Central Google Scholar
Shi, K. et al. Full-length transcriptome sequences of ridgetail white prawn Exopalaemon carinicauda provide insight into gene expression dynamics during thermal stress. Sci Total Environ. 747, 141238 (2020).
Article ADS CAS PubMed Google Scholar
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 28, 511–515 (2010).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
Article PubMed PubMed Central Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J Mol Biol. 215, 403–410 (1990).
Article CAS PubMed Google Scholar
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27880589 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27756800 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27756801 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27862044 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27862045 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27880535 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27880536 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27880537 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27880538 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27880539 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR27880540 (2024).
NCBI GenBank https://identifiers.org/ncbi/insdc:JAZBEV000000000 (2024).

Download references

Acknowledgements

This research was funded by National Key Research and Development Program of China (No. 2023YFD2401001), National Natural Science Foundation of China (32072974), China Agriculture Research System of MOF and MARA (CARS-48) and the Central Public-interest Scientific Institution Basal Research Fund, CAFS (2023TD50).

Author information

These authors contributed equally: Jiajia Wang, Jianjian Lv, Miao Shi.

Authors and Affiliations

State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, Shandong, 266071, China
Jiajia Wang, Jianjian Lv, Qiong Wang, Yuying He, Jian Li & Jitao Li
Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao Marine Science and Technology Center, Qingdao, Shandong, 266237, China
Jiajia Wang, Jianjian Lv, Qianqian Ge, Qiong Wang, Yuying He, Jian Li & Jitao Li
Berry Genomics Co., Ltd., Beijing, China
Miao Shi

Authors

Jiajia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianjian Lv
View author publications
You can also search for this author in PubMed Google Scholar
Miao Shi
View author publications
You can also search for this author in PubMed Google Scholar
Qianqian Ge
View author publications
You can also search for this author in PubMed Google Scholar
Qiong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuying He
View author publications
You can also search for this author in PubMed Google Scholar
Jian Li
View author publications
You can also search for this author in PubMed Google Scholar
Jitao Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.W., J.L. and J.L. (Jitao Li) conceived and designed the study. Q.G. and Q.W. prepared the material. J.W., J.L. (Jianjian Lv) and M.S. analyzed the data. J.W. and Y.H. prepared the results. J.W. drafted the manuscript. J.L. (Jianjian Lv) and J.L. (Jitao Li) edited and improved the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jitao Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, J., Lv, J., Shi, M. et al. Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda. Sci Data 11, 576 (2024). https://doi.org/10.1038/s41597-024-03423-9

Download citation

Received: 04 March 2024
Accepted: 24 May 2024
Published: 04 June 2024
DOI: https://doi.org/10.1038/s41597-024-03423-9
Springer Nature Limited

Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda

Abstract

Similar content being viewed by others

A chromosome-level genome assembly and evolutionary analysis of Coregonus ussuriensis Berg

Chromosome-level assembly of Triplophysa yarkandensis genome based on the single molecule real-time sequencing

Improved chromosomal-level genome assembly and re-annotation of leopard coral grouper

Background & Summary