Chromosome-level genome assembly of the northern Pacific seastar Asterias amurensis

Wang, Yanlin; Wang, Yixin; Yang, Yujia; Ni, Gang; Li, Yulong; Chen, Muyan

doi:10.1038/s41597-023-02688-w

Chromosome-level genome assembly of the northern Pacific seastar Asterias amurensis

Data Descriptor
Open access
Published: 04 November 2023

Volume 10, article number 767, (2023)
Cite this article

Download PDF

You have full access to this open access article

Scientific Data

Chromosome-level genome assembly of the northern Pacific seastar Asterias amurensis

Download PDF

Yanlin Wang ORCID: orcid.org/0009-0000-9005-9708¹,
Yixin Wang¹,
Yujia Yang¹,
Gang Ni¹,
Yulong Li² &
…
Muyan Chen¹

1667 Accesses
1 Citation
2 Altmetric
Explore all metrics

Abstract

Asterias amurensis has attracted widespread concern because of its population outbreaks, which has impacted fisheries and aquaculture, as well as disrupting local ecosystems. A high-quality reference genome is necessary to better investigate mechanisms of outbreak and adaptive changes. Combining PacBio HiFi and Hi-C sequencing data, we generated a chromosome-level A. amurensis genome with a size of 491.53 Mb. The contig N50 and scaffold N50 were 8.05 and 23.75 Mb, respectively. The result of BUSCO analysis revealed a completeness score of 98.85%. A total of 16,531 protein-coding genes were predicted in the genome, of which 94.63% were functionally annotated. The high-quality genome assembly resulting from this study will provide a valuable genetic resource for future research on the mechanism of population outbreaks and invasion ecology.

A chromosome-level genome assembly and evolutionary analysis of Coregonus ussuriensis Berg

Article Open access 18 July 2024

Chromosome-level assembly of Triplophysa yarkandensis genome based on the single molecule real-time sequencing

Article Open access 05 January 2024

Chromosome-level genome assembly and annotation of the Spinibarbus caldwelli

Article Open access 28 August 2024

Background & Summary

Asterias amurensis (class: Asteroidea), also known as the northern Pacific seastar, is widely distributed in the northwest and northeast Pacific, native to the coast of Alaska¹, China², Japan³, Korea⁴, and Russia⁵. As a benthic echinoderm with distinct evolutionary classification⁶, its reproduction mode includes not only dioecious but also asexual reproduction by arm regeneration^7,8. Females have high fecundity and can annually spawn ~20 million eggs³. The planktonic stage of larva can last for seven weeks or several months, which enables them to rapidly spread in a suitable environment^9,10. A. amurensis is located at the highest trophic level among the benthic invertebrates as a voracious and efficient generalist predator¹¹, which has been reported to impact a variety of infaunal taxa, especially commercial bivalves^12,13,14. And it has even been associated with the decline of some fish species¹⁵.

In the early 1980s, free-spawning starfish A. amurensis were first spotted in southeast Tasmania of Australia, possibly introduced from central Japan through ship ballast water³. Since their first detection, this starfish has successfully established populations in a short period and gradually expanded to Victoria^16,17,18. As one of the most successful invasive species, A. amurensis became a significant threat to native assemblages, marine commercial species, and has damaged native ecosystems in Australia^13,19. Thus, this starfish was listed as one of the high-priority marine pests in Australia²⁰. Although its invasive range is limited in Australia²¹ so far, A. amurensis will likely continue to expand due to its high fecundity, wide environmental tolerance, and long larval duration²², even invading the Southern Ocean²³. However, due to the lack of genomic information in A. amurensis, genetic changes associated with invasive lineages remain unknown^16,24.

Periodic and massive outbreaks of A. amurensis populations have been reported in several countries, including Australia, China, and Japan, which have significantly impacted fishery and mariculture grounds, as well as destroyed the original ecological balance, leading to serious economic losses^25,26,27. Unfortunately, no effective bio-control method has been reported for this pest up to now. To provide warning information for possible outbreaks of A. amurensis, early detection technologies have been developed based on targeting rRNA²⁸ and the mitochondrial cytochrome c oxidase subunit I (COI) gene^21,29,30. However, the mechanism of aggregation and outbreak is complex and unclear. Relevant studies require the support of a high-quality genome assembly, which may help to identify species-specific factors associated with aggregating starfish³¹.

In the present study, a de novo assembled chromosome-level A. amurensis genome was prepared using PacBio HiFi and Hi-C sequencing data. The final genome size was 491.53 Mb with scaffold N50 of 23.75 Mb. Using three approaches for gene structure annotation, we identified a total of 16,531 protein-coding genes, of which 15,643 genes were functionally annotated with at least one public database. A high-quality reference genome for A. amurensis will be a useful genomic resource to explore both the mechanism of population outbreak and the genetic basis underlying adaptive change during the invasion process. Meanwhile, the A. amurensis genome will be a noteworthy addition to the existing suite of Asteroidea genomes for future cell, developmental and evolutionary biology research.

Methods

Sample collection

All samples used in this study were from a male adult A. amurensis collected by diving in Qingdao, Shandong Province, China (36°03′04″N, 120°21′26″E) in November 2022. Fresh gonad tissue from the base of the arm was excised and washed with phosphate buffered saline (PBS, 1X). It was then immediately frozen in liquid nitrogen and transferred to −80 °C for storage. High quality DNA was extracted from gonad using DNeasy Blood & Tissue Kit (Qiagen, Germany) for long-read and short-read whole genome sequencing. To aid in structural annotation, nine tissues including gonad, body wall, madreporite, spine, mouth, stomach, muscle, podia, and eye spot were used for transcriptome sequencing. All tissues were isolated separately with scissors and forceps, and then treated in the same way as the gonad collection. Total RNA was extracted using the TRIzol reagent (Vazyme, China).

Sequencing

For long-read sequencing, high molecular weight genomic DNA (gDNA) was fragmented to approximately 15 kb to construct a PacBio HiFi library. The sequencing library was generated using the SMRTbell Express Template Prep kit 2.0 (Pacific Biosciences, USA), following the manufacturer’s recommendations, as described in the previous study³². The library was finally sequenced with circular consensus sequencing (CCS) mode on the PacBio Sequel II system using a single 8 M cell. After filtering out the low-quality reads and sequence adapters, a total of 11.15 Gb CCS data were obtained with a mean length of 12.51 kb (Table 1).

Table 1 Statistical analysis of sequencing reads from BGI, Illumina and PacBio.

Full size table

For short-read whole genome sequencing, gDNA was fragmented into approximately 350 bp for library construction. The library was sequenced on DNBSEQ-T7 platform to generate 150 bp paired-end (PE150) reads. After filtering out low-quality reads including reads shorter than 100 bp, reads that contained >10% “N”, and reads that contained >50% low-quality bases (Phred score ≤10), the clean data generated was 112.58 Gb, which covered ~229X of the genome (Table 1).

The chromosome conformation capture (Hi-C) technique was employed to assemble a chromosome-level genome. The fresh gonad was crosslinked using formaldehyde solution and digested with four-cutter restriction enzyme (DpnII). The ends of the restriction fragments were labeled with biotinylated nucleotides, and then the ligated DNA was sheared into fragments from 300 bp to 700 bp in length for Hi-C library construction. The resulting library was quantified with the Q-PCR method and sequenced with the DNBSEQ-T7 platform. After removing adapters and low-quality short reads, a total of 102.75 Gb (209.04 × coverage) of clean data was generated, with Q20 = 97.32% and Q30 = 92.33% (Table 2).

Table 2 Statistical analysis of sequencing data from Hi-C.

Full size table

For transcriptome sequencing, total RNA of nine tissues from the same starfish was extracted and equally pooled for cDNA library construction. The resulting library was constructed by NEBNext® Ultra™ RNA Library Prep Kit (NEB, USA) according to the manufacturer’s instructions and sequenced on Illumina NovaSeq6000 system, finally generating 13.47 Gb clean data to help genome structure annotation.

Genome assembly

Based on PaciBio HiFi reads, Hifiasm (v0.18.4)³³ was applied for de novo assembly of primary contigs with default parameters. Haplotypic and heterozygous duplication was removed using purge_dups (v1.2.6)³⁴ with the parameter of cutoffs ‘-l 5 -m 18 -u 54’. A primary assembly was generated, consisting of 90 contigs spanning 491.50 Mb. N50 and the maximum contig length were 8.05 and 28.59 Mb, respectively (Table 3).

Table 3 Assembly statistics of A. amurensis genome.

Full size table

We further scaffolded the contigs using Hi-C sequencing data to obtain a high-quality chromosome-scale genome. Juicer (v1.6)³⁵ was applied for raw sequence data analysis and then 3D-DNA (v190716)³⁶ was used to anchor contigs into chromosomes. The assembly was further corrected manually according to the Hi-C heatmap using JuiceboxGUI (v1.11.08)³⁷, a visualization system for Hi-C contact maps. The final genome consisted of 22 chromosomes with lengths ranging from 13.43 to 38.00 Mb, and the N50 was 23.75 Mb (Table 3, Fig. 1, Fig. 2). Previous karyotype analysis³⁸ of A. amurensis indicated that it had a diploid chromosome number of 44, which was consistent with our results.

Annotation of repetitive elements

The Extensive de novo TE Annotator (EDTA, v2.0.0)³⁹ and RepeatModeler (v2.0.3)⁴⁰ were utilized to build repetitive sequence libraries for A. amurensis genome. We combined these two libraries as a final comprehensive repeat library for repeat annotation. Then, RepeatMasker (v4.1.2)⁴¹ was used to predict and classify repetitive elements of A. amurensis genome. Overall, sequences constituting 48.69% of the assembled genome were identified as repeats, of which the most abundant repetitive element was long terminal repeats (LTR, 19.63%), followed by DNA transposons (18.20%) (Table 4, Fig. 2).

Table 4 Classification of repetitive sequences in A. amurensis genome.

Full size table

Noncoding RNA (ncRNA) annotation

Ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs) were predicted by Barrnap (v0.9, https://github.com/tseemann/barrnap) and tRNAscan-SE (v2.0.11)⁴² with default parameters, respectively. Based on an alignment with Rfam database (v14.8)⁴³, Infernal (v1.1.4)⁴⁴ was used to annotate other ncRNAs, including small nuclear RNAs (snRNAs) and microRNAs (miRNAs). In total, we identified 37 miRNAs, 14,926 tRNAs, 415 rRNAs, and 202 snRNAs in A. amurensis genome (Table 5, Fig. 2).

Table 5 Classification of ncRNAs in A. amurensis genome.

Full size table

Gene prediction and functional annotation

We used three approaches for predictions of gene structures, including de novo, homology-based, and RNA-seq-based prediction. Augustus (v3.4.0)⁴⁵, GlimmerHMM (v3.0.4)⁴⁶, GeneMark (v4.69)⁴⁷, SNAP (version 2006-07-28)⁴⁸, and BRAKER2 (v2.1.6)⁴⁹ were utilized for de novo gene model prediction and they were performed with default parameters. For homology-based prediction, we downloaded protein sequences of the crown-of-thorns starfish Acanthaster sp. (https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/001/949/145/GCF_001949145.1_OKI-Apl_1.0/), sea urchin Strongylocentrotus purpuratus (https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/002/235/GCF_000002235.5_Spur_5.0/), and sea cucumber Apostichopus japonicus (https://ftp.ncbi.nlm.nih.gov/genomes/genbank/invertebrate/Apostichopus_japonicus/latest_assembly_versions/GCA_002754855.1_ASM275485v1/) from National Center for Biotechnology Information (NCBI) as references and used MetaEuk (version aa7ac2eb7334405ad57d50d78361e3dcd61bb27a)⁵⁰ with default parameters to predict gene structures. For RNA-seq-based prediction, we firstly mapped short RNA reads to reference genome using HISAT2 (v2.2.1)⁵¹ with the parameter ‘-dta’ and then assembled transcripts using StringTie (v2.2.1)⁵². Meanwhile, the Program to Assemble Spliced Alignments (PASA, v2.4.1) pipeline (https://github.com/PASApipeline/PASApipeline) was used to identify possible coding regions based on de novo transcriptome assembled by Trinity (v2.14.0)⁵³ with default parameters. Then, EvidenceModeler (EVM, v1.1.1)⁵⁴ and Funannotate (v1.8.14) pipeline (https://github.com/nextgenusfs/funannotate) were applied for combining predicted results from three strategies and removal of low-quality gene annotations. Based on the RNA-seq data of A. amurensis from this study, adult stomach tissue⁵⁵, and bipinnaria larval¹⁶ from other studies, PASA (v2.4.1) was applied for the update of untranslated regions (UTRs). The general annotation pipeline applied in the present study was shown in Fig. 3. As a result, a total of 16,531 protein-coding genes were predicted and the average gene length was 17,803.19 bp, with an average coding sequence (CDS) length of 1,885.87 bp and average exon number of 10.07 (Table 6). Among them, 12,736 (77.04%) genes were supported by evidence from all three strategies (Fig. 4). We also counted the density of genes on different chromosomes with a window of 1 Mb in length (Fig. 5) and simply compared gene length, CDS length, exon length, intron length and exon number per gene of A. amurensis and other species used in homology-based predictions (Fig. 6). The 1 Mb region with the largest number of annotated genes were from the end of chromosome 18 (Fig. 5).

Table 6 Statistical results of the gene structure annotation in A. amurensis genome.

Full size table

Functional annotations were accomplished using Funannotate pipeline, based on databases including Clusters of Orthologous Groups of Proteins (COG)⁵⁶, eggNOG⁵⁷, Gene Ontology (GO)⁵⁸, Interpro⁵⁹, Kyoto Encyclopedia of Genes and Genomes (KEGG)⁶⁰, NCBI non-redundant protein (Nr), Pfam⁶¹, and Swiss-Prot⁶². The results showed that 15,643 protein sequences (94.63%) were annotated with at least one public database (Table 7, Fig. 7).

Table 7 Summary of the functional gene annotation in A. amurensis genome.

Full size table

Comparative genomic analysis

The longest protein sequences of A. amurensis and other five asteroid species including Acanthaster sp.⁶³, Asterias rubens⁶⁴, Patiria miniata⁶⁵, Plazaster borealis⁶⁶, and Zoroaster cf. ophiactis⁶⁷ were utilized to identify orthologous groups using OrthoFinder (v2.5.5)⁶⁸ with the parameters ‘-S diamond’, and the sea urchin Lytechinus variegatus⁶⁹ was selected as an outgroup. A total of 5,315 single-copy orthogroups were obtained for subsequent phylogenetic analysis. Based on multiple sequence alignments of the single-copy orthogroups using MAFFT (v7.520)⁷⁰, IQ-TREE (v2.2.3)⁷¹ was applied for construction of the species trees with the parameters ‘-m MFP -bb 1000’ and the best model of GTR + F + I + R4. Predictably, A. amurensis was most closely related to A. rubens and P. borealis from the family Asteriidae (Fig. 8). Then, divergence times were estimated using MCMCTREE in PAML (v4.9i)⁷² based on the divergence time (A. amurensis vs L. variegatus: 461.1.5-600.0 million years ago) extracted from TIMETREE (http://www.timetree.org/). The expansion and contraction of gene families were analyzed by Computational Analysis of gene Family Evolution (CAFE, v5.0.0)⁷³ with a p-value of 0.05. The results revealed that 197 and 482 gene families were expanded and contracted in A. amurensis, respectively (Fig. 8).

Data Records

The PacBio, BGI, RNA-seq, and Hi-C sequencing data have been deposited in the NCBI Sequence Read Archive (SRA) database under the accession numbers of SRR24902114⁷⁴, SRR24831139⁷⁵, SRR24871501⁷⁶, and SRR24835318⁷⁷. The final chromosome assembly has been deposited in GenBank with assembly accession number GCA_032118995.1⁷⁸. The genome annotation files are available in the Figshare database⁷⁹.

Technical Validation

Nucleic acid quality

The concentration and quality of DNA were evaluated using Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, USA) and agarose gel electrophoresis, respectively. RNA integrity was assessed using Agilent 2100 Bioanalyzer (Agilent Technologies, USA).

Genome assembly and annotation quality evaluation

The quality of the final chromosome-level genome assembly was assessed using four methods as follows. Firstly, we mapped clean PE150 reads from whole genome sequencing to A. amurensis genome using BWA-MEM (v0.7.17)⁸⁰ and calculated the mapping rate using samtools (v1.9)⁸¹, resulting in a genome coverage rate of 99.95% and a mapping rate of 99.61%. Secondly, the results of Benchmarking Universal Single-Copy Orthologs (BUSCO, v5.2.2)⁸² analysis based on 954 genes of metazoa_odb10 database indicated that 951 (99.69%) core metazoan genes were detected in A. amurensis genome, consisting of 943 (98.85%) complete and 8 (0.84%) fragmented genes (Table 8). Thirdly, the Core Eukaryotic Genes Mapping Approach (CEGMA, v2.5)⁸³ based on 248 core eukaryotic genes showed that 236 (95.16%) genes were identified in the final genome assembly. Finally, meryl (v1.3)⁸⁴ was used to generate k-mer counts based on paired-end reads generated by whole genome sequencing, and Merqury (v1.3)⁸⁴ was utilized to estimate the consensus quality value (QV) of A. amurensis genome, resulting in a QV of 48.51. The results from the four methods above revealed the high accuracy and completeness of the final genome assembly.

Table 8 BUSCO evaluation of gene annotation in A. amurensis genome.

Full size table

Code availability

No custom code was utilized in this study. Data processing was performed by relevant pipelines and software according to the manual and protocols and the version as well as useful parameters have been described in the Methods section. The default parameters as developers suggested were used in those pipelines and software of which parameters were not specifically mentioned in this work.

References

Fukuyama, A. K. & Oliver, J. S. Sea star and walrus predation on bivalves in Norton Sound, Bering Sea, Alaska. Ophelia 24, 17–36 (1985).
Article Google Scholar
Li, B. et al. Size distribution of individuals in the population of Asterias amurensis (Echinodermata: Asteroidea) and its reproductive cycle in China. Acta Oceanol. Sin. 37, 96–103 (2018).
Article Google Scholar
Ward, R. D. & Andrew, J. Population genetics of the northern Pacific seastar Asterias amurensis (Echinodermata: Asteriidae): allozyme differentiation among Japanese, Russian, and recently introduced Tasmanian populations. Mar. Biol. 124, 99–109 (1995).
Article Google Scholar
Paik, S. G., Park, H. S., Yi, S. K. & Yun, S. G. Developmental duration and morphology of the sea star Asterias amurensis, in Tongyeong, Korea. Ocean Sci. J. 40, 65–70 (2005).
Article ADS Google Scholar
Kashenko, S. D. Responses of embryos and larvae of the starfish Asterias amurensis to changes in temperature and salinity. Russ. J. Mar. Biol. 31, 294–302 (2005).
Article Google Scholar
Reich, A., Dunn, C., Akasaka, K. & Wessel, G. Phylogenomic analyses of Echinodermata support the sister groups of Asterozoa and Echinozoa. PLoS One 10, e0119627 (2015).
Article PubMed PubMed Central Google Scholar
Dupont, S. & Thorndyke, M. Bridging the regeneration gap: insights from echinoderm models. Nat. Rev. Genet. 8, 320–320 (2007).
Article CAS Google Scholar
Medina-Feliciano, J. G. & Garcia-Arraras, J. E. Regeneration in echinoderms: molecular advancements. Front. Cell Dev. Biol. 9, 768641 (2021).
Article PubMed PubMed Central Google Scholar
Byrne, M., Morrice, M. G. & Wolf, B. Introduction of the northern Pacific asteroid Asterias amurensis to Tasmania: reproduction and current distribution. Mar. Biol. 127, 673–685 (1997).
Article Google Scholar
Kashenko, S. D. Development of the starfish Asterias amurensis under laboratory conditions. Russ. J. Mar. Biol. 31, 36–42 (2005).
Article Google Scholar
Qu, P. et al. Trophic structure of common marine species in the Bohai Strait, North China Sea, based on carbon and nitrogen stable isotope ratios. Ecol. Indic. 66, 405–415 (2016).
Article CAS Google Scholar
Hutson, K. S., Ross, D. J., Day, R. W. & Ahern, J. J. Australian scallops do not recognise the introduced predatory seastar Asterias amurensis. Mar. Ecol. Prog. Ser. 298, 305–309 (2005).
Article ADS Google Scholar
Ross, D. J., Johnson, C. R. & Hewitt, C. L. Impact of introduced seastars Asterias amurensis on survivorship of juvenile commercial bivalves Fulvia tenuicostata. Mar. Ecol. Prog. Ser. 241, 99–112 (2002).
Article ADS Google Scholar
Nishimura, H., Miyoshi, K. & Chiba, S. Predatory behavior of the sea stars Asterias amurensis and Distolasterias nipon on the Japanese scallop, Mizuhopecten yessoensis. Plankton Benthos Res. 14, 1–7 (2019).
Article Google Scholar
Parry, G. D. & Hirst, A. J. Decadal decline in demersal fish biomass coincident with a prolonged drought and the introduction of an exotic starfish. Mar. Ecol. Prog. Ser. 544, 37–52 (2016).
Article ADS CAS Google Scholar
Richardson, M. F. & Sherman, C. D. De novo assembly and characterization of the invasive northern Pacific seastar transcriptome. PLoS One 10, e0142003 (2015).
Article PubMed PubMed Central Google Scholar
Dunstan, P. K. & Bax, N. J. How far can marine species go? Influence of population biology and larval movement on future range limits. Mar. Ecol. Prog. Ser. 344, 15–28 (2007).
Article ADS Google Scholar
Ling, S. D., Johnson, C. R., Mundy, C. N., Morris, A. & Ross, D. J. Hotspots of exotic free-spawning sex: man-made environment facilitates success of an invasive seastar. J. Appl. Ecol. 49, 733–741 (2012).
Article Google Scholar
Ross, D. J., Johnson, C. R. & Hewitt, C. L. Abundance of the introduced seastar, Asterias amurensis, and spatial variability in soft sediment assemblages in SE Tasmania: clear correlations but complex interpretation. Estuarine, Coastal Shelf Sci. 67, 695–707 (2006).
Article ADS Google Scholar
Hayes, K. R. & Sliwa, C. Identifying potential marine pests—a deductive approach applied to Australia. Mar. Pollut. Bull. 46, 91–98 (2003).
Article CAS PubMed Google Scholar
Ellis, M. R. et al. Detecting marine pests using environmental DNA and biophysical models. Sci. Total Environ. 816, 151666 (2022).
Article ADS CAS PubMed Google Scholar
Richardson, M. F., Sherman, C. D., Lee, R. S., Bott, N. J. & Hirst, A. J. Multiple dispersal vectors drive range expansion in an invasive marine species. Mol. Ecol. 25, 5001–5014 (2016).
Article CAS PubMed Google Scholar
Byrne, M., Gall, M., Wolfe, K. & Aguera, A. From pole to pole: the potential for the Arctic seastar Asterias amurensis to invade a warming Southern Ocean. Glob. Chang. Biol. 22, 3874–3887 (2016).
Article ADS PubMed Google Scholar
Bock, D. G. et al. What we still don’t know about invasion genetics. Mol. Ecol. 24, 2277–2297 (2015).
Article PubMed Google Scholar
Ross, D. J., Johnson, C. R. & Hewitt, C. L. Assessing the ecological impacts of an introduced seastar: the importance of multiple methods. Biol. Invasions 5, 3–21 (2003).
Article Google Scholar
Li, L., Yu, Y., Wu, W. & Wang, P. Extraction, characterization and osteogenic activity of a type I collagen from starfish (Asterias amurensis). Mar. Drugs 21, 274 (2023).
Article CAS PubMed PubMed Central Google Scholar
Witman, J. D., Genovese, S. J., Bruno, J. F., McLaughlin, J. W. & Pavlin, B. I. Massive prey recruitment and the control of rocky subtidal communities on large spatial scales. Ecol. Monogr. 73, 441–462 (2003).
Article Google Scholar
Smith, K. F. et al. Application of a sandwich hybridisation assay for rapid detection of the northern Pacific seastar, Asterias amurensis. N. Z. J. Mar. Freshwater Res. 45, 145–152 (2011).
Article CAS Google Scholar
Pochon, X., Bott, N. J., Smith, K. F. & Wood, S. A. Evaluating detection limits of next-generation sequencing for the surveillance and monitoring of international marine pests. PLoS One 8, e73935 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Deagle, B. E., Bax, N., Hewitt, C. L. & Patil, J. G. Development and evaluation of a PCR-based test for detection of Asterias (Echinodermata: Asteroidea) larvae in Australian plankton samples from ballast water. Mar. Freshwater Res. 54, 709–719 (2003).
Article CAS Google Scholar
Hall, M. R. et al. The crown-of-thorns starfish genome as a guide for biocontrol of this coral reef pest. Nature 544, 231–234 (2017).
Article ADS CAS PubMed Google Scholar
Xu, C. et al. Chromosome level genome assembly of oriental armyworm Mythimna separata. Sci. Data 10, 597 (2023).
Article CAS PubMed PubMed Central Google Scholar
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Article CAS PubMed PubMed Central Google Scholar
Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898 (2020).
Article CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Article CAS PubMed PubMed Central Google Scholar
Saotome, K. & Komatsu, M. Chromosomes of Japanese starfishes. Zool. Sci. 19, 1095–1103 (2002).
Article Google Scholar
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).
Article CAS PubMed PubMed Central Google Scholar
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. USA 117, 9451–9457 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinf. Chapter 4, 10.1–10.14 (2009).
Chan, P. P., Lin, B. Y., Mak, A. J. & Lowe, T. M. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 49, 9077–9096 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kalvari, I. et al. Rfam 14: expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res. 49, D192–D200 (2021).
Article CAS PubMed Google Scholar
Nawrocki, E. P., Kolbe, D. L. & Eddy, S. R. Infernal 1.0: inference of RNA alignments. Bioinformatics 25, 1335–1337 (2009).
Article CAS PubMed PubMed Central Google Scholar
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Article CAS PubMed PubMed Central Google Scholar
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
Article CAS PubMed Google Scholar
Besemer, J. & Borodovsky, M. GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res. 33, W451–W454 (2005).
Article CAS PubMed PubMed Central Google Scholar
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
Article PubMed PubMed Central Google Scholar
Bruna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genomics Bioinf. 3, lqaa108 (2021).
Article Google Scholar
Levy Karin, E., Mirdita, M. & Soding, J. MetaEuk—sensitive, high-throughput gene discovery, and annotation for large-scale eukaryotic metagenomics. Microbiome 8, 48 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20, 278 (2019).
Article CAS PubMed PubMed Central Google Scholar
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 9, R7 (2008).
Article PubMed PubMed Central Google Scholar
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR26081154 (2023).
Galperin, M. Y. et al. COG database update: focus on microbial diversity, model organisms, and widespread pathogens. Nucleic Acids Res. 49, D274–D281 (2021).
Article CAS PubMed Google Scholar
Cantalapiedra, C. P., Hernandez-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Article CAS PubMed PubMed Central Google Scholar
Blum, M. et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 49, D344–D354 (2021).
Article CAS PubMed Google Scholar
Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016).
Article CAS PubMed Google Scholar
Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).
Article CAS PubMed Google Scholar
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000).
Article CAS PubMed PubMed Central Google Scholar
Baughman, K. W. et al. Acanthaster planci, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:BDGF01000000 (2016).
Wellcome Sanger Institute. Asterias rubens, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:CABPRM030000000 (2019).
Ku, C. J., Cary, G. A. & Hinman, V. F. Patiria miniata isolate m_02_andy, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:JADOBP010000000 (2020).
Lee, Y. et al. Chromosome-level genome assembly of Plazaster borealis sheds light on the morphogenesis of multiarmed starfish and its regenerative capacity. GigaScience 11, giac063 (2022).
Article PubMed PubMed Central Google Scholar
Liu, J., Zhou, Y., Pu, Y. & Zhang, H. A chromosome-level genome assembly of a deep-sea starfish (Zoroaster cf. ophiactis). Sci. Data 10, 506 (2023).
Article CAS PubMed PubMed Central Google Scholar
Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).
Article PubMed PubMed Central Google Scholar
Davidson, P. L. et al. Chromosomal-level genome assembly of the sea urchin Lytechinus variegatus substantially improves functional genomic analyses. Genome Biol. Evol. 12, 1080–1086 (2020).
Article CAS PubMed PubMed Central Google Scholar
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Article CAS PubMed PubMed Central Google Scholar
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Article CAS PubMed Google Scholar
Mendes, F. K., Vanderpool, D., Fulton, B. & Hahn, M. W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516–5518 (2020).
Article CAS Google Scholar
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR24902114 (2023).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR24831139 (2023).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR24871501 (2023).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR24835318 (2023).
Wang, Y. L. et al. Chromosome-level genome assembly of northern Pacific seastar Asterias amurensis. GenBank https://www.ncbi.nlm.nih.gov/assembly/GCA_032118995.1 (2023).
Wang, Y. L. et al. Chromosome-level genome assembly of Asterias amurensis. figshare. https://doi.org/10.6084/m9.figshare.23538585.v2 (2023).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central Google Scholar
Manni, M., Berkeley, M. R., Seppey, M., Simao, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).
Article CAS PubMed PubMed Central Google Scholar
Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).
Article CAS PubMed Google Scholar
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Prof. Scott F Cummins for his help in manuscript polishing. This research was supported by National Natural Science Foundation of China [grant numbers 31972767 and 42276103].

Author information

Authors and Affiliations

The Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
Yanlin Wang, Yixin Wang, Yujia Yang, Gang Ni & Muyan Chen
CAS Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China
Yulong Li

Authors

Yanlin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yixin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yujia Yang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Ni
View author publications
You can also search for this author in PubMed Google Scholar
Yulong Li
View author publications
You can also search for this author in PubMed Google Scholar
Muyan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.Y.C. conceived the study. Y.L.W. prepared the nucleic acid samples. Y.L.L. performed the data analysis. Y.L.W., Y.X.W., Y.J.Y. and G.N. visualized the results and wrote the manuscript. All authors reviewed and approved the manuscript.

Corresponding authors

Correspondence to Yulong Li or Muyan Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, Y., Wang, Y., Yang, Y. et al. Chromosome-level genome assembly of the northern Pacific seastar Asterias amurensis. Sci Data 10, 767 (2023). https://doi.org/10.1038/s41597-023-02688-w

Download citation

Received: 06 July 2023
Accepted: 25 October 2023
Published: 04 November 2023
DOI: https://doi.org/10.1038/s41597-023-02688-w
Springer Nature Limited

Chromosome-level genome assembly of the northern Pacific seastar Asterias amurensis

Abstract

Similar content being viewed by others

A chromosome-level genome assembly and evolutionary analysis of Coregonus ussuriensis Berg

Chromosome-level assembly of Triplophysa yarkandensis genome based on the single molecule real-time sequencing

Chromosome-level genome assembly and annotation of the Spinibarbus caldwelli

Background & Summary