The genome of the giant Nomura’s jellyfish sheds light on the early evolution of active predation
Unique among cnidarians, jellyfish have remarkable morphological and biochemical innovations that allow them to actively hunt in the water column and were some of the first animals to become free-swimming. The class Scyphozoa, or true jellyfish, are characterized by a predominant medusa life-stage consisting of a bell and venomous tentacles used for hunting and defense, as well as using pulsed jet propulsion for mobility. Here, we present the genome of the giant Nomura’s jellyfish (Nemopilema nomurai) to understand the genetic basis of these key innovations.
We sequenced the genome and transcriptomes of the bell and tentacles of the giant Nomura’s jellyfish as well as transcriptomes across tissues and developmental stages of the Sanderia malayensis jellyfish. Analyses of the Nemopilema and other cnidarian genomes revealed adaptations associated with swimming, marked by codon bias in muscle contraction and expansion of neurotransmitter genes, along with expanded Myosin type II family and venom domains, possibly contributing to jellyfish mobility and active predation. We also identified gene family expansions of Wnt and posterior Hox genes and discovered the important role of retinoic acid signaling in this ancient lineage of metazoans, which together may be related to the unique jellyfish body plan (medusa formation).
Taken together, the Nemopilema jellyfish genome and transcriptomes genetically confirm their unique morphological and physiological traits, which may have contributed to the success of jellyfish as early multi-cellular predators.
KeywordsJellyfish mobility Medusa structure formation Scyphozoa de novo genome assembly
Results and discussion
Jellyfish genome assembly and annotation
Here, we present the first de novo genome assembly of the Nomura’s jellyfish (Nemopilema nomurai; Fig. 1b). It resulted in a 213-Mb genome comprised of 255 scaffolds and an N50 length of 2.71 Mb, containing only 1.48% gaps (Additional file 1: Tables S2 and S3). The Nemopilema hybrid assembly was created using a combination of short and long read sequencing technologies, consisting of 38.2 Gb Pacific Biosciences (PacBio) single-molecule real-time sequencing (SMRT) reads, along with 98.6 Gb of Illumina short-insert, mate-pair, and TruSeq synthetic long reads (Additional file 1: Figures S3–S5; Tables S4–S7). The resulting assembly shows the longest continuity among cnidarian genomes (Additional file 1: Table S9). We predicted 18,962 protein-coding jellyfish genes by combining de novo (using medusa bell and tentacle tissue transcriptomes) and homologous gene prediction methods (Additional file 1: Tables S10 and S11, Additional files 2 and 3). This process recovered the highest number of single-copy orthologous genes  among all published non-bilaterian metazoan genome assemblies to date (Additional file 1: Table S12). A total of 21.07% of the jellyfish genome was found to be made up of transposable elements, compared to those of Acropora digitifera (9.45%), Nematostella vectensis (33.63%), and Hydra vulgaris (42.87%) (Additional file 1: Table S13).
We compared the Nemopilema genome to other cnidarian genomes, including the recently published Aurelia aurita  and Clytia hemisphaerica genomes , all of which are from predominantly sessile taxa, to detect unique Scyphozoa function (active mobility), physical structure (medusa bell), and chemistry (venom). We also performed transcriptome analyses of both Nemopilema nomurai and the Sanderia malayensis jellyfish across three medusa tissue types and four developmental stages.
Evolutionary analysis of the jellyfish
Genomic context and muscle-associated genes
Jellyfish have two primary muscle types: the epitheliomuscular cells, which are the predominant muscle cells found in sessile cnidarians, and the striated muscle cells located in the medusa bell that are essential for swimming. To understand the evolution of active-swimming in jellyfish, we examined their codon bias compared to other metazoans by calculating the guanine and cytosine content at the third codon position (GC3) [14, 15] (Additional file 1: Figure S13). It has been suggested that genes with high level of GC3 are more adaptable to external stresses (e.g., environmental changes) . Among the high-scoring top 100 GC3 biased genes, the regulation of muscle contraction, and neuropeptide signaling pathways, GO terms were specific to Nemopilema (Additional file 4: Tables S25 and S26). Calcium plays a key role in the striated muscle contraction in jellyfish, and the calcium signaling pathway (GO:0004020, P = 5.60E− 10) showed a high level of GC3 biases specific to Nemopilema. Nemopilema and Aurelia top 500 GC3 genes were enriched in GO terms associated with homeostasis (e.g., cellular chemical homeostasis and sodium ion transport), which we speculate is essential for the activation of muscle contractions that power the jellyfish’s mobile predation (Additional file 1: Section 5.1; Additional file 4: Tables S27 and S28).
Since cnidarians have been reported to lack titin and troponin complexes, which are critical components of bilaterian striated muscles, it has been suggested that the two clades independently evolved striated muscles . A survey of genes that encode muscle structural and regulatory proteins in cnidarians showed a conserved eumetazoan core actin-myosin contractile machinery shared with bilaterians (Additional file 1: Table S32). However, like other cnidarians, Nemopilema lacks titin and troponin complexes, which are key components of bilaterian striated muscles. Also, γ-syntrophin, a component of the dystroglycan complex, was absent in Nemopilema, Aurelia, and Hydra. However, Nemopilema and Aurelia do possess α/β-Dystrobrevin and α/ε-Sarcoglycan dystroglycan-associated costamere proteins, indicating that several components of the dystroglycan complex were lost after the Scyphozoa-Hydrozoa split. It was suggested that Hydra undergone secondary simplifications relative to Nematostella, which has a greater degree of muscle-cell-type specialization . Compared to Hydra and Nematostella, Nemopilema and Aurelia show intermediate complexity of muscle structural and regulatory proteins between Hydra and Nematostella.
Medusa bell and tentacle transcriptome profiling
Conversely, gene expression analyses of the tentacles revealed high RNA expression levels of neurotransmitter-associated functional categories (ion channel complex, postsynapse, and neurotransmitter receptor activity; Fig. 3b; Additional file 5: Tables S34–S37); consistent with the anatomy of jellyfish tentacles, which contain the sensory cells and a loose plexus of the neuronal subpopulation at the base of the ectoderm .
Body patterning in the jellyfish
There has been much debate surrounding the early evolution of body patterning in the metazoan common ancestor, particularly concerning the origin and expansion of Hox and Wnt gene families [22, 23, 24]. In total, 83 homeodomains were found in Nemopilema, while 82, 41, 120, and 148 of homeodomains were found from Aurelia, Hydra, Acropora, and Nematostella, respectively (Additional file 1: Table S41). Five of the eight Hox genes in Nemopilema are of the posterior type that are associated with aboral axis development  and clustered with Nematostella’s posterior Hox genes, HOXE and HOXF (Additional file 1: Figures S18–S20). Aurelia has six posterior type Hox genes, but does not have the HOXB, C, and D type (HOX2 in humans). Though absent in Hydra and Acropora, synteny analyses of ParaHox genes in Nemopilema show that the XLOX/CDX gene is located immediately downstream of GSX in the same tandem orientation as those in Nematostella, suggesting that XLOX/CDX was present in the cnidarian common ancestor and subsequently lost in some lineages (Additional file 1: Figure S21). Hox-related genes, EVX and EMX, are also present in Nemopilema and Aurelia, although they are absent in Hydra. Given the large amount of ancestral diversity in the Wnt genes, it has been proposed that Wnt signaling controlled body plan development in the early metazoans . Nemopilema possesses 13 Wnt orthologs representing 10 Wnt subfamilies (Additional file1: Figure S22; Table S42). Wnt9 is absent from all cnidarians, likely representing losses in the cnidarian common ancestor. Cnidarians have undergone dynamic lineage-specific Wnt subfamily duplications, such as Wnt8 (Nematostella, Acropora, and Aurelia), Wnt10 (Hydra), and Wnt11, and Wnt16 (Nemopilema and Aurelia). It has been proposed that a common cluster of Wnt genes (Wnt1–Wnt6–Wnt10) existed in the last common ancestor of arthropods and deuterostomes . Our analyses of cnidarian and bilaterian genomes revealed that Acropora also possess this cluster, while Nemopilema, Aurelia, and Hydra are missing Wnt6, suggesting loss of the Wnt6 gene in the Medusozoa common ancestor (Additional file 1: Figure S23). Taken together, the jellyfish have comparable number of Hox and Wnt genes to other cnidarians, but the dynamic repertoire of these gene families suggests that cnidarians have evolved independently to adapt their physiological characteristics and life cycle.
Polyp to medusa transition in jellyfish
The polyp-to-medusa transition is prominent in jellyfish compared to the other sessile cnidarians. To understand the genetic basis of the medusa structure formation in the jellyfish, we compared transcriptional regulation between cnidarians and across jellyfish developmental stages (see Additional file 1: Sections 7.1 and 7.2). We assembled the Sanderia transcripts using six pooled samples of transcriptomes (Additional file 1: Table S43). The assembled transcripts had a total length of 61 Mb and resulted in 58,290 transcript isoforms and 43,541 unique transcripts, with a N50 of 2325 bp. On average, 87% of the RNA reads were aligned to into the assembled transcripts (Additional file 1: Table S44), indicating that the transcript assembly represented the majority of sequenced reads. Furthermore, the composition of the protein domains contained in the top 20 ranks was quite similar between Nemopilema and Sanderia (Additional file 1: Table S45). To obtain differentially expressed genes for each stage, we compared each stage with the previous or next stage in the life cycle of the jellyfish. The polyp stage, which represents a sessile stage in the jellyfish life cycle, showed enriched terms related to ion channel activity and energy metabolism (regulation of metabolic process, and amino sugar metabolic process; Additional file 1: Table S46). Active feeding in the polyp stimulates asexual proliferation either into more polyps or metamorphosis to strobila . Since anthozoans do not form a medusa, the strobila asexual reproductive stage is an important stage in which to study the metamorphosis from polyp to medusa. In this stage, GO terms related to amide biosynthetic and metabolic process were highly expressed compared to the polyp stage (Additional file 1: Table S47). It has been reported that RF-amide and LW-amide neuropeptides were associated with metamorphosis in cnidarians [28, 29, 30]. However, we could not confirm this finding in our strobila and ephyra stage comparisons. In our system, the gene expression patterns of the two stages are quite similar. In the ephyra, the released mobile stage, GO terms involving amide biosynthetic and metabolic process were also highly expressed compared to the merged medusa stage (Additional file 1: Table S48). In the medusa, extracellular matrix, metallopeptidase activity, and immune system process terms were enriched (Additional file 1: Table S49), consistent with the physiology of their bell, tentacles, and oral arm tissue types.
Identification of toxin-related domains in jellyfish
An interesting branch on the tree of life, jellyfish have evolved remarkable morphological and biochemical innovations that allow them to actively hunt using pulsed jet propulsion and venomous tentacles. While the expansion and contraction of distinct families reflect the adaptation to salinity and predation and the convergent evolution of muscle elements, the Nemopilema genome strikes a balance between the conservation of many ancient genes and an innovative potential reflected in significant number of new genes that appeared since Rhizostomeae emerged. The Nemopilema nomurai genome has provided clues to the genetic basis of the innovative structure, function, and chemistry that have allowed this distinctive early group of predators to colonize the waters of the globe.
A medusa Nemopilema nomurai was collected at the Tongyeong Marine Science Station, KIOST (34.7699 N, 128.3828 E) on Sept. 12, 2013. The Sanderia malayensis samples were obtained from Aqua Planet Jeju Hanwha (Seogwipo, Korea) for transcriptome analyses of developmental stages since Nemopilema cannot be easily grown in the laboratory. The DNA and RNA preparation of Nemopilema and Sanderia are described in Additional file 1: Section 1.1. Species identification of Nemopilema was confirmed by comparing the MT-COI gene of five species of jellyfish. We aligned Nemopilema Illumina short reads (~ 400 bp insert-size) to the MT-COI gene of Chrysaora quinquecirrha (NC_020459.1), Cassiopea frondosa (NC_016466.1), Craspedacusta sowerbyi (NC_018537.1), and Aurelia aurita (NC_008446.1) jellyfish with BWA-MEM aligner . Consensus sequences for each jellyfish were generated using SAMtools . The consensus sequence from C. sowerbyi was excluded due to low coverage. We conducted multiple sequence alignment using MUSCLE  and ran the MEGA v7  neighbor joining phylogenetic tree (gamma distribution) with 1000 bootstrap replicates. Mitochondrial DNA phylogenetic analyses confirmed the identification of the Nemopilema sample as Nemopilema nomurai.
Genome sequencing and scaffold assembly
For the de novo assembly of Nemopilema, PacBio SMRT and five Illumina DNA libraries with various insert sizes (400 bp, 5 Kb, 10 Kb, 15 Kb, and 20 Kb) were constructed according to the manufacturers’ protocols. The Illumina libraries were sequenced using a HiSeq2500 with a read length of 100 bp (400 bp, 15 Kb, and 20 Kb) and a HiSeq2000 with a read length of 101 bp (5 Kb and 10 Kb). Quality filtered PacBio subreads were assembled into distinct contigs using the FALCON assembler  with various read length cutoffs. To extend contigs to scaffolds, we aligned the Illumina long mate-pair libraries (5 Kb, 10 Kb, 15 Kb, and 20 Kb) to contig sets and extended the contigs using SSPACE . Gaps generated by SSPACE were filled by aligning the Illumina short-insert paired-end sequences using GapCloser . We also generated TSLRs using an Illumina HiSeq2000, which were aligned to scaffolds to correct erroneous sequences and to close gaps using an in-house script. Detailed genome sequencing and assembly process are provided in Additional file 1: Section 2.2.
The jellyfish genome was annotated for protein-coding genes and repetitive elements. We predicted protein-coding genes using a two-step process, with both homology- and evidence-based prediction. Protein sequences of the sea anemone, hydra, sponge, human, mouse, and fruit fly from the NCBI database and Cnidaria protein sequences from the NCBI Entrez protein database were used for homology-based gene prediction. Two tissue transcriptomes from Nemopilema were used for evidence-based gene prediction via AUGUSTUS . Final Nemopilema protein-coding genes were determined using AUGUSTUS with exon (from the homology-based gene prediction) and intron (from the evidence-based gene prediction) hints. Repetitive elements were also predicted using Tandem Repeats Finder  and RepeatMasker . Details of the annotation process are provided in Additional file 1: Sections 3.1 and 3.2.
Gene age estimation
Phylostratigraphy employs BLASTP-scored sequence similarity to estimate the minimal age of every protein-coding gene. The protein sequence is used to query the NCBI non-redundant database and detect the most distant species in which a sufficiently similar sequence is present inferring that the gene is at least as old as the age of the common ancestor . For every species, we use the NCBI taxonomy. The timing of most divergence events is estimated using TimeTree  and the Encyclopedia of Life . To facilitate detection of sequence similarity, we use the e value threshold of 10−3. We evaluate the age of all proteins whose length is equal or greater than 40 amino acids. We count the number of genes in each phylostratum, from the most ancient (PS 1) to the newest (PS 11). To see broad evolutionary patterns, we aggregate the counts from several phylostrata into three broad evolutionary eras: ancient (PS 1–5, cellular organisms to Eumetazoa, 4204 Mya to 741 Mya), middle (PS 6–7, Cnidaria to Scyphozoa, 741 Mya to 239 Mya), and young (PS 8–11, Rhizostomeae to Nemopilema nomurai, 239 Mya to present).
Comparative evolutionary analyses
Orthologous gene clusters were constructed to examine the conservation of gene repertoires among the genomes of the Nemopilema nomurai, Aurelia aurita, Hydra vulgaris, Clytia hemisphaerica, Acropora digitifera, Nematostella vectensis, Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, Homo sapiens, Trichoplax adhaerens, Amphimedon queenslandica, Mnemiopsis leidyi, and Monosiga brevicollis using OrthoMCL . To infer a phylogeny and divergence times, we used RAxML  and MEGA7 program , respectively. A gene family expansion and contraction analysis was conducted using the Café program . Domain regions were predicted by InterProScan  with domain databases. Details of the comparative analysis are provided in Additional file 1: Sections 4.1–4.3.
Transcriptome sequencing and expression profiling
Illumina RNA libraries from Nemopilema nomurai and Sanderia malayensis were sequenced using a HiSeq2500 with 100-bp read lengths. Since there is not a reference genome for S. malayensis, we de novo assembled a pooled six RNA-seq read set using the Trinity assembler . Quality filtered RNA reads from Nemopilema and Sanderia were aligned to the Nemopilema genome assembly and the assembled transcripts, respectively, using the TopHat  program. Expression values were calculated by the Fragments Per Kilobase Of Exon Per Million Fragments Mapped (FPKM) method using Cufflinks , and differentially expressed genes were identified by DEGseq . Details of the transcriptome analysis are presented in Additional file 1: Sections 5.2 and 7.1.
Hox and ParaHox analyses
We examined the homeodomain regions in Nemopilema using the InterProScan program. Hox and ParaHox genes were identified in Nemopilema by aligning the homeodomain sequences of human and fruit fly to the identified Nemopilema homeodomains. We considered only domains that were aligned to both the human and fruit fly. We also used this process for Acropora, Hydra, and Nematostella for comparison. Additionally, we added one Hox gene for Acropora and two Hox genes for Hydra, which are absent in the NCBI gene set, though they were present in previous studies [23, 60]. Hox and ParaHox genes of Clytia hemisphaerica, a hydrozoan species with a medusa stage, were also added based on a previous study . Finally, a multiple sequence alignment of these domains was conducted using MUSCLE, and a FastTree  maximum likelihood phylogeny was generated using the Jones–Taylor–Thornton (JTT) model with gamma option.
Wnt gene subfamily analyses
Wnt genes of Nematostella and Hydra were downloaded from previous studies [25, 63], and those of Acropora were downloaded from the NCBI database. Wnt genes in Nemopilema and Aurelia were identified using the Pfam database by searching for “wnt family” domains. A multiple sequence alignment of Wnt genes was conducted using MUSCLE, and aligned sequences were trimmed using the trimAl program  with “gappyout” option. A phylogenetic tree was generated using RAxML with the PROTGAMMAJTT model and 100 bootstraps.
Further information, including sample preparation, assembly, genome annotation, and evolutionary analyses, can be found in Additional file 1 [65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112].
Korea Institute of Science and Technology Information (KISTI) provided us with the Korea Research Environment Open NETwork (KREONET), which is the internet connection service for efficient information and data transfer.
This work was supported by the Genome Korea Project in Ulsan Research Funds (1.180024.01 and 1.180017.01) of Ulsan National Institute of Science & Technology (UNIST). This work was also supported by a grant from the Marine Biotechnology Program (20170305, Development of Biomedical materials based on marine proteins) and the Collaborative Genome Program (20180430) funded by the Ministry of Oceans and Fisheries, Korea. This work was also supported by the Collaborative Genome Program for Fostering New Post-Genome Industry of the National Research Foundation (NRF) funded by the Ministry of Science and ICT (MSIT) (NRF-2017M3C9A6047623 and NRF-2017R1A2B2012541). V.L. and M.W.K. gratefully acknowledge funding support from the National Institutes of Health of USA (R01 HD073104 and R01 HD091846 to M.W.K.).
Availability of data and materials
The jellyfish genome project has been deposited at DDBJ/ENA/GenBank under the accession PEDN00000000 . The version described in this paper is version PEDN01000000. Raw DNA and RNA sequence reads for Nemopilema nomurai and Sanderia malayensis have been submitted to the NCBI Sequence Read Archive database (SRA627560) . All other data can be obtained from the authors upon reasonable request.
JB and SY supervised the project. YSC, JB, and SY planned and coordinated the project. HMK, JAW, YSC, SY, and JB wrote the manuscript. NayoungL, NayunL, YJJ, SW, KS, JCR, HSY, JHL, and SY prepared the samples, performed the experiments, and provided toxinological considerations. VL, AK, and MWK performed the gene evolutionary age analysis. HMK, SGP, YSC, YB, YJ, SJ, OC, JSE, and AM performed the in-depth bioinformatics data analyses. All authors read and approved the final manuscript.
Ethics approval and consent to participate
This is not applicable.
YSC and OC are employees, and JB is on the scientific advisory board of Clinomics Inc. HMK, YSC and JB have an equity interest in the company. All other coauthors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 2.Arai MN. A functional biology of Scyphozoa: Springer Science & Business Media; 2012.Google Scholar
- 3.Hale G. The classification and distribution of the class Scyphozoa. Eugene: University of Oregon; 1999.Google Scholar
- 9.Leclère L, Horin C, Chevalier S, Lapébie P, Dru P, Peron S, Jager M, Condamine T, Pottin K, Romano S, et al. The genome of the jellyfish Clytia hemisphaerica and the evolution of the cnidarian life-cycle. bioRxiv. 2018. https://doi.org/10.1101/369959.
- 40.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 2013. https://arxiv.org/abs/1303.3997.
- 83.Tavaré S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lect Math Life Sci. 1986;17(2):57–86.Google Scholar
- 87.Lesh-Laurie GE, Suchy PE. Cnidaria: scyphozoa and cubozoa. Microsc Anat Invertebrates. 1991;2:185–266.Google Scholar
- 93.Campbell NA, Reece JB, Taylor MR, Simon EJ, Dickey J. Biology: concepts & connections, vol. 3: Pearson/Benjamin Cummings; 2009.Google Scholar
- 97.Collins AG, Schuchert P, Marques AC, Jankowski T, Medina M, Schierwater B. Medusozoan phylogeny and character evolution clarified by new large and small subunit rDNA data and an assessment of the utility of phylogenetic mixture models. Syst Biol. 2006;55(1):97–115.Google Scholar
- 101.Haas B, Papanicolaou A. TransDecoder (find coding regions within transcripts); 2016.Google Scholar
- 103.Gutierrez-Mazariegos J, Schubert M, Laudet V. Evolution of retinoic acid receptors and retinoic acid signaling. In: The biochemistry of retinoic acid receptors I: Structure, activation, and function at the molecular level. Dordrecht: Springer; 2014. p. 55–73.Google Scholar
- 107.Lalevee S, Anno YN, Chatagnon A, Samarut E, Poch O, Laudet V, Benoit G, Lecompte O, Rochette-Egly C. Genome-wide in silico identification of new conserved and functional retinoic acid receptor response elements (direct repeats separated by 5 bp). J Biol Chem. 2011;286(38):33322–34.PubMedPubMedCentralCrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.