The phylogenetic position of dicyemid mesozoans offers insights into spiralian evolution
Obtaining phylogenomic data for enigmatic taxa is essential to achieve a better understanding of animal evolution. Dicyemids have long fascinated biologists because of their highly simplified body organization, but their life-cycles remain poorly known. Based on the discovery of the dicyemid DoxC gene, which encodes a spiralian peptide, it has been proposed that dicyemids are members of the Spiralia. Other studies have suggested that dicyemids may have closer affinities to mollusks and annelids. However, the phylogenetic position of dicyemids has remained a matter of debate, leading to an ambiguous picture of spiralian evolution.
In the present study, newly sequenced transcriptomic data from Dicyema japonicum were complemented with published transcriptomic data or predicted gene models from 29 spiralian, ecdysozoan, and deuterostome species, generating a dataset (Dataset 1) for phylogenomic analyses, which contains 348 orthologs and 58,124 amino acids. In addition to this dataset, to eliminate systematic errors, two additional sub-datasets were created by removing compositionally heterogeneous or rapidly evolving sites and orthologs from Dataset 1, which may cause compositional heterogeneity and long-branch attraction artifacts. Maximum likelihood and Bayesian inference analyses both placed Dicyema japonicum (Dicyemida) in a clade with Intoshia linei (Orthonectida) with strong statistical support. Furthermore, maximum likelihood analyses placed the Dicyemida + Orthonectida clade within the Gastrotricha, while in Bayesian inference analyses, this clade is sister group to the clade of Gastrotricha + Platyhelminthes.
Whichever the case, in all analyses, Dicyemida, Orthonectida, Gastrotricha, and Platyhelminthes constitute a monophyletic group that is a sister group to the clade of Mollusca + Annelida. Based on present phylogenomic analyses, dicyemids display close affinity to orthonectids, and they may share a common ancestor with gastrotrichs and platyhelminths, rather than with mollusks and annelids. Regarding spiralian phylogeny, the Gnathifera forms the sister group to the Rouphozoa and Lophotrochozoa, as has been suggested by previous studies; thus our analysis supports the traditional acoeloid–planuloid hypothesis of a nearly microscopic, non-coelomate common ancestor of spiralians.
KeywordsDicyemida Mesozoa Spiralia Transcriptome Phylogenomics Molecular evolution
Understanding the origin and evolutionary history of metazoans has been a biological research objective for more than a century, but the phylogenetic relationships among many enigmatic taxa remain unsolved. Phylogenomic data from ambiguous taxa are essential to better comprehend animal evolution. Referring to the “new animal phylogeny,” bilaterians comprise three clades: Deuterostomia, Ecdysozoa, and Spiralia, as inferred from morphological and molecular data . Earlier proposed scenarios of spiralian evolution remain contentious, possibly due to poor sampling of species belonging to certain small taxa, problematic long branches of some taxa caused by unusually fast evolution, and a paucity of morphological synapomorphies . Recent phylogenomic studies have proposed that the Spiralia comprises three higher taxonomic units; i.e., that the Gnathifera (Syndermata, Gnathostomulida, and Micrognathozoa) forms a group with the clade comprising the Rouphozoa (Gastrotricha + Platyhelminthes) and the Lophotrochozoa (Mollusca, Annelida, Brachiopoda, Nemertea, etc.) . However, some microscopic lineages, e.g., dicyemids, remain poorly studied, which may cause phylogenetic analyses to produce erroneous interpretations of spiralian evolution.
Recently, large quantities of transcriptomic data from microscopic lineages have been incorporated into phylogenomic analyses, improving our understanding of spiralian evolution [2, 3]. However, dicyemids have been excluded from most recent phylogenomic studies of microscopic spiralian lineages. Here, we used genome-wide data to re-examine the phylogenetic position of dicyemids using maximum likelihood (ML) and Bayesian inference (BI) analyses. Because systematic biases sometimes occur in phylogenomic studies, thus confounding analyses due to artificial signals, we performed phylogenomic analyses using not only the complete dataset, but also sub-datasets from which rapidly evolving or compositionally heterogeneous sites had been eliminated, and from which orthologs that could have caused long-branch attraction artifacts had been excluded.
Sample collection, library preparation, and Illumina sequencing
A mixed life-stage sample of Dicyema japonicum was collected from urine in renal sacs of the host, Octopus vulgaris, and washed several times with artificial seawater . The sample was homogenized in TRIzol Reagent (Ambion, #15596026), and RNA was extracted using the phenol-chloroform method, after which it was further purified with a QIAGEN RNeasy Micro Kit (QIAGEN, #74004). A stranded library was prepared using a NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, #E7420), and sequencing was performed on an Illumina HiSeq2500.
Sequencing data sources and transcriptome assembly
Newly sequenced D. japonicum transcriptomic data were deposited in the DNA Data Bank of Japan (DDBJ accession number: DRA004566). Raw reads from transcriptome sequencing of Limnognathia maerski (SRR2131287), Macracanthorhynchus hirudinaceus (ERR454503), Mesodasys laticaudatus (SRR1797883), Stenostomum sthenum (SRR1801788) and Brachionus koreanus (SRR1658835) were downloaded from the NCBI database. Illumina raw reads were quality-trimmed with Trimmomatic (v0.33) . Trimmomatic removed three bases from both ends of all reads, and deleted them once the average quality within the window fell below a threshold of 20. If reads became shorter than 36 bases, they were discarded (LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36). Afterward, quality-trimmed reads were assembled de novo using Trinity (v2.0.6)  at default settings. Transcriptome assemblies of Adineta vaga, Brachionus plicatilis, Gnathostomula paradoxa, and Macrodasys sp. were adopted from the supplemental database of Struck et al. . Transcriptome assemblies of Stenostomum leucops, Microstomum lineare, Prostheceraeus vittatus, Geocentrophora applanata, Monocelis fusca, and Bothrioplana semperi were adopted from Laumer et al. . TransDecoder, a Trinity-plugin script, was used to extract likely coding regions within Trinity transcriptome assemblies, and transcripts were translated into amino acid sequences . Protein sequences of Tribolium castaneum, Drosophila melanogaster, Schistosoma mansoni, and Daphnia pulex were downloaded from the Uniprot database. Protein sequences contributed by the Intoshia linei genome project  were downloaded from NCBI BioProject: PRJNA316116. Gene models of Octopus bimaculatus were downloaded from the Octopus Genome website of the Molecular Genomics Unit at OIST [18, 19]. Gene models of Lottia gigantea [20, 21], Capitella teleta , and Helobdella robusta  were downloaded from the JGI database.
Compilation of datasets
Translated transcripts of all 29 taxa were assigned into ortholog groups (OGs) using a hidden Markov model-based search with HaMStR . In order to minimize missing data, only OGs that contained orthologs from all 29 taxa were selected. Each OG was aligned using MAFFT (v7.220) with the Smith-Waterman algorithm and 1000 cycles of iterative refinement (--localpair --maxiterate 1000) . Then alignments were trimmed using trimAl (v1.2)  with a gap threshold of 0.9, a similarity threshold of 0.001, and a window size of 6. Trimmed alignments shorter than 30 amino acids were discarded. Ortholog alignments were concatenated into a supermatrix using FASconCAT-G . For further filtering, the gene tree of each selected ortholog was reconstructed with PhyML (v3.1) . A paralog screening function of TreSpEx  detected possible paralogs. We use TIGER v1.2  to rank sites into 20 bins based upon relative evolutionary rate of the applied dataset. Sums of branch lengths of each gene tree were determined using customized Perl scripts. An index of long-branch heterogeneity was calculated by TreSpEx (fun -e) for each gene tree. BMGE (v1.12) identified and removed the high-entropy regions and compositionally heterogeneous sites . All aforementioned software was employed with default settings, unless otherwise specified.
Although we washed dicyemid samples with artificial seawater several times and carefully collected individual dicyemids under a microscope, we still could not preclude the possibility of contamination with host octopus cells. In order to avoid the octopus contamination, we performed an assessment to confirm that our dicyemid transcriptome assembly was uncontaminated. We mapped 562 million Illumina raw reads from the host Octopus vulgaris genome back to the dicyemid transcriptome assembly using Bowtie 2 (v2.2.3) . We couldn't find the octopus 18S sequence in our assembly. Only 1% of dicyemid transcripts were mapped to octopus reads, and none of the mapped transcripts were included in datasets in the present study.
RAxML (v8.1.20)  was employed to reconstruct phylogenetic trees using the maximum likelihood method with 100 bootstrap replications under the GAMMA model of rate heterogeneity. The partitioning scheme for each dataset was selected by PartitionFinderProtein v1.1.1  using RAxML and relaxed clustering algorithm. For Bayesian inference analyses, a Bayesian Markov chain Monte Carlo (MCMC) sampler, PhyloBayes-MPI (v1.6j)  was used. For each dataset, three independent chains were calculated using a CAT + GTR mixture model. We discarded at least the first 5000 trees from each chain, and then sub-sampled every 10 trees to calculate a majority rule consensus tree of all remaining trees pooled across three chains. The PhyloBayes “bpcomp” command was used to calculate the largest discrepancy (maxdiff) observed across three independent chains. For all analyses, maxdiff values were lower than 0.15, indicating that all chains had converged. Calculated cycles and consumed time until three independent chains converged, depended on the datasets used. For instance, the analyses of Dataset 1 took seven weeks to achieve convergence using 128 cores.
Immunostaining and imaging
Specimens for immunostaining were fixed in 4% PFA solution for 30 min, and then stored in 75% ethanol at –20°C. The immunostaining protocol was modified from a previous amphioxus study . Samples were blocked for 1 hr in blocking solution (3% BSA and 0.1% Triton X-100 in PBS) and incubated in a primary antibody solution of anti-acetylated tubulin mouse monoclonal antibody (Sigma, #T6793, 1:1000 diluted in blocking solution) at 4°C overnight. Fluorescent signals were detected after incubation in a secondary antibody solution of Alexa Fluor 594-conjugated, goat anti-mouse antibody. DAPI (Invitrogen, 1 μg/mL in PBST) was used for nuclear staining, and plasma membrane were stained with CellMask (Life technology, C10046). Fluorescent images were acquired using a Zeiss 780 confocal microscope with 20× and 100× objectives.
Results and discussion
Taking advantage of next-generation sequencing technology, phylogenomic methods utilize large molecular datasets to enable investigation of the phylogeny of animal taxa that have small body sizes, and that lack uniting synapomorphies. Newly sequenced dicyemid transcriptomic data were complemented with published transcriptomic data or predicted gene models from other spiralian, ecdysozoan, and deuterostome species. We developed a complete dataset for 29 taxa containing 348 orthologs, 58,124 amino acids, with only 6% missing data (Dataset 1). However, because large amounts of data may also amplify systematic biases, such as erroneously assigned orthologs, compositional heterogeneity, and long-branch attractions, we prepared two sub-datasets to assess the influences of potential bias sources.
Since orthologs were assigned by HaMStR , which sometimes groups paralogous sequences erroneously as sets of orthologous sequences, a paralog screening function of TreSpEx  was used to detect paralogs. After removing 17 possible paralogs from the complete dataset, two filtering criteria were used to generate Datasets 2 and 3 with potential sources of systematic bias removed. First, we obtained the sums of branch lengths of each gene tree using customized Perl scripts. If the branch length of dicyemids on a gene tree exceeded 30% of the sum of all branch lengths, this ortholog was excluded. Afterward, we ranked sites into 20 bins based upon relative evolutionary rate of the remaining alignment, and removed bin20 (the most rapidly evolving sites). That resulted in Dataset 2, containing 321 orthologs and 45,359 amino acids. For Dataset 3, an index of long-branch heterogeneity was calculated with TreSpEx for each gene tree. After excluding orthologs with index values over 100, BMGE removed the high-entropy regions and compositionally heterogeneous sites, and generated Dataset 3, with 302 orthologs and 41,202 amino acids.
Both D. japonicum (Dicyemida) and I. linei (Orthonectida) exhibit long branches not seen in other animal taxa (Figs. 3 and 4). In all ML and BI analyses, D. japonicum shows a close affinity to I. linei with strong statistical support, although we could not categorically exclude the influence of their long-branch lengths. However, parasitic organisms often have short generation times and large population sizes, which may be associated with rapid evolution [38, 39]. The long branch length probably reflects the accelerated evolutionary rate in dicyemids . It may be that dicyemids and orthonectids both evolved rapidly after diverging from a common ancestor. In addition, both dicyemids and orthonectids possess simplified morphologies without obvious synapomorphies; therefore, further comparative studies on the micro-structures or genomic features of these taxa may provide more evidence to facilitate the assessment of affinities between them.
ML analyses placed the clade Mesozoa [Dicyemida + Orthonectida] as a sister group to one gastrotrich clade [Diuronotus aspetos + Lepidodermella squamata] (Fig. 3a), and the result of a taxon-exclusion experiment based on Dataset 3 showed D. japonicum as nested within the Gastrotricha as well (Fig. 3b), indicating that the close relationship of D. japonicum with the Gastrotricha in ML analyses probably does not reflect the presence of the other long-branch taxon. Although the position of the Mesozoa does not alter across various ML analyses and is independent of potential systematic biases, the ML result for Dataset 3 (with long-branches and compositional heterogeneity removed) displayed poor bootstrap support for the root of the clade comprising dicyemids and gastrotrichs. In contrast, in BI analyses, the Dicyemida and Orthonectida formed a sister group to the Rouphozoa (Fig. 4), and BI analysis of Dataset 2 (Fig. 4b) revealed that the Mesozoa is a sister group to the Rouphozoa with significant statistical support.
A previous developmental study showed that early cleavage of dicyemids exhibits stereotypical spiral cleavage, as in the case of spiralians, such as annelids, mollusks, or platyhelminths . Furthermore, studies using different molecular markers have placed mesozoans as close relatives of annelids and mollusks [10, 11]. The present phylogenomic analyses, however, offer two additional possibilities. ML analyses suggest that the Mesozoa is nested within the Gastrotricha, implying that dicyemids may be degenerate gastrotrichs. However, taking morphological traits into account, the mono-ciliated epithelial cells of gastrotrichs has been considered a diagnostic trait , whereas dicyemids possess multi-ciliated epithelial cells (Fig. 1c). This morphological trait barely supports the hypothesis of dicyemids nested within the Gastrotricha. Alternatively, BI trees suggest that the Mesozoa diverged early from other rouphozoans. Even so, in all analyses, the Dicyemida, Orthonectida, Gastrotricha, and Platyhelminthes constitute a monophyletic group that is a sister group to the Trochozoa (Mollusca + Annelida). This indicates that dicyemids may share characters of a common acoelomate common ancestor with gastrotrichs and platyhelminths, rather than with mollusks and annelids; nevertheless, morphological synapomorphies between dicyemids and gastrotrichs (or rouphozoans) remain to be discovered. Developing a firm grasp of spiralian evolution will still require additional developmental or genomic studies of a wide range of microscopic spiralian taxa.
Previous hypothetical scenarios of spiralian evolution remain controversial. One supports the traditional acoeloid–planuloid hypothesis of a non-coelomate common ancestor of the spiralians, whereas another suggests that the common ancestor resembled an annelid-like organism with a segmented, coelomate body plan . According to the present analyses of spiralian phylogeny, the small, non-coelomate Gnathifera branched off first, and is a sister group to the Rouphozoa and Trochozoa, as reported in previous studies [2, 3]. Moreover, within the Gnathifera and Rouphozoa, most species are small, with acoelomate or pseudocoelomate body plans, whereas animals with coeloms are only found in the Lophotrochozoa (Mollusca, Annelida, etc.). The foregoing analyses and observations support the conclusion that the last common ancestor of spiralians may have been a microscopic animal lacking a coelom. It may be that microscopic lineages either maintained their ancestral morphology or that they underwent regressive evolution secondarily simplifying their morphologies, while lophotrochozoan lineages evolved more complex morphologies with a coelomic cavity and larger body size.
We provide the first evidence based on a large molecular dataset that the Dicyemida has a close affinity to the Orthonectida. These taxa constitute a monophyletic group with the Gastrotricha and Platyhelminthes. Datasets with systematic biases removed (Datasets 2 and 3) show concordant results with the complete dataset (Dataset 1). The present study indicates that the mesozoan clade [Dicyemida + Orthonectida] is either a sister group to the Rouphozoa or nested within the Gastrotricha in BI and ML analyses, respectively, but further genomic and developmental studies will be necessary to examine these competing hypotheses. Nonetheless, the present results agree with earlier findings that the Gnathifera branched off first, and comprise a sister group to the Rouphozoa and Lophotrochozoa, supporting the previously proposed acoeloid-planuloid hypothesis of a nearly microscopic non-coelomate common ancestor of the spiralians.
We thank Dr. Keisuke Nakashima for providing acetylated alpha-tubulin antibody and Yi-Jyun Luo for providing CellMask and DAPI reagents. We are grateful to all members of the Marine Genomics Unit, especially Dr. Jun Inoue for advice on analyses, and to Dr. Chuya Shinzato, who provided customized Perl scripts. We thank the OIST Imaging and Instrumental Analysis Section, the DNA Sequencing Section, and the Information Service Section for technical support. We also thank Dr. Steven D. Aird for editing the manuscript.
This study was supported by internal funds from the Okinawa Institute of Science and Technology (OIST). Tsai-Ming Lu was supported by a JSPS Research Fellowship for Young Scientists (DC1). Hidetaka Furuya was supported by Grants-in-Aid for Scientific Research (no. 26440216) from the Japan Society for the Promotion of Science.
Availability of data and materials
Transcriptomic data of Dicyema japonicum are deposited in the DNA Data Bank of Japan (DDBJ accession number: DRA004566).
T-ML and NS designed the study. HF collected the samples and identified species. MK contributed to library preparation and Illumina sequencing. T-ML carried out immunostaining and imaging and executed transcriptome assembly and phylogenomic analyses. T-ML interpreted data and wrote the manuscript. NS supervised the project. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 5.van Beneden É. Recherches sur les Dicyémides, survivants actuels d’un embranchement des Mésozoires. Bull Acad Roy Belg. 1876;42:3–111.Google Scholar
- 11.Petrov NB, Aleshin VV, Pegova AN, Ofitserov MV, Slyusarev GS. New insight into the phylogeny of Mesozoa: Evidence from the 18S and 28S rRNA genes. Moscow University biological sciences bulletin. Springer. 2010;65:167–9.Google Scholar
- 19.Octopus Genome, Molecular Genetics Unit, OIST. 2015. http://octopus.unit.oist.jp/OCTDATA/BASIC/Metazome/Obimaculoides_280_peptide.fa.gz. Accessed 19 Oct 2015.Google Scholar
- 21.JGI Genome Portal: Lottia gigantea (Project ID: 1091351). 2007. ftp://ftp.jgi-psf.org/pub/JGI_data/Lottia_gigantea/v1.0/Lotgi1_GeneModels_FilteredModels1_aa.fasta.gz. Accessed 5 Jun 2015.
- 22.JGI Genome Portal: Capitella teleta (Project ID: 16637). 2007. ftp://ftp.jgi-psf.org/pub/JGI_data/Capitella/v1.0/FilteredModelsv1.0.aa.fasta.gz. Accessed 5 Jun 2015.
- 23.JGI Genome Portal: Helobdella robusta (Project ID: 16078). 2007. ftp://ftp.jgi-psf.org/pub/JGI_data/Helobdella_robusta/v1.0/proteins.Helro1_FilteredModels3.fasta.gz. Accessed 5 June 2015.
- 29.Struck. TreSpEx – detection of misleading signal in phylogenetic reconstructions based on tree information. EBO. 2014;10:51–67.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.