Next-generation sequencing of Dreissena polymorpha transcriptome sheds light on its mitochondrial DNA
- 1.8k Downloads
Zebra mussels Dreissena polymorpha (Veneroida, Dreissenidae) are known for their invasive behavior. Despite numerous studies dealing with this species, no results of large sequencing projects have been published to date, hampering marker development. In this study, we present a relatively large novel transcriptomic dataset obtained by Illumina MiSeq technology from mantle and male gonad of D. polymorpha sampled in Poland. The transcriptomic data were typical for the tissue analyzed. Moreover, they showed the expression of a single mitochondrial genome, indicating that this species do not have doubly uniparental inheritance of mitochondria. The sequences of mitochondrial transcripts were used to design primers and obtain nearly complete sequence of the zebra mussel mitochondrial genome by PCR and Sanger sequencing. This mitogenome has unique gene order, with the genes split into two blocks encoded in opposite directions. The closest mitogenome available in GenBank belongs to the marine clam Mya arenaria (Myoida, Myidae). The average divergence of the sequences of these mitogenomes is quite high, in the range of 20%. Both mitogenomic and transcriptomic resources should prove very useful for elucidating population genetics and conservation issues involving this important species.
KeywordsZebra mussel Invasive species Mitochondrial genome Genetic mapping NGS
The freshwater zebra mussel, Dreissena polymorpha (Pallas, 1771) (Dreissenidae), is one of the most successful invaders in terms of abundance and high density (Nalepa & Schlosser, 1993; Van der Velde et al., 2010; Schlosser & Schmuckal, 2012; Sousa et al., 2014). Native to the lakes, slow moving rivers and low salinity areas of the Black, Caspian and Azov Seas regions of central and eastern Europe, the zebra mussel was introduced to northwestern Russia, much of mainland Europe and England by at least the mid-1800s by the construction of canal systems in Europe and Russia that linked several major river drainages. The zebra mussel together with the quagga mussel (Dreissena bugensis, Andrusov, 1897) was introduced into the Great Lakes of North America in the mid-1980s by the discharge of ballast water from trans-Atlantic freighters (Stańczykowska, 1977; Hebert et al., 1989; Roberts, 1990; Bij de Vaate et al., 2002; Astanei et al., 2005; Bidwell, 2010). The colonization success of the zebra mussel is largely related to its method of reproduction external mode of fertilization, presence of free-swimming planktonic veliger larval stage, and attachment of juveniles and adults to hard surfaces via byssal threads (Borcherding, 1991; Johnson & Carlton, 1996). Despite this, there are only relatively few genetic studies of this invader (Stepien et al., 2002; Soroka, 1999, 2002, 2003, 2010; Astanei et al., 2005; Astanei & Gosling, 2010).
The zebra mussels started expanding their range in Poland approx. 200 years ago, and this expansion was correlated with the development of inland sailing (Stańczykowska, 1977; Bidwell, 2010). It is presumed to have arrived on the Baltic coast via River Nemunas which, at the end of the 18th century, was linked with River Dnieper via the Oginskij Canal (Nowak, 1974; Bidwell, 2010; Kołodziejczyk et al., 2011). In Poland, the earliest record of zebra mussel date from 1824, from the former Eastern Prussia, whereas it is currently present mainly in Pomeranian, Masurian and Suwalskie Lake District as well as in lagoons, lower stretches, and estuaries of the Vistula and Odra rivers (Stańczykowska, 1977; Lewandowski, 1991; Bidwell, 2010; Stańczykowska et al., 2010).
D. polymorpha is one of the most studied invasive species (Karatayev et al., 1997, 2007; Kołodziejczyk et al., 2011; Schlosser & Schmuckal, 2012; Sousa et al., 2014); however, most studies concentrate on their ecology, toxicology, and physiology, whereas purely genetic studies are less abundant and concentrate more on North American than European samples (Sousa et al., 2014). Few markers are available, mainly allozyme electrophoretic and RAPDs markers (Hebert et al., 1989; May & Marsden, 1992; Spidle et al., 1994; Marsden et al., 1995; Lewis et al., 2000; Elderkin et al., 2001, 2004; Stepien et al., 2002). The usefulness of cox1 appeared to be limited for the regional and fine-scale European D. polymorpha populations due to very low levels of polymorphism: only 5 haplotypes were observed over 338 individuals by Tarnowska et al. (2013) in France. Generally, the polymorphism measured with these low-resolution and/or anonymous markers led neither to the full understanding of the invasion nor to its better control (Marsden et al., 1995, 1996; Elderkin et al., 2001, 2004; Müller et al., 2002; Stepien et al., 2002).
Several studies used mitochondrial markers to assess genetic polymorphism and connectivity of D. polymorpha populations. The only available markers are based on the most conservative cox1 and lrn genes. Nevertheless, it has been shown that populations closest to the native range have marginally higher polymorphism in mitochondrial genomes, and a common haplotype is present in all sampled locations in Europe and North America, with the overall diversity derived from cox1 sequencing not exceeding 1.1% (Baldwin et al., 1996; Giribet & Wheeler, 2002; Therriault et al., 2004; Albrecht et al., 2007; Soroka, 2010).
We maintain that the available genetic resources, including the mitochondrial variation currently recognized, are inadequate to fully characterize the D. polymorpha range expansion in Europe. Consequently, we will present here an attempt to fill the gap by providing a large amount of original transcriptomic data and a nearly complete mitogenome of D. polymorpha.
Materials and methods
A sample of 10 zebra mussels was collected from lake Wdzydze (54°01′34.21″N, 17°55′31.96″E) in July 2010. Fresh animals were transported to the laboratory, sexed by microscopic examination of gonadal tissues. Both RNA and DNA were extracted from mantle/gonadal tissue of the same male individual using established protocols (Sańko & Burzyński, 2014). The total RNA sample was sent to Macrogen (South Korea). Library preparation and next-generation sequencing (NGS) was performed by Macrogen, using TrueSeq kit and MiSeq 2 × 150 bp technology. Approximately 3.5 × 106 raw sequence reads were obtained. They were preprocessed and assembled using CLC Genomics workbench software v. 9.01 (QIAGENE), under default settings for de novo NGS assembly. The assembly was evaluated by TransRate (Smith-Unna et al., 2016) and BUSCO (Simão et al., 2015) applications. TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies, using only the sequenced reads and the assembly as input. BUSCO (Benchmarking Universal Single-Copy Orthologs) is a measure for quantitative assessment of genome assembly and annotation completeness based on evolutionarily informed expectations of gene content. The obtained set of transcripts was first analyzed for the presence of mitochondrial transcripts by wise2 searches (Birney et al., 2004) using hmm profiles of mitochondrialy encoded proteins (Finn et al., 2010) and by nhmmer (Wheeler & Eddy, 2013) searches using hmm profiles of mitochondrialy encoded ribosomal RNA genes (Griffiths-Jones, 2003). The sequences of mitochondrial contigs were then used to design specific primers (Supplementary Table S1) for mtDNA amplification and Sanger sequencing. Sequence annotation was achieved by comparative analysis: the original annotations form wise2 searches were verified with ORF-finding and BLAST (Altschul et al., 1990) searches from within the CLC software. In addition, the trn genes were identified using arwen software (Laslett & Canbäck, 2008). To better evaluate the structure of mitochondrial transcripts, the quality filtered reads were mapped onto the final annotated mitochondrial contig in CLC software. The remaining contigs were analyzed using the BLAST2GO application (Conesa et al., 2005). Phylogenetic analysis based on transcriptomic data was conducted by adding D. polymorpha transcript sequences to the reference alignments downloaded from the DRYAD repository (González et al., 2014) and following similar methodology (González et al., 2015). Maximum-likelihood tree searches were conducted with RAxML v. 8.2.8 (Stamatakis, 2006), using a model of protein evolution with corrections for a discrete gamma distribution with the LG model (“-m PROTGAMMALG” option) (Le & Gascuel, 2008). Bootstrap resampling was conducted for 100 replicates using a rapid bootstrapping algorithm (Stamatakis et al., 2008), and the support values were mapped onto the optimal tree.
The annotated mitochondrial and assembled transcriptomic sequences as well as raw sequence data have been deposited in GenBank under the following accession numbers: mitogenome—KY091877, raw data—SRR5000302. The Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the accession GFBQ00000000. The version described in this study is the first version, GFBQ01000000.
General transcriptome data
The overall quality of the assembly was good, with a TransRate score at 0.18 and approximately 150 BUSCOs present. Given the general lack of annotated reference data for bivalves, this is a good result. There were 26636 contigs, out of which approx. 7500 were positively identified by Blast2GO application (Supplementary Fig. S1), showing diverse distribution of contigs among various GO terms, typical for bivalvian mantle (Sleight et al., 2016; Vendrami et al., 2016; Yarra et al., 2016).
Mitochondrial transcript mapping statistics
Total reference length
GC contents in %
Total read count
Mean read length
Total read length
Total reference length
Minimum excl. zero coverage regions
Average excl. zero coverage regions
Standard deviation excl. zero coverage regions
The partial sequences of cox1 (KC429149, AF474404, EU484431, EF414494, AM748975, DQ840121, AF510510, JX239087: Marescaux & Van Doninck, 2013) and lrn (DQ280038, KP052744, DQ333747: Giribet et al., 2006; Van der Velde et al., 2010) genes of D. polymorpha present in GenBank perfectly match the respective fragment of this genome, confirming its identity. BLAST searches with all annotated D. polymorpha mitochondrial genes against RefSeq database pointed at the mitogenome belonging to M. arenaria as the closest one. However, the gene order of D. polymorpha mitogenome is unique, not resembling that of M. arenaria and the sequence homology was also relatively weak, at 60% sequence identity for cox1, 55% for cytb and 35% for atp6.
There were three relatively long unassigned regions in the mitogenome. The largest one contained an array of AT-rich tandem repeats. The length of the repeat unit was 121–133 bp, but the number of repeats was difficult to infer reliably using short-reads NGS. The assembly suggested the presence of at least 6 repeats. There were also two short open reading frames (ORFs) present within the URs. The shorter one, 153 bp long, was located between trnS and nd4, while the longer one, 246 bp long, was located between trnQ and atp6. Neither of the predicted protein sequences encoded by these ORFs had any similarity to known proteins in BLAST and wise2 searches against nr and pfam databases.
Some bivalve species have an exotic system of mitochondria inheritance called DUI (doubly uniparental inheritance) resulting in male heteroplasmy for two, often very divergent mitogenomes (Skibinski et al., 1994; Zouros et al., 1994). Since DUI has been reported also for a venerid clam Ruditapes philipinarum (Passamonti & Scali, 2001), it could have been expected in D. polymorpha, too. However, based on the presented data, it can be positively concluded that zebra mussels do not have DUI as their male gonads are homopalsmic, both at the genetic and transcriptomic levels. The representative mitochondrial genome was characterized at unprecedented level by providing its nearly complete sequence. The incompleteness of this mitogenome of is unfortunate. There are several possible reasons for this failure, but the most likely one is the presence of long and complex repeats coupled with the apparently difficulty of melting secondary structures within the missing part of the mitogenome. The alternative hypothesis would be that the mitogenome is actually linear in this species. There are very few examples of linear mitogenomes in animals, most notably atypical structures have been reported in some crustaceans (Doublet et al., 2012). Either way the complete structure could probably be solved only by using modern NGS technique allowing much longer reads. Other than the structural uncertainty, this genome is reasonably complete, with the possible exception of atp8. This is not surprising given the difficulties associated with detection of this gene (Breton et al., 2010; Śmietanka et al., 2010). The possibility that one of the anonymous ORFs is in fact encoding atp8 cannot be dismissed; however, there is also no positive argument supporting such a claim. Similar sequences do not exist even in the closest mitogenome of M. arenaria. Ultimately, increasing taxonomic sampling of mitogenomes will allow better comparative analysis and may lead to correct identification of these ORFs. The missing and/or extra trn genes could well be the consequence of incompleteness or imprecise genome annotations. It can be argued that there are no numts (nuclear mitochondrial DNA segments) contaminating our data because transcript mapping shows typical pattern of expression and numts are generally not expressed. In fact, even taking into account that the overrepresentation of mitochondrial transcript is a typical feature of animal RNA-seq datasets (Smith, 2013), the observed proportion (more than 26%) of mitochondrial transcripts was very high. In this context, it is a little surprising that the expression of nad6 was much lower than that of the other genes. This may be due to the bicistronic nature of its transcript and the fact that library construction involves priming of the first strand cDNA synthesis from the oligo-dT-based primer which may contribute to the under-representation of 3′ ends of transcripts in the library. Bicistronic transcripts have been observed previously in other bivalves (Chatzoglou et al., 2013; Sańko & Burzyński, 2014).
The phylogenetic relationships within Bivalvia are a matter of ongoing debate. The early inferences based on poor molecular markers often conflicted with established views based on morphological characters, up to the point of questioning the monophyly of Bivalvia (Plazzi et al., 2013). Fortunately, these conflicts can usually be resolved by providing more molecular data. For Bivalvia, the recent analysis of transcriptomes representing all major lineages is such a keystone (González et al., 2015), solving most conflicts. It is therefore reassuring that the presented transcriptomic dataset is fully compatible with this methodology and supports the placement of Dreissenids in close proximity with Myidae, confirming the recent findings based on a much smaller dataset (Bieler et al., 2014) and in accordance also with the presented mitogenomic data. The only existing NGS-derived D. polymorpha dataset was targeted at STR (short tandem repeats, also known as microsatellites) marker development and contains relatively low number of unique contigs (Peñarrubia et al., 2015). Moreover, nearly all microsatellite sequences of D. polymorpha deposited in GenBank can be also found in the presented transcriptomic dataset, demonstrating that it can be also a good data source for STR marker development.
The presented set of transcriptomic and mitogenomic data should be beneficial for development of highly variable genetic markers that is needed for studying and managing of D. polymorpha invasions.
This work was partially supported by Polish Ministry of Higher Education Grant no. N 303 364 33 to MS and Polish National Science Center Grant no. UMO-2012/07/B/NZ2/01991 to BS. Computer intensive analyses were run on PLGRID high-performance computer clusters.
- Astanei, I. & E. Gosling, 2010. A microgeographic analysis of genetic variation in Dreissena polymorpha in Lough Key, Ireland. In Van der Velde, G., S. Rajagopal & A. Bij de Vaate (eds), The zebra mussels in Europe. Backhuys Publishers, Leiden/Margraf Publishers, Weikersheim.Google Scholar
- Bidwell, J. R., 2010. Range expansion of Dreissena polymorpha: a review of major dispersal vectors in Europe and North America. In Van der Velde, G., S. Rajagopal & A. Bij de Vaate (eds), The Zebra Mussels in Europe. Backhuys Publishers, Weikersheim.Google Scholar
- Bieler, R., P. M. Mikkelsen, T. M. Collins, E. A. Glover, V. L. Gonzalez, D. L. Graf, E. M. Harper, J. Healy, G. Y. Kawauchi, P. P. Sharma, S. Staubach, E. E. Strong, J. D. Taylor, I. Temkin, J. D. Zardus, S. Clark, A. Guzman, E. McIntyre, P. Sharp & G. Giribet, 2014. Investigating the Bivalve Tree of Life – an exemplar-based approach combining molecular and novel morphological characters. Invertebrate Systematics 28: 32–115.CrossRefGoogle Scholar
- Borcherding, J., 1991. The annual reproductive cycle of the freshwater mussel Dreissena polynorpha Pallas in lakes. Oekologia 87: 208–218.Google Scholar
- González, V. L., S. C. S. Andrade, R. Bieler, T. M. Collins, C. W. Dunn, P. M. Mikkelsen, J. D. Taylor & G. Giribet, 2015. A phylogenetic backbone for Bivalvia: an RNA-seq approach. Proceedings of the Royal Society B-Biological Sciences 282: UNSP 20142332.Google Scholar
- Karatayev, A. Y., L. E. Burlakova & D. K. Padilla, 1997. The effects of Dreissena polymorpha (Pallas) invasion on aquatic communities in Eastern Europe. Journal of Shellfish Research 16: 187–203.Google Scholar
- Kołodziejczyk, A., A. M. Łabęcka & A. Stańczykowska, 2011. Invasion of Corbicula and Dreissena – unexplained problems. In Głowaciński, Z. (ed.), Allien Species in the Fauna of Poland II Key Issues and Summary Conclusions. Instytut Ochrony Przyrody PAN, Kraków: 551–563.Google Scholar
- Lewandowski, K., 1991. The occurrence of Dreissena polymorpha (Pall.) in some mesotrophic lakes of the Masurian Lakeland (Poland). Ekologia Polska 39: 273–286.Google Scholar
- Nalepa, T. F. & D. W. Schlosser (eds), 1993. Zebra mussels: Biology, impact, and control. Lewis Publishers, Boca Raton, FL.Google Scholar
- Nowak, E., 1974. Animals in expansion. WP, Warszawa. (in Polish).Google Scholar
- Smith-Unna, R., C. Boursnell, R. Patro, J. Hibberd & S. Kelly, 2016. TransRate: reference free quality assessment of de novo transcriptome assemblies. Genome Research gr.196469.115.Google Scholar
- Soroka, M., 2010. Genetic differentiation of Dreissena polymorpha from East European countries. In Van der Velde, G., S. Rajagopal & A. Bij de Vaate (eds), The Zebra Mussels in Europe. Backhuys Publishers, Weikersheim.Google Scholar
- Stańczykowska, A., 1977. Ecology of Dreissena polymorpha (Pall.) (Bivalvia) in lakes. Polish Archives of Hydrobiology 24: 461–530.Google Scholar
- Stańczykowska, A., K. Lewandowski & M. Czarnoleski, 2010. Distribution and densities of Dreissena polymorpha in Poland – past and present. In Van der Velde, G., S. Rajagopal & A. Bij de Vaate (eds), The Zebra Mussels in Europe. Backhuys Publishers, Weikersheim.Google Scholar
- Therriault, T. W., M. F. Docker, M. I. Orlova, D. D. Heath & H. J. MacIsaac, 2004. Molecular resolution of the family Dreissenidae (Mollusca: Bivalvia) with emphasis on Ponto-Caspian species, including first report of Mytilopsis leucophaeata in the Black Sea basin. Molecular Phylogenetics and Evolution 30: 479–489.CrossRefPubMedGoogle Scholar
- Van der Velde, G., S. Rajagopal & A. Bij de Vaate (eds), 2010. The Zebra Mussels in Europe. Backhuys Publishers, Weikersheim.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.