EST and Mitochondrial DNA Sequences Support a Distinct Pacific Form of Salmon Louse, Lepeophtheirus salmonis
Nuclear deoxyribonucleic acid sequences from approximately 15,000 salmon louse expressed sequence tags (ESTs), the complete mitochondrial genome (16,148bp) of salmon louse, and 16S ribosomal ribonucleic acid (rRNA) and cytochrome oxidase subunit I (COI) genes from 68 salmon lice collected from Japan, Alaska, and western Canada support a Pacific lineage of Lepeophtheirus salmonis that is distinct from that occurring in the Atlantic Ocean. On average, nuclear genes are 3.2% different, the complete mitochondrial genome is 7.1% different, and 16S rRNA and COI genes are 4.2% and 6.1% different, respectively. Reduced genetic diversity within the Pacific form of L. salmonis is consistent with an introduction into the Pacific from the Atlantic Ocean. The level of divergence is consistent with the hypothesis that the Pacific form of L. salmonis coevolved with Pacific salmon (Onchorhynchus spp.) and the Atlantic form coevolved with Atlantic salmonids (Salmo spp.) independently for the last 2.5–11 million years. The level of genetic divergence coincides with the opportunity for migration of fish between the Atlantic and Pacific Ocean basins via the Arctic Ocean with the opening of the Bering Strait, approximately 5 million years ago. The genetic differences may help explain apparent differences in pathogenicity and environmental sensitivity documented for the Atlantic and Pacific forms of L. salmonis.
KeywordsSalmon lice Lepeophtheirus salmonis Expressed sequence tags (ESTs) Mitochondrial genome 16S rRNA Cytochrome oxidase subunit I (COI) gene
The salmon louse, Lepeophtheirus salmonis, is an economically important ectoparasite of farmed and wild salmon throughout the northern hemisphere (see reviews by Pike and Wadsworth 1999; Johnson and Fast 2004; Johnson et al. 2004; Boxaspen 2006; Costello 2006). Indirect and direct annual losses due to L. salmonis in the global salmonid aquaculture industry are estimated to exceed US$ 100 million (Johnson et al. 2004). In addition, elevated abundances of sea lice on wild salmon smolts in coastal waters occupied by salmon aquaculture have led to the hypothesis that wild populations of Atlantic salmonids have been negatively impacted by parasites derived from farmed salmon (Costello 2006). Uncertainty concerning the transmission of L. salmonis between farmed and wild salmon populations in British Columbia, Canada, has led to considerable research effort and scientific debate.
The development of the parasite includes two nonparasitic nauplii stages that facilitate dispersal in the plankton, an infective copepodid stage, four chalimus stages that are tethered to the host by a frontal filament, two preadult stages, and one adult stage. Preadults and adults are not tethered and are mobile on the surface of the fish. Following mating, adult females produce eggs that hatch to complete the life cycle (see reviews by Johnson and Fast 2004; Boxaspen 2006; Costello 2006). L. salmonis has become an important model for the study of ectoparasitic infestations on salmon. Disease due to L. salmonis on Atlantic salmon (Salmo salar) and sea trout (S. trutta) results from the feeding behavior and the secretion of bioactive compounds by the parasite (Pike and Wadsworth 1999; Dawson et al. 1997; Fast et al. 2007). The parasite feeds on mucus, epidermal cells, and underlying tissues causing physical damage, changes in the composition of blood electrolytes, physiological stress, immune dysfunction, impairment of swimming ability, and possibly death (see reviews by Johnson and Fast 2004; Boxaspen 2006; Costello 2006; Tully and Nolan 2002). However, physiological and immunological studies of L. salmonis remain limited (see review by Wagner et al. 2008).
Innate resistance to the salmon louse varies among the various species of salmon and trout (Jones 2001; Johnson and Albright 1992; Fast et al. 2003, 2006). Laboratory studies show that the heaviest infestations and greatest impacts are observed on sea trout (S. trutta) and Atlantic salmon (S. salar) followed by rainbow trout (Oncorhynchus mykiss), chinook (O. tshawytscha), and coho salmon (O. kisutch; Dawson et al. 1997; Johnson and Albright 1992; Fast et al. 2002). More recently, pink salmon (O. gorbuscha) were shown to rapidly reject L. salmonis and avoid the clinical consequences of infestation (Jones et al. 2007). Morphological and protein data suggest that the development of an inflammatory reaction, both systemically and at the site of parasite attachment, is a distinguishing feature of Oncorhynchus spp. that is not observed in the more susceptible Atlantic salmonids (e.g., S. salar; Fast et al. 2002). The kinetics of these inflammatory processes suggests they play a role in parasite rejection.
Oncorhynchus and Salmo species have been geographically isolated since the Miocene, approximately 18 to 30 million years ago (Devlin 1993; McKay et al. 1996). In light of differential responses of salmon species to lice, the question arises as to whether Atlantic and Pacific parasites such as L. salmonis have coevolved with Atlantic salmon and Pacific salmon, respectively, as distinct populations. Earlier L. salmonis microsatellite data based on six loci identified significant differentiation (fixation index = 0.0595) between one Pacific population and Atlantic forms but noted that only 6% of the overall variation was across oceans (Todd et al. 2004). In addition, a study of four mitochondrial genes noted clear differences between samples from a Japanese population and six Atlantic populations but excluded analysis and reporting of the Japanese data because of reduced length and numbers of sequences (Tjensvoll et al. 2006). In the present study, we examine the mitochondrial genome of the Pacific L. salmonis and compare it to the Atlantic form.
Genomic characterization of Atlantic salmon and rainbow trout (Rexroad et al. 2003; Rise et al. 2004; Govoroun et al. 2006; Adzhubei et al. 2007, Wynne et al. 2008) has enabled an expanded capacity for exploring the salmonid response to infectious disease and other environmental impacts (Rise et al. 2004; von Schalburg et al. 2005). In contrast, the present availability of sequence data from fewer than 200 salmon louse genes (GenBank: Nov 2007) limits our ability to measure and characterize parasite responses prior to and during infection. In the present study, we report on an expression sequence tag (EST) analysis of L. salmonis collected from salmon in the Pacific Ocean as part of a larger effort to improve our understanding of the coincident expression of host and parasite genes during infection.
Materials and Methods
Salmon Lice Samples
mRNA Isolation and Construction of cDNA Libraries
Total ribonucleic acid (RNA) was extracted from frozen samples using TRIzol reagent (Invitrogen) and Poly(A) + RNA purified by using Poly(A) Purist™ (Ambion). The non-normalized complementary deoxyribonucleic acid (cDNA) libraries for different developmental stages (copepodid, chalimus I, III, and IV, preadult male, preadult female, adult male, and adult female) were constructed using pBluescript II XR cDNA Library Construction Kits (Stratagene). To obtain enough RNA, particularly from the early stages, several hundreds of individuals were pooled. Poly(A) + RNAs of 2.5 to 5μg were used for each cDNA library following methods previously described (Rise et al. 2004). A normalized library containing equal amounts of RNA from all eight developmental stages was also constructed (Evrogen).
Sequencing, Sequence Analysis, and Contig Assembly
cDNA libraries were manually arrayed in 384-well microtiter plates, and glycerol stocks of overnight cultures were prepared. Plasmid DNAs were extracted and sequenced on an ABI 3730 DNA analyzer (Applied Biosystems) with M13 Reverse or Forward primers. The resulting ESTs were assembled with CAP3 (Huang and Madan 1999) with default parameters. The assembled total contigs (clusters + singletons) were annotated using RPS-BLAST or BLASTX comparisons with the Conserved Domain Database (www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd) or SWISSProt (Bairoch and Boeckmann 1992). The best BLAST match (E value threshold of 1e−10) was used to identify contigs. Contigs not meeting this threshold were annotated as unknown.
Genomic Characterization and Annotation for the Pacific Sea Lice mtDNA
The Screening for Sequence Variation in COI and 16S Genes
Total DNA extracts were obtained from fresh or ethanol-fixed samples as described above. The partial gene sequences of the 16S ribosomal RNA (rRNA) and cytochrome oxidase subunit I (COI) genes were amplified with the following primer sets: the 16S rRNA, LsPc-16S-F, and LsPc-16S-R; COI gene, LsPc-COI-F and LsPc-COI-R. PCR amplification was performed using 1.0μL of L. salmonis genomic DNA with an initial denaturation step of 5min at 95°C and then 40 cycles as follows: 30s of denaturation at 95°C, 30s of annealing at 55°C, and 2min of extension at 72°C. The PCR products were purified with QIAquick PCR purification kit (Qiagen) and directly sequenced with the internal sequencing primers, respectively. All primers used in this study are shown in Supplemental Table 1.
The partial sequences of the Pacific L. salmonis 16S rRNA (total of 67 samples) and COI gene (total of 63 samples) were obtained by the PCRs described above (for 16S rRNA: LsBa; 11 samples, LsSi; two samples, LsSo; five samples, LsUc; five samples, LsBs; 14 samples, LsJu; three samples, LsPm; eight samples, LsPk; seven samples, LsJp; 12 samples. For COI gene: LsBa; nine samples, LsSi; two samples, LsSo; five samples, LsUc; five samples, LsBs; 14 samples, LsJu; three samples, LsPm; seven samples, LsPk; six samples, LsJp; 12 samples; Fig. 1). The 16S rRNA and COI gene sequences that were originally identified by Tjensvoll et al. (2006) were used for the Atlantic form of L. salmonis sequences (16S rRNA [GenBank: AY602770-AY602949] and COI gene [GenBank:AY602587-AY602766]). All sequences were trimmed to the same length (16S rRNA; 796bp, COI gene; 1,300bp) and aligned using CLUSTALW (Higgins and Sharp 1988). Distance matrices (Kimura two-parameter) and data for the phylogenic tree were generated by the PHYLIP program package (Felsenstein 1989) using the neighbor-joining and unweighted pair group method with arithmetic mean (UPGMA) methods. For simplifying the phylogenic trees, identical sequences were grouped into clusters. The phylogenic trees were generated by NJplot software (Perrire and Gouy 1996).
Results and Discussion
cDNA Libraries and ESTs
Independent cDNA libraries were constructed for copepodids, chalimus I, III, and IV stages, male and female preadult, and male and female adult stages of L. salmonis. Several hundreds of individual copepodid and chalimus stage individuals were pooled to obtain sufficient RNA quantities. One hundred to 900 sequence reads were obtained from each library and assembled into contiguous sequences (contigs). An analysis of these contigs showed that more than 30% of the sequences were rRNA gene transcripts indicating very active protein translation. There was also a very high level of transcript redundancy making random sequencing strategies far too inefficient. To obtain the broadest possible representation of genes, equal amounts of messenger RNA from the different life stages listed above were combined, and a normalized cDNA library was constructed. Inserts of 5,760 random clones from this normalized library were sequenced from both the 5′ and 3′ ends resulting in 11,252 total sequences. In the normalized library, the average contig had 1.33 sequences with the largest contig consisting of ten individual sequence reads. A combined total of 14,994 EST sequences from all of the libraries were assembled into 5,256 unique contigs, of which 1,407 were composed of single sequences, 3,849 composed of two or more sequences, and 1,326 of three or more sequences. Contigs were annotated by RPS-BLAST or BLASTX comparisons to known protein domain profiles and protein entries in public databases (Conserved Domain Database; www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd, and SWISS-PROT; Bairoch and Boeckmann 1992). Of the 5,256 contigs, 2,557 matched at least one entry in the databases, and the others remain unidentified. EST sequences are available in GenBank (EX475086–EX486337) and contigs along with their proposed annotation are available through the cGRASP website (www.uvic.ca/cbr/grasp). The identification of 5,256 unique contigs provides a novel resource with which to study sea louse biology, as well as serving as the basis for a cDNA microarray. Efforts are currently underway to build a sea louse microarray that will complement existing salmonid microarrays (Rise et al. 2004; von Schalburg et al. 2005) to enable profiling of both host and parasite gene expression during infection.
Comparison of Atlantic and Pacific Form L. salmonis Genes
A total of 155 of the 5,256 contigs from Pacific L. salmonis matched (BLAST E value < 1e−100) at least one of the approximately 200 nuclear gene sequences from the Atlantic form of L. salmonis available in the public databases. These comparisons showed an average of 96.8% identity over an average of 765bp (data not shown). The importance of a 3.2% difference is difficult to determine without knowledge of gene duplications or establishing natural population variation for each gene, and as contig comparisons include 5′ (presumably genic) and 3′ (3′ untranslated region) sequences, they provide only a very rough estimate of overall sequence similarity. However, nuclear gene sequence comparisons do show clear genetic differences between Atlantic and Pacific forms of L. salmonis.
Nineteen of the 5,256 EST contigs were identified as mitochondrial sequences and spanned approximately 80% of the complete 14.5-kb mtDNA genome of a previously described Atlantic isolate (GenBank: AY625897; Tjensvoll et al. 2005). The EST contigs for Pacific L. salmonis mtDNA genes differed from Atlantic mtDNA by an average of 8%. Similar differences in mitochondrial contigs from the Pacific form were also apparent in comparisons to ATPase subunit 6, COI, cytochrome b, and 16S rRNA mtDNA gene sequences from 180 Atlantic Ocean isolates (Tjensvoll et al. 2006).
Characterization of the Pacific Form L. salmonis mtDNA Genome
The summary of nucleotide and protein differences between Pacific and Atlantic form
In nucleic sequence
In deduced amino acid sequence
Identities (amino acid residues)
12S ribosomal RNA
16S ribosomal RNA
Similar to ATPase 8
ATP synthase F0 subunit 6
Cytochrome c oxidase subunit I
Cytochrome c oxidase subunit II
Cytochrome c oxidase subunit III
NADH dehydrogenase subunit 1
NADH dehydrogenase subunit 2
NADH dehydrogenase subunit 3
NADH dehydrogenase subunit 4
NADH dehydrogenase subunit 4L
NADH dehydrogenase subunit 5
NADH dehydrogenase subunit 6
Distribution of Pacific Form of L. salmonis
The divergence of 16S rRNA and COI genes in and between Pacific and Atlantic forms of L. salmonis
16S rRNA (%)
Atl vs. Pac
Atl vs. Pac
The 68 Pacific L. salmonis samples precluded a robust analysis of population structure in the biogeographical distribution of alleles. The evidence for structure among samples collected from the Atlantic was not found in an earlier study (Tjensvoll et al. 2006). The overall average divergence between individuals within the Pacific population is 0.14% in the 16S rRNA locus and 0.62% in the COI gene locus. The intraspecific divergence values from the Pacific samples are consistently lower than those seen in the Atlantic form (Table 2) and indicate lower genetic variability within the Pacific salmon louse population. In general, these values are consistent with intraspecific variation found in many other species (Hebert et al. 2003). It is interesting to note that the two most distinct individuals were found at distant locations in the Pacific (LsBs: the mid-Bering Sea and LsBa; Broughton Archipelago, British Columbia; Fig. 4). These two Pacific sea lice individuals differ by 1.8% from each other and by 3.4% from the other 61 Pacific isolates at the COI locus, which indicates the possibility of population structure. A more extensive sampling from various Pacific locations is required to determine the existence of population structure.
Ranges of divergence based on Kimura two-parameter distance and crustacean molecular-clock calibrations
Distance (K2P, %)
Divergence range (Myr)
Distance (K2P, %)
Divergence range (Myr)
The level of separation between Pacific and Atlantic salmon lice mitochondrial genomes, the estimated time of Pacific and Atlantic salmon louse separation coinciding with the opening of the Bering Strait, and the reduced overall variation found within the 16S rRNA and COI genes from Pacific salmon lice all support an Atlantic Ocean origin of L. salmonis followed by a limited introduction into the Pacific Ocean coincident with the opening of the Bering Strait, approximately 5 million years ago. Parallel coevolution of salmon lice on their respective hosts in the Pacific and Atlantic Oceans has resulted in nuclear and mitochondrial genetic changes that may help to explain apparent phenotypic differences observed between these forms. In recent work using Scottish L. salmonis specimens, Bricknell et al. (2006) provided evidence of reduced tolerance of copepodids for low salinity in comparison to similar studies using lice specimens from British Columbia (Johnson and Albright 1992). Similarly, Saksida et al. (2007) documented a lower incidence of disease and a reduced need to treat farmed S. salar for L. salmonis in British Columbia compared with farmed S. salar in Scotland and Norway. More research is required to test these hypotheses. The high level of sequence divergence between the Pacific and Atlantic L. salmonis indicates that a taxonomic revision of these forms may be warranted.
We would like to particularly acknowledge the efforts of Dr. Jim Seeb, Douglas Eggers, Dick Wilmont, and Mark Witteveen in collecting sea lice samples from all over Alaska, Professor Kazuo Ogawa, Kouki Miura, and Kazuya Nagasawa for samples from Japan, and Fisheries and Oceans for samples from the Bering Sea, western Vancouver Island, and Broughton Archipelago. We would also like to thank the Sequencing Team at the Michael Smith Genome Sciences Centre, Vancouver, British Columbia, for sequence support. Funding for this study was provided by Genome British Columbia, the province of British Columbia, Microtek International, Mainstream, Marine Harvest, and Grieg Seafoods.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.