Introduction

Identifying the source of invaders is a valuable tool in fauna crime and wildlife forensics, e.g. knowledge of possible introduction routes to reduce further introductions (Geller et al. 2010). Wildlife DNA forensic methods have primarily been used as a means to identify the species of collected evidence in wildlife crime (e.g. Linacre (2009)). However, the expanding field of genetic methods and genetic markers (e.g. microsatellites and SNP’s) offer a wide array of related applications in fauna crime related questions (Ogden et al. 2009; Alacs et al. 2010; Geller et al. 2010; Ogden et al. 2013; Johnson et al. 2014).

There has also been an advance in the field of statistical inference with regards to interpreting patterns from genetic markers (Hansen et al. 2001; Beaumont and Rannala 2004; Drummond et al. 2005). Bayesian inference methods utilizing e.g. microsatellites provide an effective tool for natural scientists (Beaumont et al. 2002; Stauffer, 2008; Stephens and Balding 2009) allowing for statistical genetic assignment and identification of a given individual to putative source populations (Pearse and Crandall 2004). Such methods are useful in identifying indigenous and introduced individuals (Primmer et al. 2000), and have been extensively used in a number of convictions, e.g. regarding illegal salmon fishing and trade (Withler et al. 2004). Furthermore, genetic software have been developed to infer the past demographic history (Pybus et al. 2000; Heled and Drummond 2008; Guillemaud et al. 2009), making it theoretically possible to infer the most likely number of translocated individuals from one source population to a new locality (Anderson and Slatkin 2007).

Lake Storsjøen in Rendalen municipality, South-Central Norway, was recently discovered to have been exposed to a translocation event of the European smelt (Osmerus eperlanus L.) from an unknown source (County Governor of Hedmark 2011). European smelt (hereafter, smelt) is an osmerid species native to Norway, but has not previously been observed in Lake Storsjøen (Museth et al. 2008). It was first discovered by local fishermen in Lake Storsjøen in 2008 (Strømsmoen 2008), but the exact time of translocation is unknown.

Norwegian law prohibits the translocation of any freshwater-species, both alive and as baitfish, to localities where they have not previously been known to inhabit (Innlandsfiskeloven [law relating to salmonids and freshwater fish 2014]: Omsetnings- og sykdomsforskriften for akvatiske dyr [Sale and disease regulation for aquatic animals, 2008]). This study has the main objective of identifying the most likely source of the introduced smelt, and to get an insight into the introduction history to Lake Storsjøen. To achieve this, genetic samples from several potential source locations were compared at 15 microsatellite loci against samples from the introduced smelt in Lake Storsjøen. By testing microsatellite data using multiple inference programs, we aim to pinpoint the most likely source population(s) and illustrate the value of population genetics as a tool in wildlife forensics.

Based on the likely assumption that the smelt in Lake Storsjøen was illegally translocated by humans either intentionally, or by accident when using smelt as bait, the following hypotheses were tested, that; (1) translocation of smelt occurred from a locality in geographic proximity to Lake Storsjøen, (2) the translocation of smelt to Lake Storsjøen occurred from only one source location, (3) the translocated population will exhibit a reduced genetic diversity compared to the source populations due to a limited number of individuals translocated, i.e. founder effects.

Material and methods

The European smelt is a small fish species of the family Osmeridae. It is widely distributed in the north east Arctic coastal waters, from the White- and Barents seas in the north to Garonne estuary in France (Kottelat and Freyhof 2007). In Norway the smelt is naturally distributed in the South-Eastern part, mainly in large lakes (Sandlund and Næsje 2000).

As the main aim of this study was to identify the most likely origin of the translocated smelt in Lake Storsjøen a sub-set of smelt populations were selected from the complete number of existing populations in order to test the specific hypotheses. A priori prediction suggests that the most closely situated smelt population is the most likely founder, where the source population is probably large and publicly well-known, corresponding well with Lake Mjøsa that has a large population of smelt with several well-known spawning locations. Thus, two selected spawning locations from Lake Mjøsa were considered to be a likely source. Secondly, a set of six more southerly distributed smelt populations at an increasing distance from Lake Storsjøen were selected (Fig. 1; Table 1). A smelt population from Lake Vänern in Southeastern Sweden was selected to be used as an outgroup for polarizing genetic assignments geographically. The large Lake Vänern and the ancient Lake Ancylus have likely been important with regard to colonization of the freshwater fishes in Norway (Borgstrøm 2000). In addition, one locality in Lake Mjøsa was sampled twice in different years (Mjøsa Lågen 2009, 2011, Table 1) offering an opportunity to compare temporal samples and thus test the assignment ability of the different softwares when using temporal samples from the same locality (see methods and details below). In total, 416 smelt from 10 localities were collected between 2009 and 2012 in Norway and Sweden and used in the analyses (Table 1; Fig. 1). Sampling was performed by both gill-net fishing and by dip-netting at spawning sites until a minimum number of 20–30 smelt were obtained from each location, a number of individuals assumed to be sufficient for population genetic assignment analyses when assuming a high genetic differentiation between populations (Cornuet et al. 1999; Hansen et al. 2001). The sampled smelt were immediately either frozen, or preserved in containers with 96 % ethanol (EtOH). In the laboratory, fin clips from pectoral fins were taken, and fin clips were preserved in individually marked 2 ml Eppendorf tubes.

Table 1 Description of the sampled smelt (Osmerus eperlanus) locations including: locality, population codes, latitude and longitude, area of the lake (km2), mean depth of the lake (m), year sampled, size range of the smelt (cm), mean length of the smelt (cm), sample size (N) and method of collection
Fig. 1
figure 1

The ten sampling locations of smelt (Osmerus eperlanus) in Norway and Sweden. Lake Storsjøen is the translocated smelt population. The map was created in ArcGIS Version10.1 (ESRI 2012)

DNA extraction and PCR

DNA extraction was performed at the fish genetics lab at Tromsø University using the E-Z 96 Tissue DNA kit following the manufacturers procedures (OMEGA Bio-tek) followed by Nanodrop quantification of DNA quality. A total of 15 microsatellite loci (see Supp. Table 5) were optimized. Subsequent polymerase chain reaction (PCR) was conducted in two multiplex panels. The PCR products were separated on an ABI 3130 XL Automated Genetic Analyzer (Applied Biosystems) and alleles scored in the GENEMAPPER 3.7 software (Applied Biosystems). The scoring was verified twice by visual inspection. Replicate samples (5–10 samples per population) were also included as a part of the quality assessment of the dataset. One locus (M-Omo4) was excluded due to poor amplification, leaving 14 microsatellites for further analyses.

Data analysis

The software MICRO-CHECKER 2.2.3 (Van Oosterhout et al. 2004) was used to check for genotyping errors, followed by the program FREENA (Chapuis and Estoup 2007; Chapuis et al. 2008) that corrects for allele-frequency bias. There was no systematic occurrence of homozygote excess within loci across populations, and no significant differentiation in F st when comparing uncorrected and corrected loci.

LOSITAN (Beaumont and Nichols 1996; Antao et al. 2008) was run to test if loci were candidates for directional or balancing selection. Analyses were run with 100 000 simulations under the “Force mean F st ”, and “Neutral mean F st ” alternatives. One locus (Oep539) was candidate to directional selection. Putative effects of directional selection at Oep539 were tested for influence on the genetic assignment tests by both including and excluding the locus in the following analyses. However, no difference in assignment was detected when removing the candidate locus, thus the locus Oep539 was included in the subsequent analyses.

GENEPOP 4.0 (Raymond and Rousset 1995; Rousset 2008) was used to check for deviations from Hardy–Weinberg equilibrium and linkage disequilibrium (LD (Guo and Thompson 1992)). False discovery rate (FDR) corrections (Pike 2011) was used to adjust p-values for multiple tests. The results showed that significant deviations were found in only one locus (Oep384, p < 0.0005) in one population (Lake Mjøsa/Lågen-11) after FDR corrections (threshold, p < 0.0005). This was in concordance with the results from MICRO-CHECKER, and the deviation was possibly due to null alleles. Significant LD was not discovered in any tests following FDR correction. Thus, only one locus, Oep384 was removed, and a total of 13 loci were used in the following genetic analyses.

Genetic diversity and population differentiation

Genetic diversity estimates were calculated by the use of several different software; Expected heterozygosity (H exp ), and genetic divergence between populations (pairwise Fst) was calculated by the use of GENEPOP 4.0 (Raymond and Rousset 1995; Rousset 2008; Kalinowski 2004), GENALEX 6.5 (Peakall and Smouse 2012), and Fstat 2.9.3.2 (Goudet 1995 Weir and Cockerham 1984). Standardized private allelic richness (A p ) and standardized allelic richness (A r ) accounting for differences in sample size, was calculated with HP-RARE 1.0 (Kalinowski 2005) with rarefaction using 36 genes (i.e. the minimum gene number across samples). Pairwise Wilcoxon rank sum tests were performed in R (R Core Team 2014) to check for significant differences in A p , A r and H exp between the introduced population and the potential source populations.

Population assignment

Different Bayesian inference programs may give deviating results (Frantz et al. 2009). Thus, the comparison of several softwares may lead to a stronger support for assignment of individuals, and minimize the risk of bias. Here, two different Bayesian assignment software were applied; STRUCTURE 2.3.2 (Pritchard et al. 2000), and GENECLASS2 (Piry et al. 2004). STRUCTURE was run with an admixture model using 100,000 burn-in steps, and 100,000 Markov Chain Monte Carlo repetitions with 10 iterations using the LOCPRIOR function which incorporates geographic sampling information as recommended by (Hubisz et al. 2009), as well as a hierarchical approach following the recommendation by (Evanno et al. 2005). The most likely number of clusters (based on LnP(K) and ΔK) was determined using STRUCTURE HARVESTER (Earl and Vonholdt 2012). The software GENECLASS2 (Piry et al. 2004) was used to exclude or assign reference groups as possible sources, i.e. to determine which groups are likely source populations, and to significantly exclude unlikely sources (Pearse and Crandall 2004). This was done by using all the different criteria available for calculation; Bayesian, allele frequency, and distance based. Bayesian and frequency based approaches in this software have the advantage that they do not assume that the source population is among the sampled populations. This gives the possibility of asking if the “true” source population is among the sampled populations, rather than asking which population has the highest likelihood as a potential source, and to significantly exclude unlikely sources (Pearse and Crandall 2004). Initial tests, using all criteria, were performed with GENECLASS2 by testing all spawning populations with known origin against all the potential source populations to determine the power, or consistency of this analysis. All spawning populations were consistently assigned back to their known origin (100 % for assignment of groups, 77.8–99.8 % for assignment of individuals), indicating a high power of the analysis. All computations were executed with an assignment threshold of p < 0.01.

Phylogenetic analysis

The program POPULATIONS 1.2.30 (Langella 1999) was used to create phylogenetic rooted neighbor-joining trees. The trees were created with bootstrap values from 100 permutations using the Nei’s standard distance (Nei 1972), Nei’s DA distance ((Nei et al. 1983) Supp. Fig. 7)) and Cavalli-Sforza and Edwards distance method. Results are shown with the Cavalli-Sforza and Edwards distance as this method assume that genetic differentiation occurs due to genetic drift, and do not assume that population size remains constant (Cavalli-Sforza and Edwards 1967). As the translocated smelt in Lake Storsjøen most likely consisted of a limited number of individuals (where random genetic drift may be influential), this method seemed to be the most appropriate. The tree was visualized using TreeView32 (Page 1996) using the Swedish population Vänern as a geographical outgroup/root.

Genetic analyses for demographic events

BOTTLENECK (Piry et al. 1999) was used to evaluate if the translocated smelt individuals in Lake Storsjøen have gone through a bottleneck at the time of release. BOTTLENECK was run with 1000 iterations for all the three mutation models (stepwise mutation, infinite alleles, and two-phased model (TPM (recommended by the authors)), and with all statistical tests. Evaluation of the Wilcoxon sign-rank test under the TPM-model was given most emphasis as this test was recommended by the authors.

In order to get an estimate of the approximate number of individuals that was transferred from the most likely source population into Lake Storsjøen two different packages were used; COLONIZE (Mergeay et al. 2007), and COALIT/NFCONE (Anderson and Slatkin, 2007). COLONIZE was run 10 independent times with rare allele correction, maximum 10,000 colonizers, 10 batches, and 100 randomizations. The program calculates probabilities for maximum and minimum, as well as a joint probability value (joint probability for minimum and maximum colonizers), for potential number of colonizers. The COLONIZE program estimate the probability that a simulated founder event of a certain size would result in at least as many alleles as observed amongst the actual colonized population. The COALIT/NFCONE package uses a Monte Carlo approximation to the likelihood that allows for estimation of number of founding individuals (or chromosomes) by calculating maximum likelihood estimates, and upper and lower support limits that correspond to a confidence interval (Anderson and Slatkin 2007). The software was run using the input of the source and translocated populations, under a wide range of values of intrinsic growth rate (r) and carrying capacity (K) to establish how the analysis was affected by the different assumptions of r and K, which are not well known for smelt. At r values above 2.0, further increase in r produced only negligible changes in the estimated number of founders. The intrinsic growth rate parameters; 0.5, 1.0, 2.0, 3.0, and the levels of carrying capacities; 50,000, 250,000, 500,000, 1000,000, 5000,000 diploid individuals, were therefore used for the final analysis. These scenarios are assumed to capture the range of likely demographic scenarios of the smelt during invasion. The scenario of intrinsic rate of increase values of 3, was only run under values of K = 50,000 and 250,000 due to extensive run-times.

To test for a demographic population expansion event of the smelt in Lake Storsjøen, the k-, and g-test of (Reich et al. 1999) implemented in KG-TEST was applied. The g-test significance level was checked according to the recommended cutoff values in Table 1 (p. 455) reported by (Reich et al. 1999). Most emphasis will be put on the k-test as the k-test has a maximum sensitivity for detecting expansions that happened within a few generations, while the g-test is more suitable for detecting expansions that happened further in the past (Donnelly et al. 2001).

Results

Genetic diversity and population differentiation

A total of 155 alleles were observed in the 11 populations, across the 13 loci. Standardized private allelic richness (A p ) ranged from 0.04 to 0.92, but there was no statistical significant difference between any of the lakes (Wilcoxon rank test: p > 0.05). Lake Eikeren, Hurdal, Norsjø, Randsfjorden and Storsjøen exhibited the lowest private allelic richness and Lake Väneren exhibited the highest A p . Lake Storsjøen had a slightly lower A p (0.15) than the Lake Mjøsa populations; Lake Mjøsa/Snippsandodden (0.24), Lake Mjøsa/Lågen-11 (0.26), and Lake Mjøsa/Lågen-09 (0.29, Fig. 2a).

Fig. 2
figure 2

a Private allelic richness (Ap), b allelic richness (Ar), and c expected heterozygosity (Hexp), in the 11 smelt populations; Lake Eikeren (Eik), Holingdal (Hol), Hurdal (Hur), Mjøsa/Lågen-09 (Lag09), Mjøsa/Snippsandodden (MjN), Mjøsa/Lågen-11(Lag11), Norsjø (Nor), Randsfjorden (Ran), Storsjøen (Sto), Tyrifjorden (Tyr), Vänern (Van). Values are given with mean ± standard error of the mean (SEM)

Allelic richness varied between 2.92 and 5.98 across populations, but was not significantly different between any of the populations (Wilcoxon rank test: p > 0.05). Lake Storsjøen exhibited an allelic richness of 5.00, relatively similar to Lake Mjøsa/Lågen-09 (5.06), Lake Mjøsa/Lågen-11 (5.20) and Lake Mjøsa/Snippsandodden (4.97, Fig. 2b). The populations exhibiting the lowest allelic richness were the westerly distributed populations; Lake Eikeren (2.92), Lake Norsjø (3.08) and Lake Tyrifjorden (3.22), while Lake Väneren exhibited the highest allelic richness of 5.98 (Fig. 2b).

The mean expected heterozygosity (H exp ) ranged from 0.28 to 0.51, with statistically significant differences between some population pairs (Wilcoxon rank test, Table 2); the most distant population from Lake Storsjøen, Lake Norsjø had a significantly lower H exp than Lake Hurdal, the three Lake Mjøsa populations, Lake Storsjøen and Lake Vänern. Lake Eikeren exhibited a significantly lower H exp than all populations except Lake Norsjø, Randsfjorden and Tyrifjorden. Lake Holingdal exhibited a significantly lower H exp than Lake Vänern and Lake Hurdal exhibited a significantly higher H exp than Lake Norsjø, Randsfjorden and Tyrifjorden. The three Lake Mjøsa populations exhibited a significantly higher H exp than Lake Norsjø (as well as between Lake Randsfjorden and Lake Mjøsa/Lågen-09) and a significantly lower H exp than Lake Vänern. Lake Storsjøen had a significantly higher H exp than Lake Eikeren, Lake Norsjø and Lake Randsfjorden, but no difference in H exp to the Lake Mjøsa populations (Lake Mjøsa/Snippsandodden, Lake Mjøsa/Lågen-09, and Lake Mjøsa/Lågen-11 (Fig. 2c; Table 2)).

Table 2 Upper diagonal: Pairwise comparison of expected heterozygosity (H exp ) among the 11 smelt populations from Wilcoxon rank test

Pairwise comparisons of population differentiation (F st ) showed highly significant differentiation (p < 0.001) among most of the population pairs after adjustment of alpha (α < 0.0005, Table 2). The only non-significant F st values were between two locations within Lake Mjøsa (p = 0.67), and the temporal samples of Lake Mjøsa (p = 0.36), and between Lake Storsjøen and the two temporal samples of Lake Mjøsa. Lake Storsjøen was highly genetically divergent from all other populations except Lake Mjøsa. This indicates that Lake Storsjøen was most genetically similar to the two temporal samples from the same locality in Lake Mjøsa, making this location candidate as the likely source of the smelt in Lake Storsjøen (Table 2).

Phylogenetic analysis

The phylogenetic neighbor-joining tree with Lake Vänern as the root, showed a pattern where Lake Storsjøen was only moderately separated from the three Lake Mjøsa samples with a bootstrap support of only 70 %. Even less bootstrap support (60 %) differentiated Lake Mjøsa/Lågen-09 from Lake Mjøsa/Lågen-11 and Lake Mjøsa/Snippsandodden. Finally, only a very low bootstrap support (32 %) differentiated Lake Mjøsa/Lågen-11 from Lake Mjøsa/Snippsandodden (Fig. 3).

Fig. 3
figure 3

Plots from hierarchical approach in STRUCTURE (right side) with corresponding phylogenetic neighbor-joining tree from Cavalli-Sforza chord measure (left side). First plot presents all populations, round 1 without populations Eik, Ran and Tyr, round 2 without population Nor, round 3 without populations Hol and Van, and round 4 without population Hur, i.e. only populations Lag09, Lag11, MjN and Sto

Population assignment

The first STRUCTURE analysis resulted in two clusters according to the ΔK value (ΔK = 855.687, mean LnP(K) = −11693.26). However, the LnP(K) value suggested further structuring into seven different clusters (mean LnP(K) = −10911.67, ΔK = 24.71, Fig. 3, Supp. Fig 6). Clustering all populations into two clusters resulted in one cluster containing Lake Eikeren, Tyrifjorden and Randsfjorden, while the remaining populations were assigned to the other cluster. Round 1 of the hierarchical approach resulted in further sub-structuring into ΔK = 2, and LnP(K) = 5, where ΔK grouped the Lake Norsjø population into a single cluster (Fig. 3, Round 1). Round 2 resulted in a ΔK = 2, and LnP(K) = 4, where ΔK separated Lake Holingdal, Vänern and Hurdal (Fig. 3, Round 2). However, closer inspection of the q-values of the Lake Hurdal population revealed only a 0.036 higher q value to the opposite cluster. Round 3 is thus shown with Lake Mjøsa/Lågen-09, Mjøsa/Lågen-11, Mjøsa/Snippsandodden, Storsjøen & Hurdal (ΔK = 2, LnP(K) = 1, Fig. 3, Round 3), and without Lake Hurdal (Fig. 3, Round 4). The three Lake Mjøsa populations also had the highest proportion of membership in the same cluster as Lake Storsjøen (Table 3). The most likely partition was thus a cluster containing all the three Lake Mjøsa populations, together with Lake Storsjøen.

Table 3 Assignment of Lake Storsjøen smelt to potential sources using Bayesian clustering in STRUCTURE with prior population information (i.e. trained clustering), and three different approaches (Bayesian, frequency and distance based) in GENECLASS 2 with eight different tests (Bayesian; Rannala & Mountain 1997; Baudouin & Lebrun 2001, Frequency based; Paetkau et al. 0.1995, Distance based; Nei’s standard distance (Nei’s SD; Nei, 1972), Nei’s minimum distance (Nei’s MD; Nei, 1973), Nei’s DA distance (Nei’s DA; Nei et al.1983), Cavalli-Sforza and Edwards distance (Cavalli-Sforza and Edwards, 1967) and Goldstein’s et al. distance, (Goldstein et al.1995)

All approaches in the software GENECLASS (Bayesian, frequency based and distance), including the various simulation criterion, assigned Lake Mjøsa as the most likely source of the smelt in Lake Storsjøen. Seven of the eight tests ranked Lake Mjøsa/Lågen-11 as the most likely source while one distance based method (Goldstein et al. 1995) suggested Lake Mjøsa/Snippsandodden as the most likely source (Table 3, Supp. Table 4).

Genetic analyses for demographic events

The simulations done by the program COLONIZE showed that a minimum number of 70 translocated smelt was necessary to have at least a 90 % chance of obtaining as many alleles in the Lake Storsjøen population as were observed. 100 or more translocated smelt were required to have more than a 95 % chance of observing as many alleles (Fig. 4). This was supported by the similar results obtained for the ten replicate runs, thus indicating that the original founding population in Lake Storsjøen likely consisted of at least 70–100 translocated smelt individuals. It was not possible to produce a reliable estimate for maximum number of colonizers to Lake Storsjøen, probably due to low sample size.

Fig. 4
figure 4

Result from 10 independent runs (not separated) in COLONIZE estimating joint probability (0–1) for potential number of colonizers from Lake Mjøsa/Lågen-11 into Lake Storsjøen

The COALIT/NFCONE softwares yielded a maximum likelihood estimate of between 531 and 1053 founders with a minimum support limit between 76 and 149, varying with demographic assumptions (Fig. 5). The maximum support limit reached a peak at approximately 4000 founding individuals, but as the complete limit could not be calculated, only the estimates of maximum likelihood and lower support limits are depicted in Fig. 5.

Fig. 5
figure 5

Maximum likelihood estimates from COALIT/NFCONE runs for the number of founders in Lake Storsjøen under different scenarios of intrinsic rate of increase (r) with corresponding lower support limits. Vertical bars represent carrying capacities ranging for different values of r

The Wilcoxon sign-rank test from BOTTLENECK did not detect significant heterozygote excess (p > 0.05) under any of the three mutation model scenarios, suggesting no sign of a recent bottleneck event in Lake Storsjøen. The mode-shift indicator from BOTTLENECK suggested a normal L-shaped mode distribution, indicating a demographically stable population.

The intralocus k-test from KG-TEST for detecting population expansions revealed a significant signal for a recent population expansion in Lake Storsjøen, with 12 of 13 loci exhibiting negative k-values (p = 0.005). The interlocus g-test on the other hand did not reveal significant signs of a population expansion in Lake Storsjøen (p = 1.3) with a cutoff value of 0.22 from Reich et al. (1999).

Discussion

The results suggested that Lake Mjøsa was the most likely source of the introduced smelt in Lake Storsjøen, supporting the initial hypothesis that the translocation of smelt occurred from a locality in geographic proximity to Lake Storsjøen. Thus, based on the findings, the most likely introduction history is that the translocation of smelt to Lake Storsjøen occurred from only one source location. The Lake Storsjøen smelt exhibited no reduction in heterozygosity levels compared to the putative source population, and no difference in private allelic richness, and allelic richness compared to the remainder of the sampled populations.

The smelt introduction from Lake Mjøsa to Lake Storsjøen

Even though the assignment tests indicated that the smelt in Lake Storsjøen most likely originates from only one source location, it was not possible to deduce if the translocation to Lake Storsjøen was a single introduction event, or a results from several introductions from Lake Mjøsa. To address these unknowns, one option is to apply a larger set of higher-resolution genetic markers that can firmly differentiate between founders from the two Lake Mjøsa localities and the two temporal samples. However, resolving the question if the Lake Storsjøen smelt stems from multiple translocations from the very same population within Lake Mjøsa will be very hard, or even impossible, to reveal with any genetic marker, no matter the degree of resolution.

Interestingly, most tests were able to distinguish between the two temporal samples in Lake Mjøsa (Mjøsa/Lågen-09 and Mjøsa/Lågen-11), and the second sampling location in Lake Mjøsa; Mjøsa/Snippsandodden, with the majority of the tests assigning Lake Mjøsa/Lågen-11 as the most likely source. The 2011 sample from Lake Mjøsa/Lågen exhibited a higher similarity to Lake Storsjøen than the sample from 2009. This is possibly an artifact of the limited samples from 2009 (n = 26), compared to 2011 (n = 60), reflecting only a part of the genetic diversity of the population. This further illustrates that sampling effects may be an important issue in genetic assignment analyses. Thus, all of the performed analyses revealed a high genetic similarity between the Lake Mjøsa/Lågen population and Lake Storsjøen, and most analyses revealed a high differentiation of this assemblage to all of the other populations.

Population assignment programs use genotypes to calculate probabilistic inference of possible source populations (Piry et al. 2004). However, if the applied genetic markers do not have a high enough power to distinguish between putative sources with a similar genetic composition, they may not be able to reveal the real source (Huffman and Wallace 2012). Alternatively, there is a possibility that the Lake Storsjøen smelt may have originated from an un-sampled population that holds a genetic composition similar to that of the Lake Mjøsa/Lågen populations. However, the existence of a second population, identical in genetic composition to Lake Mjøsa/Lågen seems highly unlikely, especially because we included samples from the majority of the neighboring lakes. In addition, the combination of the high resolution of microsatellite markers, in conjunction with the ability of the majority of the analyses performed, to consistently distinguish between populations (even temporal and spatial samples from the same lake) makes this an unlikely scenario.

The origin of smelt in Lake Storsjøen

For all analyses the Lake Storsjøen population consistently had the highest likelihood of origin from Lake Mjøsa, and the majority of the tests assigned the spawning locality Lake Mjøsa/Lågen as the most likely source.

In this study, the Bayesian and frequency based approaches implemented in GENECLASS gave the most detailed interpretation through the ability to significantly exclude all other populations than Lake Mjøsa/Lågen as potential sources at a significance threshold of p < 0.01. This is in correspondence with previous simulation studies indicating a higher assignment success through Bayesian and frequency based methods compared to distance based approaches (Cornuet et al. 1999). Nevertheless, in this study, all analyses reached the same general conclusion making it very likely that Lake Mjøsa is indeed the true source population.

Introduction history of the Lake Storsjøen smelt

Founder populations will often consist of a small proportion of individuals of the original population, comprising only a part of the original genetic diversity (Nei et al. 1975; Dlugosch and Parker 2008). Interestingly, there was no statistical difference in heterozygosity, and no difference in the level of allelic richness between the invaders and putative source population. Similar results were found by (Clegg et al. 2002) who argued that the inability to detect strong founder effects in their study was due to large founder numbers (>100) increasing the likelihood of the founders being genetically representative of the original population. Accordingly, (Nei et al. 1975)stated that the amount of genetic loss is dependent on the number of founding individuals. The lack of any genetic reductions in the Lake Storsjøen smelt may thus have been caused by a substantial number of founders. Indeed, this is supported both by the COLONIZE and COALIT/NFCONE tests that estimated an initial translocation of a substantial number of smelt individuals (70–1000) from Lake Mjøsa/Lågen to Lake Storsjøen. The COLONIZE program provides the probability that a simulated founder event of a certain size would give us at least as many distinct alleles as observed amongst the actual colonized population (Mergeay et al. 2007). As such, it provides a rather ad-hoc method of estimating the number of founders. By contrast, COALIT/NFCONE uses more information than that available in the number of distinct alleles: since it is based on a sufficient statistic. COALIT/NFCONE uses all the information available in the sample for estimating the number of founders under the model (Anderson and Slatkin 2007). Accordingly we place most weight on the COALIT/NFCONE results which gave a maximum likelihood estimate for number of founders in Lake Storsjøen between 531 and 1053 individuals. Even if one assumes a very high intrinsic rate of increase (r < 3), the maximum likelihood estimate remains near 531. Thus, the best estimate for the number of founders is 531 with a lower support limit around 75. In contrast, Kinziger et al. (2011) studying the speckled dace (Rhinichthys osculus), an introduced fish species, discovered a reduction in allelic richness relative to the source population. However, the estimated number of founding individuals in that study was much smaller (n = 7–17). In general it seems that a potential explanation for the lack of reduced genetic variation in reported translocated populations as compared with source populations may be that a large number of founder individuals preserve the main composition of the genetic diversity in the original population. Indeed, in our study, only a marginal difference in private allelic richness was discovered between the different lakes, and a post hoc Kruskal–Wallis test revealed no statistical difference between the Lake Storsjøen smelt, and the most likely source population.

Lake Storsjøen is an attractive lake for fishing large-sized brown trout (Salmo trutta). There is extensive sport-fishing and annually a competition is held where the winner that catches the largest trout is awarded 150,000 NOK. Last year, 303 fishermen competed in this competition with a total of 170 kg brown trout caught in 48 h (Storsjoen Fiskeforening 2014). These estimates of a substantial number of founders suggest that the translocation to Lake Storsjøen is unlikely to have happened as an accident e.g. by tipping over a bucket of live bait. Smelt is an important forage fish for brown trout (Krause and Palm 2008), and its potential to facilitate a population of the highly desired, large-sized trout (Sandlund and Næsje 2000), may be a possible explanation for the translocation to Lake Storsjøen. Another possibility is that the smelt may have been released repeatedly in small numbers from the same founder population over a long time period through accidental release while being used as bait. This however, seems less likely because smelt was not discovered during an extensive survey fishing in Lake Storsjøen in 2007 (Museth et al. 2008), and was first discovered by local fishermen in 2008 (Strømsmoen 2008), leaving a small time period for numerous releases of a small number of fish. In addition, it seems unlikely that smelt would be systematically collected to be used as bait from Lake Mjøsa, and not from any other lakes. Regardless, translocation of freshwater fish, both alive and as baitfish, to new localities where they do not already exist, is prohibited by Norwegian law (Innlandsfiskeloven [law relating to salmonids and freshwater fish 2014]: Omsetnings- og sykdomsforskriften for akvatiske dyr (Sale and disease regulation for aquatic animals, 2008).

Population expansion of the smelt in Lake Storsjøen

The smelt in Lake Storsjøen may have had an initial advantage in establishment due to the moderate to large number of translocated individuals and the related high amount of genetic variation. The signal for a recent population expansion after translocation to Lake Storsjøen strongly supports this and indicates that the smelt has had a high reproductive success in its new environment. A similar scenario have been shown for vendace (Coregonus albula), that appeared as a highly successful invader associated with limited signatures of founder effects (Amundsen et al. 2012; Præbel et al. 2013). However, although the smelt seems to increase rapidly in population size in Lake Storsjøen, only few generations have passed as the colonization was likely recent in time. No smelt were caught during an extensive survey of the Lake Storsjøen fish community in 2007 (Museth et al. 2008), and the first registered observation was made in 2008 by local fishermen (Strømsmoen 2008). In contrast, during field sampling in 2011 and 2012, smelt were caught at several different localities in the lake, which indicates that the smelt in Lake Storsjøen has undergone a recent population expansion.

Management implications

The results suggest that the smelt in Lake Storsjøen has experienced a rapid population growth following the translocation from Lake Mjøsa. The population is thus likely to expand rapidly and proliferate into available niches in Lake Storsjøen in the future. Studies of introduced rainbow smelt (Osmerus mordax), a close relative of the European smelt, has revealed diverse effects on the local community in North American lakes and rivers (Hrabik et al. 1998), through e.g. predation and interspecific resource competition (Evans and Loftus 1987; Mercado-Silva et al. 2007). Similar rapid effects has been observed for other systems and species, such as the intentionally introduction of Coregonus albula in the Pasvik watercourse (Northern Norway) (Mutenia and Salonen 1992; Bøhn et al. 2008; Præbel et al. 2013). Long-term population genetic and demographic monitoring of the smelt and the ecosystem in Lake Storsjøen is thus crucial since the introduction of smelt is likely to have implications for the food web. Common whitefish (Coregonus lavaretus), the most abundant fish species in Lake Storsjøen (Museth et al. 2008), is an important resource with traditions for domestic use, as well as for commercial- and recreational purposes (H. B. Sundet, advisor for Hedmark County Governor, pers. comm., May, 2013). As whitefish and smelt may have overlapping niches (Sandlund and Næsje 2000; Sandlund et al. 2005), the whitefish population may be affected, subsequently leading to socioeconomic consequences for the local community. On the other hand, the smelt may increase the size of the local trout through provision of a new food source. The question now is, “how, and to what degree, will the introduced smelt affect the ecosystem in Lake Storsjøen, and will these effects impart negative impacts on the fish community, or have positive or negative economic and socioeconomic consequences on the local human population.

In regards to fauna crime, this study has given a unique opportunity to study an introduction event at an early stage, and to monitor the future course in the affected ecosystem, potentially illustrating alternative applications in the framework of invasive species management and fauna crime. It further demonstrates the applicability of multilocus genetic markers as an effective tool for inference of source population and assessment of introduction history of an invasive population. The methods used were effective in assigning Lake Mjøsa as the most likely source of the introduced smelt in Lake Storsjøen, and that the smelt were most likely translocated from the spawning location Lågen. Thus, this study demonstrates an efficient tool to discover and evaluate illegal introductions, which can be used in law enforcement when addressing fauna crime. These methods may be especially useful as a means to stop further introductions in cases when the introduction route is unknown, regardless of the translocation being intentional or unintentional. In Norway, illegal fish translocations seem to be widespread and our genetic methods can be used in such cases addressing various aspects. Thus, the application of these methods can help authorities and law enforcement regulate the spread of the invasive organism through the detected route of transmission. These methods also have the potential to aid in wildlife forensic prosecution such as uncovering illegal poaching, cheating in e.g. fishing competitions by unveiling the true origin of the organism in question, and stop escape of farmed fish by discovering from which fish pen the farmed fish are escaping. The ability to confidently ascertain from where and how an introduction happened, may also illustrate that illegal introductions can theoretically be exposed, thus acting as a cautionary note for the future.