Introduction

Most species are spatially structured, with local populations linked by varying levels of connectivity. When gene flow between local populations is limited, genetic differences between populations arise and are maintained by random drift and local adaptation. Knowledge of genetic population structure is thus a general prerequisite for the identification of conservation and management units within species (e.g. Palsbøll et al. 2007; Funk et al. 2012; Östman et al. 2017).

Atlantic salmon (Salmo salar L. 1759, Salmonidae) is an anadromous fish, renowned for its long marine feeding migration and subsequent ‘homing’ back to its natal river to breed (Scheer 1939). This homing behaviour is reflected in genetically distinct populations among rivers (e.g. Ståhl 1987; O’Reilly et al. 1996; Fontaine et al. 1997; McConnell et al. 1997; Spidle et al. 2003; Verspoor et al. 2005; Griffiths et al. 2010). Even between neighbouring salmon rivers with only minor genetic divergence, levels of gene flow may be low enough to allow for a high degree of demographic independence (see Waples and Gaggiotti 2006). Rivers have therefore been considered as natural starting points when defining management units for Atlantic salmon (Dionne et al. 2009).

However, major river systems have been shown to host within-river genetic population structure. For example, Vähä et al. (2007, 2008) used microsatellites and found temporally stable genetic substructure in the large subarctic Teno River system, generating over 20 genetically distinct subpopulations of Atlantic salmon. These subpopulations were subsequently shown to harbour differentiation at the genomic level, associated with important life history diversity, including the age at which adults mature and return to their natal river (‘sea age’ or ‘age at maturity’), and the seasonal timing of when adult salmon enter the river to spawn (‘run timing’) (Pritchard et al. 2018). Substantial genetic population structure has also been found within the eastern Canadian Romaine, Moisie, Restigouche and Sainte-Marguerite Rivers (Garant et al. 2000; Landry and Bernatchez 2001; Dionne et al. 2009), the Moy and Foyle River systems in the island of Ireland (Dillane et al. 2007, 2008; Ensing et al. 2011), and the Penobscot River in Maine, USA (Spidle et al. 2001). Conversely, only weak genetic population structure occurs within the large Atlantic salmon stock spawning in the Miramichi River in eastern Canada (Dionne et al. 2009; Wellband et al. 2018). Furthermore, the Varzuga River in the Kola Peninsula in northern Russia shows weak within-river differentiation despite a clear pattern of ‘isolation-by-distance’ (IBD) (Primmer et al. 2006).

The long-term sustainability and ecosystem services of fish populations are buffered and enhanced by genetic and life history diversity, through a portfolio effect (Schindler et al. 2010). It is therefore important to maintain such diversity by avoiding the depletion of certain subcomponents of fish stocks (Hilborn et al. 2003; Hutchinson 2008; Vähä et al. 2017; Jacobson et al. 2019; Nordahl et al. 2019; Tamario et al. 2019). Thus, taking within-river population substructure into account can benefit salmon management, even though this can be difficult to achieve in practice (Potter et al. 2003). Population genetic studies are useful for this aim as they can help to identify the appropriate spatial scale at which to manage salmon, and thus to define conservation and management units and priorities (Dionne et al. 2009; Vähä et al. 2017).

The Baltic Sea supports a distinct evolutionary lineage of Atlantic salmon (Nilsson et al. 2001; Bourret et al. 2013). Wild Baltic salmon populations have been extirpated from almost three quarters of their native rivers due to the construction of hydropower dams, overfishing, habitat loss and pollution (e.g. Karlsson and Karlström 1994). Currently only 27 rivers out of around 90 in the Baltic contain self-sustaining salmon stocks considered to be wild (Koljonen 2001; ICES 2019). Earlier studies have found varying degrees of genetic differentiation among the remaining Baltic salmon rivers (e.g. Ståhl 1987; Koljonen et al. 1999; Säisä et al. 2005; Verspoor et al. 2005; Koljonen 2006), with a general pattern of IBD and three main population groups, assumed to mirror post-glacial colonization events from different refugia (Koljonen et al. 1999; Nilsson et al. 2001; Säisä et al. 2005; Tonteri et al. 2005).

The neighbouring northern Baltic Rivers Tornio and Kalix are currently estimated to produce more than 70% of all wild Baltic salmon smolts (juveniles migrating from rivers to the sea) (ICES 2019). The Tornio River (Torne in Swedish; length 522 km, watershed area 40,157 km2, c. 50,000–150,000 returning spawners annually in recent years, ICES 2019) flows on the border of Finland and Sweden, with its mainstem and tributaries spanning both countries. The Kalix River (length 461 km, watershed area 23,600 km2, c. 30,000–60,000 returning spawners annually in recent years, ICES 2019) is located entirely in Sweden. These two unregulated major rivers have their mouths located just 50 km apart, at the northernmost rim of the Gulf of Bothnia. The rivers are connected by a large natural bifurcation (Tärendö River) where more than 50% of the annual discharge of the Swedish Torne River tributary flows into the Kalix River main stem (Fig. 1). As in most other Baltic rivers, the Tornio-Kalix salmon stock underwent a severe population decline over the twentieth century, and was considered close to extinction in the late 1980s (Pruuki 1993; Romakkaniemi et al. 2003). In response to this decline, the lower Tornio mainstem and the mid to upper Muonio tributary (Fig. 1) were heavily stocked between 1977 and 2002, with hatchery-reared 1st–3rd generation juvenile offspring of returning adults captured at the Tornio river mouth (Romakkaniemi et al. 2003; Anttila et al. 2008). Little or no supplementary stocking was performed in other parts of the Tornio or Kalix Rivers (Romakkaniemi 2008; ICES 2019). Since the late 1990s, the Tornio-Kalix salmon stock has recovered rapidly, but continues to be subject to harvest pressure from offshore, coastal and riverine fisheries (ICES 2019). Its importance as the main source of wild Baltic salmon means that sustainable management of the stock is essential. The cross-border nature of the river system, coupled with the bifurcation connecting the two rivers poses a special management challenge. However, the bifurcation has not, as of yet, been considered particularly important in the assessment (e.g. ICES 2019) or management of the stocks.

Fig. 1
figure 1

Kalix and Tornio Rivers, showing the 45 parr electrofishing locations (dots), combined into 16 named sampling sites for statistical analyses (dashed lines), catch area of returning adults (grey hatching spanning sites T2 and T3), and location of the smolt trap

The Tornio and Kalix Rivers and their salmon stocks are of equivalent size and importance to the extensively studied and geographically close Teno River. However, so far no comprehensive genetic survey of juvenile salmon from different parts within both the Tornio and Kalix Rivers has been carried out, and currently salmon in the two rivers are assessed (e.g. ICES 2019) and managed separately with the assumption of no structuring within each river (Palm et al. 2020). Previous studies, focusing mostly on differences between the Tornio and Kalix and sampling only a few sites within the rivers, have reported minor genetic differences among salmon parr (juveniles feeding in freshwater), smolts, and adults sampled from the system (e.g. Ståhl 1981; Koljonen and McKinnell 1996; Koljonen et al. 1999; Nilsson et al. 2001; Säisä et al. 2005; Verspoor et al. 2005; Koljonen 2006, but see Jansson 1993).

Here, we apply 18 microsatellite markers to samples of salmon from different life stages, to investigate the fine-scale population genetic structure within the Tornio-Kalix River system. We further examine whether this genetic structure is associated with life history variation in freshwater and marine life history traits.

Materials and methods

Samples

Parr: A total of 772 salmon parr were sampled between August and October in 2012 (n = 725) and 2013 (n = 47) at 45 electrofishing locations in all major branches of the Tornio-Kalix River system, including the bifurcation (Tärendö River, Fig. 1). For all sites except T2, fin-clips of sampled parr were taken in the field and stored in 95% ethanol in individual tubes. For site T2, scale samples were taken in the field and stored in paper envelopes. To allow for genetic comparisons of different age groups, tissue samples from yearlings (age 0+ , total length 30–60 mm) and older parr (age > 0+, total length > 60 mm) were collected as evenly as possible. The proportions of yearlings and older parr in the total material were 38.1% and 61.9%, respectively. For statistical analyses, the 45 locations were combined into 16 broader sampling sites based on geographical proximity and genetic homogeneity among locations within sites (results not shown) (Fig. 1).

Smolts: Tissue samples were collected from out-migrating smolts in 2011, using a trap placed close to the Tornio River mouth (Fig. 1). The trap was operated during the whole smolt migration season as part of a long-term monitoring program (starting 1 week after local ice break-up and ending when the daily smolt captures fell to below 0.1% of the total captures). Smolts from all upstream production areas were therefore assumed to be included in the collection. Samples were collected from every 320th trapped individual to a maximum of 5 fish per day from May 14th to June 28th, making a total of 196 smolts (107 females, 89 males). All smolts genotyped in this study had been killed at capture, measured, weighed, sexed, and subsequently aged by trained experts on the basis of scale growth rings, following international guidelines for Atlantic salmon scale reading (ICES 2011).

Adults: We genetically analyzed scale samples from 287 ascending adults that were caught by anglers 110–180 km from the Tornio River mouth (Fig. 1). The samples were collected in 2009 (87 females, 52 males, 6 unreported sex) and 2010 (106 females, 34 males, 2 unreported), throughout the entire fishing season (June 4th to August 15th). A maximum of 21 individuals (mean 4) per day were collected. Adults had been killed at capture, measured, weighed and sexed, and life history data (age of smolting in years, age at maturity in sea winters, number of previous spawnings) had been determined from scale growth rings as above (ICES 2011).

Microsatellite analyses

Total DNA was extracted following a Chelex extraction protocol (Walsh et al. 1991). The following 18 microsatellite loci were genotyped in two multiplexes: Ssa407 (Cairney et al. 2000), SSsp3016 (Gilbey et al. 2004), SSaD157 (King et al. 2005), Ssa14, Ssa289 (McConnell et al. 1995), Ssa85, Ssa171, Ssa197, Ssa202 (O’Reilly et al. 1996), SSsp1605, SSsp2201, SSsp2210, SSsp2216, SSspG7 (Paterson et al. 2004), SsOsl85, SsOsl311, SsOsl417, and SsOsl438 (Slettan et al. 1995). For each PCR reaction we used 4 µL of Type-it Multiplex PCR Master Mix (Qiagen), 4 µL multiplex primer mix, and 0.6 µL template with approximately 100 ng DNA. Uniform signal intensity among loci was achieved by adjusting primer concentrations. PCR was run with an initial step of 5 min at 95 °C followed by 25 cycles of 30 s at 95 °C, 90 s at 56 °C, 30 s at 72 °C and a final step of 15 min at 60 °C. Electrophoresis was performed on an ABI 310 with Liz 500 size standard and allele sizes were determined using the ABI Genotyper 3.7 software.

Statistical analyses

In order to avoid potentially biased allele frequency estimates due to family sampling of juvenile salmon (Hansen et al. 1997; Östergren et al. 2020), Colony 2.0.6.5 (Wang 2004; Jones and Wang 2010) was used to identify full siblings within the parr dataset. A total of 16 analyses (corresponding to the 16 sample sites) were performed with Colony, assuming polygamous mating patterns in both sexes as is typical for salmonids (Fleming 1996). We used the following threshold parameters for identifying full-sibs: Best (ML) full-sib family Prob. (Inc.) > 0.90 AND Prob. (Exc.) > 0.90. We retained only one parr from each identified full-sib family for the analyses below.

The programs Genepop 4.7.3 (Rousset 2008), Arlequin 3.5.1.2 (Excoffier and Lischer 2010) and Fstat 2.9.4 (Goudet 2003) were used for computing population genetic parameters. Exact tests in Genepop were used to identify statistically significant deviations from Hardy–Weinberg proportions across loci (within sites) and sites (within loci). We also used Genepop to identify deviations from genotypic linkage equilibrium, in all pairs of loci and within all sites. We used the Markov chain parameters of 10,000 steps of dememorization, 1000 batches and 10,000 iterations per batch for all analyses, and assessed the significance of the results by applying the Bonferroni correction for multiple tests (Rice 1989).

Arlequin was used for computing the number of alleles per population and loci, F-statistics (global and pairwise FST estimates between sites, and significance assessed with 100,000 permutations; Weir and Cockerham 1984), and for performing an analysis of molecular variance (AMOVA) to assess genetic differences within versus between the rivers, with 100,000 permutations. Fstat was used to compute allelic richness (locus allele number standardized to the smallest sample size of 25 individuals).

Genetic isolation-by-distance (IBD) among parr samples was analyzed using the Mantel test with 100,000 permutations in Arlequin, with FST/(1 − FST) as genetic distance and the shortest waterway (either via the bifurcation or the sea) as geographical distance between the sites.

We used Poptreew (Takezaki et al. 2014) to estimate Nei’s DA (Nei et al. 1983) between the 16 parr sampling sites, for constructing a neighbour-joining tree (Saitou and Nei 1987), and estimated node support using 5000 bootstraps (Felsenstein 1985). We edited the tree with FigTree 1.4.3 (Rambaut 2012).

We used Structure 2.3.4 (Pritchard et al. 2000; Falush et al. 2003) to examine population genetic structure across the two rivers. All Structure models (burn-in of 50,000 steps, followed by 100,000 MCMC replicates) were run without prior information about sampling locations, and assuming admixture and correlated allele frequencies between genetic clusters. The number of clusters (K) was increased from 1 to 16 (for the total parr material, and separate runs for the two different parr age classes), or from 1 to 5 (for the total parr, smolt and adult material), always with 5 replicate runs per K. True K was inferred following Evanno et al. (2005), as implemented in Structure harvester (Earl and vonHoldt 2012), and the results were visualized by Clumpak 1.1 (Kopelman et al. 2015). Because studies relying on the ΔK method may show a dramatic overrepresentation of K = 2 (Janes et al. 2017; Cullingham et al. 2020), we also examined the maximum likelihood of K for each value of K (Ln Pr(X|K)) with Structure harvester.

We performed genetic mixed stock analyses (MSA) for the Tornio River smolts and adults using the R (R Core Team 2020) package rubias (Moran and Anderson 2019). Because of the bifurcation between the Tornio and Kalix Rivers, smolts and adults sampled in the lower Tornio could potentially originate from spawning areas in either the Kalix or the Tornio. We therefore used all 16 parr sampling sites as a genetic baseline, and split them to two reporting groups, ‘Lower’ and ‘Upper’ reaches (Table 1), based on geographical location and the results of our population genetic analyses (see below). First, we used a leave-one-out reassignment procedure (Anderson et al. 2008) to evaluate expected accuracy of assignment to the 16 parr sites and two reporting groups. Each fish was assigned to the site to which it had the highest posterior probability of assignment (‘scaled likelihood’ in rubias), and reporting group accuracy was estimated from combined site assignments. Subsequently, we performed separate genetic mixture analyses for the smolts and adults to estimate the proportion of the total mixture originating from each parr site and reporting group, and the posterior mean of reporting group membership for each individual. We qualitatively explored changes over the sampling season by dividing the data into 1 week intervals for smolts and 2 week intervals for adults and repeating the mixture analysis (see above) for these subsets. We used the default rubias maximum likelihood approach with 2000 MCMC sweeps and discarded 200 sweeps as burn-in.

Table 1 Basic genetic statistics for samples of salmon parr (16 groups of closely located electrofishing locations combined; n = 749; Fig. 1): reporting group (RG) used for Mixed Stock Analysis, number of individuals analysed (n) after full-sib removal (see text), mean number of alleles per locus (A), allelic richness per locus (AR, adjusted for the smallest sample size, 25 individuals), observed and expected mean heterozygosity (HO and HE, 18 loci), FIS estimates (mean over all loci), and P values for exact tests of Hardy–Weinberg equilibrium

We used Chi squared tests to test for independence between sex, sea age, and smolt age, using the categories described below. To investigate whether there were differences in smolt age or sea age between different parts of the river system, we first performed multinomial logistic regression using the nnet package in R (Venables and Ripley 2002), with each individual’s posterior mean of membership to the ‘Upper’ reporting group (hereafter ‘P(Upper)’) as the explanatory variable. For smolt age (assessed from both smolts and adults), we defined three categories for the response variable: 2 years (n = 31), 3 years (n = 289), and 4–5 years (n = 162). For sea age (assessed from adults only), we defined five categories for the response variable: 1 sea winter (1SW; 16 males, 1 female, 1 unknown), 2 sea winter (2SW; 45 males, 114 females, 5 unknown), 3 sea winter (3SW; 19 males, 31 females, 1 unknown), 4–5 sea winter (4–5SW; 15 females, 2 males) and repeat spawners (RS; fish that had returned to spawn at least once previously, independent of their age at first return; 4 males, 32 females, 1 unknown). We assessed statistical significance using a two-tailed z test.

Because of the highly skewed nature of our smolt age dataset (both in terms of age and P(Upper)), we further investigated the robustness of any relationship between smolt age and P(Upper) using a randomization approach. We created 5000 datasets with no association between P(Upper) and smolt age by assigning each individual a random age generated by sampling with replacement from the true age distribution. We calculated median P(Upper) for each of the three smolt age categories for each of the 5000 randomized datasets, and compared the true median P(Upper) to the distribution of the simulated median P(Upper).

To investigate the relationship of P(Upper), smolt age (smolts only), sea age (adults only), year (adults only), and sex (smolts and adults) to seasonal migration timing, we used the MASS package in R (Venables and Ripley 2002) to fit a negative binomial generalized linear model (GLM with a log link) to catch date (coded as number of days since May 1 for smolts, and days since June 1 for adults). In order to compare the Akaike information criterion (AIC) between models with the same number of measurements for each variable, we removed 8 individuals with unknown sex. We included explanatory variables sea age, sex and year as fixed effects without interactions in the model. We then simplified the model by stepwise reduction (using the step function in R), to select the model with the lowest AIC.

Again, because of the highly skewed nature of our smolt dataset in terms of P(Upper), we further investigated the robustness of any relationship between P(Upper) and smolt capture date using a randomization approach. Because running large numbers of GLMs would be inefficient, we instead used the Spearman rank correlation coefficient (ρ) between the two variables as our exploratory measure. To retain the temporal structure of our dataset (maximum 5 individuals sampled per day), we created 5000 datasets by assigning each individual a random P(Upper) generated by sampling with replacement from the true P(Upper) distribution. We compared the true ρ to the distribution of ρ across the simulated datasets.

Results

Genetic variation

Colony identified 34 putative full-sibs from 11 families (7 families with 2 full-sibs, 1 family of each of the following full-sib family sizes: 3, 4, 5 and 8). Removal of all but one individual in each full-sib family resulted in a final total parr sample of 749 individuals.

No parr sampling sites (i.e. combinations of nearby electrofishing locations) or microsatellite loci (across sites) deviated significantly from Hardy–Weinberg genotypic proportions after Bonferroni correction for multiple tests (α = 0.05) (Table 1; Online Resource 1, Table S1). Furthermore, none of the loci showed consistent signs of linkage after pairwise tests within sites and Bonferroni correction (α = 0.05) (not shown). Therefore, we retained all 18 microsatellite loci in our analyses.

Number of alleles per locus observed in all samples combined (parr, smolts, adults; n = 1232) ranged from 3 to 29 (mean 14.6). Mean HE across loci was 0.71, with a range from 0.21 to 0.93 (Online Resource 1, Table S1), and 0.70 over all sites, ranging from 0.67 in K4 (Kaitum River) to 0.73 in T7 (Muonio River) (Table 1). Mean HE in the ‘Upper’ reporting group of the Kalix River system ranged from 0.67 to 0.69, and from 0.70 to 0.72 in the ‘Lower’ reporting group. Mean HE in all sites of the ‘Upper’ reporting group of the Tornio River system was 0.70, and ranged from 0.70 to 0.73 in the ‘Lower’ reporting group. Mean allele richness was 8.11, ranging from 6.84 in K4 (Kaitum River) to 8.98 in T2 (Tornio River) (Table 1).

Within-river genetic structure

Global mean FST among sampling sites was low but statistically significant, both across and within rivers (Both rivers: FST=0.015, P < 0.001; Kalix only: FST = 0.015, P < 0.001; Tornio only: FST = 0.015, P < 0.001). A hierarchical AMOVA revealed no overall genetic differentiation between salmon parr in the Kalix and Tornio rivers (Fbetween rivers = − 0.0004, P = 0.411), but clear differentiation among sites within the two rivers (Fwithin rivers= 0.015, P < 0.001) (Online Resource 2, Table S2).

Pairwise FST values between the 16 parr sites are shown in Table 2. Out of 120 pairwise comparisons, 90 (75%) were statistically significant (P < 0.0004) following Bonferroni correction (α = 0.05). Low and non-significant pairwise FST values were observed among samples from the lowest parts of the two rivers (sites K1, K2, T1, T2 and T3; 9 of 10 comparisons non-significant). The highest estimated FST (0.062) was found between sites K4 and K6 from different branches of the Kalix River system (tributaries Kaitum River and Ängesån River, respectively). Parr from the tributaries Ängesån River (sites K6 and K7) and Lainio River (T6) differed significantly from parr from all other sites.

Table 2 Pairwise FST estimates among the 16 parr sampling sites (below diagonal), with asterisks marking statistically significant (P > 0.0004) pairwise comparisons after Bonferroni correction (k = 120, α = 0.05)

A neighbour-joining tree based on Nei’s genetic distance (Nei et al. 1983) further supported the occurrence of larger genetic differences within than between the rivers, with upstream and downstream samples tending to be more genetically similar to one another regardless of river (Fig. 2). Strongest bootstrap supports (77 to 93%) were found for branches with the sites K6 and K7 (Kalix River tributary Ängesån River), three upstream Kalix River sites K3-K5, and a larger group including the upstream Kalix samples (K3-K5) and all three uppermost sites from the different Tornio River tributaries (T4, T6 and T8, Fig. 2).

Fig. 2
figure 2

Unrooted neighbour-joining dendrogram based on Nei’s DA among the 16 sample sites in the Tornio-Kalix River system, with the ‘Lower’ (downstream) and ‘Upper’ (upstream) reporting groups illustrated. Bootstrap support values ≥ 75% are shown. Pie charts illustrate mean ancestry from the two genetic clusters identified using Structure

We observed a pattern of isolation-by-distance (IBD) when measuring distance among all sampling sites either via the bifurcation (r = 0.487, P <0.001, Fig. 3), or via the sea (r = 0.226, P = 0.044). Signals of IBD were also obtained when performing the analysis separately within each river (Kalix, bifurcation excluded: r = 0.721, P = 0.005; Tornio: r = 0.492, P =0.046).

Fig. 3
figure 3

Relationship between pairwise genetic distance and geographical distance (km; measured as the shortest waterway between sites via the bifurcation connecting the rivers) in the Tornio-Kalix River system

Structure results suggested K = 2 genetic clusters, both within the sampled parr and for all sampled fish (parr, smolts and adults combined), supported by Ln Pr(X|K) values and/or the ΔK statistic (Fig. 4; Online Resource 3, Fig. S1, S2, Table S3). The clusters did not correspond to the two rivers; instead, parr from the lower parts of both rivers exhibited more ancestry from one cluster, and parr from the upper parts of both rivers more ancestry from the second cluster (Fig. 4). Separate Structure runs for parr of different ages produced similar results supporting K = 2 (Online Resource 3, Fig. S3).

Fig. 4
figure 4

Inferred ancestry of salmon parr from the Kalix (K1–K8) and Tornio Rivers (T1–T8) to two genetic clusters (K = 2) according to Structure. Each vertical line represents an individual fish. Horizontal lines underneath the codes for sampling sites illustrate ‘Lower’ (downstream) and ‘Upper’ (upstream) reporting groups

Mixture analysis

In the baseline reassignment test, 84.8% of parr from the ’Lower’ reaches and 78.8% from the ’Upper’ reaches of the Tornio-Kalix system were reassigned to their correct reporting groups. Estimated P(Upper) for each of the baseline individuals, calculated as the sum of posterior assignment probabilities to each of the ‘Upper’ collection sites, is shown in Fig. 5. Mixture analysis inferred the majority of sampled smolts and adults to originate from sites in the ‘Lower’ reporting group (estimated mixture proportions from ‘Upper’ reporting group: Smolts 2011, mean = 0.066, 95% CI 0.021–0.121; Adults 2009, mean = 0.294, 95% CI 0.213–0.385; Adults 2010, mean = 0.450, 95% CI 0.356–0.543).

Fig. 5
figure 5

The posterior mean of membership to the ‘Upper’ reporting group (P(Upper)) for all baseline individuals, as estimated using rubias, calculated as the sum of posterior assignment probabilities to each of the ‘Upper’ collection sites

Assignment accuracy at the level of the 16 parr sites was much lower than for the two reporting groups, with > 50% of baseline individuals correctly reassigned only for sites K4, K7, T6 and T8 (Online Resource 4, Table S4). Due to this, we collapsed the individual sites to the ‘Lower’ and ‘Upper’ reporting groups (Online Resource 4, Fig. S4, S5). In both years, sites from the ‘Upper’ reporting group contributed a large proportion of the earliest returning adults. The large majority of smolts appeared to originate from sites in the ‘Lower’ reporting group (Online Resource 4, Fig. S4).

Smolt age (examined in smolts and adults) was independent of sex, but sea age (examined in adults only) was not: 1SW fish were almost exclusively males, while 4–5SW fish and repeat spawners were disproportionately female (sex vs. smolt age: χ2 = 3.84, df = 2, P = 0.147; sex vs. sea age: χ2 = 43.078, df = 4, P < 0.0001). Multinomial logistic regression demonstrated P(Upper) to be related to smolt age: fish with a higher P(Upper) tended to spend a longer time in freshwater before smolting (Table 3, Fig. 6a; Online Resource 4, Fig. S6a). This relationship was further confirmed by the randomization analysis: observed median P(Upper) for 2 year and 3 year old smolts was in the lowest part of the simulated neutral distribution, while for 4–5 year old smolts it was higher than any simulated value (2Y, median P(Upper) = 0.0006, P = 0.0076; 3Y, median P(Upper) = 0.0083, P = 0.0426; 4–5Y, median P(Upper) = 0.1200, P < 0.0002). P(Upper) was also related to sea age, which was driven by almost all of the 1SW adults having a very low probability of assignment to the ‘Upper’ reporting group. P(Upper) was not related to age of return for fish older than 1SW (Table 3, Fig. 6b; Online Resource 4, Fig. S6b).

Table 3 Results of multinomial logistic regression, showing the relationship between P(Upper) and smolt age (adults and smolts) and between P(Upper) and sea age (adults only)
Fig. 6
figure 6

a Seasonal timing of out-migrating smolts from the Tornio River in 2011. Capture date shows days since May 1. P(Upper) shows each individual’s posterior mean of membership to the ‘Upper’ reporting group. Each panel includes original data (data points) and model predicted values with 95% confidence bands. Membership to the ‘Upper’ reporting group significantly delays capture date (see Table 4). b Seasonal timing of returning adults caught in the Tornio River in 2009 and 2010. Capture date shows days since June 1. P(Upper) shows each individual’s posterior mean of membership to the ‘Upper’ reporting group. The panels include original data (data points) and model predicted values for different age classes, with 95% confidence bands. Genetic cluster, sea age and year significantly affect capture date (see Table 4)

The negative binomial GLM that provided the best fit for smolt capture date included only P(Upper) as an explanatory variable. Smolts with a higher P(Upper) arrived at the river mouth later in the smolt migration season (Table 4, Fig. 6a). The randomization analysis examining Spearman rank correlation coefficient between P(Upper) and smolt capture date supported this relationship: observed ρ (0.358) was higher than all 5000 simulated ρ (P < 0.0002). The model that provided the best fit for adult catch date included the following explanatory variables: year, sea age, and P(Upper) (Table 4). In both years, adult salmon returning earlier in the season had a higher P(Upper) than individuals returning later. The median catch dates of 2–5SW fish (19 days since June 1) and repeat spawners (13 days since June 1) were also early compared to 1SW adults (median catch date 56.5 days since June 1), none of which were caught in the first half of the fishing season (Fig. 6b). However, this later migration date of 1SW fish was essentially based on data from 2009 only, as only one 1SW individual was sampled in 2010.

Table 4 Results of negative binomial GLM, showing estimated model coefficients for capture date of smolts as a function of P(Upper), and capture date of adults as a function of P(Upper), sea age, and year

Discussion

We found population genetic structuring within the wild Atlantic salmon stocks of the Tornio and Kalix Rivers in the northern Baltic. Interestingly, we did not find evidence of overall genetic divergence between these two large interconnected rivers, but rather parallel genetic structuring within each one. In the Tornio River, this internal structure was associated with differences in migration timing of emigrating smolts and adults returning to the river to spawn. The link between genetic structure and observed life history variation in the river system indicates that our findings are important for the management of these largest remaining wild Baltic salmon rivers.

Upstream–downstream isolation-by-distance within rivers

We observed clear genetic differentiation between the upstream and downstream sections of both the Kalix and the Tornio Rivers, associated with an underlying pattern of isolation-by-distance (IBD). By comparing parr of different ages we found that this genetic structure was stable across at least two years. Similar upstream–downstream structuring in the Kalix was also observed in the 1980s by Jansson (1993). Additionally, the Jokkfall waterfall (just north of site K2) in the Kalix River has been a natural, partial migration obstacle that salmon could pass only during low water flow, until a fish ladder was built beside it in 1980 (Jansson 1993). This historical barrier has likely contributed to the upstream–downstream genetic differentiation in the Kalix. Furthermore, the presence of a strong IBD implies population genetic stability over a longer period of time, as has been found in other large Atlantic salmon populations (Vähä et al. 2008; Ozerov et al. 2013). It is unclear whether the identification of two genetic clusters by the Bayesian clustering algorithm indicates a true genetic transition zone within the rivers, or if it merely reflects an artefact caused by the IBD pattern (Frantz et al. 2009; Perez et al. 2018). However, both scenarios support a model of fine-scale natal homing with salmon returning to relatively specific spawning sites within the river system.

Large-scale supplementary stocking using salmon of local origin took place in the Tornio River system from 1977 to 2002 (close to sites T1, T2, T7 and T8; Romakkaniemi 2008). Although extensive, these releases have not been considered a key factor in the population recovery of the Tornio-Kalix salmon (Romakkaniemi et al. 2003). This past stocking has apparently not entirely obscured genetic differentiation in the system, despite its potential to artificially increase gene flow between the upper and lower reaches. However, it is still unknown if within-river genetic structuring has decreased over time. Genetic analysis of historic samples collected before the stocking activities would clarify the effects of this practice.

Our results underline the difficulty of generalizing genetic within-river structuring of large Atlantic salmon populations (see Dionne et al. 2009). The multiple large tributaries of the Tornio-Kalix River system (Fig. 1, global FST = 0.015) might be expected to host strongly differentiated subpopulations, as in the geographically adjacent Teno River (Vähä et al. 2007, 2008, global FST= 0.067). Instead, the amount of genetic differentiation among sampling sites and the IBD pattern observed in the Tornio and Kalix mirror that seen in the large Varzuga River system of northern Russia (Primmer et al. 2006, global FST = 0.014), and in the Teno River main stem (Vähä et al. 2017). Conversely, the degree of population genetic structuring in the Tornio and Kalix is higher than that of the large Miramichi system and salmon stock in Canada (global FST = 0.004; Dionne et al. 2009; Moore et al. 2014; Wellband et al. 2018).

The relatively low level of genetic substructure observed at the present microsatellite markers does not rule out the possibility of adaptive genetic differentiation in the Tornio-Kalix system at other loci. For example, by using a genomic dataset, Wellband et al. (2018) found support for adaptive processes associated with summer precipitation being the primary force in shaping the genetic structure in the Miramichi system, with very low population differentiation at presumably neutral markers (Dionne et al. 2009; Wellband et al. 2018). Despite weak neutral differentiation, the tributaries and different parts of the Tornio-Kalix system may contain adaptive variation associated with environmental characteristics. One possible selective agent is the Atlantic salmon flatworm parasite Gyrodactylus salaris. Lumme et al. (2016, see also Anttila et al. 2008; Kuusela et al. 2009) found stable upstream–downstream genetic divergence of G. salaris in the Tornio, and coadaptation between the local parasite populations and local salmon host populations is one mechanism that could maintain upstream–downstream structuring in the river by reducing the fitness of inter-population hybrids (see Karvonen and Seehausen 2012).

Genetic homogeneity between rivers

In contrast to the upstream–downstream structure in the Kalix and Tornio, we observe little overall genetic differentiation between the two rivers. Instead, downstream sites in one river are genetically more similar to downstream sites in the other river than to upstream sites in the same river, and vice versa. This suggests the occurrence of homogenizing gene flow between equivalent lower and upper sections of the two different rivers, and/or colonization/recolonization of the upper and lower river sections by different ancestral groups.

Genetic homogenization among different parts of the Tornio-Kalix River system is likely facilitated by the unusual bifurcation tributary (Tärendö River, K8, Fig. 1) connecting the two drainages about 180 to 250 km from the river mouths. It provides a potential route for adult salmon ascending the middle part of the Kalix to access Tornio headwaters, and vice versa. We observed no significant genetic differentiation between parr in the bifurcation and parr in most parts of the Kalix mainstem or the mid to lower Tornio (Table 2). This and the stronger IBD signal when using the route through the bifurcation as the shortest waterway, compared to not including it, support a hypothesis of gene flow via this route. Additionally, the bifurcation mixes water of the two rivers, potentially causing returning salmon navigating via olfactory cues (Petersson 2016) to stray between river mouths. Tagging studies have indeed found that around 7% of smolts originating from the Tornio River are recaptured as adults in the Kalix River (A. Romakkaniemi, unpublished data). This is however within typical Atlantic salmon straying rates between watersheds, estimated from other studies (3–10%, Stabell 1984; Jonsson et al. 2003; Keefer and Caudill 2014).

Cauwelier et al. (2018b) examined 11 Atlantic salmon rivers in Scotland and observed a similar pattern of upstream–downstream rather than among-river genetic structuring, associated with exceptionally high rates of straying between rivers (27.4%). This suggests an alternative hypothesis that factors independent of the bifurcation are maintaining the parallel upstream–downstream structure in the Kalix and Tornio. In particular, a genetic basis to timing of freshwater entry and/or migration duration in the river could cause strays originating from the upper section of one river to preferentially spawn in the upper reaches of the alternative river, thus mediating gene flow between these distant headwater sites.

Life history variation within the river system

Our results imply substantial life history variation among salmon spawning in different parts of the Tornio River system. Specifically, salmon with a higher P(Upper), that were more likely to originate from stretches higher up in the river, spent more years feeding in freshwater before smoltification, entered the marine environment later during their smolt migration, rarely matured after just one year at sea, and returned to freshwater at an earlier date, compared to fish with a lower P(Upper), that were more likely to originate from the downstream reaches. Such variation in Atlantic salmon life history timing is present across and within most river systems, and is known to have both an environmental and genetic component (e.g. Thorstad et al. 2011, Barson et al. 2015).

In line with our findings, many studies have found that adult salmon entering rivers early in the season tend to be older (e.g. Jokikokko et al. 2004; Quinn et al. 2006; Harvey et al. 2017) and originate from higher up in the systems than later-returning individuals (Shearer 1990; Økland et al. 2001; Stewart et al. 2002; Niemelä et al. 2006; Östergren 2006). Vähä et al. (2011a) studied 1SW salmon entering the Teno River, and found that individuals from subpopulations in smaller tributaries entered freshwater earlier. They proposed that the earlier ascending adults have an advantage in situations with competition for limited spawning sites. Ascending the river early may also allow large salmon to reach their spawning sites before water becomes too shallow for them in the course of the season (Niemelä et al. 2006). It is also possible that salmon returning to the upper reaches of the Tornio enter the river earlier simply because they have a longer way (up to 450 km) to swim to their spawning grounds than downstream salmon. Conversely, the long distance could be the reason for upstream smolts to reach the sea later than their downstream counterparts. However, if smolts throughout the river aim to reach the sea at the same optimal conditions and date, smolts from the upper reaches could be expected to initiate their migration earlier in the season (Stewart et al. 2006). It should also be noted that as only one 1SW salmon was sampled in 2010, it is difficult to make fully robust conclusions of age structure differences of returning adults between the reporting units.

Variation in seasonal run timing of adults in multiple populations is associated with genetic differences, including a possible locus of large effect (e.g. Stewart et al. 2002; Cauwelier et al. 2018a; Pritchard et al. 2018). Differences in smolt migration timing among populations has also been suggested to be adaptive and have a genetic basis (Stewart et al. 2006; Thorstad et al. 2012; Harvey et al. 2020). In addition, the age at smoltification seems to be an adaptive and highly heritable feature influenced by genetics (e.g. Páez et al. 2011; Pedersen et al. 2013), while also being controlled by environmental cues and conditions (e.g. Otero et al. 2014). For example, a shorter growing season and subsequently worse conditions for juvenile growth increase smolt age of Atlantic salmon (Metcalfe and Thorpe 1990). The uppermost sampling sites in the Tornio River system experience around 10% more days with ice cover than the lowermost sites (220 days vs. 200 days, Korhonen 2006), which could partially explain why smolts with higher P(Upper) were on average emigrating at an older age.

Implications for conservation and management

Currently, salmon in the Tornio and Kalix Rivers are managed as two separate stocks. Our analyses suggest that this may not be warranted from the conservation genetic perspective. Instead, an overall aim should be to retain the genetic and life history diversity present in both rivers, by ensuring the security of populations throughout these large river systems across Finland and Sweden. Therefore, our findings highlight the importance of cross-border cooperation in the management of these two salmon rivers. However, before a final recommendation on the most appropriate management strategy can be made, it would be recommendable to gain a clearer picture of possible genetic structure at potentially adaptive loci, and also assess how well evolutionary and ecological processes match in the system. As discussed for example by Waples and Gaggiotti (2006), ecological and evolutionary population concepts do not always go “hand in hand”, and in some situations a high rate of gene flow (evolutionary process) can still be accompanied by relatively low demographic exchange (ecological process). In the present case, straying between the lower Tornio and Kalix appears high enough to prevent genetic differentiation, but may still be low enough to allow largely independent demographic dynamics in the separate rivers. Hence additional research is warranted before recommending changes with respect to practical fisheries management (i.e. data collection and stock assessment).

Our results suggest that the salmon populations in the upper reaches of the Tornio-Kalix system harbour important genetic and life history diversity. These populations may be particularly vulnerable to the effects of harvesting both at sea and in the rivers due to their longer generation time and longer riverine migrations, increasing the chance of mortality before reproduction (e.g. Garant et al. 2003). Moreover, similar to Jansson (1993), we observed slightly lower allelic richness and mean heterozygosity in the upper than the lower sections of the Tornio-Kalix system. As genetic diversity is considered a prerequisite for evolutionary potential and population health, reduced genetic variation may indicate that the upstream populations in the Tornio-Kalix system are more vulnerable to environmental changes.

In the sea, Baltic salmon are harvested throughout their feeding and spawning migrations. Harvesting on the marine feeding grounds has markedly decreased over the last three decades, partly due to reduced catch quotas that have reduced overall sea fishing pressure (ICES 2019). This is expected to have increased survival of later maturing salmon in particular, as natural and fishery-related mortality at sea for Baltic salmon is high (ICES 2019), and may thus have benefitted salmon from the upper reaches of the Tornio and Kalix. Coastal fishing targets adult salmon migrating back to their natal river to spawn, and has also decreased due to quota restrictions. Additionally, delayed opening of the coastal fishing season of Baltic salmon since the 1980s is thought to have contributed to the recovery of the Tornio-Kalix salmon stock (Romakkaniemi et al. 2003). Salmon of the Tornio and Kalix are particularly targeted by fisheries along the Finnish coast (Whitlock et al. 2018). After two decades of delayed opening of the coastal fishing season, recent regulatory changes in the Finnish coastal fisheries now allow limited harvesting of salmon in the early season. Our results show that salmon entering the Tornio River early appear to be largely on the way to the upper river reaches, and allowing early fishing may thus increase fishing pressure on the upstream populations. Therefore, it is particularly important to closely follow and study what consequences changed temporal regulations of the coastal fishing may have on the Tornio and Kalix salmon populations.

In the Tornio River, salmon assigned to the ‘Upper’ reporting group are strongly over-represented in the adult samples caught by anglers, compared to the smolt samples caught in the smolt trap (29.4% of adults in 2009, 45.0% of adults in 2010; 6.6% of smolts). This observation could be driven by two phenomena relevant to management of the salmon stocks. First, upstream fish may have an increased rate of survival in the marine environment compared to downstream fish, for unknown reasons. Second, there may be disproportionate riverine fishing pressure on Tornio fish originating from the upstream reaches. Our adult sampling approximates total riverine fishery catches well along the main fishing area, where angling pressure and catches per km appear to be the highest: Finnish anglers catch a large majority of the total salmon catch in the Tornio River system, and about one-third of their total catch was caught from this 70 km long stretch in 2009–2010 (Vähä et al. 2010, 2011b; Palm et al. 2020). However, we caution that this observation may also be due to biased sampling, for five reasons. First, the samples were taken in different years (adults 2009–2010; smolts 2011). Second, while smolt samples were collected over the entire known emigration period, adult samples were restricted to the legal fishing season, which may not overlap the entire return migration period. Third, if the temporal window for migration of upstream smolts is much narrower than that of smolts from lower reaches, taking a maximum of 5 smolt samples per day would under-sample the former. Fourth, the adult fishing location was above the lowest Tornio sites. Finally, the distribution of our adult samples was mildly biased to the early fishing season, compared to the total catch. We recommend further investigation to confirm whether upstream adults are disproportionately targeted in the riverine fishery.

Run timing differences associated with different river sections makes it possible to manage specific stocks in river systems via temporal fishing closures (Vähä et al. 2011a; Cauwelier et al. 2018a), which could be utilized also in the local fishery management of the Tornio-Kalix. Such measures targeted on harvest timing have the potential to alter population trajectories and permit conservation or expansion of certain stock subcomponents (Kallio-Nyberg et al. 2011; Harvey et al. 2017; Erkinaro et al. 2019). Furthermore, variation in migration patterns is an important part of the genetic and phenotypic diversity of fish stocks, and thus essential for the resilience of the populations that fisheries depend on (Quinn et al. 2016; Jacobson et al. 2019; Tamario et al. 2019). Conserving the observed life history diversity could therefore be of large benefit for the long-term survival of the Tornio-Kalix salmon stock. To this end, understanding the evolutionary forces potentially promoting the upstream–downstream differentiation in the Tornio and Kalix Rivers would help in defining relevant management and conservation units in the system. Temporal studies incorporating genomic data from the river system could provide useful further information for the conservation of these wild salmon populations.