Background

Understanding the rather complex population structure and dynamics of marine pelagic fishes requires discerning the relative influence of life-history traits and historical processes in shaping present-day population patterns (e.g. [17]). Marine pelagic fishes exhibit great dispersal capability that enhances gene flow, as well as large effective population sizes that impose limitations to genetic drift (e.g. [811]). The combination of both life-history traits acts as major homogenizing force, which hampers genetic differentiation, and ultimately may lead to panmixia (e.g. [2, 12, 13]). In contrast, other life-history traits such as phylopatric behavior or local larval retention and recruitment act promoting isolation by distance, and local adaptations that eventually render low but significant levels of genetic differentiation in marine pelagic fish populations (e.g. [2, 10, 14]). Moreover, population structuring and dynamics of marine fishes are also heavily influenced by the physical peculiarities of the marine environment, where connectivity, and thus dispersal, is greatly dependant on ocean fronts and currents, as well as on bathymetry. For instance, the Agulhas current [15] seems to promote migration of bigeye tuna across the Cape of Good Hope from the Indian Ocean into the Atlantic Ocean (e.g. [9, 16, 17]) whereas the Almeria-Oran front [18] acts as a major barrier to gene flow between the Mediterranean Sea and the Atlantic Ocean for some species such as e.g. the mackerel [2], the anchovy [3] or the swordfish [7]. In addition, historical factors including past changes in the direction and sense of ocean currents, vicariant events caused by both climatic and eustatic sea level changes [7, 9, 19], as well as climate-associated periodical extinctions and recolonizations [20] have also decisively contributed to shaping present-day population genetic differentiation and geographic distribution.

Comparative analyses of both nuclear and mitochondrial genetic markers offer the best and most powerful approach to characterizing population genetic structure and diagnosing the evolutionary processes responsible for genetic differentiation in marine pelagic fishes (e.g. [6, 17, 21]). Therefore, genetic studies including both types of molecular markers are largely wanting.

The European sardine (Sardina pilchardus, Walbaum 1792) is a small pelagic fish that inhabits the coasts of the eastern North Atlantic Ocean (from the North Sea to Senegal), as well as the Mediterranean Sea, the Sea of Marmara, and the Black Sea [22, 23]. Adults usually swim close to the littoral zone, and display daily vertical movement capacity [23, 24]. Spawning occurs in open waters and larvae remain in plankton for long periods of time [24]. In spite of the relatively great dispersal capability of sardines both at the larval and adult stages, tagging and egg production data suggest that total inter annual displacement may be restricted by changes in the ocean water temperature and productivity, as well as by hydrogeographic boundaries [2426]. Based on these life-history traits, sardine populations in close geographic proximity are expected to show modest genetic differentiation. That is the case in the Aegean Sea [27], the Spanish Mediterranean coast [28], and the Adriatic Sea [29]. At a larger scale, isolation by distance, and the existence of potential past or present-day barriers may promote higher levels of genetic differentiation.

Morphological studies based on gill raker counts and head length [22, 30] found enough phenotypic variation to differentiate two subspecies, S. p. pilchardus (Eastern Atlantic Ocean, from the North Sea to Southern Portugal) and S. p. sardina (Mediterranean Sea and Northwest African coast). Although no private mitochondrial control region sequence haplotype could be found for each proposed subspecies, they were suggested to be genetically distinct based on significant pairwise haplotype frequency differences [31]. Moreover, some subspecies pairwise comparisons involving locations around the Atlantic Ocean region off the Gibraltar Strait showed no significant haplotype frequency differences, which suggested that this area could be a contact zone of both subspecies [31]. According to mitochondrial evidence, the well-known Almeria-Oran oceanographic front [18] between the Atlantic Ocean and the Mediterranean Sea is not a phylogenetic break for sardines.

The sardine is heavily fished all over its distribution with global catches of 1.600,000 tons per year (Fishery statistics 2003,[32]). In particular, Spain and Morocco are the countries with the largest captures (representing about the 77% of the total annual catch of sardines), and collapse of a sardine stock was reported off the Safi coast (Morocco) during the 1970s [33, 34]. Population genetic and historical demographic analyses of sardines from Safi based on mitochondrial sequence data showed strong genetic differentiation of this population sample, and the signature of an early genetic bottleneck. The genetic singularity of the sardines at Safi (also detected with allozyme data [35]), could have enhanced the effects of the historical collapse of the sardine stock [31].

In this study, we analyzed allele size variation of eight polymorphic microsatellite loci in Atlantic and Mediterranean sardines. We used coalescent-based approaches for the estimation of the actual number of populations, and employed hierarchical AMOVA and isolation by distance tests to study population genetic differentiation. Our main objective was to explore whether microsatellites provide concordant genetic differentiation patterns with respect to mitochondrial control region sequence data [31]. Comparative analysis of mitochondrial and nuclear multilocus data were used to further understand the historical and contemporary (i.e. life-history) components of sardine population structure. In addition, we tested whether the genetic singularity of the Safi population sample could be confirmed with nuclear data, and whether any signature of a genetic bottleneck was detected in this or other population samples.

Results

Microsatellite diversity among loci

Microsatellite polymorphism levels were high at the eight genotyped loci with 41 to 94 alleles per locus (mean NA value ± standard deviation was 60 ± 18.60), and mean observed and expected heterozygosities of HO = 0.738 ± 0.13 and HE = 0.748 ± 0.14, respectively (Table 1). The inbreeding coefficient varied between 0.003 at locus SAR2F and 0.450 at locus SAR1.12 (mean FIS = 0.202 ± 0.18), and only two loci (SAR2.18 and SARA2F) were in Hardy-Weinberg (HW) equilibrium over all population samples (Table 1). Tests for linkage disequilibrium showed a very low (3.6%) number of significant pairwise comparisons, which suggests independence of all examined loci.

Table 1 Summary statistics for eight microsatellite loci and each population sample of Sardina pilchardus*

Genetic diversity among sardine population samples and between subspecies

The amount of genetic variability was homogeneous among sardine population samples as indicated by the low standard deviations associated to the estimated mean number of alleles (NA = 29.3 ± 1.4), by mean allelic richness after rarefaction (NS = 27.3 ± 0.95), and by mean observed (HO = 0.747 ± 0.04) and expected (HE = 0.948 ± 0.00) heterozygosities (Table 1). The overall proportion of private alleles for the analyzed population samples was considerably high (32.1%). The inbreeding coefficient FIS within population samples across all loci was on average 0.224 ± 0.04. Sardine population samples at four locations (Larache, Quarteira, Pasajes, and Nador) showed significant mean FIS values, indicating significant departures from HW equilibrium due to homozygote excess (Table 1). A non-significant bimodal test indicated no evidence of unspecific locus amplification or genotyping errors, which could have resulted in null alleles. In addition, a null allele test based on expected homozygote and heterozygote allele size difference frequencies [36] detected that 55% of the pairwise comparisons presented HW disequilibrium mainly involving loci SAR193B, SAR19B5 and SARA3C (Additional file 1). We found that correcting for null allele frequencies [37] did not qualitatively affect the results (49% of the pairwise comparisons were still significant, data not shown). This suggests that putative null alleles had a very low effect on the average genetic diversity of our data, and hence the complete data set was included in further analyses.

The differences between S. p. pilchardus and S. p. sardina (as represented by Pasajes and the remaining population samples, respectively) genetic diversity measures were non-significant. The ANOVA test showed no differences for the mean number of alleles (NA) (F1,7 = 0.33, P = 0.58), the mean allelic richness after rarefaction (NS) (F1,7 = 0.85, P = 0.39), mean observed (HO) (F1,7 = 0.08, P = 0.78) and expected (HE) (F1,7 = 3.14, P = 0.12) heterozygosities, and the mean inbreeding coefficient (FIS) (F1,7 = 0.14, P = 0.71) between subspecies (Table 1).

Estimation of the number of possible populations and assignment of individuals

Bayesian clustering analyses [38] detected the highest likelihood for the model with K = 5. However, the modal value of ΔK was shown at K = 4 (Fig. 1). A Bayesian inference under a Dirichlet process prior [39, 40] estimated that the number of populations with the highest posterior probability was K = 3 (P = 1.0).

Figure 1
figure 1

Number of sardine populations with the highest posterior probability expressed as the ΔK, for each of the nine assumed sardine populations (K). ΔK is calculated as the mean of the absolute values of the second derivative of L(K), (L" (K)) average over five runs divided by the standard deviation of L(K) [71].

The probability of assignment of individuals to the four or five possible populations as inferred using Bayesian clustering analyses [38] was generally low (P < 0.8). The Bayesian assignment test correctly assigned 20.1% of the individuals to their own source location (22.4 % being the proportion of individuals that could not be assigned to any of the reference populations).

Measures of genetic differentiation

The null hypothesis of no contribution of the Stepwise Mutation model (SMM; [41]) to genetic differentiation (ρRST = FST) was rejected (P < 0.000) based on the multilocus data set (Table 2), suggesting that RST should be preferred over FST for the calculation of genetic differentiation between sardine population samples [42]. Three out of eight loci showed significant differences in the allele permutation test (Table 2). The significant global RST test (0.024, P < 0.000, 95% C.I. 0.026 – 0.047) over all loci suggested population structuring in sardines. Of 36 pairwise comparisons, only nine comparisons involving Nador, Barcelona and Kavala locations revealed significant values after correction for multiple tests (Table 3). Interestingly, all pairwise comparisons between Pasajes (representing S. s. pilchardus) and the rest of the sampling sites (representing S. s. sardina) were non-significant.

Table 2 Summary statistics of the allele size permutation test [42] for each locus and the 95% confidence for the simulated RST values*
Table 3 Multilocus estimates for F ST (below diagonal) and R ST (above diagonal) between sample pairs from eight microsatellite loci in common sardine

The hierarchical AMOVA revealed overall significant genetic structuring of the analyzed samples (P < 0.00) (Table 4). A two gene pool structure separating the subspecies S. p. pilchardus (Pasajes sampling site) versus S. p. sardina samples was not significant (P = 0.44). A possible a priori hypothesis of geographic structuring (organized as Atlantic Ocean versus Mediterranean Sea samples) was also not supported by the AMOVA (P = 0.07) (Table 4). The Atlantic Ocean versus Mediterranean Sea comparison was repeated excluding the Pasajes population sample, which could mask small genetic differentiation. Potential geographic structuring between the two areas remained not significant (Table 4). According to the Mantel test, correlation between genetic distance determined as RST and geographical distance (log Km) was significant (correlation coefficient r = 0.51, R2 = 0.26, P < 0.009) (Fig. 2). The Mantel test correlating FST and geographic distances was not significant (not shown). Similarly, we found no significant correlation when using the Bayesian assignment DLR distances (correlation coefficient r = 0.09, R2 = 0.01, P = 0.59) (Fig. 2).

Table 4 Analysis of molecular variance (AMOVA) of spatial genetic variation in common sardine for eight microsatellite loci *
Figure 2
figure 2

Genetic isolation by distance of all S. pilchardus population samples inferred from multilocus estimates of RST (solid circles) and DLR (solid squares) genetic distances versus geographical distance (Mantel test). Correlation coefficients: for RST r = 0.51, R2 = 0.26, P < 0.009; for DLR r = 0.09, R2 = 0.01, P = 0.59.

The Wilcoxon test detected recent bottlenecks in two population samples from the Mediterranean Sea corresponding to Nador and Kavala sampling sites (P two tails value of 0.031 for both tests), under the SMM model. No trace of genetic bottleneck was detected in Safi. Additionally, the test was performed using the Two Phase model (TPM; [43]) and the Infinite Allele model (IAM; [44]). In the first case, the test rendered non-significant results in all population samples. However Safi, Larache, Pasajes, Nador, and Kavala rendered significant results under the IAM.

Levels and patterns of gene flow among populations

The estimates of the population size parameter (Θ) ranged from 0.38 to 0.81 (0.51 ± 0.13) (Table 5) and were translated to an average effective population size (N e ) of 12,818 ± 325 sardine individuals (assuming a microsatellite mutation rate of 10-4 per locus per generation [45]). Migration rates between population samples were all of the same order, and no preferential directionality of the migrants was observed. The mixed model-nested ANOVA test showed no significant variation of the number of emigrants and immigrants between the Atlantic Ocean and the Mediterranean Sea (F1,8 = 0.44, P = 0.53; F1,8 = 0.12, P = 0.74). Also the test rendered no significant variation of the number of emigrants among population samples (F1,8 = 0.00, P = 1.0). However a significant variation of immigrants among population samples (F1,8 = 3.14, P = 0.01) was detected. A one-way ANOVA was applied to test the null hypothesis of equal rate of immigrants between population samples. The analyses rendered a significant difference in the immigration rates among population samples (F1,8 = 3.67, P = 0.001), being Barcelona-Quarteira the only pairwise comparison that was significantly different (t > 1.998).

Table 5 Maximum likelihood estimates of the population size, Θ (Θ = 4 × effective population size, N e × mutation rate, μ per generation and site) and the scaled migration rate, M (M = immigration rate per generation m/μ) for all population samples of Sardina pilchardus. Θ values are displayed on the diagonal. All values are within the bounds of 95% interval of confidence

Discussion

The study of population genetic variation of marine pelagic fish species has proven to be particularly challenging because of the biological peculiarities of these fishes including large effective population sizes and high dispersal capacities, as well as because of the apparent lack of physical barriers to gene flow in the marine realm [6, 4648]. Mitochondrial DNA is maternally inherited, lacks recombination, and shows relatively fast evolutionary rates, which make this molecular marker particularly suitable for inferring phylogeographic patterns [49]. This molecular marker is particularly appropriate for detecting historical vicariant or genetic bottleneck events, and has been very useful in describing present day phylogeography of taxa with relatively low dispersal capacity [49]. However, mitochondrial genetic variation is less helpful when tackling questions on present-day genetic structuring of taxa with large population sizes and high levels of gene flow within their distribution such as marine pelagic fishes (e.g. [6, 21]). Microsatellites are nuclear markers with higher mutation rates [50] that have proved to be more efficient and informative for detecting fine-scale population structure in marine pelagic fishes [17, 21, 51]. Overall, comparative analyses of nuclear and mitochondrial data should provide insights not achieved by each type of data separately, and should help in disentangling historical versus ecological factors involved in shaping contemporary population genetic structure of marine pelagic fishes [21].

Population genetic structure in sardines

The eight species-specific microsatellite loci used in this study showed high levels of polymorphism [52] and no significant linkage disequilibrium. All but two of the analyzed loci showed departures from HW equilibrium expectations due to homozygote excess. The null allele test [36] indicated that these departures could be due to the presence of null alleles, which seem to be rather common in large marine fish populations [53]. Nevertheless, since adjusting frequencies to take into account null alleles did not affect inbreeding coefficient estimates, all loci were used in the analyses.

Overall R ST detected weak but significant genetic structuring among sardine population samples. Pairwise estimates of R ST varied between 0.001 and 0.083, and were of the same level of magnitude to those reported for other marine fishes [5356]. These relatively low R ST values could be attributed to high levels of size homoplasy, as expected when using polymorphic microsatellites with high mutation rates in species with large effective population sizes [53, 57, 58]. However, the observed relatively high number (32.1%) of private alleles, and their even distribution among population samples indicate that allele sharing between sardines at the different locations is rather limited and thus, that the effects of size homoplasy are minimal. Alternatively, it is more likely that the high levels of locus polymorphism are the ones responsible of only detecting weak genetic structuring [53, 58].

The difficulty in detecting genetic structuring is further evidenced by Bayesian clustering and assignment tests, as well as by hierarchical AMOVA and migration rate analyses. Although the different assayed Bayesian clustering analyses agree in rejecting the null hypothesis of panmixia, they failed to predict the exact number of inferred populations, which ranges from 3 to 5. Furthermore, assignment of individuals to the inferred populations was poor regardless of the method used. In addition, none of the tested a priori hypotheses of genetic structuring rendered significant results in the AMOVA. Maximum likelihood estimates of migration rates showed that gene flow among population samples is high and even. All these results together support that sardine population samples are acting as a single significant evolutionary unit. The Mantel test detected positive and significant correlation between genetic differentiation (only when using RST) and geographical distance suggesting that a model of isolation by distance could explain the subtle genetic structuring of sardines within the evolutionary unit. Isolation by distance seems to be a rather common pattern in small-medium pelagic marine fishes (e.g. [2, 19, 51]). It is important to note here that temporal replicates at the studied locations are needed to test whether the observed population genetic patterns are stable over time.

Relative effects of life-history traits and historical factors on genetic differentiation in sardines

All significant R ST pairwise comparisons involved Mediterranean Sea versus Central Atlantic Ocean population samples. Theses results could reflect the existence of a phylogeographic discontinuity between the Atlantic Ocean and Mediterranean Sea, around the Gibraltar Strait and the Almeria-Oran front, as it has been postulated previously for different marine pelagic fish species (e.g. [2, 3, 7]). However, this hypothesis was rejected for sardines at the nuclear level because the hierarchical AMOVA failed to detect significant geographical structuring between the Atlantic and the Mediterranean sardine population samples, and high and even migration rates were observed between both basins. These results are congruent with those derived from population genetic analyses based on mitochondrial control region sequence data that also failed to find a barrier to gene flow for sardines at the Atlantic Ocean and the Mediterranean Sea [31]. The inferred genetic pattern for sardine is in agreement with the present-day gene flow exhibited by other marine pelagic fish species such as e.g. Scomber japonicus [2] or Thunnus thynnus [7] through the Atlantic-Mediterranean transition. The fact that the Gibraltar Strait and the Almeria-Oran front may or not act as barrier to gene flow for different marine pelagic species has been attributed to differences in life-history traits (e.g. dispersal capacity [2]) and for other marine fish species due to the existence of distinct past demographical events (e.g. bottlenecks [7]). More comparative studies on the biology and population dynamics of marine pelagic fishes distributed at both sides of the Gibraltar Strait, as well as additional population genetic analyses including temporal series are needed to further understand the factors that promote or prevent gene flow of these species across the Atlantic-Mediterranean transition.

The existence of two different subspecies (S. p. pilchardus and S. p. sardina) as previously reported based on meristic studies [22, 30], and mitochondrial control region sequence haplotypes frequency differences [31] was not supported by population genetic analyses (R ST pairwise comparisons, AMOVA test, and estimations of migration rates) based on microsatellite data. However, these results need to be taken with caution since one of the subspecies (S. p. pilchardus) was only represented by a single location (Pasajes). A more thorough sampling of sardine at North Atlantic locations would be mandatory to further test the validity of the two subspecies using microsatellite allele frequency data.

The discordant genetic structuring patterns inferred based on mitochondrial and microsatellite data could indicate that the two different classes of molecular markers may be reflecting different and complementary aspects of the evolutionary history of sardine. The significant genetic structuring evidenced by mitochondrial data might be reflecting past isolation of sardine populations into two distinct groupings during Pleistocene [31]. Afterwards, sardine populations expanded and secondary contact was re-established around the Gibraltar Strait. Microsatellite data reveal the existence of a present day single evolutionary unit that shows weak genetic structuring due to isolation by distance. At micro geographical scale, genetic drift is supposed to overcome gene flow as geographical distance increases [59] because of the effect of different life-history traits such as e.g. larval retention, homing behavior, or reduced dispersal capacity, that need to be further studied in sardines.

Periodic population extinctions and recolonizations at the regional level are common in sardines and other clupeids and may be responsible for the shallow coalescence of mitochondrial genealogies [20]. In this regard, mitochondrial and nuclear markers exhibit different performance in detecting instances of genetic bottlenecks. Mitochondrial control region sequence data support the existence of a past (Pleistocene) genetic bottleneck of sardines in Safi that is only detected at the nuclear level using the IAM. In addition, analyses of microsatellite data under both the SMM and IAM revealed potential genetic bottlenecks at Kavala and Nador, which would be too recent to be detected by mitochondrial data.

Different types of genetic markers occasionally may render contrasting population genetic structure patterns for a given species [21, 60]. In some instances, discordance among marker classes may result from methodological biases, which when appropriately corrected allow obtaining reconciled patterns [21, 60]. In other cases, conflicting results in describing population genetic structure may arise from the differential effects of genetic drift and mutation on a marker class [21]. In such cases, discordance could be interpreted as a source of alternative and complementary information useful for investigating how evolutionary processes at different time scales shape patterns of genetic heterogeneity. In this study, the comparison of two classes of molecular markers with different mutation rates and modes of inheritance has allowed us to gain complementary and broader insights on sardine historical and contemporary population genetics and dynamics, which ultimately could serve to improve fishery management of this commercially important marine pelagic fish species.

Conclusion

The discordant genetic structuring patterns inferred based on mitochondrial and microsatellite data appear to be pointing to complementary aspects of the evolutionary history of sardine. Past isolation of sardine populations into two distinct groupings is supported by mitochondrial data whereas current gene flow within a single evolutionary unit and a weak genetic structuring due to isolation by distance are evidenced by microsatellite data. This study shows that only the combination of molecular markers with different modes of inheritance and mutation rates is able to disentangle the complex patterns of population structure and dynamics of a small marine pelagic fish such as the sardine.

Methods

Sample collection

We extended the sample collection of a previous study [61] from about 25–30 to nearly 50 mature sardine specimens per landing port. Overall, population genetic analyses included 433 individuals from six localities (Dakhla, Tantan, Safi, Larache, Quarteira and Pasajes) in the Atlantic Ocean (N = 293) and three localities (Nador, Barcelona and Kavala) in the Mediterranean Sea (N = 140) (Fig. 3). The sardines from the Pasajes location were assigned to the subspecies S. p. pilchardus based on distribution area, and mitochondrial haplotypes frequencies. The sardines from the remaining locations were assigned to the subspecies S. p. sardina based on the same criteria.

Figure 3
figure 3

Locations of sardine samples collected in the Atlantic Ocean and Mediterranean Sea (red circles). The yellow colored area shows the distribution of S. pilchardus. Details for sample sizes are listed in Table 1.

Microsatellite genotyping

Genomic DNA of newly analyzed specimens was extracted from fresh muscle following standard phenol-chloroform procedures as previously reported [61]. Specific polymorphic microsatellites (SAR1.5, SAR1.12, SAR2.18, SAR9, SAR19B3, SAR19B5, SARA2F and SARA3C) of S. pilchardus were PCR amplified following optimized reaction conditions [62]. Forward primers were labeled with fluorescent dye (Invitrogen), and PCR amplified products were genotyped on an ABI 3730 automated sequencer (Applied Biosystems). Data collection and sizing of alleles were carried out using GeneMapper v3.7 software (Applied Biosystems). Approximately 10% of the samples were re-run to assess repeatability in scoring.

Statistical analyses

Microsatellite genetic diversity was quantified per locus and per sampling site as the observed and expected heterozygosities [63], number of alleles (NA), and number of alleles standardized to those of the population sample with the smallest size (NS) [64], using both GENETIX 4.02 [65] and FSTAT 2.9.3 [66] (Additional file 2). Deviations from HW equilibrium (by estimating the inbreeding coefficient, FIS) and linkage disequilibrium for each locus and sardine sampling site were assessed using GENEPOP version3.3 [67]. Significance of both analyses was tested with a Markov chain Monte Carlo (MCMC) that was run for 1000 batches of 2000 iterations each, with the first 500 iterations discarded before sampling [68]. P values from multiple comparisons were corrected using a Bonferroni correction [69]. Significant differences of genetic diversity measures between S. pilchardus subspecies were tested using a one-way ANOVA test.

A bimodal test for each locus and sampling site was performed to detect possible genotyping errors due to preferential amplification of one of the two alleles, misreading of bands or transcription errors, using the program DROPOUT [70]. Additionally, MICRO-CHECKER v2.23 [36], was used to explore the existence of null alleles, and to evaluate their impact on the estimation of genetic differentiation.

Genetic and spatial variation between populations

Several alternative methods were used to determine sardine population genetic structure. The program STRUCTURE 2.0 [38] uses a model-based Bayesian clustering approach to determine the number of populations (K) with the highest posterior probability and to estimate admixture proportions. Simulations were conducted using an admixture model and correlated allele frequencies between populations (MCMC consisted of 5 × 105 burn-in iterations followed by 2 × 106 sampled iterations). Additionally, the inference of the best value of K was also based on the modal value of ΔK [71]. The range of possible tested K s was from one to nine, and five trial runs of STRUCTURE were carried out for each putative K.

The program STRUCTURAMA [72] infers population genetic structure from genetic data by allowing the number of populations to be a random variable that follows a Dirichlet process prior [39, 40]. We run 1 × 106 MCMC cycles, and we let α (the prior mean of the number of populations) be a random variable. The first 1 × 105 cycles were discarded as burn-in.

We finally applied a Bayesian assignment test as implemented in the program GENECLASS 2.0 [73], which provides the probability for each individual of belonging to the reference population. The computation followed the partial exclusion method [74], and simulation consisted of 10,000 individuals.

The relative contributions of mutation and genetic drift to genetic differentiation of sardine populations could be determined by comparing the variance in allelic identity (FST, IAM [44]) and allelic size (RST, SMM [41]). The program SPAGEDI 1.1 [75] generates a simulated distribution of R ST values (ρR ST ) for testing the null hypothesis of no contribution of SSM to genetic differentiation (ρRST = FST), and the alternative hypothesis that genetic differentiation is caused mainly by SMM-like mutation (ρRST > FST,) [42]. The test rendered a significant result (P < 0.000), and thus, further analyses of genetic differentiation between samples were mostly based on RST pairwise comparisons, as estimated by the program RST-CALC [76].

To determine the amount of genetic variability partitioned within and among populations, an analysis of molecular variance (AMOVA) [77] was performed with ARLEQUIN v3.0 [78]. For all calculations, significance was assessed by 20,000 permutations, and reported P-values were Bonferroni adjusted [69]. The Mantel test was used to test correlation between geographical and genetic distances as implemented in GENEPOP version3.3 [67]. The logarithm of geographical distance in kilometers was regressed against either R ST as estimated in RST-CALC [76] or genetic distances based on Bayesian assignment values (DLR) as computed in SPASSIGN [79].

To detect possible genetic bottlenecks (i.e. significant heterozygote excess) in any of the analyzed population samples, we assumed the SMM, IAM, and TPM, and applied the Wilcoxon sign-rank test as implemented in the software BOTTLENECK [80].

Gene flow among sardine populations

The program MIGRATE v 2.1.0 [81] was used to infer the population size parameter Θ (i.e. 4 N e μ, were N e is the effective population size and μ is the mutation rate per site) and the migration rate, M (M = m/μ, were m is the immigration rate per generation) among sardine population samples based on the maximum likelihood method [82]. A subset of 20 individuals per population sample was analyzed due to computational constraints. The analyses were carried under the SMM. F ST estimates and a UPGMA tree were used as starting parameters for the estimation of Θ and M. The MCMC run consisted of ten short and two long chains with 5,000 and 50,000 recorded genealogies respectively, after discarding the first 100,000 genealogies (burn-in). One of every 20 and 200 reconstructed genealogies was sampled for the short and long chains, respectively. To test the null hypothesis that the number of emigrants/ immigrants between the Atlantic Ocean and Mediterranean Sea has equal rates, a nested mixed-model ANOVA was performed using two variables (basin and location of origin), with emigrant and immigrant rates as repeated measurements.