Background

The success of any breeding programme depends critically on how the base population of breeders is built, since the genetic variability that is initially available in the founders will affect the genetic progress achieved in the subsequent selection programme [1,2,3]. This is particularly important in aquaculture because, given the high fecundity of fish species, base populations can be created from very few individuals, which would lead to small effective population sizes (Ne) and therefore, to high rates of loss of genetic variability, high rates of inbreeding and restricted long-term selection responses.

With the rapid development of genomic tools, temporal series of Ne can be estimated for generations before pedigree recording began. This is of great importance in aquaculture species to determine the impact of domestication on the genetic variability present in the base populations and the potential long-term response to selection. Genomic estimates of Ne are obtained based on the linkage disequilibrium (LD) approach [4], and different methods have been developed to estimate this parameter across generations. These methods have assumed that the Ne of a particular generation in the past can be estimated from LD between pairs of single nucleotide polymorphisms (SNPs) separated by a specific distance [5]. However, this assumption implies that the demographic events that occurred in that particular generation do not affect subsequent generations, and the method only holds for linear changes in population size [5]. To circumvent this problem, Santiago et al. [6] have recently developed an approach where the LD spectrum for the whole range of recombination rates between all pairs of SNPs is taken into account for estimating Ne in consecutive generations, and this allows the detection of drastic changes in population size.

In spite of the importance of estimating Ne, estimates of this parameter are scarce for most aquaculture species. In this study, we used genomic information that was recently produced for important fish species in European aquaculture (turbot, gilthead seabream, European seabass and common carp) to obtain recent and historical estimates of Ne for commercial populations, using the novel method developed by Santiago et al. [6]. These estimates are useful to evaluate the current genetic status of the populations and to identify past changes in Ne potentially associated with domestication or with the establishment of selective breeding programmes.

Methods

Data

Data were derived from broodstock (and their offspring) sampled in 2014 from different European breeding programmes for turbot, gilthead seabream, European seabass and common carp within the framework of the FISHBOOST project (www.fishboost.eu) (Table 1). Unrelated broodstock were mated and their offspring were used for different experimental purposes. Genomic information was available for both parents and their offspring. Genotypes were obtained using reduced representation genotyping approaches [specifically RAD sequencing, (RAD-seq)]. The species’ linkage maps and reference genomes were used to map the SNPs [7,8,9,10]. Details on the number of samples and SNPs available for each population analysed are summarised in Table 1. Genotyping and filtering details are described elsewhere for turbot [7], seabream [8], seabass [9] and carp [10]. Imputation of missing genotypes, which was only performed for turbot, was carried out using the software BEAGLE 4.1 [11].

Table 1 Description of samples and genomic information for the populations analysed

Turbot samples were obtained from an experimental population of Atlantic origin maintained at CETGA (Aquaculture Cluster of Galicia, Spain) through hierarchical matings. For gilthead seabream, data came from one of the four genetically linked yearly cohorts of the breeding nuclei of the Andromeda Group SL (Greece) and Ferme Marine de Douhet (FMD, France), where the main breeding objectives in the selection programmes are growth and body shape. The Andromeda programme applies mass spawning, while the FMD programme applies partial factorial mating designs [8]. European seabass samples came also from one of the four linked yearly cohorts of the FMD breeding nucleus, where the breeding objectives are growth and body shape. In this programme, partial factorial matings are also applied [9]. Finally, for common carp, samples were obtained from the Amur Mirror carp (Vodňany line), which was recently created at the University of South Bohemia in České Budějovice. For this line, F1 offspring were obtained from crosses between females from a cultured population (originating from Hungary and Germany) with a mirror phenotype for scaliness and males from a wild population (from the Amur river, Siberia) with a scaly phenotype. The Amur Mirror strain was founded from F2 crosses by selecting offspring that had the mirror phenotype. The population used in this study was obtained by artificial fertilization that involved four blocks of full factorial crosses each comprising five dams and ten sires [10].

Estimation of linkage disequilibrium and effective population size

Estimates of LD between pairs of loci and temporal estimates of Ne were obtained using the software GONE and its auxiliary programs developed by Santiago et al. [6] (available in https://github.com/esrud/GONE). Squared correlations between allele frequencies of pairs of SNPs (r2; [12]) were obtained for all pairs of SNPs within each linkage group (chromosome). Category bins for different ranges of genetic distances (in Morgans) between SNPs were built and the average values of d2 (the average of r2 values between pairs of SNPs weighted by their variance in allele frequences; [13]) were obtained for each bin. The method involves a genetic algorithm to infer the historical series of Ne in the population that minimises the sum of the squared differences between the observed values of d2 of the bins and those predicted considering different demographic histories. The analyses assumed that phase is unknown and the genetic distances between SNPs were corrected by Haldane’s function. For the remaining software options, the default values were used. In order to compare our results with those of other studies, patterns of LD measured as r2 across physical distance were represented for the populations for which the physical position of SNPs was available on the reference genome assemblies (i.e. turbot GCA_003186165.1 and seabass GCA_000689215).

For the sake of comparison, temporal estimates of ancestral Ne were also obtained using the previous method of Hayes et al. [5] as implemented by Saura et al. [14]. Although both the GONE method and that of Hayes et al. [5] are based on the well known relationship between LD and Ne [4], the main difference between them is that the former assumes constant Ne or linear changes in Ne.

Results

The pattern of LD decay with physical distance that was computed with offspring data for turbot and seabass is represented in Fig. 1. Overall, the average LD (r2) between SNPs separated by short distances (< 0.01 kb) was moderately low (0.15 for turbot and 0.24 for seabass) and decreased rapidly with physical distance. The average r2 was reduced by half in both cases for distances shorter than 5 kb.

Fig. 1
figure 1

Decay of average linkage disequilibrium across chromosomes measured as r2 against physical distance. Physical distance in terms of fragment length is indicated in Mb for the species for which a physical map is available; i.e. turbot (left panels) and seabass (right panels). Three different distance categories are represented: a from 0.0 to 0.5 Mb; b from 0.5 to 5 Mb; c from 5 to 20 Mb

Estimates of recent Ne were equal to or less than 50 fish in all cases. When using offspring data, Ne of 31 for turbot, 46 for seabream_A, 32 for seabream_F, 40 for seabass and 33 for carp were found, and when using parents data, the corresponding values were 26, 50, 30, 32 and 15, respectively.

Estimates of historical Ne were larger than 1000 fish for about 20 generations ago in all species. However, important drops were observed about five generations ago for turbot and seabream and about eight to nine generations ago for seabass, using data from parents or from offspring (Fig. 2 and see Additional file 1: Fig. S1). The two populations of seabream showed a similar pattern of Ne decay. Estimates of ancestral Ne are not provided for carp since the Amur Mirror strain comes originally from crosses of several strains, and under a scenario of strong and recent population admixture, the method to estimate historical Ne is not conceptually applicable. However, estimates of contemporary Ne can be obtained in this case, although the estimates are likely to be biased downwards because of population admixture [15].

Fig. 2
figure 2

Estimates of Ne (logarithmic scale) across the last 20 generations for each population analysed. Straight lines represent estimates obtained using data from parents and dashed lines represent estimates obtained using data from offspring

The LD method of Hayes et al. [5] led to linear trends in historical Ne, as expected (see Additional file 2: Fig. S2), which contrasts with the drastic drops shown in Fig. 2. Historical values estimated with this method for the earliest generation shown (generation 100) were smaller than 1000 individuals, i.e. much smaller than those obtained by GONE in Fig. 2. However, recent Ne values with the same method (44 for turbot, 33 for seabass, 51 for seabream_A, and 49 for seabass_F) were of the same order of magnitude as those obtained with the method of Santiago et al. [6] and are shown in Fig. 2.

Discussion

In this study, recent and historical Ne estimates were obtained for farmed populations of important European aquaculture species (turbot, gilthead seabream, European seabass and common carp), using genome-wide SNP data from RAD-seq, and a novel accurate method based on LD measures [6]. Our results revealed that recent Ne for all the analysed populations were small and that important drops in ancestral Ne occurred in these populations about five to nine generations ago.

Recent Ne estimates for all the analysed populations were equal to or less than 50 fish. A value around 50 is considered to fit the minimum value recommended to avoid severe inbreeding depression and retain fitness in the short-term [16,17,18]. However, our Ne estimates for seabream and seabass could be slightly underestimated given that the data used came from breeding schemes with overlapping generations and the method asumes discrete generations [6, 19].

In general, the magnitude of our recent estimates of Ne was within the range of those found in other farmed fish populations of different species [20,21,22,23,24,25,26,27,28,29,30], although there are exceptions [31]. For instance, the estimate of Ne in the GIFT (Genetically Improved Farmed Tilapia) selection programme in which the creation of the base population was carefully planned, was equal to 88 after seven generations of selection for growth rate [31]. The small estimates of Ne obtained for the farmed populations analysed here contrast with the large estimates (> 1000) found for wild populations of turbot, seabass and seabream [32,33,34]). Although estimates of Ne for the wild common carp are not available, genetic variability analyses have shown that they are smaller in farmed than in wild strains [35, 36].

Estimates of historical Ne for all the analysed populations revealed important drops occurring about five to nine generations ago. We obtained similar results using data from the reduced number of parental samples or from the more extensive number of offspring samples (Fig. 2). The power of the method to detect fluctuations in Ne is proportional to the product of the sample size and the square root of the number of markers divided by Ne, and the minimum value to ensure accurate estimations of Ne is 100 [6]. Using parental samples, the value was much larger than 100, and thus estimates obtained from parents were as reliable as those obtained from offspring.

A drop in Ne and the consequent drop in genetic variability in farmed populations can occur during the establishment of the base population (founder effect) but also in subsequent generations of selection if there is no optimal inbreeding control. Some caution must be taken in the interpretation of the drops observed as they could also be a consequence of population admixture or of the use of inaccurate genetic maps [6]. Nevertheless, our results are highly consistent with the information about the origin of broodstock and how these programmes have been run. Although limited, the available information suggests that the domestication of turbot, gilthead seabream and European seabass started around the 1970s, and that selective breeding programmes for increasing growth rate started in the 1990s [37], with approximately four to six generations of selection to date for the populations analysed here. Under this broad context, our estimates of historical Ne suggest that the combination of both domestication and the start of selection programmes is the most likely explanation for the important recent drops inferred in the populations analysed. Both events may have occurred too close in time to be disentangled by the method.

Our results reflect a moderately low LD between SNPs that are separated by very short distances in turbot and seabass populations. In addition, a very fast LD decay with physical distance was observed in both populations. In fact, r2 decreased by half at distances shorter than 0.02 Mb and it was maintained when the distance increased by one order of magnitude. At distances longer than 10 Mb, r2 reached values lower than 0.05. Similar LD values have been reported for coho salmon [29] and Nile tilapia [38] at short distances but the rate of decrease in LD was much slower than those observed here for turbot and seabass. Much higher values of LD (> twofold for short distances) have been reported in farmed populations of Atlantic salmon [28, 39, 40] and rainbow trout [30], with also LD remaining higher over much longer distances. These results may suggest that higher LD values are observed in populations with a longer history of artificial selection.

As already mentioned, an important limitation of the LD method of Hayes et al. [5] to estimate historical Ne is that it only holds for linear changes in population size. Indeed, previous studies applying this method have observed linear trends in Ne over time [28,29,30,31], as we observed when reanalysing our data applying this method (see Additional file 2: Fig. S2). As reflected in our results, the method by Santiago et al. [6] provides in this case, a more precise view of the drastic changes in the historical Ne, such as those observed in Fig. 2. Another difference between the results of the two methods concerns the large discrepancy between the historical estimates of Ne (see Additional file 1: Fig. S1 and Additional file 2: Fig. S2). In order to shed some light on this issue, we carried out computer simulations under a scenario that mimics the pattern observed in Fig. 2 (see Additional file 3: Fig. S3 for results and simulation details). In the simulations, a large population with a constant size N of 1000 or 10,000 suddenly drops to N = 100 or 50 individuals in the last ten or five generations, respectively. We repeated this simulation 20 times and carried out analyses with the methods of Santiago et al. [6] (using GONE) and Hayes et al. [5]. The simulations show that the method of Hayes et al. [5] does not reflect the sudden drop in population size, and that it gives very downwardly biased estimates of the historical size. The simulations also show that the ancestral Ne obtained by GONE can be overestimated, particularly when the size of the ancestral population is large. Thus, the large observed values of ancestral Ne shown in Fig. 2 and Additional file 2: Fig S2 should be taken with caution, since they can be overestimations. In any case, GONE is able to detect the drastic change in Ne as reflected in both figures.

Conclusions

In summary, our results suggest that the current Ne of the commercial populations analysed here are, in general, below the critical value of 50 individuals that is recommended to ensure short-term sustainability of selection programmes. Series of historical Ne reveal important drops probably due to domestication and the start of breeding programmes. Our findings highlight the need for broadening the genetic composition of base populations from which selection programmes start and suggest that measures to increase Ne within all the farmed populations analysed here should be implemented. These measures include increasing the number of parents selected, conducting artificial fertilization and applying single-pair rather than mass spawning [41], and if possible implementing optimal contribution selection [42, 43], to maximise genetic gain while restricting the rate of inbreeding. In cases where these interventions are not sufficient to increase Ne above the critical value, another option could be to interchange genetic material from different genetically improved stocks.