Background

The reduction in performance due to inbreeding (i.e. inbreeding depression) has long been documented in plant and animal populations [1]. In general, inbreeding depression is most severe for traits that are closely related with fitness, but other traits can also be affected [2]. The standard approach for estimating inbreeding depression is to regress the phenotype of the trait of interest on the inbreeding coefficient (F). Typically, F, defined as the probability that both alleles at any locus within an individual are identical by descent (IBD), has been computed from pedigree information. However, the current availability of very large numbers of single nucleotide polymorphisms (SNPs) offers new opportunities to improve the accuracy of F estimates [3] and to develop more detailed approaches for detecting inbreeding depression [4-6]. Several potential advantages of using genomic F rather than pedigree-based F (F ped ) have been highlighted [4,5]. Genomic F measures homozygosity directly and thus can more accurately reflect the actual percentage of the genome that is homozygous, whereas F ped is only an expectation of that percentage. Another critical difference is that genomic F allows us to estimate inbreeding and inbreeding depression for specific genomic regions, which is not possible with F ped . In addition, genomic F can be estimated in populations where pedigree recording is difficult or impossible.

Several alternative estimates of genomic F based on SNP genotypes have been proposed. A simple estimate can be obtained on a SNP-by-SNP basis as the proportion of homozygous genotypes [3-6]. However, the drawback of this estimate is that it does not differentiate between IBD and identity by state (IBS). An alternative approach for quantifying individual homozygosity that better reflects IBD is based on runs of homozygosity (ROH). The idea is that autozygous genotypes are not evenly distributed throughout the genome but are distributed in runs that are inherited together [7,8]. This is explained by consanguineous matings causing inheritance of haplotypes that are IBD and result in homozygous stretches along the genome of the offspring [9]. It has been shown that these runs provide a good measure of individual genome-wide autozygosity (F roh ) and allow us also to distinguish between recent and ancient inbreeding [10].

The aim of this study was to detect genomic regions responsible for inbreeding depression for two reproductive traits in a highly inbred strain of Iberian pigs (the Guadyerbas strain) using different measures of genomic F.

Methods

Animals, pedigree and phenotypic data

In this study, data originated from Guadyerbas pigs maintained in a small isolated herd at the CIA ‘El Dehesón del Encinar’ (Oropesa, Toledo, Spain). Four males and 20 females founded the herd that has been maintained in isolation under a genetic conservation program [11]. Complete and very accurate genealogy is available since the foundation of the herd in 1944. It comprises about 25 generations and includes 1178 animals born in the herd between 1947 and 2011. The effective population size (N e ) estimated from the rate at which pedigree-based or SNP-based coancestry increases has been estimated to be about 10 [3].

Phenotypic data for total number of piglets born (TNB) and number of piglets born alive (NBA) in successive parities from pedigreed sows were available. Means (standard deviations) for TNB and NBA were 7.39 (2.34) and 7.06 (2.25), respectively. Farrowing facilities were improved in 2000 with a new building where piglets had ad libitum access to creep food from seven days of age onwards. No creep food was supplied before 2000. Thus, there were eight levels of management by combining season of farrowing (four seasons) and farrowing facilities (two). Boars from two strains of Iberian pigs (Guadyerbas and Torbiscal) were used in the matings. Offspring fathered by Torbiscal boars were never maintained in the herd.

Genotyping data

All animals born in the herd between 1992 and 2011 (about six generations) were genotyped. They included 86 males and 141 females. Of these females, 113 had phenotypic records for litter size. Genomic DNA extracted from blood samples was hybridized with the Illumina PorcineSNP60 BeadChip v1 and images were scanned by an external service (Universidad Autónoma de Barcelona, Spain). The SNP chip comprises 62 163 probes that are distributed across 18 autosomes and the two sex chromosomes according to the latest version of the porcine gene annotation (Sscrofa10.2). Genotype calls were obtained with the Genotyping Module of the GenomeStudio Data Analysis software (Illumina Inc.). For the purpose of increasing the power of genotype calling (i.e. to correctly determine the genotype for each individual at each SNP), we included samples from other strains of Iberian pigs. In total, 468 genotyped samples were used in this step. These comprised the 227 Guadyerbas samples (including the 113 sows with phenotypic records) and 241 samples from other strains. The extra samples were not used in any further analysis.

Quality control procedures were applied to identify problematic SNPs and samples. First, SNPs that did not satisfy the following quality control criteria were removed: Call Frequency < 0.99, GenTrainScore < 0.70, AB R Mean < 0.35 and number of inconsistencies with the genealogy > 9 (see Saura et al. [3] for further details on the filtering criteria performed). Unmapped SNPs and SNPs mapped to sex chromosomes were also excluded. A total of 51 127 SNPs remained and were used in subsequent analyses. Note that monomorphic SNPs (i.e., those with a minor allele frequency (MAF) of 0 were not removed. After filtering SNPs, the data were reanalysed and samples with a Call Rate < 0.96 and with a large number of inconsistencies with the genealogy were removed. Four samples were excluded, so the final number of genotyped Guadyerbas females available was 109. The total number of litter records from these 109 genotyped sows was 265.

Inbreeding coefficients

Different estimates of F for the sows were used for the inbreeding depression analyses:

  1. (1)

    Genealogical inbreeding coefficients (F ped ) were obtained using all pedigree information that had been recorded since the foundation of the herd.

  2. (2)

    Genomic SNP-by-SNP inbreeding coefficients (F snp ) were obtained based on the excess of SNP homozygosity, as in Keller et al. [4]. The inbreeding coefficient for individual i (F snp(i) ) was computed as F snp(i) = [(OH i  − EH)/(s − EH), where s is the number of SNPs, OH i is \( {\displaystyle {\sum}_{j=1}^s{X}_{ij},} \) where X ij is an indicator variable taking values of 1 if individual i is homozygous for SNP j, and 0 if individual i is heterozygous for SNP j, and EH is the expected homozygosity in the population. The expected homozygosity was computed as \( {\displaystyle {\sum}_{j=1}^s\left[1-2{p}_j\left(1-{p}_j\right)\right],} \) where p j is the MAF for SNP j. We also computed SNP-by-SNP inbreeding coefficients as the proportion of SNPs that are homozygous for the individual (F snp_r ). Note that F snp_r ranges from 0 to 1 but F snp can be negative (when EH is higher than OH i ).

  3. (3)

    Genomic inbreeding coefficients were also estimated based on ROH (F roh ). For a given individual i, F roh(i) was defined as the proportion of its genome that is in ROH [12]. We used our own Fortran code to detect ROH [13] that were defined using the following criteria: (i) a maximum of two missing genotypes and one heterozygous genotype were permitted in a ROH; (ii) the minimum SNP density required to define a ROH was 1 SNP per 100 kb; (iii) the maximum distance allowed between two consecutive homozygous SNPs in a ROH was 1 Mb; and (iv) the minimum number of SNPs that constitute a ROH was 30. We also performed analyses based on short and long ROH. We defined the inbreeding coefficient based on short ROH for individual i (F roh_short(i) ) as the proportion of its genome that was in ROH of lengths between 0.5 and 5 Mb and the inbreeding coefficient based on long ROH (F roh_long(i) ) as the proportion of its genome that is in ROH of lengths > 5 Mb. These thresholds were applied to assess the relative importance of distant (F roh_short ) versus recent (F roh_long ) inbreeding [13]. Long ROH are expected to be autozygous segments that originated from recent common ancestors, while short ROH are likely to have originated from more remote common ancestors [9].

Inbreeding depression analyses

Inbreeding depression was estimated by regressing the phenotype of the reproductive trait (TNB and NBA) on F. This regression was performed by including F as a covariate in a bivariate animal model analysis. The model equation for both traits was:

$$ \mathbf{y}=\mathbf{X}\beta +{\mathbf{Z}}_{\mathbf{1}}\mathbf{a}+{\mathbf{Z}}_{\mathbf{2}}\mathbf{p}+\mathbf{e}, $$

where y is the vector of observations for TNB or NBA, β is the vector of fixed effects, including the combination of season of farrowing and farrowing facilities (eight levels), parity (four levels), strain of boar (two levels) and the (linear) regression on F, a is the vector of additive genetic effects, p is the vector of permanent environmental effects associated with the sows, e is the vector of random residual effects, and X, Z 1, and Z 2 are incidence matrices relating fixed and random effects to observations. The expectation of y was assumed to be E[y] = X β, and the variances and covariances of the random effects were assumed to be V(a) = A σ 2 a, V(p) = I m σ 2 p and V(e) = I n σ 2 e, where A is the pedigree-based numerator relationship matrix of order N (number of animals in the pedigree), I m and I n are identity matrices of order m (number of sows with litter size records), and n (number of records), respectively, and σ 2 a, σ 2 p, and σ 2 e are the variances of additive genetic effects, permanent environmental effects, and residual effects, respectively. The analyses were performed using the REML/VCE 6.0 [14] and PEST [15] softwares.

Different analyses were performed by using the different inbreeding coefficients in the model (F ped , F snp , F roh , F roh_short and F roh_long ). The analysis using F ped was carried out using all available performance and pedigree records, which included 823 sows with reproductive records on 2712 litters and a total pedigree file with 1032 animals. Analyses using F snp and F roh included only records for the 109 genotyped females and used estimates of variance and covariance components obtained from the F ped analysis. Three analyses were implemented with the genomic inbreeding coefficients: (i) using F coefficients across the whole genome; (ii) using F coefficients for each autosome; and (iii) using F coefficients for specific regions within chromosomes.

Ethical statement

The current study was carried out under a Project License from the INIA Scientific Ethic Committee. Animal manipulations were performed according to the Spanish Policy for Animal Protection RD1201/05, which meets the European Union Directive 86/609 about the protection of animals used in experimentation. We hereby confirm that the INIA Scientific Ethic Committee, which is the named IACUC for the INIA, specifically approved this study.

Results

Descriptive statistics for F ped , F snp , F snp_r and F roh are summarized in Table 1. The average F snp_r was very high but this is simply due to an effect of scale since F snp_r is not corrected for homozygosity in the base population and therefore includes IBS. The average inbreeding coefficient based on long ROH was very close to F ped , whereas the average F roh_short was about four times lower than the average F ped . Although short ROH were more abundant (about double the number) than long ROH (Figure 1), their total contribution to the autosomal genome was relatively small (11.1% for short versus 34.2% for long ROH). Chromosomes SSC1 (SSC for Sus scrofa) and SSC13 contained the longest ROH, with sizes greater than 170 Mb, while the maximum size for ROH in the other autosomes was 120 Mb (Figure 2). This was as expected, given the negative correlation between physical (Mb) chromosome size and recombination rate in pigs [16]. Fisher [17] noted that the expected length of a DNA segment that is IBD follows an exponential distribution with mean equal to 1/2 g Morgans, where g is the number of generations since the common ancestor. Given this, large ROH (>5 Mb) reflect the expected inbreeding from a common ancestor that lived less than 10 generations ago. In our data, the average F roh_long was about three times as large as the average F roh_short , thus suggesting that most inbreeding was recent.

Table 1 Mean, standard deviation (SD) and range (Min, Max) of pedigree and SNP-derived inbreeding coefficients
Figure 1
figure 1

Distribution of short and long ROH.

Figure 2
figure 2

Distribution of ROH according to their size within each autosome. The number of SNPs per autosome is indicated in brackets.

Pearson correlations between the different inbreeding coefficients across individuals were positive and high (Table 2), except for correlations with F roh_short , which were smaller and negative. These negative correlations should be interpreted with caution because, although short ROH are likely to have originated from remote common ancestors, they could be covered up or included in some of the longer ROH. The high positive correlations of F ped and F snp (which are measures of overall inbreeding) with F roh_long also support most of the inbreeding to be recent. Correlations computed for each autosome followed the same pattern as those computed for the whole genome [See Additional file 1]. In magnitude, the highest positive correlations across autosomes were consistently those between F snp and F roh_long and the highest negative correlations were those between F snp and F roh_short .

Table 2 Pearson correlations (SE) between different inbreeding coefficients measured at the whole-genome level

As indicated above, estimates of genetic parameters were obtained from the pedigree-based analysis and then used in all the subsequent analyses that used genomic inbreeding coefficients. Estimates of heritability (0.052 ± 0.025 for NBA and 0.077 ± 0.029 for TNB) and permanent environmental coefficients (0.073 ± 0.023 for NBA and 0.068 ± 0.024 for TNB) were of the same order of magnitude as those previously reported for Iberian pigs [18]. These estimates led to a repeatability estimate of 0.15 for both traits. The estimated genetic correlation between traits was very high (0.964 ± 0.024) and also close to previous estimates [19,20].

A significant reduction in both NBA and TNB with increases in the inbreeding coefficient was observed when performing the pedigree-based analysis using all available data (i.e. using records from 823 sows). Estimates of inbreeding depression were −0.197 ± 0.092 for NBA and −0.211 ± 0.104 for TNB per 10% increase in F ped . Although the effect of genomic F at the whole-genome level (i.e., using F snp and F roh ) with litter size was not significant (−0.267 ± 0.186 and −0.253 ± 0.194 for NBA and TNB, respectively, when using F snp , and −0.415 ± 0.374 and −0.335 ± 0.391 for NBA and TNB, respectively, when using F roh ), significant (p < 0.05 for both traits, see Table 3) inbreeding depression was found for both traits when the genomic analyses were carried out at the autosomal level for SSC13. Estimates of inbreeding depression that were obtained from the analyses performed for each autosome using F snp are presented in Figure 3 for NBA and TNB. Only SSC13 showed a significant effect. The estimated inbreeding depression for this chromosome was −0.121 ± 0.047 and −0.117 ± 0.049 per 10% increase in F snp for both NBA and TNB, respectively.

Table 3 Estimates of inbreeding depression on SSC13 for NBA and TNB
Figure 3
figure 3

Inbreeding depression estimates expressed as the change in phenotypic mean per 10% increase in F snp and 95% confidence intervals across autosomes for number of piglets born alive (NBA) (a) and total number of piglets born (TNB) (b).

Reductions in the number of piglets per 10% increase in F snp , F roh and F roh_long for SSC13 were all significant (see Table 3) and of the same order of magnitude as those derived from the pedigree-based analyses, which were based on the entire genome. There was no significant effect of F roh_short on either NBA or TNB. This may be explained by purging of deleterious alleles in ancient generations, or by a bottleneck.

In order to detect specific genomic regions that cause inbreeding depression, all autosomes were fragmented in segments of equal size (three, five and eight segments per autosome) and three additional analyses per chromosome were carried out using F snp . When autosomes were divided into three segments, inbreeding depression was only significant for the first region of SSC13 (0.0 - 73 Mb) for both traits (Table 4). When they were divided in five segments, significance was found for the first (0–44 Mb) and the second (44–88 Mb) regions of SSC13 for both traits. Finally, when autosomes were divided into eight fragments, only the second region of SSC13 showed a significant result for both traits. This region is 32.4 Mb in size and is located between 27 and 55 Mb. No other regions showed a significant result for any of the traits.

Table 4 Estimates of inbreeding depression for different regions of SSC13 for NBA and TNB

We also performed a bootstrap test by generating 1000 bootstrap replicates of the individual phenotypes and genotypes for the region that was significant in the inbreeding depression analysis. Bootstrap replicates were created by randomly resampling the individual phenotypes and inbreeding depression analyses were repeated for each replicate. When fragmenting SSC13 in eight segments, the average of the estimates for the effect across bootstrap replicates was 0, with a standard error that was lower than that obtained from the real data (0.020) in all cases except one. This is equivalent to a bootstrap p-value of 0.001. These results constitute more evidence that the signals detected are not spurious.

Discussion

We have detected inbreeding depression associated with a specific region of SSC13 for two reproductive traits in a highly inbred strain of Iberian pigs using different measures of genome-wide inbreeding coefficients. It is important to note that the signal detected was significant despite the small sample size available for the study (109 sows). This indicates that it would be extremely rare that the effects detected here are spurious.

Also, our results are consistent with those obtained by Noguera et al. [19] who performed one of the first genome-wide scans for prolificacy traits in pigs. They used data from a Guadyerbas x Meishan F2 intercross using SNPs and microsatellite markers and detected a quantitative trait locus (QTL) on SSC13 for both NBA and TNB. This QTL region extends from about 38 to 194 Mb and overlaps with the region identified here. Specifically, the region detected in the inbreeding depression analysis is shorter (it spans from 27 to 54 Mb) and overlaps with the first part of the QTL region detected by Noguera et al. [19]. We examined the gene content of this common region by using the porcine genome annotation Sscrofa10.2 in BioMart tool of Ensembl (ensembl.org/biomart) and the Ensembl Genes 69 database and found 271 annotated genes. Interestingly three of these genes, the inter-alpha-trypsin inhibitor heavy chains 1, 3 and 4 (ITIH-1, ITIH-3 and ITIH-4) map to 38 Mb on SSC13 and play several important roles in maintaining the uterine surface glycocalyx during placental attachment in pigs [20]. Moreover, these genes have been previously associated with NBA and TNB [21]. Specifically, using the same material as Noguera et al. [19], Balcells et al. [21] analyzed the porcine ITIH-1, −3 and −4 gene sequences in order to identify polymorphisms that could explain differences in prolificacy of sows. Their results revealed significant associations with NBA and TNB for two SNPs in ITIH-1, four SNPs in ITIH-3, and four SNPs in ITIH-4. Thus, the studies of Noguera et al. [19] and Balcells et al. [21] support our findings, since genes that affect both NBA and TNB are located in the region identified on SSC13. Another recent whole-genome association study identified QTL regions for NBA and TNB that partially overlapped with the region identified here [22].

Several studies that compared the same inbreeding coefficients as used here indicated that F roh is a better measure of IBD than F snp [4,5,23]. However, Keller et al. [4] showed that as N e decreases, the similarity of both measures of molecular inbreeding (F roh and F snp ) increases. Based on this and going down to an N e of 10 (the estimate for this herd [3]), we can expect that the measures of F snp and F roh will be highly correlated (0.97 ± 0.02). In fact, the correlation between F ped and F roh found in our study was similar to that between F ped and F snp (0.63 ± 0.05 and 0.66 ± 0.05, respectively) All this information validates the use of F snp as a measure of IBD in this population, and therefore, as a suitable coefficient to perform the inbreeding depression analyses.

Previous studies aimed at detecting inbreeding depression using SNPs have focused on the whole genome level [6,23-25]. These include human studies that investigated the association between inbreeding and particular diseases [5,11,13,26,27]. Only a few of these have attempted to identify specific genomic regions that cause depression [11,13,27]. Keller et al. [8] used a ROH mapping approach to analyze the association of F roh with schizophrenia risk. The approach consisted of dividing the autosomes in a large number of segments of equal size and performing regressions of disease status on whether or not individuals had a ROH in each segment. They found significant associations between specific genomic regions and disease status. Recently Pryce et al. [28] have followed a similar approach for detecting inbreeding depression for fertility and milk production traits in dairy cattle.

In order to compare our approach for detecting genomic regions associated to inbreeding depression using F snp with the approach of Keller et al. [8], we divided the autosomes in segments of approximately 2 Mb and recorded for presence or absence of ROH within each segment. The results from these regressions showed a significant effect in the same region on SSC13 as detected with the analysis using F snp [See Additional file 2]. Although this significant effect was lost when multitest correction was applied, the results support our previous finding. Additional file 2 shows results for NBA but the same pattern was found for TNB. It should be noted that our analysis using a continuous variable (i.e., F snp ) instead of a dichotomous variable (i.e., presence/absence of ROH) has more power to detect associations between phenotypes and inbreeding within specific regions.

We also performed an association analysis for the region involved in inbreeding depression. It was conducted using the same data (265 phenotypic records on 109 sows) and statistical model as described above for inbreeding depression analyses but also included the SNP genotype as a fixed effect. Each SNP was tested separately for association with the trait and both additive and dominant effects were estimated. We found several SNPs that showed significant dominance effects and non-significant additive effects for both traits (Additional file 3 shows the results for NBA only but similar results were found for TNB [See Additional file 3]). This result is consistent with the findings from the inbreeding depression analyses and agrees with directional dominance for the genes involved.

Conclusions

In summary, using genome-wide SNP information, we detected inbreeding depression in a specific region that contains genes associated with litter size in the isolated population of Guadyerbas Iberian pigs. The availability of dense SNP platforms has created opportunities to estimate homozygosity without using pedigree relationships and to obtain patterns of homozygosity along an individual’s genome. Our results highlight the importance of SNP chips for providing new insights into where genes causing inbreeding depression are located in the genome and thus, offer a complementary tool to QTL analysis for mapping studies.