Background

As one of the earliest centers of domestication for chickens, China has the most abundant chicken genetic resources in the world, with 107 indigenous chicken breeds. These breeds play an essential role in the Chinese poultry industry due to the popularity of traditional cuisine. Because these chickens typically exhibit high adaptability to variable environments, strong disease-resistance, and produce high-quality meat and eggs, they are an important breeding resource to meet future market demands [1]. However, a sizeable fraction of indigenous chicken breeds (21.3% in the world) are under threat of extinction because breeding populations are too small for genetic sustainability [2]. Twenty-three Chinese indigenous chicken breeds have been listed in the national conservation catalogue and are currently managed under what is thought to be an optimal conservation scheme. However, the effectiveness of the scheme has never been evaluated over the long term.

An efficient in situ conservation scheme relies on an effective population size, as well as an effective selection and mating strategy [1]. The recommended effective population size is 50, which is not only sufficient to maintain population fitness, but is also small enough to be monitored and managed easily [1, 3]. The principle of a selection and mating strategy is to minimize the average kinship between selected parents [1]. The mating systems that are in use are: (i) random mating and random selection (R: R), (ii) random mating within families, with one son kept per sire family and one daughter kept per dam family (R: F), and (iii) family rotational mating (F: R) [4].

Results from a study that compared the effectiveness of these three mating systems suggest that F: R can sustain 90% of genetic diversity in a livestock population for more than 100 years [4]. Other studies demonstrate that F: R can reduce inbreeding in populations [5, 6], however, implementation of the F: R scheme requires substantial effort. The contemporary conservation scheme in China relies mainly on the R: F mating system. Although simulation experiments suggest that R: F and F: R perform similarly in maintaining genetic diversity in a conserved population [4], few empirical studies have evaluated their effectiveness.

The genetic diversity dynamics in successive generations within a conserved population directly reflect the effectiveness of a conservation scheme. DNA markers can be used efficiently to estimate the genetic diversity within and between conserved populations [7]. Various DNA marker systems have been used to assess chicken genetic diversity, such as RAPDs [8], AFLPs [9], and microsatellites [10,11,12]. SNPs, which are densely distributed across genomes, have also been used to estimate genetic diversity and population structure with high accuracy [13, 14].

In order to investigate the effectiveness of the schemes that are currently used to conserve chicken genetic resources in China, we evaluated the genetic diversity of three conserved indigenous chicken breeds. SNPs, identified using high-throughput DNA sequencing in samples obtained from three generations per breed, were used to estimate diversity and track changes across time. These data can be used to evaluate and improve ongoing conservation efforts.

Methods

Ethics statement

Sample collection procedures strictly followed protocols approved by the Animal Welfare Committee of China Agricultural University (Approval Number: XK257).

Sampling

Three Chinese indigenous chicken breeds that have been enrolled in conservation programs were used in this study: Baier Yellow Chicken (BEC), Beijing You Chicken (BYC), and Langshan Chicken (LSC). Different geographical and environmental factors have contributed to the unique characteristics of these breeds. The Baier Yellow chicken, mainly produced in Jiangxi Province, has a distinct appearance with white ears and yellow feathers, beak, and shanks. It is a rare prematurity and egg-type breed. The Langshan chicken is a classic dual-purpose breed that originated in Jiangsu Province. This is one of the oldest breeds and is unusually tall, with long legs and a tail carried at a high angle. The Beijing You chicken is an ancient breed that originated during the Qing Dynasty in Beijing. Known as “royal chickens”, they are valued for their high-quality meat and eggs, and are uniquely marked by a crest on the head, a beard under the lower jaw, and feathers on both shanks [15,16,17].

These breeds have been managed for optimal conservation as part of the National Chicken Genetics Resources program (Jiangsu, NCGR). Briefly, the conservation goals are that population sizes should be kept constant across generations (30 males and 300 females), and random mating should be enforced within families, with one son kept per sire family and one daughter kept per dam family (R: F). Samples from 270 individuals were collected from NCGR, with three generations per breed and 30 individuals per generation. Samples are identified by breed abbreviation and the last two digits of the year in which samples were obtained (e.g., samples collected from BEC in 2007 are designated BEC07). Breed and sampling information is summarized in Table 1. Blood samples were collected from the wing vein and stored at − 20 °C. Genomic DNA was extracted following the protocol accompanying the DNeasy Blood & Tissue Kit (Qiagen Inc., Valencia, California, USA). 3 μg high quality DNA was used to construct sequencing libraries for each sample.

Table 1 Samples obtained from conserved populations of three indigenous chicken breeds

Genotyping and data preparation

All DNA samples were subjected to genotyping by sequencing (GBS) using an Illumina HiSeq 4000 sequencer (Illumina, San Diego, CA, USA) after double enzyme digestion (MseI and HaeIII). The initial data set was filtered to exclude low-quality reads, and then aligned to the chicken genome (version: Gallus_gallus 4.0) using BWA (v0.7.8) [18]. PCR duplicates were removed using SAMtools rmdup (v0.1.19) [19]. Sequencing variants were identified using SAMtools mpileup (v0.1.19, arguments: -q 1 -C 50 -S -D -m 2 -F 0.002) and BCFtools view (arguments: -Q 20 -d 1 -D 8000). Variants satisfying all of the following criteria were retained for further analysis: coverage depth ≥ 1 and ≤ 8000, RMS mapping quality > 20, and distance between adjacent SNPs ≥5 bp. Variants were annotated using ANNOVAR with default parameters [20].

Quality control procedures were implemented using PLINK 1.90 [21]. SNPs were required to meet the following criteria: call rate ≥ 95%, minor allele frequency (MAF) ≥ 0.05, missing rate ≤ 0.01, and Hardy-Weinberg equilibrium test P-value >10e-6.

Preparation of data prior to calculation of genetic diversity

Before genetic diversity was estimated, linkage disequilibrium (LD) “pruning” was conducted using PLINK (v1.90, arguments: --indep-pairwise 50 5 0.2).

Nine generation-based sample pools (BEC07, BEC10, BEC15; BYC07, BYC10, BEC13; LSC10, LSC12, LSC15) were used to calculate genetic diversity, as reflected by expected heterozygosity (He), observed heterozygosity (Ho), proportion of polymorphic markers (PN), and allelic richness (AR). He, Ho and PN were calculated using PLINK 1.90 with the default settings. AR estimates were determined using ADZE v1.0 [22].

Inbreeding coefficient

Two measures of inbreeding coefficient were calculated for each chicken population.

Inbreeding coefficient based on the mating plan (FES): The estimation of effective population size (Ne) was based on number of sires and dams, following Wright’s model [23]. Computation of Ne requires the numbers of males (Nm) and females (Nf) in each population that participated in the R: F program, and is calculated using the equation: \( Ne=\frac{3 Nf+ Nm}{16\mathrm{NmNf}} \). The increment of hypothetical inbreeding (ΔF) was calculated using the equation: \( \Delta F=\frac{1}{2 Ne} \).

Inbreeding coefficient based on runs of homozygosity (FROH): A run of homozygosity is defined as a region > 100 Kb containing > 50 SNPs. FROH was calculated using PLINK v1.90 (with parameters --file BEC07_qc --ibc --allow-extra-chr --chr-set 28 --out BEC07) and is the fraction of the genome spanned by runs of homozygosity [24].

Analysis of population structure

To reduce noise due to linkage disequilibrium, SNPs with a pair-wise genotype r2 value ≥0.2 were removed from the data set. A principal component analysis (PCA) [25] was conducted using PLINK and visualized with the SNPRelate R package [26]. A neighbor-joining (NJ) tree was constructed with Nei’s genetic distances [27] using the phylogeny program MEGA v7.0 [28] and displayed with FigTree v1.4.3 [29]. The genetic structures of the 9 sub-populations described above were analyzed with STRUCTURE v2.3.4 [30], using admixture and a correlated allele model [30, 31]. Ten independent runs were performed with K ranging from 1 to 10, with a burn-in period length of 10,000, followed by 100,000 Markov chain Monte Carlo (MCMC) repetitions, and 20 replications for each K value. STRUCTURE HARVESTER [32] was utilized to determine the optimal K value by comparing the likelihood of the data (LnK) for different values of K [lnP(X|K)] and by examining the second-order rate change of lnP(X|K),ΔK [33, 34]. Results for K = 2 to K = 9 are included in this report.

Estimation of genetic differentiation

The unbiased genetic differentiation estimate, FST [35], was calculated using VCFtools v0.1.14 [36] with the quality-controlled SNP dataset to estimate genetic differentiation between populations (with parameters --vcf chicken_qc.vcf --weir-fst-pop BEC.txt --weir-fst-pop NLS.txt --out BEC_NLS).

Estimation of nucleotide diversity

Genome-wide nucleotide diversity (π) was computed for each breed using VCFtools v0.1.14 [36] (parameters --vcf BEC_qc.recode.vcf --window-pi 100,000 --window-pi-step 10,000 --out BEC).

Linkage disequilibrium decay

LD was evaluated as the correlation coefficient (r2) between alleles at two separate SNP loci [37]. Within each population, all pairs of autosomal SNPs with MAF > 0.05 and Hardy-Weinberg equilibrium P-value >10E-6 were used to calculate r2 with Haploview [38]. Inter-SNP distances from 0 kb to 500 kb were consolidated into 5 bins.

Effective population size

Effective population size (Ne) was estimated according to the random mating model of linkage disequilibrium, using default parameters in NEESTIMATOR v2.01 [39]. Ne estimates for each breed were calculated as the average of the estimates for macrochromosomes (gga1-gga5) [40] (Axelsson et al., 2005).

Runs of homozygosity

Runs of homozygosity (ROH) were identified for each of the 9 sub-populations using PLINK v1.90. The ROH program slides a moving window of 1 Mb across the genome to estimate homozygosity. One heterozygous and five missing calls per window were allowed to avoid false negatives caused by occasional genotyping errors or missing genotypes. The minimum length and SNP counts required for each ROH were 100 kb and 50 SNPs, respectively. Additional statistical significant tests were conducted to detect the differences in genome-wide homozygosity levels among populations with three measures (NSEG, KB, KBAVG).

Results

Descriptive statistics

To assess genetic diversity, DNA samples from three indigenous chicken breeds (BEC, BYC, and LSC) were subjected to high-throughput DNA sequencing. Two hundred seventy individuals, representing three generations per breed and 30 individuals per generation, yielded 120 Gb of high-quality sequence data. About 99.7% of the reads mapped to the reference genome (Gallus_gallus 4.0) for each individual, providing ~8X average genome coverage. 6,950,965 SNPs were identified in the initial screen. 6,234,592 SNPs were excluded because they deviated from Hardy-Weinberg equilibrium (1,244,248 SNPs), exhibited minor allele frequency ≤ 0.05 (4,959,232 SNPs), or were located on non-autosomal or small chromosomes (31,112 SNPs). 716,373 SNPs met criteria for inclusion in the final data set. The average physical distance between neighboring SNPs was 1.34 kb, ranging from 1.10 kb on GGA6 to 5.15 kb on GGA25 (Additional file 1: Table S1). The distribution of SNPs across all chromosomes is shown in Additional file 2: Figure S1.

Genetic diversity within the BEC, BYC and LSC breeds

All three breeds maintained relatively high genetic diversity in the R: F conservation scheme (see scheme definitions in Materials and Methods). LSC exhibited the highest genetic diversity as measured by Ho (0.2348), He (0.2250), AR (1.235), and PN (0.8130) (Table 2). An analysis across three generations within the same breed showed that genetic diversity was highest for LSC15 with Ho (0.2379), He (0.2281) and AR (1.238). The highest proportion of polymorphic markers (PN) was observed in BEC07 (82.41%), while BYC15 exhibited the lowest genetic diversity. As expected, BEC and BYC showed decreasing levels of diversity from the initial generation (BEC07/BYC07) to the current generation (BEC15/BYC15). Conversely, LSC displayed increasing diversity with the implementation of a conservation program (Fig. 1). However, dynamic changes in genetic diversity within breeds were less than 10% throughout the sampled generations (Additional file 3: Figure S2).

Table 2 Genetic diversity measurements for nine sub-populations from three chicken breeds
Fig. 1
figure 1

Dynamic changes between different generations within breeds. Ho, observed heterozygosity; He, expected heterozygosity; PN, proportion of polymorphic markers; AR, allelic richness; FROH, inbreeding coefficients based on ROH segments; FES, inbreeding coefficient based on pedigree

Estimation of inbreeding coefficients

Estimated inbreeding coefficients varied between breeds and conservation methods. Average FES ranged from 0.0789 in BEC to 0.2010 in BYC. As expected, FES values across generations increased as conservation procedures were maintained. In contrast, average FROH tended to be lower than FES, ranging from 0.0511 in BEC to 0.0745 in BYC, and FROH did not exhibit the steady increase that was observed for FES. Maximum inbreeding was observed in BYC15 (FES = 0.2010 and FROH = 0.0925). Correlation between FES and FROH was strongly positive (r2 = 0.76).

Population structure analysis

Population structures of the three native chicken breeds, comprising 9 conservation sub-populations, were analyzed using PCA, NJ tree, and STRUCTURE. PCA showed that the first two principal components account for 17.77% (PC1) and 15.01% (PC2) of the total variability. Individuals from the 9 sub-populations clearly group into their respective breeds (Fig. 2a). The results of the NJ tree analysis were consistent with those obtained by PCA (Fig. 2b).

Fig. 2
figure 2

a Population structures of conserved populations revealed by principal component analysis. b Neighbor-joining tree constructed using genetic sharing distances. c Admixture plot for breeds analyzed based on different number of assumed ancestors (K). BEC, Baier Yellow Chicken; BYC, Beijing You Chicken; LSC, Langshan Chicken

Figure 2c shows an admixture plot representing the 270 sampled chickens, generated using a model-based clustering approach. At a low value of K (K = 2), two distinct ancestors are apparent (BYC and LSC). BEC appears to include both LSC (as the majority component) and BYC. At K = 3, individuals cluster strongly into the three corresponding breeds, consistent with the PCA and NJ tree results (as shown in Fig. 2a and b). All generations within each breed show the same pattern. The optimum population structure inferred using the admixture model in STRUCTURE was subdivided into three sub-populations based on both LnP(D) and Evanno’s ∆K method (K = 3; Fig. 3). At K = 4, BEC splits into its two main ancestors. For K = 5 to 8, LSC appears to include two or more distinct ancestors, but at K = 9, it groups again into one common ancestor. Finally, the, BYC breed always exhibits homogeneity, except for K = 9.

Fig. 3
figure 3

L(K) and ΔK values using different values of K, as calculated by STRUCTURE Harvester. a Average likelihood of runs in STRUCTURE L(K) along with number of K clusters. . b ΔK, estimator of the optimal number of clusters (K)

Population differentiation analysis

To investigate the extent of population differentiation between different generations within breeds and between breeds, FST values were calculated using the filtered genotype data (Additional file 4: Table S2). The FST values for all pair-wise population comparisons are shown in Fig. 4a. For the entire population, FST values varied from 0.0046 to 0.1530, and FST values between breeds ranged from 0.1127 to 0.1243 (Fig. 4b). FST values are expected to be significantly higher between breeds than between generations within a breed. All FST values between generations within a breed were below 0.05 (from 0.0046 to 0.0423), indicating that no obvious genetic differentiation appeared within any breed (Fig. 4a). However, these FST values increased during conservation. For example, for BEC, the FST value was 0.0046 between BEC07 and BEC10, and increased to 0.0329 between BEC07 and BEC15. Similar trends were observed between BEC07 vs BEC15 (0.0329) and BEC10 vs BEC15 (0.0285).

Fig. 4
figure 4

a Matrix showing pairwise differentiation estimates (FST) between nine breed sub-populations. b Nucleotide diversity (π) and genetic differentiation (FST) across the three breeds. The value in each circle represents a measure of nucleotide diversity for this breed, and the value on each line indicates genetic differentiation between the two breeds

Linkage disequilibrium decay (LD decay) and effective population size

LD for each population was estimated as the physical genomic distance at which the genotypic association (r2) decays to less than half of its maximum value. Short-range LD was always observed in each of the three different generations within same breed (Fig. 5). As expected, LD values tended to increase as conservation continued. For example, LD values for LSC10, 12 and 15 were 13.19 kb, 16.99 kb, and 20.10 kb, respectively. The BEC pattern was similar: 11.56 kb (BEC07), 13.55 kb (BEC10), and 20.77 kb (BEC15). The highest LD value in BYC occurred in the last generation (BYC15, 24.61 kb), but BYC07 (23.99 kb) exceeded BYC10 (17.95 kb).

Fig. 5
figure 5

Linkage disequilibrium between generations and within each breed as a function of inter-SNP distance. Physical distance is measured in kb

Effective population size (Ne) was estimated for autosomal chromosomes gga1 through gga28 based on linkage disequilibrium (Table 3). Average Ne differed amongst the 9 sub-populations (from 19.33 to 34.85). Within macrochromosomes (gga1–5), BEC07 had the highest estimated Ne (81.52) within a breed, and Ne declined in BEC as conservation continued. In contrast, Ne was lower (55.36) in LSC12 than in LSC10 (70.28) or LSC15 (73.74). Ne estimates for BYC fluctuated, with a high value (75.04) in BYC10 and lower values in BYC07 and BYC15.

Table 3 Effective population sizes (Ne) for three Chinese indigenous chicken populations

Runs of homozygosity (ROH)

Runs of homozygosity (ROH) were identified in the genomes of the 9 sub-populations from all three breeds (Table 4). A genome-wide survey for autozygosity was conducted to identify regions with signatures that reflect ancient or recent inbreeding effects. We estimated FROH, and found that the maximum values occurred in the BYC breed. All three BYC generations exceeded 0.05 (Table 2). BYC15 had the highest level of inbreeding (0.0925), while the BEC and LSC breeds had similar and lower inbreeding levels (~ 0.05).

Table 4 Statistical summary for runs of homozygosity in sub-populations of three chicken breeds

ROH was then assessed to determine whether any populations exhibited evidence of recent inbreeding. All three generations in BYC had higher ROH values, suggesting that recent inbreeding had occurred in this breed (Table 4 and Fig. 6). Consistent with the LD decay analysis, the highest ROH was observed in BYC15 (r20.1248 = 24.61 kb), followed by BYC07 and BYC10. We speculate that BYC inbreeding might reflect the small effective population size of this breed (Table 3). The ROH values for BEC were highest in BEC15, followed by BEC07 and BEC10, which parallels the trend observed for LD decay and FROH. Similar patterns were observed for LSC. Homozygosity was also measured between the individuals in each sub-population using three methods (NSEG, KB, and KBAVER) (Additional file 5: Table S3). All three measures varied significantly between BYC generations (P<0.01). NSEG and KB also showed significant differences (P<0.0001) between generations in BEC and NLS.

Fig. 6
figure 6

Homozygosity frequency distribution derived from runs of homozygosity (ROH) for each generation and breed

Discussion

The chicken, one of the first animals to be domesticated, has been subjected to long-term natural selection, artificial selection, and genetic drift for diverse specific traits [41, 42]. A variety of factors accelerated the generation of phenotypic differences and genetic variability [43, 44]. However, this variability has been threatened due to ecosystem damage and commercial breeding. In China, gene pool (live) conservation and protected regions for both ex situ and in situ conservation have been established for the management of poultry genetic resources, as exemplified by the National Chicken Genetics Resources Program (Jiangsu). Because live conservation is typically implemented using small populations, it is necessary to monitor the status of each population to evaluate the effectiveness of the management strategy. In this study, we performed genotyping by sequencing (GBS) to assess the genomic diversity of different generations from three conserved breeds (Baier Yellow Chicken, Beijing You Chicken and Langshan Chicken).

The majority of studies have estimated genetic variability in Chinese indigenous chickens using microsatellites [45, 46] or mtDNA [47, 48], but genome-wide SNPs have seldomly been used. We estimated the genetic diversity in three chicken populations using SNP markers. The results showed that all three chicken breeds have maintained rich genetic diversity in terms of heterozygosity (Ho, He), proportion of polymorphic markers (PN), and allelic richness (AR), consistent with previous studies [45, 46, 49,50,51]. In the most recent generations sampled, LSC15 ranked first in genetic diversity, followed by BEC15 and BYC15 (Table 2). A study of the same populations in 2008 indicated that genetic diversity measured using microsatellites was highest in BYC, followed by LSC and BEC [45], suggesting that genetic diversity in BYC has decreased more rapidly than in the other breeds. We also observed this declining trend in the BYC breed. BYC diversity decreased from 2007 to 2015 (Table 2), while genetic variability in the BEC breed fluctuated, and LSC exhibited a slight increase. The BYC breed has been under conservation (~ 39 generations) for a longer period than either LSC or BEC, which have been conserved only since 1998 (~ 17 generations). The long-term practice of conservation in a small population size may reduce genetic diversity. Furthermore, all three breeds were subjected to ex situ live conservation in Jiangsu. The BYC breed, which originated in Beijing, might have adapted poorly to the environment, resulting in a loss of genetic diversity. In contrast, the LSC and BEC breeds might have adapted more easily.

The genetic diversity in all breeds changed no more than 10% between generations (Additional file 3: Figure S2). The conservation goal is to maintain 90% of the genetic diversity from the initial population and an inbreeding coefficient less than 0.1 for 100 years [52, 53]. According to our results, the genetic diversity of the three chicken populations meets conservation criteria under the current program (R: F). In particular, inbreeding events have been effectively avoided under the R: F mating system, based on assessments of population structure, genetic differentiation, LD decay, and ROH.

Nevertheless, the decline of genetic diversity should not be ignored (Fig. 1 and Table 2). The significant differences in ROH that we observe between generations in all three breeds also suggest that these populations have not reached the desired level of genetic stability during conservation. Both the decline in genetic diversity and the high heterozygosity across generations are indicative of genetic drift, which can be reduced by enlarging the population size. In our study, the estimated effective population sizes (Ne), based on whole-genome SNPs for the conserved populations, were far below the required threshold of 50 individuals [1, 3]. We also evaluated Ne according to chromosome size, using the classification proposed by the International Chicken Genome Sequencing Consortium [54]: large macrochromosomes (gga1–5), intermediate chromosomes (gga6–10) and micro-chromosomes (gga11–28). Because micro-chromosomes have high rates of recombination, we estimated Ne based on the macrochromosome class (gga1–5). The maximum Ne was 81.52 in BEC07 and the minimum was 41.56 in BEC15 (Table 3), suggesting these conserved populations are relatively stable but also at risk. We therefore recommend that ex situ live and in situ live conservation efforts be combined to help maintain high levels of genetic diversity in the long term.

Conclusions

In summary, we collected 270 samples from three successive generations of three conserved chicken breeds. We estimated dynamic changes in genetic diversity using genome-wide SNPs, making it possible to comprehensively evaluate the current conservation scheme (R: F). The results demonstrated that the conserved Chinese chicken populations have sustained high levels of genetic variability under current conservation practices. We also compared successive generations within each breed to characterize trends in genetic diversity, allowing us to assess the effects of conservation over time. Overall, this study demonstrates an efficient strategy for assessing the success a conservation program and for improving conservation and management practices.