Introduction

Prunus sibirica L. (Siberian apricot) is a member of the Rosaceae family and is the dominant species on mountain dunes and dry steppes (Yu 1979). P. sibirica is an important ecological and economic tree species in China, mainly distributed in Inner Mongolia, Heilongjiang, Jilin, Liaoning, Gansu, Hebei, and Shanxi in China, and the P. sibirica forest in Inner Mongolia covers 47.44 hectares.

P. sibirica shows strong cold and drought resistance and high nutritional and medicinal value (Dong et al. 1991). However, the yield differences in P. sibirica in China are significant and very unstable because of long-term seed reproduction, delayed breeding, and the influence of climatic factors. In addition, frost can severely reduce the yield of P. sibirica, greatly restricting its industrialization (Ma et al. 2007; Yao et al. 2007; Jin et al. 2018). P. sibirica shows self-incompatibility, resulting in large genetic differences among individual genetic resources (Li et al. 2011a, b).

Biodiversity refers to all of the variation among living things on earth and is the basis of survival and biological development (Rao et al. 2002). Genetic diversity is an important component of biological diversity and refers to the genetic differentiation of all living organisms (Lu 2018). Dong et al (2018) studied the diversity of 19 quantitative traits of P. sibirica from different populations, and the results showed that the coefficient of variation of the economic traits of fruit and kernel were large, and the coefficient of variation of fruit yield was the largest, indicating that P. sibirica has rich germplasm resources with great potential for breeding good varieties. Wang et al (2019) studied the seed traits of P. sibirica from 19 populations and showed rich genetic variations in seed traits between 19 populations. Li (2014) studied the genetic diversity of P. sibirica and Prunus armeniaca in North China according to inter-simple sequence repeat (ISSR), sequence-related amplified polymorphism (SRAP), and simple sequence repeat (SSR) analyses. The results showed that the natural population of P. sibirica in China exhibits a high level of genetic diversity, showing high genetic differentiation within its populations but low genetic differentiation and moderate gene flow among populations.

The initial SCoT marker was a molecular marker based on a single primer amplification reaction proposed by Collard and Mackill (2009) in rice. It was a novel molecular marker of the target gene. The strategy was to conduct genome amplification with a single primer according to the conservative nature of the ATG translation of the flanking sequences of plant genes, the goal was to reveal the percentage of dominant polymorphic markers in candidate functional gene regions via a procedure with easy operation to identify rich polymorphisms. Compared with random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), SSR, and ISSR markers, SCoT markers have been efficiently produced and linked to traits, making them convenient for use in molecular marker-assisted breeding. SCoT markers have been used in various plant species, such as Phoenix dactyifera (Somayeh Saboori et al. 2020), Mangifera indica (Li Zhou et al. 2020) and Diospyros kaki. (Changfei Guana et al. 2020).

Genetic resources for P. sibirica are extremely abundant and present great developmental potential, and they can provide an important genetic basis for the improvement of Chinese P. sibirica and the breeding of new varieties. In recent years, research on the genetic diversity of P. sibirica has mostly been conducted in northeastern and northern China, and P. sibirica is mainly distributed in Inner Mongolia. Moreover, few studies (Bao et al. 2016)have been conducted on the genetic diversity of P. sibirica in Inner Mongolia. Therefore, we used SCoT molecular markers to analyze the genetic diversity of 278 individuals from 10 populations of P. sibirica in Inner Mongolia. The genetic diversity and genetic structure among populations and individuals of P. sibirica in Inner Mongolia were revealed. This study provides a scientific basis for the cultivation and exploitation of the abundant genetic resources of P. sibirica in Inner Mongolia.

Materials and methods

Plant materials

The materials were obtained from the P. sibirica Germplasm Resource Garden of the Inner Mongolia Fine Variety Breeding Center (Fig. 1). A total of 278 individuals were collected from 10 populations (Table 1). Fresh young leaves were placed in a bucket containing liquid nitrogen, brought back to the laboratory, and stored in a -80 °C freezer for further experiments.

Fig. 1
figure 1

Geographic location map

Table 1 Site conditions and sampling information for P. sibirica populations

DNA extraction

Genomic DNA was extracted from fresh leaves using a Plant DNA Extraction Kit (TIANGEN, China). The extracted DNA was tested for quality and purity with a

spectrophotometry at a wavelength of 260/280 nm using a BioPhotometer (Thermo Fisher Scientific, America) and 1% agarose gel electrophoresis (with a nucleic acid stain) and stored in a -20 °C freezer for further experiments.

SCoT-PCR

The 80 primer sequences were selected by referring to Collard and Mackill (2009) and were synthesized by Shanghai Sangon Biological Engineering Technology and Services. A total of 23 highly polymorphic and repeatable primers were selected from 80 SCoT primers to evaluate the selected P. sibirica accessions (Table 2).

Table 2 Screened SCoT primer sequence information

A 96-well gradient PCR instrument was used for the amplification reactions. PCR was performed in a 20 μL volume containing 10 μL of 2 × Taq Mix, 1.6 μL (30 ng/μL) of genomic DNA, 0.8 μL (10 mol·L−1) of each primer, and 7.6 μL of ddH2O. SCoT-PCR amplification was conducted with initial denaturation for 5 min at 95 °C, followed by 40 cycles of 45 s at 94 °C, 45 s at 50 to 54 °C and 2 min at 72 °C, with a final extension of 7 min at 72 °C and holding at 4 °C. The PCR products were separated in a 1.5% agarose gel using 1 × TBE running buffer, stained, and photographed, and the records were preserved.

Data analysis

SCoT amplicons were scored in a binary matrix as present (1) or absent (0) for each sample based on the corresponding standard size. Vague bands that could not be easily detected were not scored. POPGEN32 software was used to calculate the total number of bands (NPB) and the number of polymorphous bands (PPB) obtained with the SCoT primers to analyze the genetic diversity index of P. sibirica and to calculate Nei’s gene diversity index (H), Shannon's information index (I), the number of alleles (Na), the effective number of alleles (Ne), total genetic diversity (Ht), population genetic diversity (Hs), the coefficient of genetic differentiation (Gst), and gene flow (Nm). Ntsys-2.0 software was used to calculate Nei's genetic distances from the binary (0, 1) matrices obtained, and the unweighted pair-group method with arithmetic means (UPGMA) method performed in the SHAN program was used for the cluster analysis of the 10 populations to construct the cluster relationship tree diagram. MEGA7 software was used for the cluster analysis of the 278 individuals based on Nei's genetic similarity matrix obtained via the arithmetic mean (UPGMA) method. An analysis of molecular variance (AMOVA) and GenAlEx software were used for the analysis of genetic differentiation among and within populations. A principal coordinate analysis (PCoA) was performed according to the binary matrix using GenAlEx software. STRUCTURE software was used to determine the genetic structure of the studied population, and K was tested from 1 to 10 with ten replicates. IBD software was used for Mantel test to analyze the correlation between genetic distance and geographic distance and altitude.

Results

SCoT polymorphism analysis

In the SCoT analysis, 23 SCoT primers were used for marker amplification in 278 P. sibirica individuals. A total of 292 bands that could be scored were produced, among which 289 bands were polymorphic, with a mean of 12.6 polymorphic bands per primer. The polymorphism percentage was 98.87%. Among the 23 SCoT primers, SCoT21, SCoT32, and SCoT53 amplified a maximum of 17 bands, and the polymorphism percentage was 100%. The number of bands amplified by the SCoT 25 primer was at least 9, and the polymorphism percentage was 90%. The 23 SCoT primers produced Na values ranging from 1.90 to 2.00, with a mean of 1.99; Ne values ranging from 1.28 to 1.60, with a mean of 1.44; H values ranging from 0.18to 0.34, with a mean of 0.26; and I values ranging from 0.30 to 0.50, with a mean of 0.41 (Table 3).

Table 3 Polymorphism information obtained from SCoT primer amplification in P. sibirica

Population genetic diversity and genetic differentiation

The results of the genetic diversity and genetic differentiation analyses in the 10 populations of P. sibirica in Inner Mongolia showed that NPB ranged from 110 to 180, with a mean of 156.40 bands; PPB ranged from 37.67 to 61.64, with a mean of 53.56 polymorphisms; Na ranged from 1.40 to 1.62, with a mean of 1.54; Ne ranged from 1.20 to 1.31, with a mean of 1.28; H ranged from 0.12 to 0.19, with a mean of 0.17; and I ranged from 0.18 to 0.29, with a mean of 0.26. Among the 278 individuals, the overall PPB, Na, Ne, H, and I total values were 97.94%, 1.98, 1.43, 0.26, and 0.40, respectively (Table 4), implying the presence of considerable genetic diversity among the P. sibirica accessions. The genetic diversity was determined to be highest among the KYZ population, with an H value of 0.19 and a PPB of 61.64%. The second-highest H value of 0.18 was observed for KZH, for which PPB was 60.62%. In contrast, the genetic diversity was found to be lowest in WJG, with an H value of 0.12 and a PPB of 37.67%.

Table 4 Genetic diversity indices of P. sibirica populations

The 278 individuals showed an Ht of 0.26, with a variance of 0.03; Hs of 0.17, with variance of 0.01; Gst of 0.36, where the mean showed a high Gst value (Gst > 0.25); and Nm of 0.89 (Nm < 1), with the mean showing low gene flow (Table 5). The AMOVA showed 9 degrees of freedom among the populations and 267 degrees of freedom within the populations. It showed that the mean squares of the variance among the populations was 4,622.89, and that within the populations was 7,385.47. The among-population variance was 17.61 on average and the within-population variance was 27.56. Data from the AMOVA molecular detection prove that 39% of total genetic variance occurred among the populations, while 61% occurred for within the population genetic variability (Table 6).This result indicates that there is low genetic differentiation among populations.

Table 5 Genetic differentiation within P. sibirica populations
Table 6 Analysis of molecular variance (AMOVA) of P. sibirica populations

Analysis of the among-population genetic structure

POPGEN32 was used to analyze the genetic distance and genetic similarity of the 10 P. sibirica populations in Inner Mongolia. The analyses of Nei's genetic similarity and Nei's genetic distance showed that the population genetic distance ranged from 0.03 to 0.25 while the genetic similarity ranged from 0.77 to 0.97. The genetic similarity between WJG and KSK was the highest, at 0.97, and their genetic distance was the lowest, at 0.03. The lowest genetic similarity between ZLT and WJG was 0.77, and the greatest genetic distance was 0.25 (Table 7). Based on Nei's genetic distance, the UPGMA method was used for cluster analysis. The results showed that the 10 populations were divided into two major groups at a genetic distance threshold of 0.22 (Fig. 2). Group I included KSK, WJG, LC, KYZ, KZH, and ZLA. Group II included BLY, AL, WLS, and ZLT. When the genetic distance was less than 0.17, group I was divided into subgroup I and subgroup II. Subgroup I included KSK, while subgroup II included WJG, LC, KYZ, KZH, and ZLA. Group II was divided into subgroup III and subgroup IV. Subgroup III included BLY, AL, and WLS, while subgroup IV included ZLT.

Table 7 Genetic similarity (above diagonal) and genetic distance (below diagonal)among populations
Fig. 2
figure 2

UPGMA dendrogram of P. sibirica populations based on Nei's genetic distance

Analysis of genetic structure within populations

Based on Nei's genetic similarity, MEGA7 software was used to perform the clustering analysis of 278 individuals (Fig. 3), with the PCoA performed using GenAlEx software (Fig. 4). The results showed that the 278 individuals were divided into two groups. The UPGMA clustering analysis showed that the 278 individuals were divided into group I and group II when the genetic similarity was 0.17. Group I included 144 individuals, while group II included 134 individuals. Group I included 23 WJG individuals, 46 KSK individuals, 26 KYZ individuals, 38 KZH individuals, and 11 ZLT individuals. Group II included 27 AL individuals, 14 WLS individuals, 26 ZLT individuals, 25 BLY individuals, 31 LC individuals, and 11 ZLA individuals. When the genetic similarity was ≤ 0.14, group I and group II were divided into four subgroups. A total of 23 WJG individuals and 46 KSK individuals were clustered into subgroup I; 26 KYZ individuals, 38 KZH individuals and 11 ZLA individuals were divided into subgroup II; 27 AL individuals, 12 ZLT individuals, and 9 WLS individuals were clustered into subgroup III; and 14 ZLT individuals, 5 WLS individuals, 11 ZLA individuals, 26 BLY individuals, and 30 LC individuals were included in subgroup IV. The PCoA results showed that the 278 P. sibirica individuals were divided into groups I and II. These results were relative to those of the UPGMA clustering analysis of similarity. The two groups were divided into four subgroups. Group I included all individuals in the KSK, WJG, KZH, and KYZ populations and some individuals from ZLA. Group II included all individuals from AL, WLS, ZLT, BLY, and LC and some individuals from ZLA.

Fig. 3
figure 3

UPGMA dendrogram of P. sibirica populations based on the SCoT markers

Fig. 4
figure 4

Principal coordinate analysis (PCoA) of P. sibirica populations based on SCoT markers

To further explore the genetic relationships within populations of P. sibirica, the population structure of 278 P. sibirica individuals was evaluated using STRUCTURE version 2.3.4 software (Beaumont et al. 2001). ∆K values computed for all classes indicated a strong signal for K = 2 (Fig. 5), and K = 2 values provided the most rational arrangement of P. sibirica in different regions. A total of 278 P. sibirica individuals were divided by the STRUCTURE analysis into 2 groups (Fig. 6). Group I included all individuals in the KSK, WJG, KZH, and KYZ populations and some individuals from ZLA. Group II included all individuals from AL, WLS, ZLT, BLY, and LC and some individuals from ZLA. This result is similar to the clusters identified using UPGMA and PCoA.

Fig. 5
figure 5

Evanno’s test based on delta k among P. sibirica cultivars studied

Fig. 6
figure 6

Population structure of the 278 individuals at K = 2. The same color indicates the same group

Discussion

In this study, a genetic diversity analysis was performed using SCoT markers to assess phylogenetic relationships among 278 P. sibirica individuals from 10 different populations in Inner Mongolia. Twenty-three SCoT primers were used for the amplification of genomic DNA from P. sibirica. A total of 292 clear bands were obtained, 289 of which were polymorphic. The average number of amplified bands per primer was 12.6, with a polymorphism percentage of 98.87%. In a previous study, researchers used SRAP (Ai et al. 2011), AFLP (Wang 2008), RAPD (Lu 2008), and ISSR (Liu et al. 2011, 2007; Duan et al. 2010; Li et al. 2009) molecular markers to analyze the genetic diversity of P. sibirica in different regions, and the results showed that the polymorphism ranged from 58 to 90%. In contrast, the SCoT molecular markers were found to be more polymorphic than other molecular markers. Therefore, SCoT markers may be suitable for the study of genetic diversity according to molecular markers in P. sibirica. A previous study of the genetic diversity of Prunus armeniaca (Li 2014) indicated H, I, and polymorphism percentage values of H = 0.17, I = 0.26 and PPB = 50.07%; P. sibirica (Ma 2014) indicated H, I, and polymorphism percentage values of H = 0.23, I = 0.35 and PPB = 76.74%. However, the H, I, and polymorphism percentage values obtained in the present study were H = 0.26, I = 0.40 and PPB = 97.94%, respectively. We showed that the genetic diversity of the P. sibirica populations in Inner Mongolia is higher than that of Prunus armeniaca and P. sibirica. This result was consistent with the conclusions of other scholars that the genetic diversity of outbred species is higher (Hamrick 1989; Zheng et al. 2008).

As a perennial wild resource, P. sibirica is widely distributed in Inner Mongolia and exhibits a large distribution area, long-term evolution and considerable diffusion as well as ecological diversity, leading to rich genetic diversity. According to Wright (1972), when the FST (GST, genetic differentiation coefficient) value is 0 to 0.05, the genetic differentiation within populations is low; when the FST value is 0.05 to 0.15, the genetic differentiation within populations is moderate; when the FST value is 0.15 to 0.25, the genetic differentiation within populations is high; and when the FST value is greater than 0.25, the genetic differentiation within populations is great. In our research, the GST was 0.36. The results showed that the genetic differentiation coefficient of P. sibirica within populations in Inner Mongolia was relatively high, with among-population genetic differentiation of 39% and a within-population genetic differentiation of 61%. Wang (2019), Liu et al. (2012) and Ma (2013), who used SSR and ISSR markers for P. sibirica and wild apricot genetic diversity analyses, showed that within-population variation dominated, and the among-population variation was much lower than the within-population variation. which is similar to the results of a previous study in P. sibirica.

The reasons for the high within-population genetic differentiation of P. sibirica in Inner Mongolia may include the following: 1. Gene flow is an important factor affecting genetic differentiation in a population when Nm < 1, which can conversely effectively prevent genetic differentiation caused by genetic drift. The Nm value for P. sibirica in Inner Mongolia was 0.89, indicating that genetic drift was not the main factor influencing the genetic differentiation of the P. sibirica populations in Inner Mongolia. Some gene flow exists among the P. sibirica population in Inner Mongolia, but the Nm intensity was relatively low. 2. It the main breeding method of P. sibirica is outcrossing, self-incompatibility will refuse self-pollination or inbreeding, and its breeding mainly relies on natural pollination or pollination by visiting insects such as bees, flies and a few butterflies (Liu et al. 2010; Liu 2010). Although pollen can be spread by insects and by wind over long distances, the large distribution area and discontinuities in the population distribution of P. sibirica are restricted due to habitat fragmentation (Wu et al. 2015), thus weakening the gene flow within populations of P. sibirica. 3. Some populations have been destroyed or their habitats have been degraded, resulting in a gradual fragmentation of the population distribution and limiting the gene flow within the populations. Therefore, P. sibirica in Inner Mongolia shows high within-population genetic differentiation, and the topography and landforms and human activtes could be potential factors influencing the genetic difference.

In the present study, the genetic structure of P. sibirica populations in Inner Mongolia was analyzed. From the clustering diagram, principal component analysis, and structure analysis, showed that most of the populations are clustered together with a similar geographic distribution, and the Mantel test showed a not significant correlation between the genetic distance and geographic distance and between the genetic distance and altitude of the P. sibirica studied (r = 0.04, P≦0.48; r = 0.03, P≦0.43). Therefore, the genetic difference among P. sibirica is not due to their geographical distance and altitude. The clustering diagram, principal component analysis and structure analysis of the 278 individuals divided the populations into two groups ( I and II). KSK, WJG, KYZ, and KZH were clustered in group I; LC, BLY, WLS, AL, and ZLT were clustered in group II; and only the 22 individuals in the ZLA group were divided between the two groups. A certain degree of gene flow exists among the subgroups, and gene flow also affects the population structure of P. sibirica (Liu et al 2012). This may occur because of the area of many sandy land in Inner Mongolia, and the topography and landforms are very complex. Thus, weakening the gene flow opportunities among populations. Second, habitat fragmentation is also an important factor that affects the composition and structure of the ecosystem and the genetic structure of a species.

Conclusion

This study revealed that P. sibirica shows high genetic diversity in Inner Mongolia. Therefore, improving the preservation and utilization of specific germplasms of different populations is necessary to avoid the disappearance of a large number of valuable genetic resources. Moreover, expanding the collection of resources to implement transfer protection will be conducive to better scientific research and resource protection in P. sibirica, increase the gene flow among populations, and improve the genetic diversity of P. sibirica. Based on resource collection, genetic diversity analyses, and genetic structure analyses, the breeding of P. sibirica varieties with high yields and frost resistance is being actively performed to meet the increasing demand for the industrial production of P. sibirica resources.