Transcriptome-derived SSR markers for DNA fingerprinting and inter-populations genetic diversity assessment of Atractylodes chinensis

Atractylodes chinensis (fam. Asteraceae) is an important medicinal plant due to its unique pharmacological activity. The species is widely distributed in most areas of northern China. It is difficult to identify different populations of A. chinensis due to their similarity in characteristics. This study was the first investigation to date that assessed the genetic diversity of A. chinensis from different geographical counties of northern China using simple sequence repeat (SSR) markers. Of the 106 SSR primers in the clusters classified in the sesquiterpenoid biosynthesis pathway in the transcriptomic database of A. chinensis, ten with high polymorphism were used to analyze the inter-populations genetic diversity and construct DNA fingerprinting of 19 A. chinensis populations. A total of 78 alleles were detected, with an average number of 6.5 alleles per primer. The PIC value ranged from 0.4748 to 0.8918 with a mean of 0.6265. The neighbor-joining tree was used to classify 19 populations of A. chinensis into three clusters. DNA fingerprinting was performed according to these ten SSR markers. The results revealed that geographic origin is not exactly related to genetic diversity, as populations belonging to different provinces are grouped in the same cluster. The results of this study confirm that SSR markers are effective for genetic diversity analysis. The inter-populations genetic diversity and fingerprinting of A. chinensis in this study could provide a scientific basis for species identification and selective breeding.


Introduction
Atractylodes chinensis (DC.) Koidz (typically referred to as "Bei Cang Zhu" in Chinese) is a major medicinal plant known as rhizome atractylodes, which are used to treat digestive disorders, rheumatic diseases and night blindness [5]. Modern pharmacological studies have reported that rhizome atractylodes was also used for anti-inflammatory, antibacterial [10,17] and anti-tumor properties [11]. A. chinensis is widely distributed throughout most areas of northern China and is mainly produced in Hebei, Inner Mongolia, Liaoning and other provinces of China [35]. The contents of atractylodin in rhizome atractylodes, an important standard of quality assessment in the Chinese pharmacopeia, vary among provinces and even counties [13], but are similar in characteristics. The utilization of and research on A. chinensis have received less attention worldwide. Additionally, A. chinensis faces an unprecedented threat of even extinction due to its sharp reduction in wild resources as well as increasing medicinal demand. Although cultivation relieved some of this pressure over the past ten years, species of stable and consistent quality have not yet been cultivated due to unclear genetic basis. China is very rich in genetic variability of A. chinensis. Therefore, it is critical to adopt an effective methodology to assess the interpopulations genetic diversity of wild A. chinensis populations.
High-performance liquid chromatography (HPLC) fingerprinting [13], ITS [9,12] and trnL-F [8,22] sequences and chloroplast genome variation [30,33] have been used to analyze interspecific phylogenetic relationships of Atractylodes species. However, these methods are not effective for intraspecific diversity analysis [34]. Simple sequence repeats (SSRs) are the ideal markers due to their high polymorphism, codominance and low cost. SSR markers have been widely used in variety identification, fingerprinting construction and intraspecific genetic diversity analysis [14,25,34]. The selection of a set of core SSR primers for germplasm identification and genetic diversity have been conducted for many medicinal plants, such as Glehnia littoralis [27], Glycyrrhiza [16], and Euryale ferox [15]. However, such a marker toolkit is not presently available for A. chinensis genetic diversity analysis.
In this paper, we screened SSR loci in clusters classified into the sesquiterpenoid biosynthesis pathway based on the transcriptomic database of A. chinensis. Ten SSRs with high polymorphism were used to analyze the interpopulations genetic diversity and fingerprinting of 19 A. chinensis populations. Interpopulation genetic diversity and fingerprinting will provide a scientific basis for species identification and selective breeding in A. chinensis.

DNA extraction and PCR amplification
A. chinensis rhizomes were collected from different counties of northern China (Table 1), including Hebei, Shandong, Inner Mongolia and Jilin Provinces. No permission was required to collect wild resources of A. chinensis. All of the samples used in this study were identified as A. chinensis by Professor Qiaosheng Guo who works at Nanjing Agriculture University (Nanjing, Jiangsu Province, China). Professor Guo identified the experimental species through comparison with specimens inform the institute of botany Jiangsu Province, and the Chinese Academy of Sciences. All the samples were planted in the experimental farm of Hebei Normal University of Science & Technology (Qinhuangdao, Hebei, China). The quality and price of rhizome atractylodes were established according to the counties in the Chinese herbal medicine market. Young leaves of ten randomly selected plants from each population were mixed as one sample, immediately frozen in liquid nitrogen and stored at − 80 °C prior to DNA extraction.
The total DNA of A. chinensis was extracted through the improved CTAB method using plant genomic extraction kits (Cat.No.0419-50 CB, Huayueyang, Beijing, China, http:// www. huayu eyang. com. cn/ produ ct/ 27678 2043). The purities of extracted DNA samples were tested in a 2.0% agarose gel with electrophoresis on a horizontal electrophoresis DYCP-31DN apparatus (Liuyi, Beijing, China) and a gel-imaging system (GBOX-HR, Syn-gene, UK). The OD260/280 ratios of DNA were measured by a spectrophotometer (Synergy HT, Gene Company Limited, Hong Kong, China).
For SSR amplification, a 10 μL volume of reaction mixture included 50 ng/μL DNA, 2.5 mM dNTPs, 10 × buffer (Mg 2+ included), 5.0 U/μL Taq enzyme, 10 μM of each primer, and ddH 2 O. Procedures for SSR amplification were carried out in a thermal cycler (BIO-RAD S1000 PCR, California, USA) by the following cycles: an initial 4 min pre denaturation at 94 °C, followed by 35 cycles of a 30-s denaturation at 94 °C, a 30-s annealing phase at 55 °C, and a 1 min extension at 72 °C, and a final extension at 72 °C for 10 min. The PCR products were preserved at 4 °C. PCR products were separated by polyacrylamide gel electrophoresis (6%) at a constant voltage (130 V) for 3 h. A 1000 bp DNA marker (TaKaRa, Japan) was used to determine allele size.

RNA sequencing and core SSR marker screening
RNA extraction and sequencing were performed as described by Zhao et al. (2021) [36]. RNA of A. chinensis was extracted using TRIzol Reagent (Invitrogen). Transcriptome data of A. chinensis were acquired based on the Illumina Hiseq Xten PE150 platform, by Novogene Co. (Beijing, China). All SSR primers used in this study were designed from the transcriptomic database as reported by Zhao et al. 2021 [36], and they are available in the SRA (BioProject ID PRJNA698794, https:// www. ncbi. nlm. nih. gov/ sra/ PRJNA 698794). SSR marker detection, identification and primer design were performed as described by   [31].
This study was carried out to analyze the inter-populations genetic diversity based on those markers in clusters classified into sesquiterpenoid biosynthesis pathway. Twenty-five SSR primers in clusters annotated as terpene skeleton biosynthesis and eighty-one primers in clusters annotated as the sesquiterpenoid biosynthesis pathway were screened for polymorphism testing (Supplementary  Table S1). SSR primers were constructed by Shanghai Invitrogen Biotechnology Company (Shanghai, China). The core primers, with high allelic frequencies (> 2), were screened by amplification with DNA extracted from 8 A. chinensis populations from different counties. Only ten SSR primers with distinct bands and high polymorphism were used to analyze interpopulations genetic diversity in this study (Table 3).

Data analysis
The amplified bands with good resolution from 10 SSR primers were counted and scored as 1 (present) or 0 (absence). Several genetic diversity assessment parameters such as the observed number of alleles, effective number of alleles, Nei's (1973) gene diversity (h) and Shannon's information index (I) were determined using software POPGENE version 1.32 [19]. The polymorphism information content (PIC) was calculated as described by Botstein et al. (1980) [3]. Similarity coefficients were calculated using the similarity program in PopGene version 1.32.
The clustering of 19 A chinensis populations was performed based on a similarity matrix using an unweighted pair group method with arithmetic average (UPGMA) algorithm following SAHN module of NTSYS version 2.10. The phylogenetic tree was constructed using the neighbor-joining method by MEGA version 7.0.21.
The size of the amplified fragments was estimated by using the DNA ladder that produced the expected size (100-1000 bp). SSR locus diversity data from ten SSR primers are summarized in Table 4. The overall size of the amplified fragments varied from 200 to 1000 bp. A total of 65 loci in 78 alleles (80.33%) were detected revealing the presence of a large difference. The number of polymorphic alleles per SSR locus ranged from 2 (S4) to 13 (S2) with an average of 6.5 alleles per locus (Table 4), showing that 19 A. chinensis populations exhibited a high level of genetic diversity. The average number of allelic genes in this study was more than that of many other crop species, namely, 3.7 in Euryale ferox [15], 5.1 in Lactuca sativa var capitata [37] and 4.5 in Sesamum indicum [2].
SSRs with PIC values > 0.5 were considered highly informative markers [24]. The PIC values among the 19 A. chinensis populations varied from 0.4908 (S54) to 0.8918 (S2) with an average of 0.6265 (Table 4), which was much higher than 0.5. The calculated average PIC value (0.6265) in A. chinensis was higher than that in some crops, namely, 0.495 in Camellia sinensis [7], 0.32 in Gossypium hirsutum [23], 0.5619 in Sorghum bicolor ssp. bicolor [21], which indicated their high informativeness. The value of PIC related to the relative frequency and number of alleles [24] was proportional to the polymorphic locus.
The highest number of polymorphic alleles and PIC value were 13 and 0.8918, respectively, in primer S2. Eight out of ten (80.00%) markers had a PIC value > 0.5, except for S4 (0.4908) and S54 (0.4748), indicating that they were suitable for genetic diversity and fingerprinting studies.

Genetic diversity and relatedness
A dendrogram elucidating the genetic relationships among the 19 A. chinensis populations was constructed using the neighbor-joining method by MEGA version 7.0.21. To better understand their relationships, we divided the tested 19 A. chinensis populations into three clusters (Fig. 1). Populations P11 were grouped into Cluster I. Cluster II consisted of 9 populations distributed into two subgroups. One population, P6, was grouped into the subgroup. The remaining 8 populations were grouped into the second subgroup. Populations belonging to different provinces constituted cluster II. For example, P5 (Shandong Province) and P19 (Jilin Province) were grouped into the Hebei Province cluster. Similarly, cluster III consisted of 9 populations derived from different provinces. Populations P2 and P15 from Inner Mongolia Province were grouped into Hebei Province.
The 15 populations from Hebei Province were divided into two clusters, and grouped with Shandong and Jilin Provinces or with Inner Mongolia Province. The three clusters formed in the dendrogram revealed that the geographic origin does not exactly corroborate genetic diversity. This phenomenon appeared in many SSR marker-based genetic diversities, such as Sesamum indicum [2,20], Camellia oleifera [4], Vicia amoena [31] and Trifolium repens [32]. Wu et al. carried out genetic diversity analysis of Trifolium repens using PCoA, UPGMA and STRU CTU RE, and indicated that UPGMA analysis was implemented based on genetic distance, which provided more detailed relationships [32]. In this study, we used MEGA software to determine the genetic diversity of A. chinensis based on UPGMA. Weak genetic differentiation was observed in Pennisetum glaucum among the geographical regions, suggesting high  Table 5. The similarity coefficient ranged from 0.46 to 0.90 among 19 A. chinensis populations based on ten SSR primer amplification results. P8 and P10 showed the highest similarity (0.90), and the lowest similarity (0.46) was estimated between P2 and P11.
SSR marker analysis is an effective method for genetic diversity analysis and molecular marker-assisted selection breeding [6,28]. In the present study, we used ten wellchosen SSR markers in clusters annotated as sesquiterpenoid biosynthesis to analyze 19 A. chinensis populations in northern China. The results showed that these markers were highly polymorphic. The SSR marker analyses revealed the presence of genetic diversity among 19 A. chinensis populations which could be helpful for selective breeding in the future.

Establishment of DNA fingerprinting
According to the amplification results, the set of SSR markers used here provided a discernible assessment of the ability of SSR primers to produce unique DNA profiles of A. chinensis populations. The ten SSR markers were able to differentiate 19 A. chinensis populations. DNA fingerprints of the 19 A. chinensis populations were constructed according to the original data matrix of amplification results (Supplementary Table S2).  TC)12  TGC CGA GTC TTA CTC ATG CTC AGC AAA GCC AAA AAC GGT GG  2  S 4  (T)10  ATC ATG CAT AGC CAG ACG CA TGG GCA CTT GGG GAA TAT CG  3  S 52  (AG)6  TCC GCC CCT GAG CTA CTA TC  TGG CGA CAC ATT TTC GTG AA  4  S 53  (AG)6  CCG CCC CTG AGC TAC TAT CT  TGG CGA CAC ATT TTC GTG AA  5  S 54  (AG)6  CCG CCC CTG AGC TAC TAT CT  TTG GCG ACA CAT TTT CGT   DNA fingerprinting is a popular technique for identifying species. The genus Atractylodes comprises species of perennial herbs used as important crude drugs prescribed in Chinese, Japanese, Korean and Thai traditional medicine, including Atractylodes lancea, A. chinensis, Atractylodes japonica and Atractylodes macrocephala [35]. A. lancea and A. chinensis are known as Cangzhu in Chinese and Sojutsu in Japanese. A. japonica is recorded in the Japanese and Korean Pharmacopoeias but not in the Chinese pharmacopeia. The plants of the genus Atractylodes showed similar morphological features of stems, leaves and rhizomes, leading to disagreement regarding whether they are unique species and to their frequent misuses in medical products [29]. DNA fingerprinting is immensely helpful in detecting populations with high similarity. The results of the present study revealed that SSR marker-based fingerprinting databases are useful to detect genetic polymorphisms representing a method for analyzing unique populations. Marker-based fingerprinting provides a desirable reference for species and germplasm identification in the genus Atractylodes.

Unique alleles
SSR markers, in contrast to morphological markers, have strong species specificity [26]. Seventeen populations produced unique bands with certain SSR markers (Table 6). For P11, five SSR markers generated unique bands, and three markers generated unique bands for P14. Four SSR markers received unique bands for P6, and three markers received unique bands for P7.
Among the ten SSR primers used in the present study, seven were detected to generate unique fragments in certain populations (Table 7). Numerous specific SSR loci enabled us to select markers that yield highly specific amplifications independently ( Supplementary Fig. S1, Table 7). S54   The unique fragment generated through natural selection [18] was utilized for the evaluation of germplasm resources and molecular markerassisted selection breeding.

Conclusion
The selection of set of core SSR primers is a crucial step for genetic diversity, DNA fingerprinting and germplasm identification. The ten SSR markers used in this study enable conclusions regarding the overall polymorphism and number of alleles observed in the 19 studied A. chinensis populations but do not relate explicitly to functional diversity and specific traits. The genetic diversity combination of agronomic traits (such as yield and quality traits) and SSR markers can be a key source of information to exploit superior A. chinensis germplasm resources for selective breeding.

Conflict of interest
The authors declare that they have no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.  P1  S99  P11  S52, S53, S54, S63,  S99  P2  S99  P12  S99  P3  S53, S74  P13  S74  P4  S4  P14  S52, S54, S63  P5  S99  P15  S74, S99  P6  S53, S54, S63, S99 P16  S99  P7  S52, S74, S99  P17  S4  P8  S52  P19  S74  P10 S62, S99