Population genetic analysis of the GlobalFiler STR loci in 748 individuals from the Kazakh population of Xinjiang in northwest China

Abstract

The six-dye GlobalFiler™ Express PCR amplification kit incorporates 21 commonly used autosomal short tandem repeat (STR) loci and three gender determination loci. In this study, we analyzed the GlobalFiler STR loci on 748 unrelated individuals from a Chinese Kazakh population of Xinjiang, China. No significant deviations from Hardy-Weinberg equilibrium and linkage disequilibrium were observed within and between 21 autosomal STR loci. SE33 showed the greatest power of discrimination in Kazakh population. The combined power of discrimination of Kazakh was 99.999999999999999999999996797 %. No significant differences of allele frequencies were observed between Kazakh and Uyghur at all 15 tested STR loci, as well as Mongolian. Significant differences were only observed between Kazakh and the other Chinese populations at TH01. Multiple STR loci showed significant differences between Kazakh and Arab, as well as South Portuguese. The multidimensional scaling plot (MDS) plot and neighbor-joining tree also showed Kazakh is genetically close to Uyghur.

The introduction of a set of highly polymorphic short tandem repeat (STR) loci for human individual identification (HID) has proven to be successful in forensic investigations [1]. A total of 24 autosomal STR loci were commonly used in forensics [2], which were embedded in several multiplex amplification kits. Recently, the six-dye GlobalFiler™ Express PCR Amplification kit were developed by Thermo Fisher Scientific company [3], which includes 21 autosomal STRs of above 24 markers and three gender determination loci (Amelogenin, Yindel, and DYS391). The autosomal STR loci in GlobalFiler kit are D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317, D16S539, D2S1338, D19S433, VWA, TPOX, D18S51, D5S818, FGA, D12S391, D1S1656, D2S441, D10S1248, D22S1045, and SE33. Some previous studies have shown that the GlobalFiler kit can support reliable DNA typing results and enhanced discrimination power [4, 5], and a few population data of the GlobalFiler STRs were recently released [6, 7].

Kazakh is one of ethnic populations of Xinjiang Autonomous Region in northwest China. According to the 2010 census, the population of Kazakh had reached 1.462 million, who mainly lives in the Ili Kazak Autonomous Prefecture, Mori Kazak Autonomous County, and Barkol Kazak Autonomous County [8]. To evaluate the performance of the GlobalFiler kit on a Chinese Kazakh population, here, we typed the GlobalFiler STR loci in 748 unrelated Kazakh samples who were collected from Xinjiang. Peripheral blood samples were collected from these individuals after acquiring their informed consent. Amplification of 24 loci were performed using GlobalFiler™ Express kit (Thermo Fisher Scientific company, Carlsbad, USA) in the GeneAmp PCR System 9700 (Thermo Fisher Scientific company) according to manufacturer’s recommendation. PCR products were separated by capillary electrophoresis in an ABI PRISM 3730xL genetic analyzer (Thermo Fisher Scientific company). The GeneMapper® ID-X software v1.4 (Thermo Fisher Scientific company) was used for genotype assignment. DNA typing and assignment of nomenclature were based on the ISFG recommendations [9, 10].

The allele frequencies were estimated from corresponding genotype counts. The exact tests of Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) were performed using Arlequin v3.5 [11], and the observed heterozygosity (Ho) and expected heterozygosity (He) were also estimated. Match probability (MP), power of discrimination (PD), and power of exclusion (PE) were estimated using Modified-Powerstats [12]. The exact test of population differentiation was performed between Kazakh of the present study and the other populations using Arlequin v3.5 [11]. We also estimated pairwise Fst as described by Weir and Hill [13] to measure the genetic distance between populations. Results were visualized by multidimensional scaling plot (MDS) using R v3.1.2 (http://www.r-project.org) and neighbor-joining tree was construced using Mega 6 software [14].

To evaluate whether random match probability can be estimated using simple product role [15], we performed HWE and LD test within and between STR loci in the GlobalFiler kit. No significant deviations from HWE were observed after Bonferroni correction (P > 0.002381) (see Table S1). We also found no significant deviations from LD between pairwise STR loci after Bonferroni correction (P > 0.0002381) (see Table S2). The results showed that match probability can be estimated in a Kazakh population by multiplying the allele frequencies within and across 21 STR loci.

The allele frequencies and forensic statistical parameters of 21 autosomal STR loci are shown in Table S1. Ho ranged from 0.6350 (TPOX) to 0.9398 (SE33), whereas He ranged from 0.6239 (TPOX) to 0.9461 (SE33). PD ranged from 0.7987 (TPOX) to 0.9932 (SE33), and PE ranged from 0.3350 (TPOX) to 0.8773 (SE33). The results showed that SE33 had the greatest power of discrimination in Kazakh population. The combined power of discrimination (CPD) was 99.999999999999999999999996797 %, and the combined power of exclusion (CPE) was 99.9999998683861 %. The combined match probability (CMP) was 3.203 × 10−26. Thus, the GlobalFiler kit has the greatest power of HID of the currently available commercial multiplex STR kits.

The exact test of population differentiation was performed between Kazakh of the present study and the other populations from earlier reports (Table S3). Overall, the allele frequency data of the 12 populations were obtained, including ten Chinese populations typed by different amplification kits [16], as well as the populations from South Portugal [6] and United Arab Emirates [7] who were both typed by the GlobalFiler kit. The Chinese populations include six Han populations from different areas (Jilin, Gansu, Qinghai, Shandong, Jiangsu, and Guangdong) and four ethnic groups (Tibetan, Dai, Mongolian, and Uyghur). After Bonferroni correction (P = 0.05/195 = 0.00026, 192 was the number of test), no significant differences were observed between Kazakh and Uyghur at all 15 tested STR loci, as well as Mongolian. These two populations have similar geographic distributions with Kazakh, especially Uyghur has been proven having a close genetic relationship with Kazakh in previous studies [8, 17]. In the comparison between Kazakh and the other Chinese populations, significant differences were only observed at TH01. The results showed that the frequencies of 15 GlobalFiler STR loci at least had no significant differences between Kazakh and the other Chinese populations, and the other six STR loci still remain tests in China while more data of GlobalFiler STRs are generated in the future.

We also performed exact test of population differentiation between Kazakh and the other two worldwide populations. Significant differences were observed between Kazakh and Arab from the United Arab Emirates, as well as South Portuguese populations at multiple STR loci. Among them, only the frequencies of CSF1PO, D3S1358, D8S1179, and FGA showed no significant differences. All six STR loci out of the AmpFlSTR Identifiler markers (D10S1248, D12S391, D1S1656, D22S1045, D2S441, and SE33) showed significant differences between Kazakh and the above two populations.

Unbiased Fst were estimated based on 15 shared STR loci among studied populations. As showed in Fig. 1, Kazakh is genetically close to Uyghur, and both of them show apparent separation of other Chinese populations and Arab, as well as South Portuguese. The same conclusion could have been observed in the neighbor-joining tree (Fig. 2). The results indicated that the Kazakh population may be mixed by the ancestral groups from Western and Eastern Eurasians.

Fig. 1
figure1

Multidimensional scaling (MDS) plot based on pairwise Fst values between Kazakh and other populations

Fig. 2
figure2

The neighbor-joining trees constructed with pairwise Fst

In conclusion, we report the allele frequencies and forensic statistical parameters of the GlobalFiler STR loci in Chinese Kazakh population, which may serve as a forensic database reference of Chinese populations.

Change history

  • 26 February 2020

    ‘Concerns have been raised about the ethics approval and informed consent procedures related to the research reported in this paper. The paper includes the following author declarations: “Peripheral blood samples were collected from these individuals after acquiring their informed consent”. Editorial action will be taken as appropriate once an investigation of the concerns is complete and all parties have been given an opportunity to respond in full.’

References

  1. 1.

    Hammond HA, Jin L, Zhong Y, Caskey CT, Chakraborty R (1994) Evaluation of 13 short tandem repeat loci for use in personal identification applications. Am J Hum Genet 55(1):175–189

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Gettings KB, Aponte RA, Vallone PM, Butler JM (2015) STR allele sequence variation: current knowledge and future issues. Forensic Sci Int-Genet 18:118–130. doi:10.1016/j.fsigen.2015.06.005

    CAS  Article  Google Scholar 

  3. 3.

    Wang DY, Gopinath S, Lagacé RE, Norona W, Hennessy LK, Short ML, Mulero JJ (2015) Developmental validation of the GlobalFiler® Express PCR Amplification Kit: a 6-dye multiplex assay for the direct amplification of reference samples. Forensic Sci Int-Genet 19:148–155. doi:10.1016/j.fsigen.2015.07.013

    CAS  Article  Google Scholar 

  4. 4.

    Flores S, Sun J, King J, Budowle B (2014) Internal validation of the GlobalFiler (TM) Express PCR Amplification Kit for the direct amplification of reference DNA samples on a high-throughput automated workflow. Forensic Sci Int-Genet 10:33–39. doi:10.1016/j.fsigen.2014.01.005

    CAS  Article  Google Scholar 

  5. 5.

    Martin P, de Simon LF, Luque G, Farfan MJ, Alonso A (2014) Improving DNA data exchange: validation studies on a single 6 dye STR kit with 24 loci. Forensic Sci Int-Genet 13:68–78. doi:10.1016/j.fsigen.2014.07.002

    CAS  Article  Google Scholar 

  6. 6.

    Almeida C, Ribeiro T, Oliveira AR, Porto MJ, Santos JC, Dias D, Dario P (2015) Population data of the GlobalFiler® Express loci in South Portuguese population. Forensic Sci Int-Genet 19:39–41. doi:10.1016/j.fsigen.2015.06.001

    CAS  Article  Google Scholar 

  7. 7.

    Alhmoudi OA, Jones RJ, Tay GK, Alsafar H, Hadi S (2015) Population genetics data for 21 autosomal STR loci for United Arab Emirates (UAE) population using next generation multiplex STR kit. Forensic Sci Int-Genet 19:190–191. doi:10.1016/j.fsigen.2015.07.009

    Article  Google Scholar 

  8. 8.

    Yuan JY, Wang XY, Shen CM, Liu WJ, Yan JW, Wang HD, Pu HW, Wang YL, Yang G, Zhang YD, Meng HT, Jing H, Zhu BF (2014) Genetic profile characterization and population study of 21 autosomal STR in Chinese Kazak ethnic minority group. Electrophoresis 35(4):503–510. doi:10.1002/elps.201300398

    CAS  Article  Google Scholar 

  9. 9.

    Bar W, Brinkmann B, Budowle B, Carracedo A, Gill P, Lincoln P, Mayr W, Olaisen B (1997) DNA recommendations—further report of the DNA Commission of the ISFH regarding the use of short tandem repeat systems. Int J Legal Med 110(4):175–176. doi:10.1007/s004140050061

    CAS  Article  Google Scholar 

  10. 10.

    Olaisen B, Bar W, Brinkmann B, Budowle B, Carracedo A, Gill P, Lincoln P, Mayr WR, Rand S (1998) DNA recommendations 1997 of the International Society for Forensic Genetics. Vox Sang 74(1):61–63. doi:10.1046/j.1423-0410.1998.7410061.x

    CAS  Article  Google Scholar 

  11. 11.

    Excoffier L, Lischer HEL (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour 10(3):564–567. doi:10.1111/j.1755-0998.2010.02847.x

    Article  Google Scholar 

  12. 12.

    Zhao F, Wu X, Cai G, Xu C (2003) The application of Modified-Powerstates software in forensic biostatistics (in Chinese). Chinese Journal of Forensic Medicine 18(5):297–298

    Google Scholar 

  13. 13.

    Weir BS, Hill WG (2002) Estimating F-statistics. Annual Review of Genetics 36:721–750

    CAS  Article  Google Scholar 

  14. 14.

    Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol 30(12):2725–2729

    CAS  Article  Google Scholar 

  15. 15.

    Chakraborty R, Kidd KK (1991) The utility of DNA typing in forensic work. Science 254(5039):1735–1739. doi:10.1126/science.1763323

    CAS  Article  Google Scholar 

  16. 16.

    Li L, Xu J, Liu X, Chen W, Xia M, Yang S, Jiang P, Ma T, Yang Y, Qian J, Sun H, Hu R, Miqin, Feng Z, Zuo Y, Zhou R, Ping Y, Zhou H, Zhao Z, Jin L, Li S (2015) Population data of 15 short tandem repeat loci in 1084 individuals from six Han and four ethnic populations in China. Forensic Sci Int-Genet 19:146–147. doi:10.1016/j.fsigen.2015.06.015

  17. 17.

    Lou HY, Li SL, Jin WF, Fu RQ, Lu DS, Pan XW, Zhou HG, Ping Y, Jin L, Xu SH (2015) Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups. Eur J Hum Genet 23(4):536–542. doi:10.1038/ejhg.2014.134

    CAS  Article  Google Scholar 

Download references

Acknowledgments

This study was supported by grants from the National Science Foundation of China (31271338), Project of Chinese Ministry of Education (113022A) and the National High Technology Research and Development Program (2012AA021802).

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Liming Li or Shilin Li.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Honghua Zhang and Shuping Yang contributed equally to this work.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(XLSX 18 kb)

ESM 2

(XLSX 11 kb)

ESM 3

(XLSX 12 kb)

ESM 4

(XLSX 11 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Yang, S., Guo, W. et al. Population genetic analysis of the GlobalFiler STR loci in 748 individuals from the Kazakh population of Xinjiang in northwest China. Int J Legal Med 130, 1187–1189 (2016). https://doi.org/10.1007/s00414-016-1319-2

Download citation

Keywords

  • Short tandem repeat
  • GlobalFiler
  • Kazakh
  • Xinjiang