Abstract
When traditional short tandem repeat profiling fails to provide valuable information to arrest the criminal, forensic ancestry inference of the biological samples left at the crime scene will probably offer investigative leads and facilitate the investigation process of the case. That is why there are consistent efforts in developing panels for ancestry inference in forensic science. Presently, a 30-plex next generation sequencing-based assay was exploited in this study by assembling well-differentiated single nucleotide polymorphisms for ancestry assignment of unknown individuals from three continental populations (African, European and East Asian). And meanwhile, relatively balanced population-specific differentiation values were maintained to avoid the over-estimation or under-estimation of co-ancestry proportions in individuals with admixed ancestry. The principal component analysis and STRUCTURE analysis of reference populations, test populations and the studied Mongolian group indicated that the novel assay was efficient enough to determine the ancestry origin of an unknown individual from the three continental populations. Besides, ancestry membership proportion estimations for the Mongolian group revealed that a large fraction of the ancestry was contributed by East Asian genetic component (approximately 83.9%), followed by European (approximately 12.6%) and African genetic components (approximately 3.5%), respectively. And next generation sequencing technology applied in this study offers possibility to incorporate more single nucleotide polymorphisms for individual identification and phenotype prediction into the same assay to provide as many as possible investigative clues in the future.
Similar content being viewed by others
References
Al-Asfi M, McNevin D, Mehta B, Power D, Gahan ME, Daniel R (2018) Assessment of the precision ID ancestry panel. Int J Legal Med 132:1581–1594
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR (2015) A global reference for human genetic variation. Nature 526:68–74
Bai H, Guo X, Narisu N, Lan T, Wu Q, Xing Y, Zhang Y, Bond SR, Pei Z, Zhang Y, Zhang D, Jirimutu J, Zhang D, Yang X, Morigenbatu M, Zhang L, Ding B, Guan B, Cao J, Lu H, Liu Y, Li W, Dang N, Jiang M, Wang S, Xu H, Wang D, Liu C, Luo X, Gao Y, Li X, Wu Z, Yang L, Meng F, Ning X, Hashenqimuge H, Wu K, Wang B, Suyalatu S, Liu Y, Ye C, Wu H, Leppala K, Li L, Fang L, Chen Y, Xu W, Li T, Liu X, Xu X, Gignoux CR, Yang H, Brody LC, Wang J, Kristiansen K, Burenbatu B, Zhou H, Yin Y (2018) Whole-genome sequencing of 175 Mongolians uncovers population-specific genetic architecture and gene flow throughout North and East Asia. Nat Genet 50:1696–1704
de la Puente M, Santos C, Fondevila M, Manzo L, Carracedo A, Lareu MV, Phillips C, Consortium EU-NS (2016) The Global AIMs Nano set: a 31-plex SNaPshot assay of ancestry-informative SNPs. Forensic Sci Int Genet 22:81–88
Elhaik E, Tatarinova T, Chebotarev D, Piras IS, Maria Calo C, De Montis A, Atzori M, Marini M, Tofanelli S, Francalacci P, Pagani L, Tyler-Smith C, Xue Y, Cucca F, Schurr TG, Gaieski JB, Melendez C, Vilar MG, Owings AC, Gomez R, Fujita R, Santos FR, Comas D, Balanovsky O, Balanovska E, Zalloua P, Soodyall H, Pitchappan R, Ganeshprasad A, Hammer M, Matisoo-Smith L, Wells RS (2014) Geographic population structure analysis of worldwide human populations infers their biogeographical origins. Nat Commun 5:3513
Eller E (1999) Population substructure and isolation by distance in three continental regions. Am J Phys Anthropol 108:147–159
Gettings KB, Lai R, Johnson JL, Peck MA, Hart JA, Gordish-Dressman H, Schanfield MS, Podini DS (2014) A 50-SNP assay for biogeographic ancestry and phenotype prediction in the US population. Forensic Sci Int Genet 8:101–108
Goeman JJ, Solari A (2014) Multiple hypothesis testing in genomics. Stat Med 33:1946–1978
Handley LJ, Manica A, Goudet J, Balloux F (2007) Going the distance: human population genetics in a clinal world. Trends Genet 23:432–439
Hazel JW, Clayton EW, Malin BA, Slobogin C (2018) Is it time for a universal genetic forensic database? Science 362:898–900
Hellenthal G, Busby GBJ, Band G, Wilson JF, Capelli C, Falush D, Myers S (2014) A genetic atlas of human admixture history. Science 343:747–751
Hou QF, Yu B, Li SB (2007) Genetic polymorphisms of nine X-STR loci in four population groups from Inner Mongolia, China. Genomics Proteomics Bioinform 5:59–65
Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23:1801–1806
Jay F, Sjodin P, Jakobsson M, Blum MG (2013) Anisotropic isolation by distance: the main orientations of human genetic differentiation. Mol Biol Evol 30:513–525
Jin XY, Wei YY, Lan Q, Cui W, Chen C, Guo YX, Fang YT, Zhu BF (2019) A set of novel SNP loci for differentiating continental populations and three Chinese populations. PeerJ 7:e6508
Kimura M, Weiss GH (1964) The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics 49:561–576
Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, Wilson RK, Ding L (2009) VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 25:2283–2285
Lan Q, Shen C, Jin X, Guo Y, Xie T, Chen C, Cui W, Fang Y, Yang G, Zhu B (2019) Distinguishing three distinct biogeographic regions with an in-house developed 39-AIM-InDel panel and further admixture proportion estimation for Uyghurs. Electrophoresis 40:1525–1534
Lao O, Vallone PM, Coble MD, Diegoli TM, van Oven M, van der Gaag KJ, Pijpe J, de Knijff P, Kayser M (2010) Evaluating self-declared ancestry of US Americans with autosomal Y- chromosomal and mitochondrial DNA. Hum Mutat 31:E1875–1893
Li CX, Pakstis AJ, Jiang L, Wei YL, Sun QF, Wu H, Bulbul O, Wang P, Kang LL, Kidd JR, Kidd KK (2016) A panel of 74 AISNPs: Improved ancestry inference within Eastern Asia. Forensic Sci Int Genet 23:101–110
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760
Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319:1100–1104
Pereira L, Alshamali F, Andreassen R, Ballard R, Chantratita W, Cho NS, Coudray C, Dugoujon JM, Espinoza M, Gonzalez-Andrade F, Hadi S, Immel UD, Marian C, Gonzalez-Martin A, Mertens G, Parson W, Perone C, Prieto L, Takeshita H, Rangel Villalobos H, Zeng Z, Zhivotovsky L, Camacho R, Fonseca NA (2011) PopAffiliator: online calculator for individual affiliation to a major population group based on 17 autosomal short tandem repeat genotype profile. Int J Legal Med 125:629–636
Phillips C (2015) Forensic genetic analysis of bio-geographical ancestry. Forensic Sci Int Genet 18:49–65
Phillips C, Fernandez-Formoso L, Garcia-Magarinos M, Porras L, Tvedebrink T, Amigo J, Fondevila M, Gomez-Tato A, Alvarez-Dios J, Freire-Aradas A, Gomez-Carballa A, Mosquera-Miguel A, Carracedo A, Lareu MV (2011) Analysis of global variability in 15 established and 5 new European standard set (ESS) STRs using the CePh human genome diversity panel. Forensic Sci Int Genet 5:155–169
Phillips C, Fondevila M, Lareau MV (2012) A 34-plex autosomal SNP single base extension assay for ancestry investigations. Methods Mol Biol 830:109–126
Phillips C, Freire Aradas A, Kriegel AK, Fondevila M, Bulbul O, Santos C, Serrulla Rech F, Perez Carceles MD, Carracedo A, Schneider PM, Lareu MV (2013) Eurasiaplex: a forensic SNP assay for differentiating European and South Asian ancestries. Forensic Sci Int Genet 7:359–366
Phillips C, Parson W, Lundsberg B, Santos C, Freire-Aradas A, Torres M, Eduardoff M, Borsting C, Johansen P, Fondevila M, Morling N, Schneider P, Carracedo A, Lareu MV (2014) Building a forensic ancestry panel from the ground up: The EUROFORGEN Global AIM-SNP set. Forensic Sci Int Genet 11:13–25
Phillips C, Salas A, Sanchez JJ, Fondevila M, Gomez-Tato A, Alvarez-Dios J, Calaza M, de Cal MC, Ballard D, Lareu MV, Carracedo A (2007) Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs. Forensic Sci Int Genet 1:273–280
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genet 155:945–959
Pugach I, Matveyev R, Wollstein A, Kayser M, Stoneking M (2011) Dating the age of admixture via wavelet transform analysis of genome-wide data. Genome Biol 12:R19
Qin P, Li Z, Jin W, Lu D, Lou H, Shen J, Jin L, Shi Y, Xu S (2014) A panel of ancestry informative markers to estimate and correct potential effects of population stratification in Han Chinese. Eur J Hum Genet 22:248–253
Qin P, Zhou Y, Lou H, Lu D, Yang X, Wang Y, Jin L, Chung YJ, Xu S (2015) Quantitating and dating recent gene flow between European and East Asian populations. Sci Rep 5:9500
Ren P, Liu J, Zhao H, Fan XP, Xu YC, Li CX (2019) Construction of a rapid microfluidic-based SNP genotyping (MSG) chip for ancestry inference. Forensic Sci Int Genet 41:145–151
Romanini C, Romero M, Salado Puerto M, Catelli L, Phillips C, Pereira R, Gusmao L, Vullo C (2015) Ancestry informative markers: inference of ancestry in aged bone samples using an autosomal AIM-Indel multiplex. Forensic Sci Int Genet 16:58–63
Rosenberg NA (2005) Algorithms for selecting informative marker panels for population assignment. J Comput Biol 12:1183–1201
Rosenberg NA, Li LM, Ward R, Pritchard JK (2003) Informativeness of genetic markers for inference of ancestry. Am J Hum Genet 73:1402–1422
Rousset F (2008) genepop’007: a complete re-implementation of the genepop software for Windows and Linux. Mol Ecol Resour 8:103–106
Santangelo R, Gonzalez-Andrade F, Borsting C, Torroni A, Pereira V, Morling N (2017) Analysis of ancestry informative markers in three main ethnic groups from Ecuador supports a trihybrid origin of Ecuadorians. Forensic Sci Int Genet 31:29–33
Santos C, Phillips C, Fondevila M, Daniel R, van Oorschot RAH, Burchard EG, Schanfield MS, Souto L, Uacyisrael J, Via M, Carracedo A, Lareu MV (2016) Pacifiplex: an ancestry-informative SNP panel centred on Australia and the Pacific region. Forensic Sci Int Genet 20:71–80
Tao R, Zhang J, Bian Y, Dong R, Liu X, Jin C, Zhu R, Zhang S, Li C (2018) Investigation of 12 X-STR loci in Mongolian and Eastern Han populations of China with comparison to other populations. Sci Rep 8:4287
Tillmar A, Grandell I, Montelius K (2019) DNA identification of compromised samples with massive parallel sequencing. Forensic Sci Res 4:331–336
Tishkoff SA, Kidd KK (2004) Implications of biogeography of human populations for ‘race’ and medicine. Nat Genet 36:S21–27
Wei YL, Wei L, Zhao L, Sun QF, Jiang L, Zhang T, Liu HB, Chen JG, Ye J, Hu L, Li CX (2016) A single-tube 27-plex SNP assay for estimating individual ancestry and admixture from three continents. Int J Legal Med 130:27–37
Xavier C, Parson W (2017) Evaluation of the Illumina ForenSeq DNA Signature Prep Kit–MPS forensic application for the MiSeq FGx benchtop sequencer. Forensic Sci Int Genet 28:188–194
Xie T, Hu L, Guo YX, Li YC, Chen F, Zhu BF (2019) Genetic polymorphism analysis of mitochondrial DNA from Chinese Xinjiang Kazak ethnic group by a novel mitochondrial DNA genotyping panel. Mol Biol Rep 46:17–25
Xu S, Jin L (2008) A genome-wide analysis of admixture in Uyghurs and a high-density admixture map for disease-gene discovery. Am J Hum Genet 83:322–336
Xu S, Jin W, Jin L (2009) Haplotype-sharing analysis showing Uyghurs are unlikely genetic donors. Mol Biol Evol 26:2197–2206
Funding
This project was supported by the National Natural Science Foundation of China (No. 81525015), GDUPS (2017).
Author information
Authors and Affiliations
Contributions
BZ conducted the study design and panel construction. QL performed sequencing and statistical analyses, and further wrote the manuscript. YF and SM helped in statistical analyses. TX, YL, XJ and GY collected the samples. All the authors authorized and performed the manuscript revision.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by Stefan Hohmann.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Lan, Q., Fang, Y., Mei, S. et al. Next generation sequencing of a set of ancestry-informative SNPs: ancestry assignment of three continental populations and estimating ancestry composition for Mongolians. Mol Genet Genomics 295, 1027–1038 (2020). https://doi.org/10.1007/s00438-020-01660-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-020-01660-2