Abstract
Single nucleotide polymorphisms (SNPs) are likely in the near future to have a fundamental role both in human identification and description. However, because allele frequencies can vary greatly among populations, a critical issue is the population genetics underlying calculation of the probabilities of unrelated individuals having identical multi-locus genotypes. Here we report on progress in identifying SNPs that show little allele frequency variation among a worldwide sample of 40 populations, i.e., have a low Fst, while remaining highly informative. Such markers have match probabilities that are nearly uniform irrespective of population and become candidates for a universally applicable individual identification panel applicable in forensics and paternity testing. They are also immediately useful for efficient sample identification/tagging in large biomedical, association, and epidemiologic studies. Using our previously described strategy for both identifying and characterizing such SNPs (Kidd et al. in Forensic Sci Int 164:20–32, 2006), we have now screened a total of 432 SNPs likely a priori to have high heterozygosity and low allele frequency variation and from these have selected the markers with the lowest Fst in our set of 40 populations to produce a panel of 40 low Fst, high heterozygosity SNPs. Collectively these SNPs give average match probabilities of less than 10−16 in most of the 40 populations and less than 10−14 in all but one small isolated population; the range is 2.02 × 10−17 to 1.29 × 10−13. These 40 SNPs constitute excellent candidates for the global forensic community to consider for a universally applicable SNP panel for human identification. The relative ease with which these markers could be identified also provides a cautionary lesson for investigations of possible balancing selection.
Similar content being viewed by others
References
Amorim A, Pereira L (2005) Pros and cons in the use of SNPs in forensic kinship investigation: a comparative analysis with STRs. Forensic Sci Int 150:17–21
Balding DJ (2003) Likelihood-based inference for genetic correlation coefficients. Theor Popul Biol 63:221–230
Budowle B, Moretti TR, Niezgoda SJ, Brown BL (1998) CODIS and PCR-based short tandem repeat loci: law enforcement tools. Second European symposium on human identification, Promega Corporation, Madison, pp 73–88
Calafell F, Shuster A, Speed WC, Kidd JR, Black FL, Kidd KK (1999) Genealogy reconstruction from short tandem repeat genotypes in an Amazonian population. Am J Phys Anthropol 108:137–146
Cavalli-Sforza LL, Menozzi P, Piazza A (1994) The history and geography of human genes. Princeton University Press, Princeton
Cotterman CW (1954) “Estimation of gene frequencies in nonexperimental populations. In: Kempthorne O, Bancroft TA, Gowen JW, Lush JL (eds) Statistics and mathematics in biology. Iowa State College Press, Ames, pp 449–465
Devlin B, Risch N (1995) A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29:311–322
Dixon LA, Murray CM, Archer EJ, Dobbins AE, Koumi P, Gill P (2005) Validation of a 21-locus autosomal SNP multiplex for forensic identification purposes. Forensic Sci Int 154:62–77
Gill P, Werrett DJ, Budowle B, Guerrieri R (2004) An assessment of whether SNPs will replace STRs in national DNA databases—joint considerations of the DNA working group of the European network of forensic science institutes (ENFSI) and the scientific working group on DNA analysis methods (SWGDAM). Sci Justice 44:51–53
Gill P, Fereday L, Morling N, Schneider PM (2005) The evolution of DNA databases—recommendations for new European STR loci. Forensic Sci Int 156:242–244
Gill P, Fereday L, Morling N, Schneider PM (2006) Letter to the Editor: new multiplexes for Europe—amendments and clarification of strategic development. Forensic Sci Int 163:155–157
Inagaki S, Yamamoto Y, Doi Y, Takata T, Ishikawa T, Imabayashi K, Yoshitome K, Miyaishi S, Ishizu H. (2004) A new 39-plex analysis method for SNPs including 15 blood group loci. Forensic Sci Int. 144:45–57
International HapMap Consortium (2003) The international HapMap project. Nature 406:789–796
International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1320
Kidd KK, Pakstis AJ, Speed WC, Kidd JR (2004) Understanding human DNA sequence variation. J Hered 95:406–420
Kidd KK, Pakstis AJ, Speed W, Grigorenko E, Kajuna SLB, Karoma N, Kungulilo S, Kim J-J, Lu A, Odunsi R.-B, Okonofua F, Parnas J, Schulz L, Zhukova O, Kidd JR (2006) Developing a SNP panel for forensic identification of individuals. Forensic Sci Int: 164:20–32
Kidd JR, Pakstis AJ, Kidd KK (1993) Global levels of DNA variation. Proceedings of the fourth international symposium on human identification 1993 (Promega), pp 21–30
Lee HY, Park MJ, Yoo J-E, Chung U, Han G-R, Shin K-J (2005) Selection of 24 highly informative SNP markers for human identification and paternity analysis in Koreans. Forensic Sci Int 148:107–112
Li L, Li C-T, Li R-Y, Liu Y, Lin Y, Que T-Z, Sun M-Q, Li Y (2006) SNP genotyping by multiplex amplification and microarrays assay for forensic application. Forensic Sci Int 162:74–79
National Research Council Committee on DNA Technology in Forensic Science (1996) The evaluation of forensic DNA evidence/Committee on DNA forensic science: an update. National Academy Press, Washington D.C.
Osier MV, Pakstis AJ, Goldman D, Edenberg HJ, Kidd JR, Kidd KK (2002) A proline-theronine substitution in codon 351 of ADH1C is common in native Americans. Alcohol Clin Exp Res 26:1759–1763
Peltonen L, Jalanko A, Varilo T (1999) Molecular genetics of the Finnish disease heritage. Hum Mol Genet 8:1913–1923
Petkovski E, Keyser-Tracqui C, Hienne R, Ludes B (2005) SNPs and MALDI-TOF MS: tools for DNA typing in forensic paternity testing and anthropology. J Forensic Sci 50:535–541
Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298:2381–2385
Sanchez JJ, Phillips C, Borsting C, Balogh K, Bogus M, Fondevila M, Harrison CD, Musgrave-Brown E, Salas A, Syndercombe-Court D, Schneider PM, Carracedo A, Morling N (2006) A multiplex assay with 52 single nucleotide polymorphisms for human identification. Electrophoresis 27:1713–1724
Shriver MD, Mei R, Parra EJ, Sonpar V, Halder I, Tishkoff SA, Schurr TG, Zhadanov SI, Osipova LP, Brutsaert TD, Friedlaender J, Jorde LB, Watkins WS, Bamshad MJ, Guiterrez G, Loi H, Matsuzaki H, Kittles RA, Argyropoulos G, Fernandez JR, Akey JM, Jones KW (2005) Large-scale SNP analysis reveals clustered and continuous patterns of human genetic variation. Hum Genomics 2:81–89
Syvanen AC, Sajantila A, Lukka M (1993) Identification of individuals by analysis of biallelic DNA markers, using PCR and solid-phase minisequencing. Am J Hum Genet 52:46–59
Teare MD, Dunning AM, Durocher F, Rennart G, Easton DF (2002) Sampling distribution of summary linkage disequilibrium measures. Ann Hum Genet 66:223–233
Tishkoff SA, Kidd KK (2004) Implications of biogeography of human populations for race and medicine. Nat Genet 36(suppl):s21-s27
Vallone PM, Decker AE, Butler JM (2005) Allele frequencies for 70 autosomal SNP loci with US Caucasian, African–American, and Hispanic samples. Forensic Sci Int 149:279–286
Varilo T, Peltonen L (2004) Isolates and their potential use in complex gene mapping efforts. Curr Opin Genet Dev 14:316–323
Wright S (1951) The genetical structure of populations. Ann Eugenics 15:323–354
Acknowledgments
This work was funded primarily by NIJ Grant 2004-DN-BX-K025 to KKK awarded by the National Institute of Justice, Office of Justice Programs, US Department of Justice. Points of view in this document are those of the authors and do not necessarily represent the official position or policies of the US Department of Justice. We thank Applied Biosystems for making their allele frequency database available to us. We also want to acknowledge and thank the following people who helped assemble the samples from the diverse populations: F. L. Black, B. Bonne-Tamir, L. L. Cavalli-Sforza, K. Dumars, J. Friedlaender, L. Giuffra, E. L. Grigorenko, S. L. B. Kajuna, N. J. Karoma, K. Kendler, J-J. Kim, W. Knowler, S. Kungulilo, R-B. Lu, A. Odunsi, F. Okonofua, F. Oronsaye, J. Parnas, L. Peltonen, L. O. Schulz, D. Upson, K. Weiss, and O. V. Zhukova. In addition, some of the cell lines were obtained from the National Laboratory for the Genetics of Israeli Populations at Tel Aviv University, Israel, and the African American samples were obtained from the Coriell Institute for Medical Research, Camden, NJ. Special thanks are due to the many hundreds of individuals who volunteered to give blood samples for studies of gene frequency variation.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Pakstis, A.J., Speed, W.C., Kidd, J.R. et al. Candidate SNPs for a universal individual identification panel. Hum Genet 121, 305–317 (2007). https://doi.org/10.1007/s00439-007-0342-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-007-0342-2