Abstract
Significant efforts have been made to determine the correlation structure of common SNPs in the human genome. One method has been to identify the sets of tagSNPs that capture most of the genetic variation. Here, we evaluate the transferability of tagSNPs between populations using a population sample of Sami, the indigenous people of Scandinavia. Array-based SNP discovery in a 4.4 Mb region of 28 phased copies of chromosome 21 uncovered 5,132 segregating sites, 3,188 of which had a minimum minor allele frequency (mMAF) of 0.1. Due to the population structure and consequently high LD, the number of tagSNPs needed to capture all SNP variation in Sami is much lower than that for the HapMap populations. TagSNPs identified from the HapMap data perform only slightly better in the Sami than choosing tagSNPs at random from the same set of common SNPs. Surprisingly, tagSNPs defined from the HapMap data did not perform better than selecting the same number of SNPs at random from all SNPs discovered in Sami. Nearly half (46%) of the Sami SNPs with a mMAF of 0.1 are not present in the HapMap dataset. Among sites overlapping between Sami and HapMap populations, 18% are not tagged by the European American (CEU) HapMap tagSNPs, while 43% of the SNPs that are unique to Sami are not tagged by the CEU tagSNPs. These results point to serious limitations in the transferability of common tagSNPs to capture random sequence variation, even between closely related populations, such as CEU and Sami.
Similar content being viewed by others
References
Allison AC, Hartmann O, Brendemoen OJ, Mourant AE (1952) The blood groups of the Norwegian Lapps. Acta Pathol Microbiol Scand 31:334–338
Allison AC, Broman B, Mourant AE, Ryttinger L (1956) The blood groups of Swedish Lapps. J Royal Anthrop Inst 86:87–94
Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly MJ, Donnelly P (2005) A haplotype map of the human genome. Nature 437:1299–1320
Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265
Beckman L (1996) Samerna en genetiskt unik urbefolkning. Solfjädern Offset AB, Umeå
Beckman L, Broman B, Jonsson B, Mellbin T (1959) Further data on the blood groups of the Swedish Lapps. Acta genet 9:1–8
Beckman L, Beckman G, Nylander PO (1988) Gc subtypes in Finns, Swedes and Swedish Lapps. Hum Hered 38:18–21
Beckman G, Beckman L, Sikstrom C (1993) Serum complement (C3, BF, C4) types in Swedish Saamis. Hum Hered 43:362–365
Bennett ST, Barnes C, Cox A, Davies L, Brown C (2005) Toward the 1,000 dollars human genome. Pharmacogenomics 6:373–382
Bonnen PE, Pe’er I, Plenge RM, Salit J, Lowe JK, Shapero MH, Lifton RP, Breslow JL, Daly MJ, Reich DE, Jones KW, Stoffel M, Altshuler D, Friedman JM (2006) Evaluating potential for whole-genome studies in Kosrae, an isolated population in Micronesia. Nat Genet 38:214–217
Cavalli-Sforza LL, Piazza A (1993) Human genomic diversity in Europe: a summary of recent research and prospects for the future. Eur J Hum Genet 1:3–18
Conrad DF, Jakobsson M, Coop G, Wen X, Wall JD, Rosenberg NA, Pritchard JK (2006) A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat Genet 38:1251–1260
Dawson E, Abecasis GR, Bumpstead S, Chen Y, Hunt S, Beare DM, Pabial J, Dibling T, Tinsley E, Kirby S, Carter D, Papaspyridonos M, Livingstone S, Ganske R, Lohmussaar E, Zernant J, Tonisson N, Remm M, Magi R, Puurand T, Vilo J, Kurg A, Rice K, Deloukas P, Mott R, Metspalu A, Bentley DR, Cardon LR, Dunham I (2002) A first-generation linkage disequilibrium map of human chromosome 22. Nature 418:544–548
de Bakker PI, Yelensky R, Pe’er I, Gabriel SB, Daly MJ, Altshuler D (2005) Efficiency and power in genetic association studies. Nat Genet 37:1217–1223
de Bakker PI, Burtt NP, Graham RR, Guiducci C, Yelensky R, Drake JA, Bersaglieri T, Penney KL, Butler J, Young S, Onofrio RC, Lyon HN, Stram DO, Haiman CA, Freedman ML, Zhu X, Cooper R, Groop L, Kolonel LN, Henderson BE, Daly MJ, Hirschhorn JN, Altshuler D (2006) Transferability of tag SNPs in genetic association studies in multiple populations. Nat Genet 38:1298–1303
De La Vega FM, Isaac H, Collins A, Scafe CR, Halldorsson BV, Su X, Lippert RA, Wang Y, Laig-Webster M, Koehler RT, Ziegle JS, Wogan LT, Stevens JF, Leinen KM, Olson SJ, Guegler KJ, You X, Xu LH, Hemken HG, Kalush F, Itakura M, Zheng Y, de The G, O’Brien SJ, Clark AG, Istrail S, Hunkapiller MW, Spier EG, Gilbert DA (2005) The linkage disequilibrium maps of three human chromosomes across four populations reflect their demographic history and a common underlying recombination pattern. Genome Res 15:454–462
Douglas JA, Boehnke M, Gillanders E, Trent JM, Gruber SB (2001) Experimentally-derived haplotypes substantially increase the efficiency of linkage disequilibrium studies. Nat Genet 28:361–364
Eberle MA, Rieder MJ, Kruglyak L, Nickerson DA (2006) Allele frequency matching between SNPs reveals an excess of linkage disequilibrium in genic regions of the human genome. PLoS Genet 2:e142
Evans DM, Cardon LR (2005) A comparison of linkage disequilibrium patterns and estimated population recombination rates across multiple populations. Am J Hum Genet 76:681–687
Evseeva I, Spurkland A, Thorsby E, Smerdel A, Tranebjaerg L, Boldyreva M, Groudakova E, Gouskova I, Alexeev LL (2002) HLA profile of three ethnic groups living in the North-Western region of Russia. Tissue Antigens 59:38–43
Fan C, Sikstrom C, Beckman G, Beckman L (1993) Orosomucoid polymorphism in Finns, Swedes and Swedish Saamis. Hum Hered 43:272–275
Gibson J, Tapper W, Zhang W, Morton N, Collins A (2005) Cosmopolitan linkage disequilibrium maps. Hum Genomics 2:20–27
Gonzalez-Neira A, Ke X, Lao O, Calafell F, Navarro A, Comas D, Cann H, Bumpstead S, Ghori J, Hunt S, Deloukas P, Dunham I, Cardon LR, Bertranpetit J (2006) The portability of tagSNPs across populations: a worldwide survey. Genome Res 16:323–330
Hassler S, Sjolander P, Barnekow-Bergkvist M, Kadesjo A (2001) Cancer risk in the reindeer breeding Saami population of Sweden, 1961–1997. Eur J Epidemiol 17:969–976
Hassler S, Sjölander P, Ericsson AJ (2004) Construction of a database on health and living conditions of the Swedish Sami population. In: Befolkning och bosättning i norr—etnicitet, identitet och gränser i historiens sken. Centre for Sami Research, Umeå University:107–124
Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, Frazer KA, Cox DR (2005) Whole-genome patterns of common DNA variation in three human populations. Science 307:1072–1079
Ingman M, Gyllensten U (2007) A recent genetic link between Sami and the Volga-Ural region of Russia. Eur J Hum Genet 15:115–120
Johansson A, Vavruch-Nilsson V, Edin-Liljegren A, Sjolander P, Gyllensten U (2005) Linkage disequilibrium between microsatellite markers in the Swedish Sami relative to a worldwide selection of populations. Hum Genet 116:105–113
Johnson GC, Esposito L, Barratt BJ, Smith AN, Heward J, Di Genova G, Ueda H, Cordell HJ, Eaves IA, Dudbridge F, Twells RC, Payne F, Hughes W, Nutland S, Stevens H, Carr P, Tuomilehto-Wolf E, Tuomilehto J, Gough SC, Clayton DG, Todd JA (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237
Kaessmann H, Zollner S, Gustafsson AC, Wiebe V, Laan M, Lundeberg J, Uhlen M, Paabo S (2002) Extensive linkage disequilibrium in small human populations in Eurasia. Am J Hum Genet 70:673–685
Kruglyak L, Nickerson DA (2001) Variation is the spice of life. Nat Genet 27:234–236
Laan M, Paabo S (1997) Demographic history and linkage disequilibrium in human populations. Nat Genet 17:435–438
Larsen LA, Vuust J, Nystad M, Evseeva I, Van Ghelue M, Tranebjaerg L (2001) Analysis of FMR1 (CGG)(n) alleles and DXS548-FRAXAC1 haplotypes in three European circumpolar populations: traces of genetic relationship with Asia. Eur J Hum Genet 9:724–727
Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380
Montpetit A, Nelis M, Laflamme P, Magi R, Ke X, Remm M, Cardon L, Hudson TJ, Metspalu A (2006) An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population. PLoS Genet 2:e27
Mueller JC, Lohmussaar E, Magi R, Remm M, Bettecken T, Lichtner P, Biskup S, Illig T, Pfeufer A, Luedemann J, Schreiber S, Pramstaller P, Pichler I, Romeo G, Gaddi A, Testa A, Wichmann HE, Metspalu A, Meitinger T (2005) Linkage disequilibrium patterns and tagSNP transferability among European populations. Am J Hum Genet 76:387–398
Nordqvist B (2000) Coastal adaptations in the Mesolitic. A study of costal sites with organic remains from the Boreal and Atlantic periods in Western Sweden. GOTARC. Series B 13 Göteborg
Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C, McDonough DP, Nguyen BT, Norris MC, Sheehan JB, Shen N, Stern D, Stokowski RP, Thomas DJ, Trulson MO, Vyas KR, Frazer KA, Fodor SP, Cox DR (2001) Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294:1719–1723
Sikstrom C, Nylander PO (1990) Transferrin C subtypes and ethnic heterogeneity in Sweden. Hum Hered 40:335–339
Tambets K, Rootsi S, Kivisild T, Help H, Serk P, Loogvali EL, Tolk HV, Reidla M, Metspalu E, Pliss L, Balanovsky O, Pshenichnov A, Balanovska E, Gubina M, Zhadanov S, Osipova L, Damba L, Voevoda M, Kutuev I, Bermisheva M, Khusnutdinova E, Gusar V, Grechanina E, Parik J, Pennarun E, Richard C, Chaventre A, Moisan JP, Barac L, Pericic M, Rudan P, Terzic R, Mikerezi I, Krumina A, Baumanis V, Koziel S, Rickards O, De Stefano GF, Anagnou N, Pappa KI, Michalodimitrakis E, Ferak V, Furedi S, Komel R, Beckman L, Villems R (2004) The western and eastern roots of the Saami—the story of genetic “outliers” told by mitochondrial DNA and Y chromosomes. Am J Hum Genet 74:661–682
Tantoso E, Yang Y, Li KB (2006) How well do HapMap SNPs capture the untyped SNPs? BMC Genomics 7:238
Tenesa A, Dunlop MG (2006) Validity of tagging SNPs across populations for association studies. Eur J Hum Genet 14:357–363
Wiklund K, Holm LE, Eklund G (1990) Cancer risks in Swedish Lapps who breed reindeer. Am J Epidemiol 132:1078–1082
Wright S (1950) Genetic structure of populations. Br Med J 4669:36
Acknowledgments
The study was supported by grants from the Swedish Natural Science Research Council (UG), The Foundation for Strategic Research (Genome Research Program) (UG) and The Swedish Medical Science Research Council (UG). Åsa Johansson is affiliated to The Linnaeus Centre for Bioinformatics, Uppsala University, Sweden. We are grateful for the participation of the members of the Sami communities.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Johansson, Å., Vavruch-Nilsson, V., Cox, D.R. et al. Evaluation of the SNP tagging approach in an independent population sample—array-based SNP discovery in Sami. Hum Genet 122, 141–150 (2007). https://doi.org/10.1007/s00439-007-0379-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-007-0379-2