Abstract
It is widely believed that a subset of single nucleotide polymorphisms (SNPs) is able to capture the majority of the information for genotype-phenotype association studies that is contained in the complete compliment of genetic variations. The question remains, how does one select that particular subset of SNPs in order to maximize the power of detecting a significant association? In this study, we have used a simulation approach to compare three competing methods of site selection: random selection, selection based on pair-wise linkage disequilibrium, and selection based on maximizing haplotype diversity. The results indicate that site selection based on maximizing haplotype diversity is preferred over random selection and selection based on pair-wise linkage disequilibrium. The results also indicate that it is more prudent to increase the sample size to improve a study's power than to continuously increase the number of SNPs. These results have direct implications for designing gene-based and genome-wide association studies.
Similar content being viewed by others
References
Ardlie KG, Kruglyak L, Seielstad M (2002) Patterns of linkage disequilibrium in the human genome. Nat Rev Genet 3:299–309
Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics 138:963-971
Clark AG, Weiss KM, Nickerson DA, Taylor SL, Buchanan A, Stengard J, Salomaa V, Vartiainen E, Perola M, Boerwinkle E, Sing CF (1998) Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am J Hum Genet 63:595-612
Collins FS, Guyer MS, Charkravati A (1997) Variations on a theme: cataloging human DNA sequence variation. Science 278:1580-1581
Daly M, Rioux J, Schaffner S, Hudson T, Lander E (2001) High-resolution haplotype structure in the human genome. Nat Genet 29:229-232
Das M, Burge CB, Park E, Colinas J, Pelletier J (2001) Assessment of the total number of human transcription units. Genomics 77:71-78
Fallin D, Cohen A, Essioux L, Chumakov I, Blumenfeld M, Cohen D, Schork N (2001) Genetic analysis of case/control data using estimated haplotype frequencies: application to APOE locus variation and Alzheimer's disease. Genome Res 11:143-151
Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D (2002) The structure of haplotype blocks in the human genome. Science 296:2225-2229
Hudson RR (1983) Properties of a neutral allele model with intergenic recombination. Theor Popul Biol 23:183-201
Jeffreys A, Kauppi L, Neumann R (2001) Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29:217-222
Johnson G, Esposito L, Barratt BJ, Smith A, Heward J, Genova G, Ueda H, Cordell H, Eaves I, Dudbridge F, Twells R, Payne F, Hughes W, Nutland S, Stevens H, Carr P, Tuomilehto-Wolf E, Tuomilehto J, Gough S, Clayton D, Todd J (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233-237
Kruglyak L (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 22:139-144
Long AD, Langley CH (1999) The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res 9:720-731
Petes TD (2001) Meiotic recombination hot spots and cold spots. Nat Rev Genet 2:360-369
Reich D, Cargill M, Bolk S, Ireland J, Sabeti P, Richter D, Lavery T, Kouyoumijian R, Farhadian S, Ward R, Lander E (2001) Linkage disequilibrium in the human genome. Nature 411:199-204
Risch N, Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273:1516-1517
Suchard MA, Bailey JN, Elashoff DA, Sinsheimer JS (2001) SNPing away at candidate genes. Genet Epidemiol 21 (Suppl 1):S643-648
Zhao JH, Curtis D, Sham PC (2000) Model-free analysis and permutation tests for allelic associations. Hum Hered 50:133-139
Zollner S, von Haeseler A (2000) A coalescent approach to study linkage disequilibrium between single-nucleotide polymorphisms. Am J Hum Genet 66:615-628
Acknowledgements
This work is supported by grants from the National Heart Lung and Blood Institute and National Institute of General Medical Sciences.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, Q., Fu, Yx. & Boerwinkle, E. Comparison of strategies for selecting single nucleotide polymorphisms for case/control association studies. Hum Genet 113, 253–257 (2003). https://doi.org/10.1007/s00439-003-0965-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-003-0965-x