Abstract
In recent years, genome-wide association approaches have proven a powerful and successful strategy to identify genetic contributors to complex traits, including a number of endocrine disorders. Their success has meant that genome wide association studies (GWAS) are fast becoming the default study design for discovery of new genetic variants that influence a clinical trait or phenotype. This chapter focuses on a number of key elements that require consideration for the successful conduct of a GWAS. Although many of the considerations are common to any genetic study, the greater cost, extreme multiple testing, and greater openness to data sharing require specific awareness and planning by investigators. In the section on designing a GWAS, we reflect on ethical considerations, study design, selection of phenotype/s, power considerations, sample tracking and storage issues, and genotyping product selection. During execution, important considerations include DNA quantity and preparation, genotyping methods, quality control checks of genotype data, in silico genotyping (imputation), tests of association, and replication of association signals. Although the field of human genetics is rapidly evolving, recent experiences can help guide an investigator in making practical and methodological choices that will eventually determine the overall quality of GWAS results. Given the investment to recruit patient populations or cohorts that are powered for a GWAS, and the still substantial costs associated with genotyping, it is helpful to be aware of these aspects to maximize the likelihood of success, especially where there is an opportunity for implementing them prospectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al.: Initial sequencing and analysis of the human genome. Nature 2001, 409(6822):860–921.
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA et al.: The sequence of the human genome. Science 2001, 291(5507):1304–1351.
Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, Sherry S, Mullikin JC, Mortimore BJ, Willey DL et al.: A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 2001, 409(6822):928–933.
Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M et al.: The structure of haplotype blocks in the human genome. Science 2002, 296(5576):2225–2229.
Olivier M: A haplotype map of the human genome. Physiol Genomics 2003, 13(1):3–9.
The International HapMap Consortium: A haplotype map of the human genome. Nature 2005, 437(7063):1299–1320.
Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM et al.: A second generation human haplotype map of over 3.1 million SNPs. Nature 2007, 449(7164):851–861.
de Bakker PI, Yelensky R, Pe'er I, Gabriel SB, Daly MJ, Altshuler D: Efficiency and power in genetic association studies. Nat Genet 2005, 37(11):1217–1223.
Gu CC, Yu K, Ketkar S, Templeton AR, Rao DC: On transferability of genome-wide tagSNPs. Genet Epidemiol 2008, 32(2): 89–97.
Gu CC, Yu K, Rao DC: Characterization of LD structures and the utility of HapMap in genetic association studies. Adv Genet 2008, 60:407–435.
Frayling TM: A new era in finding Type 2 diabetes genes-the unusual suspects. Diabet Med 2007, 24(7):696–701.
Lindgren CM, McCarthy MI: Mechanisms of disease: genetic insights into the etiology of type 2 diabetes and obesity. Nat Clin Pract Endocrinol Metab 2008, 4(3):156–163.
Duffy DL: Genetic determinants of diabetes are similarly associated with other immune-mediated diseases. Curr Opin Allergy Clin Immunol 2007, 7(6):468–474.
Hwang SJ, Yang Q, Meigs JB, Pearce EN, Fox CS: A genome-wide association for kidney function and endocrine-related traits in the NHLBI's Framingham Heart Study. BMC Med Genet 2007, 8 Suppl 1:S10.
Todd JA, Walker NM, Cooper JD, Smyth DJ, Downes K, Plagnol V, Bailey R, Nejentsev S, Field SF, Payne F et al.: Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat Genet 2007, 39(7):857–864.
Richards JB, Rivadeneira F, Inouye M, Pastinen TM, Soranzo N, Wilson SG, Andrew T, Falchi M, Gwilliam R, Ahmadi KR et al.: Bone mineral density, osteoporosis, and osteoporotic fractures: a genome-wide association study. Lancet 2008, 371(9623): 1505–1512.
Kiel DP, Demissie S, Dupuis J, Lunetta KL, Murabito JM, Karasik D: Genome-wide association with bone mass and geometry in the Framingham Heart Study. BMC Med Genet 2007, 8 Suppl 1:S14.
Fox CS, Heard-Costa N, Cupples LA, Dupuis J, Vasan RS, Atwood LD: Genome-wide association to body mass index and waist circumference: the Framingham Heart Study 100Â K project. BMC Med Genet 2007, 8 Suppl 1:S18.
Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, Perry JR, Elliott KS, Lango H, Rayner NW et al.: A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 2007, 316(5826):889–894.
Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, Najjar S, Nagaraja R, Orru M, Usala G et al.: Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet 2007, 3(7):e115.
Weedon MN, Lettre G, Freathy RM, Lindgren CM, Voight BF, Perry JR, Elliott KS, Hackett R, Guiducci C, Shields B et al.: A common variant of HMGA2 is associated with adult and childhood height in the general population. Nat Genet 2007, 39(10):1245–1250.
Lettre G, Jackson AU, Gieger C, Schumacher FR, Berndt SI, Sanna S, Eyheramendy S, Voight BF, Butler JL, Guiducci C et al.: Identification of ten loci associated with height highlights new biological pathways in human growth. Nat Genet 2008, 40(5):584–591.
Sanna S, Jackson AU, Nagaraja R, Willer CJ, Chen WM, Bonnycastle LL, Shen H, Timpson N, Lettre G, Usala G et al.: Common variants in the GDF5-UQCC region are associated with variation in human height. Nat Genet 2008, 40(2):198–203.
Weedon MN, Lango H, Lindgren CM, Wallace C, Evans DM, Mangino M, Freathy RM, Perry JR, Stevens S, Hall AS et al.: Genome-wide association analysis identifies 20 loci that influence adult height. Nat Genet 2008, 40(5):575–583.
McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN: Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 2008, 9(5):356–369.
Mailman MD, Feolo M, Jin Y, Kimura M, Tryka K, Bagoutdinov R, Hao L, Kiang A, Paschall J, Phan L et al.: The NCBI dbGaP database of genotypes and phenotypes. Nat Genet 2007, 39(10):1181–1186.
Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J, Pearson JV, Stephan DA, Nelson SF, Craig DW: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 2008, 4(8):e1000167.
Martin ER, Monks SA, Warren LL, Kaplan NL: A test for linkage and association in general pedigrees: the pedigree disequilibrium test. Am J Hum Genet 2000, 67(1):146–154.
Chen WM, Abecasis GR: Family-based association tests for genomewide association scans. Am J Hum Genet 2007, 81(5):913–926.
Wellcome Trust Case Control Consortium: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007, 447(7145):661–678.
Cooper JD, Smyth DJ, Smiles AM, Plagnol V, Walker NM, Allen JE, Downes K, Barrett JC, Healy BC, Mychaleckyj JC et al.: Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nat Genet 2008.
Zondervan KT, Cardon LR: Designing candidate gene and genome-wide case-control association studies. Nat Protoc 2007, 2(10):2492–2501.
Amos CI: Successful design and conduct of genome-wide association studies. Hum Mol Genet 2007, 16 Spec No. 2:R220–225.
Kraft P, Cox DG: Study designs for genome-wide association studies. Adv Genet 2008, 60:465–504.
Cupples LA: Family study designs in the age of genome-wide association studies: experience from the Framingham Heart Study. Curr Opin Lipidol 2008, 19(2):144–150.
Skol AD, Scott LJ, Abecasis GR, Boehnke M: Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 2006, 38(2):209–213.
Anderson CA, Pettersson FH, Barrett JC, Zhuang JJ, Ragoussis J, Cardon LR, Morris AP: Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms. Am J Hum Genet 2008, 83(1):112–119.
Glasel JA: Validity of nucleic acid purities monitored by 260 nm/280 nm absorbance ratios. Biotechniques 1995, 18(1):62–63.
Steemers FJ, Chang W, Lee G, Barker DL, Shen R, Gunderson KL: Whole-genome genotyping with the single-base extension assay. Nat Methods 2006, 3(1):31–33.
Illumina Inc.: Infinium HD Assay Super, Manual − Experienced User Card. In.: Part # 11294825.
Wigginton JE, Cutler DJ, Abecasis GR: A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet 2005, 76(5):887–893.
Pritchard JK, Donnelly P: Case-control studies of association in structured or admixed populations. Theor Popul Biol 2001, 60(3):227–237.
Marchini J, Cardon LR, Phillips MS, Donnelly P: The effects of human population structure on large genetic association studies. Nat Genet 2004, 36(5):512–517.
Teo YY: Common statistical issues in genome-wide association studies: a review on power, data quality control, genotype calling and population structure. Curr Opin Lipidol 2008, 19(2):133–143.
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D: Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006, 38(8):904–909.
Li Q, Yu K: Improved correction for population stratification in genome-wide association studies by identifying hidden population structures. Genet Epidemiol 2008, 32(3):215–226.
Cheung VG, Spielman RS, Ewens KG, Weber TM, Morley M, Burdick JT: Mapping determinants of human gene expression by regional and genome-wide association. Nature 2005, 437(7063):1365–1369.
Li Y, Abecasis GR: Mach 1.0: Rapid haplotype reconstruction and missing genotype inference. American Journal of Human Genetics 2006, S79:2290.
Marchini J, Howie B, Myers S, McVean G, Donnelly P: A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 2007, 39(7):906–913.
Huang L, Li Y, Singleton AB, Hardy JA, AbeCasis G, Rosenberg NA, Scheet P: Genotype-imputation accuracy across worldwide human populations. American Journal of Human Genetics 2009, 84(2):230–250.
Chen WM, Erdos MR, Jackson AU, Saxena R, Sanna S, Silver KD, Timpson NJ, Hansen T, Orru M, Grazia Piras M et al.: Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels. J Clin Invest 2008, 118(7):2620–2628.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007, 81(3):559–575.
Langefeld CD, Fingerlin TE: Association methods in human genetics. Methods Mol Biol 2007, 404:431–460.
Senn S: Transposed conditionals, shrinkage, and direct and indirect unbiasedness. Epidemiology 2008, 19(5):652–654; discussion 657–658.
Ioannidis JP: Why most discovered true associations are inflated. Epidemiology 2008, 19(5):640–648.
Kraft P: Curses—winner's and otherwise—in genetic epidemiology. Epidemiology 2008, 19(5):649–651; discussion 657-648.
Willett WC: The search for truth must go beyond statistics. Epidemiology 2008, 19(5):655–656.
de Bakker PI, Ferreira MA, Jia X, Neale BM, Raychaudhuri S, Voight BF: Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet 2008, 17(R2):R122-128.
Kavvoura FK, Ioannidis JP: Methods for meta-analysis in genetic association studies: a review of their potential and pitfalls. Hum Genet 2008, 123(1):1–14.
Ioannidis JP: Non-replication and inconsistency in the genome-wide association setting. Hum Heredity 2007, 64(4):203–213.
Acknowledgments
We wish to thank Andrew Singleton, Ph.D., National Institute on Aging, Kathleen H. Day, University of Virginia and Fang-Chi Hsu, Ph.D., Wake Forest University School of Medicine, for helpful discussion and comments.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Sale, M.M., Mychaleckyj, J.C., Chen, WM. (2009). Planning and Executing a Genome Wide Association Study (GWAS). In: Park-Sarge, OK., Curry, T. (eds) Molecular Endocrinology. Methods in Molecular Biology, vol 590. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60327-378-7_25
Download citation
DOI: https://doi.org/10.1007/978-1-60327-378-7_25
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60327-377-0
Online ISBN: 978-1-60327-378-7
eBook Packages: Springer Protocols