Genome-Wide Association Studies

  • Mark M. Iles
Part of the Methods in Molecular Biology book series (MIMB, volume 713)


Genome-wide association (GWA) studies are best understood as an extension of candidate gene association studies, scaled up to cover hundreds of thousands of markers across the genome in samples usually of several thousand cases and controls. The GWA approach allows the detection of much smaller effect sizes than with previous linkage-based genome-wide studies. However, this sensitivity makes them vulnerable to false positive findings caused by subtle differences between cases and controls that may arise as a result of issues, such as genotyping errors, population stratification, and sample mix-ups as well as the more obvious issue of multiple testing. After some background and an introduction to GWA, studies are considered stage-by-stage with particular focus on quality control as this is by far the most time-consuming and complex issue related to GWA.

Key words

Genetics Epidemiology Genome-wide Statistics Association 


  1. 1.
    Risch N, Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273(5281): 1516–1517.PubMedCrossRefGoogle Scholar
  2. 2.
    Lander ES (1996) The new genomics: global views of biology. Science 274(5287): 536–539.PubMedCrossRefGoogle Scholar
  3. 3.
    International HapMap Consortium (2003) The International HapMap Project. Nature 426: 789–796.CrossRefGoogle Scholar
  4. 4.
    McPherson R, Pertsemlidis A, Kavaslar N, Stewart A, Roberts R et al (2007) A common allele on chromosome 9 associated with coronary heart disease. Science 316(5830): 1488–1491.PubMedCrossRefGoogle Scholar
  5. 5.
    Helgadottir A, Thorleifsson G, Manolescu A, Gretarsdottir S, Blondal T et al (2007) A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science 316(5830): 1491–1493.PubMedCrossRefGoogle Scholar
  6. 6.
    Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678.CrossRefGoogle Scholar
  7. 7.
    Samani NJ, Erdmann J, Hall AS, Hengstenberg C, Mangino M et al (2007) Genomewide association analysis of coronary artery disease. N Engl J Med 357: 443–453.PubMedCrossRefGoogle Scholar
  8. 8.
    Matarin M, Brown WM, Scholz S, Simon-Sanchez J, Fung HC et al (2007) A genome-wide genotyping study in patients with ischaemic stroke: initial analysis and data release. Lancet Neurol 6(5): 414–420.PubMedCrossRefGoogle Scholar
  9. 9.
    Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M et al (2007) A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39(7): 870–874.PubMedCrossRefGoogle Scholar
  10. 10.
    Stacey SN, Manolescu A, Sulem P, Rafnar T, Gudmundsson J et al (2007) Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor-positive breast cancer. Nat Genet 39(7): 865–869.PubMedCrossRefGoogle Scholar
  11. 11.
    Easton DF, Pooley KA, Dunning AM, Pharoah PDP, Thompson D et al (2007) Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447: 1087–1093.PubMedCrossRefGoogle Scholar
  12. 12.
    Sladek R, Rocheleau G, Rung J, Dina C, Shen L et al (2007) A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445(7130): 881–885.PubMedCrossRefGoogle Scholar
  13. 13.
    Saxena R, Voight BF, Lyssenko V, Burtt NP, de Bakker PI et al (2007) Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316(5829): 1331–1336.PubMedCrossRefGoogle Scholar
  14. 14.
    Zeggini E, Weedon MN, Lindgren CM, Frayling TM, Elliott KS (2007) Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316(5829): 1336–1341.PubMedCrossRefGoogle Scholar
  15. 15.
    Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y et al (2007) A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316(5829): 1341–1345.PubMedCrossRefGoogle Scholar
  16. 16.
    Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM (2007) A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316(5826): 889–894.PubMedCrossRefGoogle Scholar
  17. 17.
    Steinthorsdottir V, Thorleifsson G, Reynisdottir I, Benediktsson R, Jonsdottir T et al (2007) A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat Genet 39(6): 770–775.PubMedCrossRefGoogle Scholar
  18. 18.
    Salonen JT, Uimari P, Aalto JM, Pirskanen M, Kaikkonen J et al (2007) Type 2 diabetes whole-genome association study in four populations: the DiaGen consortium. Am J Hum Genet 81(2): 338–345.PubMedCrossRefGoogle Scholar
  19. 19.
    Chapman JM, Cooper JD, Todd JA, Clayton DG (2003) Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum Hered 56: 18–31.PubMedCrossRefGoogle Scholar
  20. 20.
    Iles MM (2008) What can genome-wide association studies tell us about the genetics of common disease? PLoS Genet 4(2): e33.PubMedCrossRefGoogle Scholar
  21. 21.
    Novembre J, Johnson T, Bryc K, Kutalik Z, Boyko AR, Auton A, Indap A, King KS, Bergmann S, Nelson MR, Stephens M, Bustamante CD (2008) Genes mirror geography within Europe. Nature 456(7218): 98–101.PubMedCrossRefGoogle Scholar
  22. 22.
    Clayton DG, Walker NM, Smyth DJ, Pask R, Cooper JD, Maier LM, Smink LJ, Lam AC, Ovington NR, Stevens HE, Nutland S, Howson JMM, Faham M, Moorhead M, Jones HB, Falkowski M, Hardenbol P, Willis TD, Todd JA (2005) Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat Genet 37: 1243–1246.PubMedCrossRefGoogle Scholar
  23. 23.
    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3): 559–575.PubMedCrossRefGoogle Scholar
  24. 24.
    Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55: 997–1004.PubMedCrossRefGoogle Scholar
  25. 25.
    Devlin B, Roeder K (2001) Genomic control: a new approach to genetic-based association studies. Theor Pop Biol 60: 155–166.CrossRefGoogle Scholar
  26. 26.
    Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8): 904–909.PubMedCrossRefGoogle Scholar
  27. 27.
    Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000) Association mapping in structured populations. Am J Hum Genet 67: 170–181.PubMedCrossRefGoogle Scholar
  28. 28.
    Satten G, Flanders WD, Yang O (2001) Accounting for unmeasured population structure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet 68: 466–477.PubMedCrossRefGoogle Scholar
  29. 29.
    Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nat Genet 40(5): 646–649.PubMedCrossRefGoogle Scholar
  30. 30.
    Bishop DT, Demenais F, Iles MM, Harland M, Taylor JC, Corda E, Randerson-Moor J, Aitken JF, Avril MF, Azizi E, Bakker B, Bianchi-Scarrà G, Bressac-de Paillerets B, Calista D, Cannon-Albright LA, Chin-A-Woeng T, Dębniak T, Galore-Haskel G, Ghiorzo P, Gut I, Hansson J, Hočevar M, Höiom V, Hopper JL, Ingvar C, Kanetsky PA, Kefford RF, Landi MT, Lang J, Lubiński J, Mackie R, Malvehy J, Mann GJ, Martin NG, Montgomery GW, van Nieuwpoort FA, Novakovic S, Olsson H, Puig S, Weiss M, van Workum W, Zelenika D, Brown KM, Goldstein AM, Gillanders EM, Boland A, Galan P, Elder DE, Gruis NA, Hayward NK, Lathrop GM, Barrett JH, Newton Bishop JA (2009) Genome-wide association study identifies three loci associated with melanoma risk. Nat Genet 41(8): 920–925.PubMedCrossRefGoogle Scholar
  31. 31.
    Gonzalez JR, Carrasco JL, Dudbridge F, Armengol L, Estivill X, Moreno V (2008) Maximising association statistics over genetic models. Genet Epidem 32: 246–254.CrossRefGoogle Scholar
  32. 32.
    Bacanu S-A, Nelson MR, Ehm MG (2008) Comparison of association methods for dense marker data. Genet Epidem 32: 791–799.CrossRefGoogle Scholar
  33. 33.
    Lunetta KL, Hayward BL, Segal J, Van Eerdewegh P (2004) Screening large-scale association study data: exploiting interactions using random forests. BMC Genet 5: 32.PubMedCrossRefGoogle Scholar
  34. 34.
    Stefansson H, Rye DB, Hicks A, Petursson H, Ingason A (2007) A genetic risk factor for periodic limb movements in sleep. N Engl J Med 357(7): 639–647.PubMedCrossRefGoogle Scholar
  35. 35.
    Gudmundsson J, Sulem P, Manolescu A, Amundadottir LT, Gudbjartsson D et al (2007) Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nat Genet 39: 631–637.PubMedCrossRefGoogle Scholar
  36. 36.
    International Multiple Sclerosis Genetics Consortium (2007) Risk alleles for multiple sclerosis identified by a genomewide study. N Engl J Med 357(9): 851–862.CrossRefGoogle Scholar
  37. 37.
    Haiman CA, Le Marchand L, Yamamato J, Stram DO, Sheng X et al (2007) A common genetic risk factor for colorectal and prostate cancer. Nat Genet 39: 954–956.PubMedCrossRefGoogle Scholar
  38. 38.
    Zöllner S, Pritchard JK (2007) Overcoming the winner’s curse: estimating penetrance parameters from case-control data. Am J Hum Genet 80(4): 605–615.PubMedCrossRefGoogle Scholar
  39. 39.
    Garner C (2007) Upward bias in odds ratio estimates from genome-wide association studies. Genet Epidemiol 31: 288–295.PubMedCrossRefGoogle Scholar
  40. 40.
    NCI-NHGRI Working Group on Replication in Association Studies (2007) Replicating genotype-phenotype associations. Nature 447: 655–660.CrossRefGoogle Scholar
  41. 41.
    Marchini J, Howie B, Myers S, McVean G, Donnelly P (2007) A new multipoint method for genome-wide association studies via imputation of genotypes. Nat Genet 39: 906–913.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Mark M. Iles
    • 1
  1. 1.Section of Epidemiology and Biostatistics, Leeds Institute for Molecular MedicineUniversity of LeedsLeedsUK

Personalised recommendations