The Fundamentals of Modern Statistical Genetics pp 175-189 | Cite as
Genome Wide Association Studies
Abstract
The key requirement for genetic association, linkage disequilibrium (LD), is a short distance property that extends only for a limited physical distance across the human genome. As we showed in Chapter 7, if there is low LD between the genotyped marker and the DSL, there will be low power to detect association between the disease and the DSL. In the early years of association testing, the strategy was mainly used to test specific regions, e.g., genes which were selected on the basis of function relative to the biology of the disease, or on the basis of linkage analysis. By restricting testing to a small enough region, markers can be selected for testing which should be in LD with the DSL anywhere in the region. In particular, SNPs in the coding region of a gene are often chosen as markers. With Genome Wide Association Studies (GWAS) the idea is instead to cover the entire genome with a sufficiently dense set of SNPs that all untyped polymorphsims (including DSLs) are in reasonably high LD with a tested SNP. For this reason, GWAS studies are sometimes called ‘unbiased’ because every region of the genome is searched, not just those meeting determined selection criteria.
Keywords
Genome Wide Association Study Association Test Parental Genotype Genotyping Error Linkage Disequilibrium PatternBibliography
- Bertram L, Lange C, Mullin K, Parkinson M, Hsiao M, Hogan M, Schjeide B, Hooli B, DiVito J, Ionita I, et al (2008) Genome-wide association analysis reveals putative Alzheimer’s disease susceptibility loci in addition to APOE. The American Journal of Human Genetics 83(5):623–632CrossRefGoogle Scholar
- Browning B, Browning S (2009) A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. The American Journal of Human Genetics 84(2):210–223CrossRefGoogle Scholar
- Chanock S, Manolio T, Boehnke M, Boerwinkle E, Hunter D, Thomas G, Hirschhorn J, Abecasis G, Altshuler D, Bailey-Wilson J, et al (2007) Replicating genotype? Phenotype associations. Nature 447(7145):655–660CrossRefGoogle Scholar
- Clarke G, Cardon L (2009) Aspects of observing and claiming allele flips in association studies. Genetic Epidemiology 34(3):266–274Google Scholar
- DerSimonian R, Laird N (1986) Meta-analysis in clinical trials. Controlled Clinical Trials 7(3):177–188CrossRefGoogle Scholar
- Emerson J, Hoaglin D, Mosteller F (1996) Simple robust procedures for combining risk differences in sets of 2 x 2 tables. Statistics in Medicine 15(14):1465CrossRefGoogle Scholar
- Fardo D, Becker K, Bertram L, Tanzi R, Lange C (2009a) Recovering unused information in genome-wide association studies: the benefit of analyzing SNPs out of Hardy–Weinberg equilibrium. European Journal of Human Genetics 17(12):1676–1682CrossRefGoogle Scholar
- Fardo D, Ionita-Laza I, Lange C (2009b) On quality control measures in genome-wide association studies: a test to assess the genotyping quality of individual probands in family-based association studies and an application to the hapmap data. PLoS Geneties 7:e1000,572. Epub 2009 Jul 24Google Scholar
- Feng T, Zhang S, Sha Q (2007) Two-stage association tests for genome-wide association studies based on family data with arbitrary family structure. European Journal of Human Genetics 15:1169–1175CrossRefGoogle Scholar
- Gordon D, Leal S, Heath S, Ott J (2000) An analytic solution to single nucleotide polymorphism error-detection rates in nuclear families: implications for study design. Pacific Symposium on Biocomputing, 663:74Google Scholar
- Gordon D, Heath S, Liu X, Ott J (2001) A transmission/disequilibrium test that allows for genotyping errors in the analysis of single-nucleotide polymorphism data. The American Journal of Human Genetics 69(2):371–380CrossRefGoogle Scholar
- Heid I, Huth C, Loos R, Kronenberg F, Adamkova V, Anand S, Ardlie K, Biebermann H, Bjerregaard P, Boeing H, Bouchard C, et al (2009) Meta-analysis of the INSIG2 association with obesity including 74,345 individuals: does heterogeneity of estimates relate to study design? PLoS Genetics 5(10):e1000,694, DOI 10.1371/journal.pgen.1000694Google Scholar
- Herbert A, Gerry N, McQueen M, Heid I, Pfeufer A, Illig T, Wichmann EH, Meitinger T, Hunter D, Hu F, Colditz G, Zhu X, Cooper R, Ardlie K, Lyon H, Hirschhorn J, Laird N, Lenburg M, Lange C, Christman M (2006) Genetic variation near insig2 is a common determinant of obesity in Western Europeans and African Americans. Science 312(5771):279–283CrossRefGoogle Scholar
- International HapMap Consortium, The (2003) The international hapmap project. Nature 426(6968):789–796CrossRefGoogle Scholar
- International HapMap Consortium, The (2005) A haplotype map of the human genome. Nature 427:1299–1320CrossRefGoogle Scholar
- International HapMap Consortium, The (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861CrossRefGoogle Scholar
- Ionita-Laza I, McQueen M, Laird N, Lange C (2007) Genomewide weighted hypothesis testing in family-based association studies, with an application to a 100 k scan. The American Journal of Human Genetics 81(3):607–614CrossRefGoogle Scholar
- Ionita-Laza I, Perry G, Raby B, Klanderman B, Lee C, Laird N, Weiss S, Lange C (2008) On the analysis of copy-number variations in genome-wide association studies: a translation of the family-based association test. Genetic Epidemiology 32(3):273CrossRefGoogle Scholar
- Laird N, Lange C (2006) Family-based designs in the age of large-scale gene-association studies. Nature Review Genetics 7(5):385–394CrossRefGoogle Scholar
- Laird N, Lange C (2009) The role of family-based designs in genome wide association studies. Statistical Science 24(4):388–397CrossRefMATHMathSciNetGoogle Scholar
- Laird N, Horvath S, Xu X (2000b) Implementing a unified approach to family-based tests of association. Genetic Epidemiology 19:S36CrossRefGoogle Scholar
- Lange C, Silverman E, Xu X, Weiss S, Laird N (2003a) A multivariate family-based association test using generalized estimating equations: FBAT-GEE. Biostatistics 4:195–206CrossRefMATHGoogle Scholar
- Lasky-Su J, Lyon H, Emilsson V, Heid I, Molony C, Raby B, Lazarus R, Klanderman B, Soto-Quiros M, Avila L, et al (2008a) On the replication of genetic associations: timing can be everything! The American Journal of Human Genetics 82(4):849–858CrossRefGoogle Scholar
- Lasky-Su J, Neale B, Franke B, Anney R, Zhou K, Maller J, Vasquez A, Chen W, Asherson P, Buitelaar J, et al (2008b) Genome-wide association scan of quantitative traits for attention deficit hyperactivity disorder identifies novel associations and confirms candidate gene associations. American Journal of Medical Genetics. Part B, Neuropsychiatric Genetics 147B(8):1345–1354CrossRefGoogle Scholar
- Lasky-Su J, Won S, Mick E, Anney R, Franke B, Neale B, Biederman J, Smalley S, Loo S, Todorov A, et al (2010) On genome-wide association studies for family-based designs: an integrative analysis approach combining ascertained family samples with unselected controls. The American Journal of Human Genetics 86(4):573–580CrossRefGoogle Scholar
- Lin D, Huang B (2007) The use of inferred haplotypes in downstream analyses. The American Journal of Human Genetics 80(3):577–579CrossRefGoogle Scholar
- Lin P, Vance J, Pericak-Vance M, Martin E (2007) No gene is an island: the flip-flop phenomenon. The American Journal of Human Genetics 80(3):531–538CrossRefGoogle Scholar
- Lipták T (1959) On the combination of independent tests. Magyar Tudományos Akadémia Matematikai Kutató Intezetenek Kozlemenyei 3:1971–1977MATHGoogle Scholar
- Manolio T, Brooks L, Collins F (2008) A HapMap harvest of insights into the genetics of common disease. The Journal of Clinical Investigation 118(5):1590CrossRefGoogle Scholar
- Marchini J, Howie B, Myers S, McVean G, Donnelly P (2007) A new multipoint method for genome-wide association studies by imputation of genotypes. Nature 39:906–913Google Scholar
- Mitchell A, Cutler D, Chakravarti A (2003) Undetected genotyping errors cause apparent overtransmission of common alleles in the transmission/disequilibrium test. The American Journal of Human Genetics 72(3):598–610CrossRefGoogle Scholar
- Pearson T, Manolio T (2008) How to interpret a genome-wide association study. JAMA 299(11):1335CrossRefGoogle Scholar
- Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D, Maller J, Sklar P, de Bakker P, Daly M, et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics 81(3):559–575CrossRefGoogle Scholar
- Rabbee N, Speed T (2006) A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics 22(1):7CrossRefGoogle Scholar
- Rabinowitz D, Laird N (2000) A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Human Heredity 50(4):211–223CrossRefGoogle Scholar
- Rice W (1990) A consensus combined P-value test and the family-wide significance of component tests. Biometrics 46(2):303–308CrossRefMATHMathSciNetGoogle Scholar
- Satagopan J, Elston R (2003) Optimal two-stage genotyping in population-based association studies. Genetic Epidemiology 25:149–157CrossRefGoogle Scholar
- Skol A, Scott L, Abecasis G, Boehnke M (2006) Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nature 38:209–213Google Scholar
- Spielman R, Ewens W (1998) A sibship test for linkage in the presence of association: the sib transmission/disequilibrium test. American Journal of Human Genetics 62:450–458CrossRefGoogle Scholar
- Teo Y, Inouye M, Small K, Gwilliam R, Deloukas P, Kwiatkowski D, Clark T (2007) A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 23(20):2741CrossRefGoogle Scholar
- Thomas D, Xie R, Gebregziabher M (2004) Two-stage sampling designs for gene association studies. Genetic Epidemiology 27:401–414CrossRefGoogle Scholar
- Thomas D, Casey G, Conti D, Haile R, Lewinger J, Stram D (2009) Methodological issues in multistage genome-wide association studies. Statistical Science 24:414–429CrossRefMATHMathSciNetGoogle Scholar
- Van Steen K, McQueen M, Herbert A, Raby B, Lyon H, DeMeo D, Murphy A, Su J, Datta S, Rosenow C, Christman M, Silverman E, Laird N, ST Weiss, Lange C (2005) Genomic screening and replication using the same data set in family-based association testing. Nature Genetics 37:683–691CrossRefGoogle Scholar
- Wang H, Thomas D, Pe’er I, Stram D (2006) Optimal two-stage genotyping designs for genome-wide association scans. Genetic Epidemiology 30(4):356CrossRefGoogle Scholar
- Won S, Wilk J, Mathias R, O’Donnell C, Silverman E, Barnes K, O’Connor G, Weiss S, Lange C (2009) On the analysis of genome-wide association studies in family-based designs: a universal, robust analysis approach and an application to four genome-wide association Studies. PLoS Genetics 5(11):e1000,741. Epub 2009 Nov 26Google Scholar
- Zheng G, Song K, Elston R (2007) Adaptive two-stage analysis of Genetic association in case–control designs. Human Heredity 63(3–4):175–186CrossRefGoogle Scholar
- Lange C, DeMeo D, Silverman E, Weiss S, Laird N (2003b) Using the noninformative families in family-based association tests: a powerful new testing strategy. American Journal of Human Genetics 79:801–811CrossRefGoogle Scholar
- Gordon D, Ott J (2001) Assessment and management of single nucleotide polymorphism genotype errors in genetic association analysis. Pacific Symposium on Biocomputing, vol 2001, pp 18–29Google Scholar
- Laurie CC, Doheny KF, Mirel DB, Pugh EW, Bierut LJ, Bhangale T, Boehm F, Caporaso NE, Cornelis MC, Edenburg HJ, et al (2010) Quality control and quality assurance in genotypic data for genome-wide association studies. Genetic Epidemiology. Wiley Online LibraryGoogle Scholar