Bioinformatics pp 175-190 | Cite as

Adjusting for Familial Relatedness in the Analysis of GWAS Data

  • Russell ThomsonEmail author
  • Rebekah McWhirter
Part of the Methods in Molecular Biology book series (MIMB, volume 1526)


Relatedness within a sample can be of ancient (population stratification) or recent (familial structure) origin, and can either be known (pedigree data) or unknown (cryptic relatedness). All of these forms of familial relatedness have the potential to confound the results of genome-wide association studies. This chapter reviews the major methods available to researchers to adjust for the biases introduced by relatedness and maximize power to detect associations. The advantages and disadvantages of different methods are presented with reference to elements of study design, population characteristics, and computational requirements.

Key words

Genome-wide association studies GWAS Relatedness Confounding Population stratification Cryptic relatedness Familial structure 

Supplementary material (2.2 mb)


  1. 1.
    Astle W, Balding DJ (2009) Population structure and cryptic relatedness in genetic association studies. Stat Sci 24:451–471CrossRefGoogle Scholar
  2. 2.
    Vilhjálmsson BJ, Nordborg M (2013) The nature of confounding in genome-wide association studies. Nat Rev Genet 14:1–2CrossRefPubMedGoogle Scholar
  3. 3.
    Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Jakkula E, Leppä V, Sulonen A-M, Varilo T, Kallio S, Kemppinen A, Purcell S, Koivisto K, Tienari P, Sumelahti M-L et al (2010) Genome-wide association study in a high-risk isolate for multiple sclerosis reveals associated variants in STAT3 gene. Am J Hum Genet 86:285–291CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    McQuillan R, Leutenegger A-L, Abdel-Rahman R, Franklin CS, Pericic M, Barac-Lauc L, Smolej-Narancic N, Janicijevic B, Polasek O, Tenesa A et al (2008) Runs of homozygosity in European populations. Am J Hum Genet 83:359–372CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Zeggini E (2012) Next-generation association studies for complex traits. Nat Genet 43:287–288CrossRefGoogle Scholar
  7. 7.
    Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997–1004CrossRefPubMedGoogle Scholar
  8. 8.
    Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909CrossRefPubMedGoogle Scholar
  9. 9.
    Price AL, Zaitlen NA, Reich D, Patterson N (2010) New approaches to population stratification in genome-wide association studies. Nat Rev Genet 11:459–463CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Wooster R, Neuhausen SL, Mangion J, Quirk Y, Ford D, Collins N, Nguyen K, Seal S, Tran T, Averill D et al (1994) Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12-13. Science 265:2088–2090CrossRefPubMedGoogle Scholar
  11. 11.
    Li Y, Foo JN, Liany H, Low H-Q, Liu J (2014) Combined linkage and family-based association analysis improved candidate gene detection in Genetic Analysis Workshop 18 simulation data. BMC Proc 8:S29CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Li M, Boehnke M, Abecasis GR (2005) Joint modeling of linkage and association: identifying SNPs responsible for a linkage signal. Am J Hum Genet 76:934–949CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Spielman RS, Ewens WJ (1998) A sibship test for linkage in the presence of association: the sib transmission/disequilibrium test. Am J Hum Genet 62:450–458CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Zhou JJ, Yip W-K, Cho MH, Qiao D, McDonald M-LN, Laird NM (2014) A comparative analysis of family-based and population-based association tests using whole genome sequence data. BMC Proc 8:S33CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Almasy L, Blangero J (1998) Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet 62:1198–1211CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Blangero J, Diego VP, Dyer TD, Almeida M, Peralta J, Kent JWJ, Williams JT, Almasy L, Göring HH (2013) A kernel of truth: statistical advances in polygenic variance component models for complex human pedigrees. Adv Genet 81:1–31PubMedPubMedCentralGoogle Scholar
  17. 17.
    Thornton T, McPeek MS (2007) Case-control association testing with related individuals: a more powerful quasi-likelihood score test. Am J Hum Genet 81:321–337CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Stanhope SA, Abney M (2012) GLOGS: a fast and powerful method for GWAS of binary traits with risk covariates in related populations. Bioinformatics 28:1553–1554CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Jakobsdottir J, McPeek MS (2013) MASTOR: mixed-model association mapping of quantitative traits in samples with related individuals. Am J Hum Genet 92:652–666CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Falconer DS (1965) The inheritance of liability to certain diseases, estimated from the incidence among relatives. Ann Hum Genet 29:51–76CrossRefGoogle Scholar
  21. 21.
    Chen MH, Liu X, Larson MG, Fox CS, Vasan RS, Yang Q (2011) A comparison of strategies for analyzing dichotomous outcomes in genome-wide association studies with general pedigrees. Genet Epidemiol 35:650–657CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Zhou X, Stephens M (2012) Genome-wide efficient mixed model analysis for association studies. Nat Genet 44:821–824CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Eu-ahsunthornwattana J, Howey RAJ, Cordell HJ (2014) Accounting for relatedness in family-based association studies: application to Genetic Analysis Workshop 18 data. BMC Proc 8:S79CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Listgarten J, Lippert C, Kadie CM, Davidson RI, Eskin E, Heckerman D (2012) Improved linear mixed models for genome-wide association studies. Nat Methods 9:525–526CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Loh P-R, Tucker G, Bulik-Sullivan BK, Vilhjalmsson BJ, Finucane HK, Salem RM, Chasman DI, Ridker PM, Neale BM, Berger B, Patterson N, Price AL (2015) Efficient Bayesian mixed model analysis increases association power in large cohorts. Nat Genet 47:284–290CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Svishcheva GR, Axenovich TI, Belonogova NM, van Duijn CM, Aulchenko YS (2012) Rapid variance components-based method for whole-genome association analysis. Nat Genet 44:1166–1170CrossRefPubMedGoogle Scholar
  27. 27.
    Aulchenko YS, Ripke S, Isaacs A, van Duijn CM (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics 23:1294–1296CrossRefPubMedGoogle Scholar
  28. 28.
    Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42:348–354CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Yang J, Lee SH, Goddard ME, Visscher PM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88:76–82CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633–2635CrossRefPubMedGoogle Scholar
  31. 31.
    Lynch M, Ritland K (1999) Estimation of pairwise relatedness with molecular markers. Genetics 152:1753–1766PubMedPubMedCentralGoogle Scholar
  32. 32.
    Yang J, Zaitlen NA, Goddard ME, Visscher PM, Price AL (2014) Advantages and pitfalls in the application of mixed-model association methods. Nat Genet 46:100–106CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Tucker G, Price AL, Berger B (2014) Improving the power of GWAS and avoiding confounding from population stratification with PC-Select. Genetics 197:1045–1049. doi: 10.1534/genetics.1114.164285 CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Thornton T, McPeek MS (2010) ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. Am J Hum Genet 86:172–184CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Lange K, Papp JC, Sinsheimer JS, Sripracha R, Zhou H, Sobel EM (2013) Mendel: the Swiss army knife of genetic analysis programs. Bioinformatics 29:1568–1570CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    The 1000 Genomes Project Consortium (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65CrossRefPubMedCentralGoogle Scholar
  37. 37.
    Svishcheva GR, Belonogova NM, Axenovich TI (2014) FFBSKAT: fast family-based sequence kernel association test. PLoS One 9:e99407CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Uemoto Y, Pong-Wong R, Navarro P, Vitart V, Hayward C, Wilson JF, Rudan I, Campbell H, Hastie ND, Wright AF et al (2013) The power of regional heritability analysis for rare and common variant detection: simulations and application to eye biometrical traits. Front Genet 4, Article 232Google Scholar
  39. 39.
    Liu JZ, Mcrae AF, Nyholt DR, Medland SE, Wray NR, Brown KM, Investigators AMFS, Hayward NK, Montgomery GW, Visscher PM et al (2010) A versatile gene-based test for genome-wide association studies. Am J Hum Genet 87:139–145CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Hu H, Roach JC, Coon H, Guthery SL, Voelkerding KV, Margraf RL, Durtschi JD, Tavtigian SV, Shankaracharya, Wu W et al (2014) A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data. Nat Biotechnol 32:663–669CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Jiang D, McPeek MS (2014) Robust rare variant association testing for quantitative traits in samples with related individuals. Genet Epidemiol 38:10–20CrossRefPubMedGoogle Scholar
  42. 42.
    Liu F, Struchalin MV, van Duijn K, Hofman A, Uitterlinden AG, Aulchenko YS, Kayser M (2011) Detecting low frequent loss-of-function alleles in genome wide association studies with red hair color as an example. PLoS One 6:e28145CrossRefPubMedPubMedCentralGoogle Scholar
  43. 43.
    Oualkacha K, Dastani Z, Li R, Cingolani PE, Spector TD, Hammond CJ, Richards JB, Ciampi A, Greenwood CMT (2013) Adjusted sequence kernel association test for rare variants controlling for cryptic and family relatedness. Genet Epidemiol 37:366–376CrossRefPubMedGoogle Scholar
  44. 44.
    De G, Yip W-K, Ionita-Laza I, Laird N (2013) Rare variant analysis for family-based design. PLoS One 8:e48495CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Thorisson GA, Smith AV, Krishnan L, Stein LD (2005) The international HapMap project web site. Genome Res 15:1592–1593CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Ekman P, Friesen WV (1976) Pictures of facial affect. Consulting Psychologists Press, Palo Alto, CAGoogle Scholar
  47. 47.
    R Core Team (2014) R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
  48. 48.
    Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265CrossRefPubMedGoogle Scholar
  49. 49.
    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Thornton TAA, Austin MA (2013) Software and data resources for genetic association studies: Mini Review. CAB Rev 8:1–6CrossRefGoogle Scholar
  51. 51.
    Fitzgerald LM, Patterson B, Thomson R, Polanowski A, Quinn S, Brohede J, Thornton T, Challis D, Mackey DA, Dwyer T et al (2009) Identification of a prostate cancer susceptibility gene on chromosome 5p13q12 associated with risk of both familial and sporadic disease. Eur J Hum Genet 17:368–377CrossRefPubMedGoogle Scholar
  52. 52.
    Pirinen M, Donnelly P, Spencer CC (2012) Including known covariates can reduce power to detect genetic effects in case-control studies. Nat Genet 44:848–851CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Centre for Research in Mathematics, School of Computing, Engineering and MathematicsWestern Sydney UniversityParramattaAustralia
  2. 2.Menzies Institute for Medical ResearchUniversity of TasmaniaHobartAustralia

Personalised recommendations