Detecting Multiethnic Rare Variants

Part of the Methods in Molecular Biology book series (MIMB, volume 1666)


Genome-wide association studies have identified many common genetic variants which are associated with certain diseases. The identified common variants, however, explain only a small portion of the heritability of a complex disease phenotype. The missing heritability motivated researchers to test the hypothesis that rare variants influence common diseases. Next-generation sequencing technologies have made the studies of rare variants practicable. Quite a few statistical tests have been developed for exploiting the cumulative effect of a set of rare variants on a phenotype. The best-known sequence kernel association tests (SKATs) were developed for rare variants analysis of homogeneous genomes. In this chapter, we illustrate applications of the SKATs and offer several caveats regarding them. In particular, we address how to modify the SKATs to integrate local allele ancestries and calibrate the cryptic relatedness and population structure of admixed genomes.

Key words

Next-generation sequencing Common disease–rare variants hypothesis Linear mixed-effect models Unrelated individuals Sib pair designs Family designs Homogeneous population Admixed population Global ancestry Local ancestry Cryptic relatedness Population structure 



This work was funded in part by NIH grant HG003054 to X.Z. and by Tulane’s Committee on Research fellowship (600890) and Carol Lavin Bernick Faculty Grant (632119) to H.Q.


  1. 1.
    Burton PR, Clayton DG, Cardon LR et al (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145):661–678CrossRefGoogle Scholar
  2. 2.
    Heid IM, Jackson AU, Randall JC et al (2010) Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat Genet 42(11):949–960CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Lango Allen H, Estrada K, Lettre G et al (2010) Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467(7317):832–838CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Hindorff LA, Junkins HA, Hall P, et al (2011) A catalog of published genome-wide association studies
  5. 5.
    Manolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461(7265):747–753CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Gudbjartsson DF, Walters GB, Thorleifsson G et al (2008) Many sequence variants affecting diversity of adult human height. Nat Genet 40(5):609–615CrossRefPubMedGoogle Scholar
  7. 7.
    Lettre G, Jackson AU, Gieger C et al (2008) Identification of ten loci associated with height highlights new biological pathways in human growth. Nat Genet 40(5):584–591CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Weedon MN, Lango H, Lindgren CM et al (2008) Genome-wide association analysis identifies 20 loci that influence adult height. Nat Genet 40(5):575–583CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Wood AR, Esko T, Yang J et al (2014) Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet 46(11):1173–1186CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Pritchard JK (2001) Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 69(1):124–137CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Zuk O, Hechter E, Sunyaev SR et al (2012) The mystery of missing heritability: genetic interactions create phantom heritability. Proc Natl Acad Sci U S A 109(4):1193–1198CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Gorlov IP, Gorlova OY, Sunyaev SR et al (2008) Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. Am J Hum Genet 82(1):100–112CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Cirulli ET, Goldstein DB (2010) Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11(6):415–425CrossRefPubMedGoogle Scholar
  14. 14.
    Consortium GP (2010) A map of human genome variation from population-scale sequencing. Nature 467(7319):1061–1073CrossRefGoogle Scholar
  15. 15.
    Cohen J, Pertsemlidis A, Kotowski IK et al (2005) Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat Genet 37(2):161–165CrossRefPubMedGoogle Scholar
  16. 16.
    Cohen JC, Pertsemlidis A, Fahmi S et al (2006) Multiple rare variants in NPC1L1 associated with reduced sterol absorption and plasma low-density lipoprotein levels. Proc Natl Acad Sci U S A 103(6):1810–1815CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Ji W, Foo JN, O’Roak BJ et al (2008) Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet 40(5):592–599CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Nejentsev S, Walker N, Riches D et al (2009) Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324(5925):387–389CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Madsen BE, Browning SR (2009) A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5(2):e1000384CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Neale BM, Rivas MA, Voight BF et al (2011) Testing for an unusual distribution of rare variants. PLoS Genet 7(3):e1001322CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Lin D-Y, Tang Z-Z (2011) A general framework for detecting disease associations with rare variants in sequencing studies. Am J Hum Genet 89(3):354–367CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Price AL, Kryukov GV, de Bakker PI et al (2010) Pooled association tests for rare variants in exon-resequencing studies. Am J Hum Genet 86(6):832–838CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Wu MC, Lee S, Cai T et al (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89(1):82–93CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Lee S, Emond MJ, Bamshad MJ et al (2012) Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am J Hum Genet 91(2):224–237CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Luo L, Zhu Y, Xiong M (2012) Quantitative trait locus analysis for next-generation sequencing with the functional linear models. J Med Genet 49(8):513–524CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Lupski JR, Belmont JW, Boerwinkle E et al (2011) Clan genomics and the complex architecture of human disease. Cell 147(1):32–43CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Najmabadi H, Hu H, Garshasbi M et al (2011) Deep sequencing reveals 50 novel genes for recessive cognitive disorders. Nature 478(7367):57–63CrossRefPubMedGoogle Scholar
  28. 28.
    Chakravarti A (2011) Genomics is not enough. Science 334(6052):15CrossRefPubMedGoogle Scholar
  29. 29.
    Feng T, Elston RC, Zhu X (2011) Detecting rare and common variants for complex traits: sibpair and odds ratio weighted sum statistics (SPWSS, ORWSS). Genet Epidemiol 35(5):398–409CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Zhu X, Feng T, Li Y et al (2010) Detecting rare variants for complex traits using family and unrelated data. Genet Epidemiol 34(2):171–187CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Zhu Y, Xiong M (2012) Family-based association studies for next-generation sequencing. Am J Hum Genet 90(6):1028–1045CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Chen H, Meigs JB, Dupuis J (2013) Sequence kernel association test for quantitative traits in family samples. Genet Epidemiol 37(2):196–204CrossRefPubMedGoogle Scholar
  33. 33.
    Manichaikul A, Mychaleckyj JC, Rich SS et al (2010) Robust relationship inference in genome-wide association studies. Bioinformatics 26(22):2867–2873CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Davies RB (1980) The distribution of a linear combination of x2 random variables. Appl Stat 29(3):323–333CrossRefGoogle Scholar
  35. 35.
    Smith MW, O'Brien SJ (2005) Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat Rev Genet 6(8):623–632CrossRefPubMedGoogle Scholar
  36. 36.
    Qin H, Morris N, Kang SJ et al (2010) Interrogating local population structure for fine mapping in genome-wide association studies. Bioinformatics 26(23):2961–2968CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Qin H, Zhu X (2012) Power comparison of admixture mapping and direct association analysis in genome-wide association studies. Genet Epidemiol 36(3):235–243CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Price AL, Patterson NJ, Plenge RM et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8):904–909CrossRefPubMedGoogle Scholar
  39. 39.
    Yu J, Pressoir G, Briggs WH et al (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38(2):203–208CrossRefPubMedGoogle Scholar
  40. 40.
    Mathieson I, McVean G (2012) Differential confounding of rare and common variants in spatially structured populations. Nat Genet 44(3):243–246CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Price AL, Zaitlen NA, Reich D et al (2010) New approaches to population stratification in genome-wide association studies. Nat Rev Genet 11(7):459–463CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Mao X, Li Y, Liu Y et al (2013) Testing genetic association with rare variants in admixed populations. Genet Epidemiol 37(1):38–47CrossRefPubMedGoogle Scholar
  43. 43.
    Guan Y (2014) Detecting structure of haplotypes and local ancestry. Genetics 196(3):625–642CrossRefPubMedPubMedCentralGoogle Scholar
  44. 44.
    Thornton T, Tang H, Hoffmann TJ et al (2012) Estimating kinship in admixed populations. Am J Hum Genet 91(1):122–138CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Soave D, Corvol H, Panjwani N et al (2015) A joint location-scale test improves power to detect associated SNPs, gene sets, and pathways. Am J Hum Genet 97(1):125–138CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.Department of Global Biostatistics and Data ScienceTulane University School of Public Health and Tropical MedicineNew OrleansUSA
  2. 2.Department of Population and Quantitative Health SciencesCase Western Reserve University School of MedicineClevelandUSA

Personalised recommendations