Behavior Genetics

, Volume 48, Issue 1, pp 55–66 | Cite as

Adaptive SNP-Set Association Testing in Generalized Linear Mixed Models with Application to Family Studies

  • Jun Young Park
  • Chong Wu
  • Saonli Basu
  • Matt McGue
  • Wei PanEmail author
Original Research


In genome-wide association studies (GWAS), it has been increasingly recognized that, as a complementary approach to standard single SNP analyses, it may be beneficial to analyze a group of functionally related SNPs together. Among the existent population-based SNP-set association tests, two adaptive tests, the aSPU test and the aSPUpath test, offer a powerful and general approach at the gene- and pathway-levels by data-adaptively combining the results across multiple SNPs (and genes) such that high statistical power can be maintained across a wide range of scenarios. We extend the aSPU and the aSPUpath test to familial data under the framework of the generalized linear mixed models (GLMMs), which can take account of both subject relatedness and possible population structure. As in population-based GWAS, the proposed aSPU and aSPUpath tests require only fitting a single and common GLMM (under the null hypothesis) for all the SNPs, thus are computationally efficient and feasible for large GWAS data. We illustrate our approaches in identifying genes and pathways associated with alcohol dependence in the Minnesota Twin Family Study. The aSPU test detected a gene associated with the trait, in contrast to none by the standard single SNP analysis. Our aSPU test also controlled Type I errors satisfactorily in a small simulation study. We provide R code to conduct the aSPU and aSPUpath tests for familial and other correlated data.


Alcohol dependence aSPU GEE GLMM GWAS Score test 



The authors are grateful to two reviewers and an editor for many helpful comments.


This research was supported by National Institutes of Health Grants R37DA05147, R01AA09357, R01AA11886, R01DA13240, U01DA024417, R01MH066140, R01GM113250, R01HL105397, R01HL116720, and by the Minnesota Supercomputing Institute.

Compliance with ethical standards

Conflicts of interest

Jun Young Park, Chong Wu, Saonli Basu, Matt McGue, and Wei Pan declare that they have no conflict of interest.

Human and Animal Rights and Informed Consent

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.


  1. Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88(421):9–25Google Scholar
  2. Breslow NE, Lin X (1995) Bias correction in generalised linear mixed models with a single component of dispersion. Biometrika 82(1):81–91CrossRefGoogle Scholar
  3. Chen H, Wang C, Conomos MP, Stilp AM, Li Z, Sofer T, Szpiro AA, Chen W, Brehm JM, Celedn JC, Redline S, Papanicolaou GJ, Thornton TA, Laurie CC, Rice K, Lin X (2016) Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. Am J Hum Genet 98(4):653–666CrossRefPubMedPubMedCentralGoogle Scholar
  4. Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55(4):997–1004CrossRefPubMedGoogle Scholar
  5. Harville DA (1977) Maximum likelihood approaches to variance component estimation and related problems. J Am Stat Assoc 72(358):320–340CrossRefGoogle Scholar
  6. Hervieu G (2003) Melanin-concentrating hormone functions in the nervous system: food intake and stress. Expert Opin Ther Targets 7(4):495–511CrossRefPubMedGoogle Scholar
  7. Hervieu GJ (2006) Further insights into the neurobiology of melanin-concentrating hormone in energy and mood balances. Expert Opin Ther Targets 10(2):211–229CrossRefPubMedGoogle Scholar
  8. Hicks BM, Schalet BD, Malone SM, Iacono WG, McGue M (2011) Psychometric and genetic architecture of substance use disorder and behavioral disinhibition measures for gene association studies. Behav Genet 41(4):459–475CrossRefPubMedGoogle Scholar
  9. Iacono WG, McGue M (2002) Minnesota twin family study. Twin Res 5(5):482–487CrossRefPubMedGoogle Scholar
  10. Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38:D355–D360CrossRefGoogle Scholar
  11. Kim J, Zhang Y, Pan W (2016) Powerful and adaptive testing for multi-trait and multi-SNP associations with GWAS and sequencing data. Genetics 203(2):715–731CrossRefPubMedPubMedCentralGoogle Scholar
  12. Kwee LC, Liu D, Lin X, Ghosh D, Epstein MP (2008) A powerful and flexible multilocus association test for quantitative traits. Am J Hum Genet 82(2):386–397CrossRefPubMedPubMedCentralGoogle Scholar
  13. Liang K, Zeger S (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13–22CrossRefGoogle Scholar
  14. Lin X, Breslow NE (1996) Bias correction in generalized linear mixed models with multiple components of dispersion. J Am Stat Assoc 91(435):1007–1016CrossRefGoogle Scholar
  15. Miller MB, Basu S, Cunningham J, Eskin E, Malone SM, Oetting WS, Schork N, Sul JH, Iacono WG, McGue M (2012) The Minnesota center for twin and family research genome-wide association study. Twin Res Hum Genet 15(6):767–774CrossRefPubMedPubMedCentralGoogle Scholar
  16. Pan W (2011) Relationship between genomic distance-based regression and kernel machine regression for multi-marker association testing. Genet Epidemiol 35(4):211–216PubMedPubMedCentralGoogle Scholar
  17. Pan W, Kim J, Zhang Y, Shen X, Wei P (2014) A powerful and adaptive association test for rare variants. Genetics 197(4):1081–1095CrossRefPubMedPubMedCentralGoogle Scholar
  18. Pan W, Kwak I, Wei P (2015) A powerful and pathway-based adaptive test for genetic association with common or rare Variants. Am J Hum Genet 97(1):86–98CrossRefPubMedPubMedCentralGoogle Scholar
  19. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8):904–909CrossRefPubMedGoogle Scholar
  20. Roy M, David N, Cueva M, Giorgetti M (2007) A study of the involvement of melanin-concentrating hormone receptor 1 (MCHR1) in murine models of depression. Biol Psychiatry 61(2):174–180CrossRefPubMedGoogle Scholar
  21. Wang Z, Xu K, Zhang X, Wu X, Wang Z (2017) Longitudinal SNP-set association analysis of quantitative phenotypes. Genet Epidemiol 41:81–93CrossRefPubMedGoogle Scholar
  22. Wessel J, Schork NJ (2006) Generalized genomic distance-based regression methodology for multilocus association analysis. Am J Hum Genet 79(5):792–806CrossRefPubMedPubMedCentralGoogle Scholar
  23. Wu MC, Kraft P, Epstein MP, Taylor DM, Chanock SJ, Hunter DJ, Lin X (2010) Powerful SNP-set analysis for case-control genome-wide association studies. Am J Hum Genet 86(6):929–942CrossRefPubMedPubMedCentralGoogle Scholar
  24. Xu Z, Pan W (2015) Approximate score-based testing with application to multivariate trait association analysis. Genet Epidemiol 39(6):469–479CrossRefPubMedPubMedCentralGoogle Scholar
  25. Zhang Y, Xu Z, Shen X, Pan W, Initiative Alzheimer’s Disease Neuroimaging (2014) Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. Neuroimage 96:309–325CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  • Jun Young Park
    • 1
  • Chong Wu
    • 1
  • Saonli Basu
    • 1
  • Matt McGue
    • 2
  • Wei Pan
    • 1
    Email author
  1. 1.Division of BiostatisticsUniversity of MinnesotaMinneapolisUSA
  2. 2.Department of PsychologyUniversity of MinnesotaMinneapolisUSA

Personalised recommendations