The impact of disregarding family structure on genome-wide association analysis of complex diseases in cohorts with simple pedigrees
- 45 Downloads
The generalized linear mixed models (GLMMs) methodology is the standard framework for genome-wide association studies (GWAS) of complex diseases in family-based cohorts. Fitting GLMMs in very large cohorts, however, can be computationally demanding. Also, the modified versions of GLMM using faster algorithms may underperform, for instance when a single nucleotide polymorphism (SNP) is correlated with fixed-effects covariates. We investigated the extent to which disregarding family structure may compromise GWAS in cohorts with simple pedigrees by contrasting logistic regression models (i.e., with no family structure) to three LMMs-based ones. Our analyses showed that the logistic regression models in general resulted in smaller P values compared with the LMMs-based models; however, the differences in P values were mostly minor. Disregarding family structure had little impact on determining disease-associated SNPs at genome-wide level of significance (i.e., P < 5E-08) as the four P values resulted from the tested methods for any SNP were all below or all above 5E-08. Nevertheless, larger discrepancies were detected between logistic regression and LMMs-based models at suggestive level of significance (i.e., of 5E-08 ≤ P < 5E-06). The SNP effects estimated by the logistic regression models were not statistically different from those estimated by GLMMs that implemented Wald’s test. However, several SNP effects were significantly different from their counterparts in LMMs analyses. We suggest that fitting GLMMs with Wald’s test on a pre-selected subset of SNPs obtained from logistic regression models can ensure the balance between the speed of analyses and the accuracy of parameters.
KeywordsComplex diseases Family-based GWAS Logistic regression GLMMs framework
Funding support for the Late Onset Alzheimer’s Disease Family Study (LOADFS) was provided through the Division of Neuroscience, NIA. The LOADFS includes a genome-wide association study funded as part of the Division of Neuroscience, NIA. Assistance with phenotype harmonization and genotype cleaning, as well as with general study coordination, was provided by Genetic Consortium for Late Onset Alzheimer’s Disease.
The Framingham Heart Study (FHS) is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University (Contract No. N01-HC-25195 and HHSN268201500001I). Funding for SHARe Affymetrix genotyping was provided by NHLBI Contract N02-HL-64278. SHARe Illumina genotyping was provided under an agreement between Illumina and Boston University. Funding for CARe genotyping was provided by NHLBI Contract N01-HC-65226. Funding support for the Framingham Dementia dataset was provided by NIH/NIA grant R01 AG08122. Funding support for the Framingham Inflammatory Markers was provided by NIH grants R01 HL064753, R01 HL076784, and R01 AG028321. Funding support for the Framingham C-reactive protein dataset was provided by NIH grants R01 HL064753, R01 HL076784, and R01 AG028321. Funding support for the Framingham Adiponectin dataset was provided by NIH/NHLBI grant R01-DK-080739. Funding support for the Framingham Interleukin-6 dataset was provided by NIH grants R01 HL064753, R01 HL076784, and R01 AG028321.
The authors’ responsibilities were as follows: A.N. and A.M.K. designed the study, K.G.A and A.N. prepared and analyzed data, A.N. and A.M.K. wrote the manuscript, and all authors read and approved the final manuscript.
This research was supported by Grants from the National Institute on Aging (P01AG043352 and R01AG047310). The funders had no role in study design, data collection and analysis, decision to publish, or manuscript preparation.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval and consent to participate
This study focuses on secondary analysis of data obtained from dbGaP upon approval by local Institutional Review Board (IRB), and does not involve gathering data from human subjects directly. All procedures performed were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
This manuscript was not prepared in collaboration with LOADFS investigators and does not necessarily reflect the opinions or views of LOADFS. This manuscript was not prepared in collaboration with investigators of the FHS and does not necessarily reflect the opinions or views of the FHS, Boston University, or NHLBI. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
- Aulchenko YS, de Koning D-J, Haley C (2007) Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177:577–585. https://doi.org/10.1534/genetics.107.075614 CrossRefPubMedPubMedCentralGoogle Scholar
- Gordon D, Haynes C, Johnnidis C et al (2004) A transmission disequilibrium test for general pedigrees that is robust to the presence of random genotyping errors and any number of untyped parents. Eur J Hum Genet 12:752–761. https://doi.org/10.1038/sj.ejhg.5201219 CrossRefPubMedPubMedCentralGoogle Scholar
- Kupper N, Willemsen G, Riese H et al (2005) Heritability of daytime ambulatory blood pressure in an extended twin design. Hypertens 45:80–85. https://doi.org/10.1161/01.HYP.0000149952.84391.54 CrossRefGoogle Scholar
- Kupper N, Ge D, Treiber FA, Snieder H (2006) Emergence of novel genetic effects on blood pressure and hemodynamics in adolescence: the Georgia Cardiovascular Twin Study. Hypertens 47:948–954. https://doi.org/10.1161/01.HYP.0000217521.79447.9a CrossRefGoogle Scholar
- Nazarian A, Yashin AI, Kulminski AM (2018) Genome-wide analysis of genetic predisposition to Alzheimer’s disease and related sex disparities. Alzheimer’s Research & Therapy 11:5. https://doi.org/10.1186/s13195-018-0458-8
- Shih PB, O’Connor DT (2008) Hereditary determinants of human hypertension. Hypertension 51:1456–1464. https://doi.org/10.1161/HYPERTENSIONAHA.107.090480 CrossRefPubMedPubMedCentralGoogle Scholar
- Tang W, Hong Y, Province MA et al (2006) Familial clustering for features of the metabolic syndrome: the National Heart, Lung, and Blood Institute (NHLBI) Family Heart Study. Diabetes Care 29:631–636. https://doi.org/10.2337/diacare.29.03.06.dc05-0679 CrossRefPubMedGoogle Scholar