In genetic association studies, it is necessary to correct for population structure to avoid inference bias. During the past decade, prevailing corrections often only involved adjustments of global ancestry differences between sampled individuals. Nevertheless, population structure may vary across local genomic regions due to the variability of local ancestries associated with natural selection, migration, or random genetic drift. Adjusting for global ancestry alone may be inadequate when local population structure is an important confounding factor. In contrast, adjusting for local ancestry can more effectively prevent false positives due to local population structure. To more accurately locate disease genes, we recommend adjusting for local ancestries by interrogating local structure. In practice, locus-specific ancestries are usually unknown and must be inferred. For recently admixed populations with known reference ancestral populations, locus-specific ancestries can be inferred accurately using some hidden Markov model-based methods. However, SNP-wise ancestries cannot be accurately inferred when ancestral population information is not available. For such scenarios, we propose employing local principal components (PCs) to present local ancestries and adjusting for local PCs when testing for gene–phenotype association.
Genome-wide association studies Migration Random genetic drift Natural selection Admixed populations Global ancestry Local ancestries Local principal components Hidden Markov algorithms Fine mapping
This is a preview of subscription content, log in to check access
Springer Nature is developing a new tool to find and evaluate Protocols. Learn more
This work was funded in part by NHGRI grant HG003054 to X.Z. and by Tulane’s Committee on Research fellowship (600890) and Carol Lavin Bernick Faculty Grant (632119) to H.Q.
Knowler WC, Williams RC, Pettitt DJ, Steinberg AG (1988) Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am J Hum Genet 43:520–526PubMedPubMedCentralGoogle Scholar
Satten GA, Flanders WD, Yang Q (2001) Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet 68:466–477CrossRefPubMedPubMedCentralGoogle Scholar
Zhu X, Zhang S, Zhao H, Cooper RS (2002) Association mapping, using a mixture model for complex traits. Genet Epidemiol 23:181–196CrossRefPubMedGoogle Scholar
Zhang S, Zhu X, Zhao H (2003) On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals. Genet Epidemiol 24:44–56CrossRefPubMedGoogle Scholar
Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36:512–517CrossRefPubMedGoogle Scholar