Calibrating Population Stratification in Association Analysis

Part of the Methods in Molecular Biology book series (MIMB, volume 1666)


In genetic association studies, it is necessary to correct for population structure to avoid inference bias. During the past decade, prevailing corrections often only involved adjustments of global ancestry differences between sampled individuals. Nevertheless, population structure may vary across local genomic regions due to the variability of local ancestries associated with natural selection, migration, or random genetic drift. Adjusting for global ancestry alone may be inadequate when local population structure is an important confounding factor. In contrast, adjusting for local ancestry can more effectively prevent false positives due to local population structure. To more accurately locate disease genes, we recommend adjusting for local ancestries by interrogating local structure. In practice, locus-specific ancestries are usually unknown and must be inferred. For recently admixed populations with known reference ancestral populations, locus-specific ancestries can be inferred accurately using some hidden Markov model-based methods. However, SNP-wise ancestries cannot be accurately inferred when ancestral population information is not available. For such scenarios, we propose employing local principal components (PCs) to present local ancestries and adjusting for local PCs when testing for gene–phenotype association.

Key words

Genome-wide association studies Migration Random genetic drift Natural selection Admixed populations Global ancestry Local ancestries Local principal components Hidden Markov algorithms Fine mapping 



This work was funded in part by NHGRI grant HG003054 to X.Z. and by Tulane’s Committee on Research fellowship (600890) and Carol Lavin Bernick Faculty Grant (632119) to H.Q.


  1. 1.
    Knowler WC, Williams RC, Pettitt DJ, Steinberg AG (1988) Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am J Hum Genet 43:520–526PubMedPubMedCentralGoogle Scholar
  2. 2.
    Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265:2037–2048CrossRefPubMedGoogle Scholar
  3. 3.
    Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997–1004CrossRefPubMedGoogle Scholar
  4. 4.
    Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000) Association mapping in structured populations. Am J Hum Genet 67:170–181CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Satten GA, Flanders WD, Yang Q (2001) Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet 68:466–477CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Zhu X, Zhang S, Zhao H, Cooper RS (2002) Association mapping, using a mixture model for complex traits. Genet Epidemiol 23:181–196CrossRefPubMedGoogle Scholar
  7. 7.
    Zhang S, Zhu X, Zhao H (2003) On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals. Genet Epidemiol 24:44–56CrossRefPubMedGoogle Scholar
  8. 8.
    Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36:512–517CrossRefPubMedGoogle Scholar
  9. 9.
    Campbell CD et al (2005) Demonstrating stratification in a European American population. Nat Genet 37:868–872CrossRefPubMedGoogle Scholar
  10. 10.
    Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909CrossRefPubMedGoogle Scholar
  11. 11.
    Zhu X et al (2008) A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet 82:352–365CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Zhu X et al (2008) Admixture mapping and the role of population structure for localizing disease genes. Adv Genet 60:547–569PubMedGoogle Scholar
  13. 13.
    Qin et al (2010) Interrogating local population structure for fine mapping in genome-wide association studies. Bioinformatics 26(23):2961–2968CrossRefGoogle Scholar
  14. 14.
    Cavalli-Sforza LL, Bodmer WF (1999) The genetics of human populations. Dover, Mineola, New YorkGoogle Scholar
  15. 15.
    Epstein MP et al (2007) A simple and improved correction for population stratification in case-control studies. Am J Hum Genet 80:921–930CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nat Genet 40:646–649CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Tang H et al (2007) Recent genetic selection in the ancestral admixture of Puerto Ricans. Am J Hum Genet 81(3):626–633CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Genovese G et al (2010) Association of trypanolytic ApoL1 variants with kidney disease in African-Americans. Science 7:1–7Google Scholar
  19. 19.
    Voight BF et al (2006) A map of recent positive selection in the human genome. PLoS Biol 4:e72CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Sabeti PC et al (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449:913–918CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Crow JF, Kimura M (1970) An introduction to population genetics theory. Harper & Row, New York, pp 469–478Google Scholar
  22. 22.
    Patterson N et al (2004) Methods for high-density admixture mapping of disease genes. Am J Hum Genet 74:979–1000CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Tang H et al (2006) Reconstructing genetic ancestry blocks in admixed individuals. Am J Hum Genet 79:1–12CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Zhu X et al (2006) A classical likelihood based approach for admixture mapping using EM algorithm. Hum Genet 120:431–445CrossRefPubMedGoogle Scholar
  25. 25.
    Sankararaman S et al (2008) Estimating local ancestry in admixed populations. Am J Hum Genet 82:290–303CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Price AL et al (2009) Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet 5:e1000519CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Guan Y (2014) Detecting Structure of Haplotypes and Local Ancestry. Genetics 196(3):625–642CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Kang SJ et al (2010) Genome wide association of anthropometric traits in African and African derived populations. Hum Mol Genet 19(13):2725–2738CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Levy D et al (2009) Genome-wide association study of blood pressure and hypertension. Nat Genet 41:677–687CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Chang CC, Chow CC, Tellier L, Vattikuti S, Purcell SM, Lee JJ (2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4(7):1–16Google Scholar
  31. 31.
    Roshyara NR, M Scholz M (2014) AfcGENE: a versatile tool for processing and transforming SNP datasets. PLoS One 9(7):e9758CrossRefGoogle Scholar
  32. 32.
    Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2(12):2074–2093., e190. doi: 10.1371/journal.pgen.0020190 CrossRefGoogle Scholar
  33. 33.
    Johnstone I (2001) On the distribution of the largest eigenvalue in principal components analysis. Ann Stat 29:295–327CrossRefGoogle Scholar
  34. 34.
    Zou F et al (2010) Quantification of population structure using correlated SNPs by shrinkage principal components. Hum Hered 70:9–22CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Price AL et al (2010) New approaches to population stratification in genome-wide association studies. Nature Reviews 11:459–463CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.Department of Global Biostatistics and Data ScienceTulane University School of Public Health and Tropical MedicineNew OrleansUSA
  2. 2.Department of Population and Quantitative Health SciencesCase Western Reserve University School of MedicineClevelandUSA

Personalised recommendations