Genetics and Genomics of Human Population Structure

  • Sohini Ramachandran
  • Hua Tang
  • Ryan N. Gutenkunst
  • Carlos D. Bustamante

Abstract

Recent developments in sequencing technology have created a flood of new data on human genetic variation, and this data has yielded new insights into human population structure. Here we review what both early and more recent studies have taught us about human population structure and history. Early studies showed that most human genetic variation occurs within populations rather than between them, and that genetically related populations often cluster geographically. Recent studies based on much larger data sets have recapitulated these observations, but have also demonstrated that high-density genotyping allows individuals to be reliably assigned to their population of origin. In fact, for admixed individuals, even the ancestry of particular genomic regions can often be reliably inferred. Recent studies have also offered detailed information about the composition of specific populations from around the world, revealing how history has shaped their genetic makeup. We also briefly review quantitative models of human genetic history, including the role natural selection has played in shaping human genetic variation.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Auton A, Bryc K, Boyko A, Lohmueller K, Novembre J, Reynolds A, Indap A, Wright M, Degenhardt J, Gutenkunst R, King K, Nelson M, Bustamante CD (2009) Global distribution of genomic diversity underscores rich complex history of continental human populations. Genome Res 19:795–803CrossRefPubMedGoogle Scholar
  2. 2.
    Belle EM, Landry PA, Barbujani G (2006) Origins and evolution of the Europeans' genome: evidence from multiple microsatellite loci. Proc Biol Sci 273:1595–1602CrossRefPubMedGoogle Scholar
  3. 3.
    Boyko AR, Williamson SH, Indap AR, Degenhardt JD, Hernandez RD, Lohmueller KE, Adams MD, Schmidt S, Sninsky JJ, Sunyaev SR, White TJ, Nielsen R, Clark AG, Bustamante CD (2008) Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet 4:e1000083CrossRefPubMedGoogle Scholar
  4. 4.
    Cann HM et al (2002) A human genome diversity cell line panel. Science 296:261–262CrossRefPubMedGoogle Scholar
  5. 5.
    Cavalli-Sforza LL, Piazza A (1975) Analysis of evolution: evolutionary rates, independence and treeness. Theor Popul Biol 8:127–165CrossRefPubMedGoogle Scholar
  6. 6.
    Cavalli-Sforza LL, Menozzi P, Piazza A (1994) The history and geography of human genes. Princeton University Press, Princeton, NJGoogle Scholar
  7. 7.
    Cavalli-Sforza LL, Menozzi P, Piazza A (1996) The history and geography of human genes. Princeton University Press, Princeton, NJ Abridged Paperback editionGoogle Scholar
  8. 8.
    Cavalli-Sforza LL, Menozzi P, Piazza A, Mountain J (1998) Reconstruction of human evolution; bringing together genetic, archaeological, and linguistic data. Proc Natl Acad Sci USA 85:6002–6006CrossRefGoogle Scholar
  9. 9.
    Clark AG, Hubisz MJ, Bustamante CD, Williamson SH, Nielsen R (2005) Ascertainment bias in studies of human genome-wide polymorphism. Genome Res 15:1496–1502CrossRefPubMedGoogle Scholar
  10. 10.
    Conrad DF, Jakobsson M, Coop G, Wen X, Wall JD, Rosenberg NA, Pritchard JK (2006) A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat Genet 38(11):1251–1260CrossRefPubMedGoogle Scholar
  11. 11.
    Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, Absher D, Myers RM, Cavalli-Sforza LL, Feldman MW, Pritchard JK (2009) The role of geography in human adaptation. PLoS Genetics 5:e1000500CrossRefPubMedGoogle Scholar
  12. 12.
    Edwards AWF (2003) Human genetic diversity: Lewontin's fallacy. Bioessays 25:798–801CrossRefPubMedGoogle Scholar
  13. 13.
    Fagundes NJ, Ray N, Beaumont M, Neuenschwander S, Salzano FM, Bonatto SL, Excoffier L (2007) Statistical evaluation of alternative models of human evolution. Proc Natl Acad Sci USA 104(45):17614–17619CrossRefPubMedGoogle Scholar
  14. 14.
    Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164(4):1567–1587PubMedGoogle Scholar
  15. 15.
    Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, BerlinGoogle Scholar
  16. 16.
    Heath SC, Gut IG, Brennan P, McKay JD, Bencko V, Fabianova E, Foretova L, Georges M, Janout V, Kabesch M, Krokan HE, Elvestad MB, Lissowska J, Mates D, Rudnai P, Skorpen F, Schreiber S, Soria JM, Syvänen A-C, Meneton P, Herçberg S, Galan P, Szeszenia-Dabrowska N, Zaridze D, Génin E, Cardon LR, Lathrop M (2008) Investigation of the fine structure of European populations with applications to disease association studies. Eur J Hum Genet 16:1413–1429CrossRefPubMedGoogle Scholar
  17. 17.
    Hernandez RD, Williamson SH, Zhu L, Bustamante CD (2007) Context dependent mutation rates may cause spurious signatures of a fixation bias favoring higher GC-content in humans. Mol Biol Evol 24(10):2196–2202CrossRefPubMedGoogle Scholar
  18. 18.
    Hernandez RD, Williamson SH, Bustamante CD (2007) Context dependence, ancestral misidentification, and spurious signatures of selection. Mol Biol Evol 24(8): 1792–1800CrossRefPubMedGoogle Scholar
  19. 19.
    Hey J, Nielsen R (2004) Multilocus methods for estimating population sizes, migration rates and divergence times, with applications to the divergence of Drosophila pseudoobscura and D. persimilis. Genetics 167:747–760CrossRefGoogle Scholar
  20. 20.
    The Human Genome. Nature 2001;409:following p 812. (series of articles in Nature on the draft genome sequence)Google Scholar
  21. 21.
    The International HapMap Consortium (2003) The International HapMap project. Nature 426:789–796CrossRefGoogle Scholar
  22. 22.
    The International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1320CrossRefGoogle Scholar
  23. 23.
    The International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861CrossRefGoogle Scholar
  24. 24.
    Jakkula E, Rehnström K, Varilo T, Pietiläinen OPH, Paunio T, Pedersen NL, deFaire U, Järvelin M-R, Saharinen J, Freimer N, Ripatti S, Purcell S, Collins A, Daly MJ, Palotie A, Peltonen L (2008) The genome-wide patterns of variation expose significant substructure in a founder population. Am J Hum Genet 83:787–794CrossRefPubMedGoogle Scholar
  25. 25.
    Keinan A, Mullikin JC, Patterson N, Reich D (2007) Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nat Genet 39:1251–1255CrossRefPubMedGoogle Scholar
  26. 26.
    Kryukov GV, Shpunt A, Stamatoyannopoulos JA, Sunyaev SR (2009) Power of deep, all-exon resequencing for discovery of human trait genes. Proc Natl Acad Sci USA 106(10):3871–3876CrossRefPubMedGoogle Scholar
  27. 27.
    Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265:2037–2048CrossRefPubMedGoogle Scholar
  28. 28.
    Lao O, Lu TT, Nothnagel M, Junge O, Freitag-Wolf S, Caliebe A, Balascakova M, Bertranpetit J, Bindoff LA, Comas D, Holmlund G, Kouvatsi A, Macek M, Mollet I, Parson W, Palo J, Ploski R, Sajantila A, Tagliabraci A, Gether U, Werge T, Rivadeneira F, Hofman A, Uitterlinden AG, Gieger C, Wichmann H-E, Rüther A, Schreiber S, Becker C, Nürnberg P, Nelson MR, Krawczak M, Kayser M (2008) Correlation between genetic and geographic structure in Europe. Curr Biol 18:1241–1248CrossRefPubMedGoogle Scholar
  29. 29.
    Lewontin RC (1972) The apportionment of human diversity. In: Dobzhansky T, Hecht MK, Steere WC (eds) Evolutionary biology 6. Appleton-Century-Crofts, New York, pp 381–398Google Scholar
  30. 30.
    Lewontin RC (1974) The genetic basis of evolutionary change. Columbia University Press, New YorkGoogle Scholar
  31. 31.
    Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 391:1100–1104CrossRefGoogle Scholar
  32. 32.
    Mardia K V, Kent JT, Bibby JM (1980) Multivariate analysis. Academic, LondonGoogle Scholar
  33. 33.
    Marth GT, Czabarka E, Murvai J, Sherry ST (2004) The allele frequency spectrum in genome-wide human variation data reveals signatures of differential demographic history in three large world populations. Genetics 166:351–372CrossRefPubMedGoogle Scholar
  34. 34.
    Menozzi P, Piazza A, Cavalli-Sforza LL (1978) Synthetic maps of human gene frequencies in Europe. Science 201:786–792CrossRefPubMedGoogle Scholar
  35. 35.
    Myers S, Fefferman C, Patterson N (2008) Can one learn history from the allelic spectrum? Theor Popul Biol 73:342–348CrossRefPubMedGoogle Scholar
  36. 36.
    Need AC, Kasperaviciute D, Cirulli ET, Goldstein DB (2009) A genome-wide genetic signature of Jewish ancestry perfectly separates individuals with and without full Jewish ancestry in a large random sample of European Americans. Genome Biol 10(1):R7CrossRefPubMedGoogle Scholar
  37. 37.
    Nelson MR, Bryc K, King KS, Indap A, Boyko AR, Novembre J, Briley LP, Maruyama Y, Waterworth DM, Waeber G, Vollenweider P, Oksenberg JR, Hauser SL, Stirnadel HA, Kooner JS, Chambers JC, Jones B, Mooser V, Bustamante CD, Roses AD, Burns DK, Ehm MG, Lai Eric H (2008) The population reference sample (POPRES): a resource for population, disease, and pharmacological genetics research. Am J Hum Genet 83(3): 347–358CrossRefPubMedGoogle Scholar
  38. 38.
    Nielsen R, Hubisz MJ, Clark AG (2004) Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 168:2373–2382CrossRefPubMedGoogle Scholar
  39. 39.
    Nielsen R, Hellmann I, Hubisz M, Bustamante MCD, Clark AG (2007) Recent and ongoing selection in the human genome. Nat Rev Genet 8(11):857–868CrossRefPubMedGoogle Scholar
  40. 40.
    Nielsen R, Hubisz MJ, Hellmann I, Torgerson D, Andrés AM, Albrechtsen A, Gutenkunst R, Adams MD, Cargill M, Hu X, Boyko A, Indap A, Bustamante CD, Clark AG (2009) Darwinian and demographic forces affecting human protein coding genes. Genome Res 19:838–849CrossRefPubMedGoogle Scholar
  41. 41.
    Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nat Genet 40:646–649CrossRefPubMedGoogle Scholar
  42. 42.
    Novembre J, Johnson T, Bryc K, Kutalik Z, Boyko AR, Auton A, Indap A, King KA, Bergmann S, Nelson MR, Stephens M, Bustamante CD (2008) Genes mirror geography within Europe. Nature 456:98–101CrossRefPubMedGoogle Scholar
  43. 43.
    Olshen AB, Gold B, Lohmueller KE, Struewing JP, Satagopan J, Stefanov SA, Eskin E, Kirchhoff T, Lautenberger JA, Klein RJ, Friedman E, Norton L, Ellis NA, Viale A, Lee CS, Borgen PI, Clark AG, Offit K, Boyd J (2008) Analysis of genetic variation in Ashkenazi Jews by high density SNP genotyping. BMC Genet 9:14CrossRefPubMedGoogle Scholar
  44. 44.
    Parra EJ, Marcini A, Akey J, Martinson J, Batzer MA, Cooper R, Forrester T, Allison DB, Deka R, Ferrell RE, Shriver MD (1998) Estimating African American admixture proportions by use of population-specific alleles. Am J Hum Genet 63(6):1839–1851CrossRefPubMedGoogle Scholar
  45. 45.
    Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK (2009) Signals of recent positive selection in a worldwide sample of human populations. Genome Res 19(5):826–837CrossRefPubMedGoogle Scholar
  46. 46.
    Pinhasi R, Fort J, Ammerman AJ (2005) Tracing the origin and spread of agriculture in Europe. PloS Biol 3:e410CrossRefPubMedGoogle Scholar
  47. 47.
    Price AL, Butler J, Patterson N, Capelli C, Pascali VL, Scarnicci F, Ruiz-Linares A, Groop L, Saetta AA, Korkolopoulou P, Seligsohn U, Waliszewska A, Schirmer C, Ardlie K, Ramos A, Nemesh J, Arbeitman L, Goldstein DB, Reich D, Hirschhorn JN (2008) Discerning the ancestry of European Americans in genetic association studies. PLoS Genet 4(1):e236CrossRefPubMedGoogle Scholar
  48. 48.
    Pritchard JK, Rosenberg NA (1998) Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 65:220–228CrossRefGoogle Scholar
  49. 49.
    Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959PubMedGoogle Scholar
  50. 50.
    Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza LL (2005) Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc Natl Acad Sci USA 102:15942–15947CrossRefPubMedGoogle Scholar
  51. 51.
    Ramachandran S, Rosenberg NA, Feldman MW, Wakeley J (2008) Population differentiation and migration: coalescence times in a two-sex island model for autosomal and X-linked loci. Theor Popul Biol 74:291–301CrossRefPubMedGoogle Scholar
  52. 52.
    Rosenberg NA, Mahajan S, Ramachandran S, Zhao C, Pritchard JK, Feldman MW (2005) Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet 1:e70CrossRefPubMedGoogle Scholar
  53. 53.
    Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298:2381–2385CrossRefPubMedGoogle Scholar
  54. 54.
    Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, Fung H-C, Szpiech AZ, Degnan JH, Wang K, Guerreiro R, Bras JM, Scymick JC, Hernandez DG, Traynor BJ, Simon-Sanchez J, Matarin M, Britton A, van de Leemput J, Rafferty I, Bucan M, Cann HM, Hardy JA, Rosenberg NA, Singleton AB (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451:998–1003CrossRefPubMedGoogle Scholar
  55. 55.
    Sabeti PC, Reich DE, Higgins JM, Levine HZP, Richter DJ, Schaffner SF, Gabriel SB, Platko JV, Patterson NJ, McDonald GJ, Ackerman HC, Campbell SJ, Altshuler D, Cooper R, Kwiatkowski D, Ward R, Lander ES (2002) Detecting recent positive selection in the human genome from haplotype structure. Nature 419:832–837CrossRefPubMedGoogle Scholar
  56. 56.
    Salari K, Choudhry S, Tang H, Naqvi M, Lind D, Avila PC, Coyle NE, Ung N, Nazario S, Casal J, Torres-Palacios A, Clark S, Phong A, Gomez I, Matallana H, Pérez-Stable EJ, Shriver MD, Kwok PY, Sheppard D, Rodriguez-Cintron W, Risch NJ, Burchard EG, Ziv E (2005) Genetic admixture and asthma-related phenotypes in Mexican American and Puerto Rican asthmatics. Genet Epidemiol 29(1):76–86CrossRefPubMedGoogle Scholar
  57. 57.
    Satten GA, Flanders WD, Yang Q (2001) Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet 68(2):466–477CrossRefPubMedGoogle Scholar
  58. 58.
    Schaffner SF (2004) The X chromosome in population genetics. Nat Rev Genet 5:43–51CrossRefPubMedGoogle Scholar
  59. 59.
    Silva-Zolezzi I, Hidalgo-Miranda A, Estrada-Gil J, Fernandez-Lopez JC, Uribe-Figueroa L, Contreras A, Balam-Ortiz E, del Bosque-Plata L, Velazquez-Fernandez D, Lara C, Goya R, Hernandez-Lemus E, Davila C, Barrientos E, March S, Jimenez-Sanchez G (2009) Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico. Proc Natl Acad Sci USA 106(21):8611–8616CrossRefPubMedGoogle Scholar
  60. 60.
    Sundquist A, Fratkin E, Do CB, Batzoglou S (2008) Effect of genetic divergence in identifying ancestral origin using HAPAA. Genome Res 18(4):676–682CrossRefPubMedGoogle Scholar
  61. 61.
    Tallila J, Jakkula E, Peltonen L, Salonen R, Kestila M (2008) Identification of CC2D2A as a Meckel syndrome gene adds an important piece to the ciliopathy puzzle. Am J Hum Genet 82(6):1361–1367CrossRefPubMedGoogle Scholar
  62. 62.
    Tang H, Coram M, Wang P, Zhu X, Risch N (2006) Reconstructing genetic ancestry blocks in admixed individuals. Am J Hum Genet 79(1):1–12CrossRefPubMedGoogle Scholar
  63. 63.
    Tang H, Peng J, Wang P, Risch NJ (2005) Estimation of individual admixture: analytical and study design considerations. Genet Epidemiol 28(4):289–301CrossRefPubMedGoogle Scholar
  64. 64.
    Tian C, Plenge RM, Ransom M, Lee A, Villoslada P, Selmi C, Klareskog L, Pulver AE, Qi L, Gregersen PK, Seldin MF (2008) Analysis and application of European genetic substructure using 300 K SNP information. PLoS Genet 4(1):e4CrossRefPubMedGoogle Scholar
  65. 65.
    Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, Hirbo JB, Awomoyi AA, Bodo J-M, Doumbo O, Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH, Mortensen H, Nyambo TB, Omar SA, Powell K, Pretorius GS, Smith MW, Thera MA, Wambebe C, Weber JL, Williams SM (2009) The genetic structure and history of Africans and African Americans. Science 324:1035–1044CrossRefPubMedGoogle Scholar
  66. 66.
    Wang S, Lewis CM Jr, Jakobsson M, Ramachandran S, Ray N, Bedoya G, Rojas W, Parra MV, Molina JA, Gallo C (2007) Genetic variation and population structure in Native Americans. PloS Genet 3:e185CrossRefPubMedGoogle Scholar
  67. 67.
    Weir B (1996) Genetic data analysis II. Sinauer Press, Sunderland, MAGoogle Scholar
  68. 68.
    Williamson SH, Hernandez R, Fledel-Alon A, Zhu L, Nielsen R et al (2005) Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc Natl Acad Sci USA 102:7882–7887CrossRefPubMedGoogle Scholar
  69. 69.
    Wright S (1921) Systems of mating. I. The biometric relations between offspring and parent. Genetics 6:111–123Google Scholar
  70. 70.
    Wu B, Liu N, Zhao H (2006) PSMIX: an R package for population stratification inference via maximum likelihood method. BMC Bioinformatics 7:317CrossRefPubMedGoogle Scholar
  71. 71.
    Xing J, Watkins WS, Witherspoon DJ, Zhang Y, Guthery SL, Thara R, Mowry BJ, Bulayeva K, Weiss RB, Jorde LB (2009) Fine-scaled human genetic structure revealed by SNP microarrays. Genome Res 19:815–825CrossRefPubMedGoogle Scholar
  72. 72.
    Xu S, Jin L (2008) A genome-wide analysis of admixture in Uyghurs and a high-density admixture map for disease-gene discovery. Am J Hum Genet 83(3):322–336CrossRefPubMedGoogle Scholar
  73. 73.
    Yamaguchi-Kabata Y, Nakazono K, Takahashi A, Saito S, Hosono N, Kubo M, Nakamura Y, Kamatani N (2008) Japanese population structure, based on SNP genotypes from 7003 individuals compared to other ethnic groups: effects on population-based association studies. Am J Hum Genet 83:445–456CrossRefPubMedGoogle Scholar
  74. 74.
    Zhu X, Zhang S, Tang H, Cooper R (2006) A classical likelihood based approach for admixture mapping using EM algorithm. Hum Genet 120(3):431–445CrossRefPubMedGoogle Scholar
  75. 75.
    Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD (in press) Inferring the joint demographic history of multiple populations from multidimensional SNP data PLoS Genetics; arXiv:0909.0925Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Sohini Ramachandran
    • 1
  • Hua Tang
    • 2
  • Ryan N. Gutenkunst
    • 3
  • Carlos D. Bustamante
    • 4
  1. 1.Society of FellowsHarvard UniversityCambridgeUSA
  2. 2.Department of GeneticsStanford Medical SchoolStanfordUSA
  3. 3.Theoretical Biology and Biophysics, and Center for Nonlinear StudiesLos Alamos National LaboratoryLos AlamosUSA
  4. 4.Department of Biological Statistics and Computational BiologyCornell UniversityIthacaUSA

Personalised recommendations