Skip to main content
Log in

Recombination networks as genetic markers in a human variation study of the Old World

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

We have analyzed human genetic diversity in 33 Old World populations including 23 populations obtained through Genographic Project studies. A set of 1,536 SNPs in five X chromosome regions were genotyped in 1,288 individuals (mostly males). We use a novel analysis employing subARG network construction with recombining chromosomal segments. Here, a subARG is constructed independently for each of five gene-free regions across the X chromosome, and the results are aggregated across them. For PCA, MDS and ancestry inference with STRUCTURE, the subARG is processed to obtain feature vectors of samples and pairwise distances between samples. The observed population structure, estimated from the five short X chromosomal segments, supports genome-wide frequency-based analyses: African populations show higher genetic diversity, and the general trend of shared variation is seen across the globe from Africa through Middle East, Europe, Central Asia, Southeast Asia, and East Asia in broad patterns. The recombinational analysis was also compared with established methods based on SNPs and haplotypes. For haplotypes, we also employed a fixed-length approach based on information-content optimization. Our recombinational analysis suggested a southern migration route out of Africa, and it also supports a single, rapid human expansion from Africa to East Asia through South Asia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Baird SJ (2006) Phylogenetics: Fisher’s markers of admixture. Heredity 97(2):81–83

    Article  PubMed  CAS  Google Scholar 

  • Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21(2):263–265

    Article  PubMed  CAS  Google Scholar 

  • Belle EM, Barbujani G (2007) Worldwide analysis of multiple microsatellites: language diversity has a detectable influence on DNA diversity. Am J Phys Anthropol 133(4):1137–1146

    Article  PubMed  Google Scholar 

  • Cavalli-Sforza LL, Menozzi P, Piazza A (1994) History and geography of human genes. Princeton University Press, Princeton

    Google Scholar 

  • Conrad DF, Jakobsson M, Coop G, Wen X, Wall JD, Rosenberg NA, Pritchard JK (2006) A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat Genet 38(11):1251–1260

    Article  PubMed  CAS  Google Scholar 

  • Delfin F, Salvador JM, Calacal GC, Perdigon HB, Tabbada KA, Villamor LP, Halos SC, Gunnarsdottir E, Myles S, Hughes DA, Xu S, Jin L, Lao O, Kayser M, Hurles ME, Stoneking M, De Ungria MC (2011) The Y-chromosome landscape of the Philippines: extensive heterogeneity and varying genetic affinities of Negrito and non-Negrito groups. Eur J Hum Genet 19(2):224–230

    Article  PubMed  Google Scholar 

  • Dennell R, Roebroeks W (2005) An Asian perspective on early human dispersal from Africa. Nature 438(7071):1099–1104

    Article  PubMed  CAS  Google Scholar 

  • Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA (2010) A map of human genome variation from population-scale sequencing. Nature 467(7319):1061–1073

    Article  CAS  Google Scholar 

  • Excoffier L, Lischer HEL (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour 10:564–567

    Article  PubMed  Google Scholar 

  • Excoffier L, Harding RM, Sokal RR, Pellegrini B, Sanchez-Mazas A (1991) Spatial differentiation of RH and GM haplotype frequencies in Sub-Saharan Africa and its relation to linguistic affinities. Hum Biol 63(3):273–307

    PubMed  CAS  Google Scholar 

  • Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164(4):1567–1587

    PubMed  CAS  Google Scholar 

  • Fisher RA (1954) A fuller theory of “Junctions” in inbreeding. Heredity 8:187–197

    Article  Google Scholar 

  • Gunnarsdottir ED, Li M, Bauchet M, Finstermeier K, Stoneking M (2011) High-throughput sequencing of complete human mtDNA genomes from the Philippines. Genome Res 21(1):1–11

    Article  PubMed  CAS  Google Scholar 

  • Hammer MF, Woerner AE, Mendez FL, Watkins JC, Cox MP, Wall JD (2010) The ratio of human X chromosome to autosome diversity is positively correlated with genetic distance from genes. Nat Genet 42(10):830–831

    Article  PubMed  CAS  Google Scholar 

  • Henn BM, Gignoux CR, Jobin M, Granka JM, Macpherson JM, Kidd JM, Rodríguez-Botigué L, Ramachandran S, Hon L, Brisbin A, Lin AA, Underhill PA, Comas D, Kidd KK, Norman PJ, Parham P, Bustamante CD, Mountain JL, Feldman MW (2011) Hunter-gatherer genomic diversity suggests a southern African origin for modern humans. Proc Natl Acad Sci USA 108(13):5154–5162

    Article  PubMed  CAS  Google Scholar 

  • Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23(14):1801–1806

    Article  PubMed  CAS  Google Scholar 

  • Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, Fung HC, Szpiech ZA, Degnan JH, Wang K, Guerreiro R, Bras JM, Schymick JC, Hernandez DG, Traynor BJ, Simon-Sanchez J, Matarin M, Britton A, van de Leemput J, Rafferty I, Bucan M, Cann HM, Hardy JA, Rosenberg NA, Singleton AB (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451(7181):998–1003

    Article  PubMed  CAS  Google Scholar 

  • Javed A, Pybus M, Mele M, Utro F, Bertranpetit J, Calafell F, Parida L (2011) IRiS: construction of ARG networks at genomic scales. Bioinformatics 27(17):2448–2450

    Article  PubMed  CAS  Google Scholar 

  • Kimura M, Ohta T (1969) The average number of generations until fixation of a mutant gene in a finite population. Genetics 61(3):763–771

    PubMed  CAS  Google Scholar 

  • Kong A, Thorleifsson G, Gudbjartsson DF, Masson G, Sigurdsson A, Jonasdottir A, Walters GB, Jonasdottir A, Gylfason A, Kristinsson KT, Gudjonsson SA, Frigge ML, Helgason A, Thorsteinsdottir U, Stefansson K (2010) Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467(7319):1099–1103

    Article  PubMed  CAS  Google Scholar 

  • Macaulay V, Hill C, Achilli A, Rengo C, Clarke D, Meehan W, Blackburn J, Semino O, Scozzari R, Cruciani F, Taha A, Shaari NK, Raja JM, Ismail P, Zainuddin Z, Goodwin W, Bulbeck D, Bandelt HJ, Oppenheimer S, Torroni A, Richards M (2005) Single, rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes. Science 308(5724):1034–1036

    Article  PubMed  CAS  Google Scholar 

  • Mateu E, Calafell F, Lao O, Bonne-Tamir B, Kidd JR, Pakstis A, Kidd KK, Bertranpetit J (2001) Worldwide genetic analysis of the CFTR region. Am J Hum Genet 68(1):103–117

    Article  PubMed  CAS  Google Scholar 

  • Mateu E, Perez-Lezaun A, Martinez-Arias R, Andres A, Valles M, Bertranpetit J, Calafell F (2002) PKLR- GBA region shows almost complete linkage disequilibrium over 70 kb in a set of worldwide populations. Hum Genet 110(6):532–544

    Article  PubMed  CAS  Google Scholar 

  • Melé M, Javed A, Pybus M, Calafell F, Parida L, Bertranpetit J (2010) A new method to reconstruct recombination events at a genomic scale. PLoS Comput Biol 6(11):e1001010

    Article  PubMed  Google Scholar 

  • Mendizabal I, Valente C, Gusmao A, Alves C, Gomes V, Goios A, Parson W, Calafell F, Alvarez L, Amorim A, Gusmao L, Comas D, Prata MJ (2011) Reconstructing the Indian origin and dispersal of the European Roma: a maternal genetic perspective. PLoS One 6(1):e15988

    Article  PubMed  CAS  Google Scholar 

  • Novembre J, Johnson T, Bryc K, Kutalik Z, Boyko AR, Auton A, Indap A, King KS, Bergmann S, Nelson MR, Stephens M, Bustamante CD (2008) Genes mirror geography within Europe. Nature 456:98–101

    Article  PubMed  CAS  Google Scholar 

  • Parida L, Palamara PF, Javed A (2011) A minimal descriptor of an ancestral recombinations graph. BMC Bioinformatics 12(Suppl 1):S6

    Article  PubMed  Google Scholar 

  • Patterson N, Price AL, Reich D (2006) Population structure and eigen analysis. PLoS Genet 2(12):e190

    Article  PubMed  Google Scholar 

  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959

    PubMed  CAS  Google Scholar 

  • Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298(5602):2381–2385

    Article  PubMed  CAS  Google Scholar 

  • Schaffner SF, Foo C, Gabriel S, Reich D, Daly MJ, Altshuler D (2005) Calibrating a coalescent simulation of human genome sequence variation. Genome Res 15(11):1576–1583

    Article  PubMed  CAS  Google Scholar 

  • Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78(4):629–644

    Article  PubMed  CAS  Google Scholar 

  • Stephens M, Scheet P (2005) Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet 76(3):449–462

    Article  PubMed  CAS  Google Scholar 

  • Tishkoff SA, Dietzch E, Speed W, Pakstis AJ, Kidd JR, Cheung K, Bonné-Tamir B, Santachiara-Benerecetti S, Moral P, Krings M, Pääbo S, Watson E, Risch N, Jenkins T, Kidd KK (1996) Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271:1380–1387

    Article  PubMed  CAS  Google Scholar 

  • Tishkoff SA, Goldman A, Calafell F, Speed WC, Deinard AS, Bonne-Tamir B, Kidd JR, Pakstis AJ, Jenkins T, Kidd KK (1998) A global haplotype analysis of the myotonic dystrophy locus: implications for the evolution of modern humans and for the origin of myotonic dystrophy mutations. Am J Hum Genet 62(6):1389–1402

    Article  PubMed  CAS  Google Scholar 

  • Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, Hirbo JB, Awomoyi AA, Bodo JM, Doumbo O, Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH, Mortensen H, Nyambo TB, Omar SA, Powell K, Pretorius GS, Smith MW, Thera MA, Wambebe C, Weber JL, Williams SM (2009) The genetic structure and history of Africans and African Americans. Science 324(5930):1035–1044

    Article  PubMed  CAS  Google Scholar 

  • Wang S, Lewis CM, Jakobsson M, Ramachandran S, Ray N, Bedoya G, Rojas W, Parra MV, Molina JA, Gallo C, Mazzotti G, Poletti G, Hill K, Hurtado AM, Labuda D, Klitz W, Barrantes R, Bortolini MC, Salzano FM, Petzl-Erler ML, Tsuneto LT, Llop E, Rothhammer F, Excoffier L, Feldman MW, Rosenberg NA, Ruiz-Linares A (2007) Genetic variation and population structure in native Americans. PLoS Genet 3(11):e185

    Article  PubMed  Google Scholar 

  • Watterson GA, Guess HA (1977) Is the most frequent allele the oldest? Theor Popul Biol 11:141–160

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This research is part of the Genographic Project, funded by National Geographic and IBM. Additional funding was provided by the Spanish Ministry of Science and Innovation projects BFU2007-63657, BFU2007-63171, and BFU2010-19443; MM was supported by grant AP2006-03268, Generalitat de Catalunya; OB was supported by Russian Foundation for Basic Research (grants 10-06-00451, 10-04-01603) and by the Presidium RAS Programme “Molecular and Cell Biology”. We are grateful to Mònica Vallés, UPF, for excellent technical support. Genotyping and bioinformatic services were provided respectively by CEGEN (Centro Nacional de Genotipado) and INB (National Bioinformatics Insitute), Spain. HapMap phase III population samples were obtained from the Coriell Cell Repository.

Author information

Authors and Affiliations

Authors

Consortia

Corresponding authors

Correspondence to Francesc Calafell or Laxmi Parida.

Additional information

A. Javed and M. Melé are joint first authors.

Members of the Genographic Consortium are provided in the “Appendix”.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplemental material 1 (DOC 581 kb)

Appendix

Appendix

The Genographic Consortium includes: Syama Adhikarla (Madurai Kamaraj University, Madurai, Tamil Nadu, India), Christina J. Adler (University of Adelaide, South Australia, Australia), Danielle A. Badro (Lebanese American University, Chouran, Beirut, Lebanon), Andrew C. Clarke (University of Otago, Dunedin, New Zealand), Alan Cooper (University of Adelaide, South Australia, Australia), Clio S. I. Der Sarkissian (University of Adelaide, South Australia, Australia), Matthew C. Dulik (University of Pennsylvania, Philadelphia, Pennsylvania, USA), Christoff J. Erasmus (National Health Laboratory Service, Johannesburg, South Africa), Jill B. Gaieski (University of Pennsylvania, Philadelphia, Pennsylvania, USA), Wolfgang Haak (University of Adelaide, South Australia, Australia), Angela Hobbs (National Health Laboratory Service, Johannesburg, South Africa), Matthew E. Kaplan (University of Arizona, Tucson, Arizona, USA), Shilin Li (Fudan University, Shanghai, China), Begoña Martínez-Cruz (Universitat Pompeu Fabra, Barcelona, Spain), Elizabeth A. Matisoo-Smith (University of Otago, Dunedin, New Zealand), Nirav C. Merchant (University of Arizona, Tucson, Arizona, USA), R. John Mitchell (La Trobe University, Melbourne, Victoria, Australia), Amanda C. Owings (University of Pennsylvania, Philadelphia, Pennsylvania, USA), Daniel E. Platt (IBM, Yorktown Heights, NY, USA), Lluis Quintana-Murci (Institut Pasteur, Paris, France), Colin Renfrew (University of Cambridge, Cambridge, UK), Daniela R. Lacerda (Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil), Ajay K. Royyuru (IBM, Yorktown Heights, NY, USA), Fabrício R. Santos (Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil), Theodore G. Schurr (University of Pennsylvania, Philadelphia, Pennsylvania, USA), Himla Soodyall (National Health Laboratory Service, Johannesburg, South Africa), David F. Soria Hernanz (National Geographic Society, Washington, DC, USA), Pandikumar Swamikrishnan (IBM, Somers, NY, USA), Chris Tyler-Smith (The Wellcome Trust Sanger Institute, Hinxton, UK), Kavitha Valampuri John (Madurai Kamaraj University, Madurai, Tamil Nadu, India), Arun Varatharajan Santhakumari (Madurai Kamaraj University, Madurai, Tamil Nadu, India), Pedro Paulo Vieira (Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil), R. Spencer Wells (National Geographic Society, Washington, DC, USA), Janet S. Ziegle (Applied Biosystems, Foster City, CA, USA).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Javed, A., Melé, M., Pybus, M. et al. Recombination networks as genetic markers in a human variation study of the Old World. Hum Genet 131, 601–613 (2012). https://doi.org/10.1007/s00439-011-1104-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-011-1104-8

Keywords

Navigation