Multiple-Ancestor Localization for Recently Admixed Individuals

  • Yaron Margalit
  • Yael Baran
  • Eran HalperinEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9289)


Inference of ancestry from genetic data is a fundamental problem in computational genetics, with wide applications in human genetics and population genetics. The treatment of ancestry as a continuum instead of a categorical trait has been recently advocated in the literature. Particularly, it was shown that a European individual’s geographic coordinates of origin can be determined up to a few hundred kilometers of error using spatial ancestry inference methods. Current methods for the inference of spatial ancestry focus on individuals for whom all ancestors originated from the same geographic location.    In this work we develop a spatial ancestry inference method that aims at inferring the geographic coordinates of ancestral origins of recently admixed individuals, i.e. individuals whose recent ancestors originated from multiple locations. Our model is based on multivariate normal distributions integrated into a two-layered Hidden Markov Model, designed to capture both long-range correlations between SNPs due to the recent mixing and short-range correlations due to linkage disequilibrium. We evaluate the method on both simulated and real European data, and demonstrate that it achieves accurate results for up to three generations of admixture. Finally, we discuss the challenges of spatial inference for older admixtures and suggest directions for future work.


Admixture Ancestry inference Spatial model Hidden Markov Model Multivariate-normal distribution 



E.H. is a faculty fellow of the Edmond J. Safra Center at Tel Aviv University. Y.B. was supported in part by a fellowship from the Edmond J. Safra Center for Bioinformatics at Tel-Aviv University. E.H. and Y.B. were also supported in part by the United States Israel Binational Science Foundation (grant 2012304), and by the National Science Foundation (grant III-1217615), and by the Israeli Science Foundation (grant 989/08). E.H, Y.B, and Y.M were partially supported by the German-Israeli Foundation (grant 1094-33.2/ 2010). E.H was also supported by the Israel Science Foundation (grant 1425/13).


  1. 1.
    Alexander, D.H., Novembre, J., Lange, K.: Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19(9), 1655–1664 (2009)CrossRefGoogle Scholar
  2. 2.
    Baran, Y., Pasaniuc, B., Sankararaman, S., Torgerson, D.G., Gignoux, C., Eng, C., Rodriguez-Cintron, W., Chapela, R., Ford, J.G., Avila, P.C., et al.: Fast and accurate inference of local ancestry in latino populations. Bioinformatics 28(10), 1359–1367 (2012)CrossRefGoogle Scholar
  3. 3.
    Baran, Y., Quintela, I., Carracedo, Á., Pasaniuc, B., Halperin, E.: Enhanced localization of genetic samples through linkage-disequilibrium correction. Am. J. Hum. Genet. 92(6), 882–894 (2013)CrossRefGoogle Scholar
  4. 4.
    Browning, S.R., Browning, B.L.: Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81(5), 1084–1097 (2007)CrossRefGoogle Scholar
  5. 5.
    Bryc, K., Velez, C., Karafet, T., Moreno-Estrada, A., Reynolds, A., Auton, A., Hammer, M., Bustamante, C., Ostrer, H.: Genome-wide patterns of population structure and admixture among hispanic/latino populations. Proc. Nat. Acad. Sci. 107(Supplement 2), 8954–8961 (2010)CrossRefGoogle Scholar
  6. 6.
    Churchhouse, C., Marchini, J.: Multiway admixture deconvolution using phased or unphased ancestral panels. Genet. Epidemiol. 37(1), 1–12 (2013)CrossRefGoogle Scholar
  7. 7.
    Gravel, S., Henn, B., Gutenkunst, R., Indap, A., Marth, G., Clark, A., Yu, F., Gibbs, R., Bustamante, C., Altshuler, D., et al.: Demographic history and rare allele sharing among human populations. Proc. Nat. Acad. Sci. 108(29), 11983–11988 (2011)CrossRefGoogle Scholar
  8. 8.
    Haiman, C.A., Patterson, N., Freedman, M.L., Myers, S.R., Pike, M.C., Waliszewska, A., Neubauer, J., Tandon, A., Schirmer, C., McDonald, G.J., et al.: Multiple regions within 8q24 independently affect risk for prostate cancer. Nat. Genet. 39(5), 638–644 (2007)CrossRefGoogle Scholar
  9. 9.
    Hinch, A., Tandon, A., Patterson, N., Song, Y., Rohland, N., Palmer, C., Chen, G., Wang, K., Buxbaum, S., Akylbekova, E., et al.: The landscape of recombination in african americans. Nature 476(7359), 170–175 (2011)CrossRefGoogle Scholar
  10. 10.
    Jarvis, J., Scheinfeldt, L., Soi, S., Lambert, C., Omberg, L., Ferwerda, B., Froment, A., Bodo, J., Beggs, W., Hoffman, G., et al.: Patterns of ancestry, signatures of natural selection, and genetic association with stature in western african pygmies. PLoS Genet. 8(4), e1002641 (2012)CrossRefGoogle Scholar
  11. 11.
    Johnson, N.A., Coram, M.A., Shriver, M.D., Romieu, I., Barsh, G.S., London, S.J., Tang, H.: Ancestral components of admixed genomes in a mexican cohort. PLoS Genet. 7(12), e1002410 (2011)CrossRefGoogle Scholar
  12. 12.
    Kao, W.L., Klag, M.J., Meoni, L.A., Reich, D., Berthier-Schaad, Y., Li, M., Coresh, J., Patterson, N., Tandon, A., Powe, N.R., et al.: Myh9 is associated with nondiabetic end-stage renal disease in african americans. Nature Genet. 40(10), 1185–1192 (2008)CrossRefGoogle Scholar
  13. 13.
    Maples, B.K., Gravel, S., Kenny, E.E., Bustamante, C.D.: Rfmix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93(2), 278–288 (2013)CrossRefGoogle Scholar
  14. 14.
    Menelaou, A., Marchini, J.: Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold. Bioinformatics 29(1), 84–91 (2013)CrossRefGoogle Scholar
  15. 15.
    Moreno-Estrada, A., Gravel, S., Zakharia, F., McCauley, J.L., Byrnes, J.K., Gignoux, C.R., Ortiz-Tello, P.A., Martínez, R.J., Hedges, D.J., Morris, R.W., et al.: Reconstructing the population genetic history of the caribbean. PLoS Genet. 9(11), e1003925 (2013)CrossRefGoogle Scholar
  16. 16.
    Nelson, M.R., Bryc, K., King, K.S., Indap, A., Boyko, A.R., Novembre, J., Briley, L.P., Maruyama, Y., Waterworth, D.M., Waeber, G., et al.: The population reference sample, popres: a resource for population, disease, and pharmacological genetics research. Am. J. Hum. Genet. 83(3), 347–358 (2008)CrossRefGoogle Scholar
  17. 17.
    Novembre, J., Johnson, T., Bryc, K., Kutalik, Z., Boyko, A.R., Auton, A., Indap, A., King, K.S., Bergmann, S., Nelson, M.R., et al.: Genes mirror geography within europe. Nature 456(7218), 98–101 (2008)CrossRefGoogle Scholar
  18. 18.
    Price, A.L., Tandon, A., Patterson, N., Barnes, K.C., Rafaels, N., Ruczinski, I., Beaty, T.H., Mathias, R., Reich, D., Myers, S.: Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5(6), e1000519 (2009)CrossRefGoogle Scholar
  19. 19.
    Price, A.L., Zaitlen, N.A., Reich, D., Patterson, N.: New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11(7), 459–463 (2010)CrossRefGoogle Scholar
  20. 20.
    Pritchard, J.K., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155(2), 945–959 (2000)Google Scholar
  21. 21.
    Seldin, M.F., Pasaniuc, B., Price, A.L.: New approaches to disease mapping in admixed populations. Nat. Rev. Genet. 12(8), 523–528 (2011)CrossRefGoogle Scholar
  22. 22.
    Smith, M.W., O’Brien, S.J.: Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat. Rev. Genet. 6(8), 623–632 (2005)CrossRefGoogle Scholar
  23. 23.
    Wang, C., Zöllner, S., Rosenberg, N.A.: A quantitative comparison of the similarity between genes and geography in worldwide human populations. PLoS Genet. 8(8), e1002886 (2012)CrossRefGoogle Scholar
  24. 24.
    Wegmann, D., Kessner, D., Veeramah, K., Mathias, R., Nicolae, D., Yanek, L., Sun, Y., Torgerson, D., Rafaels, N., Mosley, T., et al.: Recombination rates in admixed individuals identified by ancestry-based inference. Nat. Genet. 43(9), 847–853 (2011)CrossRefGoogle Scholar
  25. 25.
    Wen, X., Stephens, M.: Using linear predictors to impute allele frequencies from summary or pooled genotype data. Ann. Appl. Stat. 4(3), 1158 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Yang, W.Y., Novembre, J., Eskin, E., Halperin, E.: A model-based approach for analysis of spatial structure in genetic data. Nat. Genet. 44(6), 725–731 (2012)CrossRefGoogle Scholar
  27. 27.
    Yang, W.Y., Platt, A., Chiang, C.W.K., Eskin, E., Novembre, J., Pasaniuc, B.: Spatial localization of recent ancestors for admixed individuals. G3: Genes, Genomes, Genet. 4(12), 2505–2518 (2014)CrossRefGoogle Scholar
  28. 28.
    Zhu, X., Luke, A., Cooper, R.S., Quertermous, T., Hanis, C., Mosley, T., Gu, C.C., Tang, H., Rao, D.C., Risch, N., et al.: Admixture mapping for hypertension loci with genome-scan markers. Nat. Genet. 37(2), 177–181 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.The Blavatnik School of Computer ScienceTel Aviv UniversityTel AvivIsrael
  2. 2.Department of Molecular Microbiology and BiotechnologyTel-Aviv UniversityTel AvivIsrael
  3. 3.International Computer Science InstituteBerkeleyUSA

Personalised recommendations