, Volume 71, Issue 10, pp 589–604 | Cite as

Single haplotype admixture models using large scale HLA genotype frequencies to reproduce human admixture

  • Alexandra Litinsky Simanovsky
  • Abeer Madbouly
  • Michael Halagan
  • Martin Maiers
  • Yoram LouzounEmail author
Original Article


The human leukocyte antigen (HLA) is the most polymorphic region in humans. Anthropologists use HLA to trace populations’ migration and evolution. However, recent admixture between populations can mask the ancestral haplotype frequency distribution. We present a statistical method based on high-resolution HLA haplotype frequencies to resolve population admixture using a non-negative matrix factorization formalism and validated using haplotype frequencies from 56 world populations. The result is a minimal set of source components (SCs) decoding roughly 90% of the total variance in the studied admixtures. These SCs agree with the geographical distribution, phylogenies, and recent admixture events of the studied groups. With the growing population of multi-ethnic individuals, or individuals that do not report race/ethnic information, the HLA matching process for stem-cell and solid organ transplants is becoming more challenging. The presented algorithm provides a framework that facilitates the breakdown of highly admixed populations into SCs, which can be used to better match the rapidly growing population of multi-ethnic individuals worldwide.


HLA Genetic admixture Non-negative matrix factorization Stem-cell donor registry Unsupervised learning 



We would like to thank the following registries for allowing their data to be used for this study: National Marrow Donor Program/Be The Match, USA; Ezer Mizion Bone Marrow Donor Registry, Israel; OneMatch Stem Cell and Marrow Network, Canada; Australian Bone Marrow Donor Registry, Australia; Matchis: the Dutch Centre for Stem Cell Donors, The Netherlands; Norwegian Bone Marrow Donor Registry, Norway; New Zealand Bone Marrow Donor Registry, New Zealand; Tobias Registry of Swedish Bone Marrow Donors, Sweden; Thai National Stem Cell Donor Registry, Thailand; Welsh Bone Marrow Donor Registry, Wales, UK.

Supplementary material

251_2019_1144_MOESM1_ESM.docx (5.1 mb)
ESM 1 (DOCX 5238 kb)


  1. Abi-Rached L, Jobin MJ, Kulkarni S, McWhinnie A, Dalva K, Gragert L, Babrzadeh F, Gharizadeh B, Luo M, Plummer FA (2011) The shaping of modern human immune systems by multiregional admixture with archaic humans. Science 334(6052):89–94PubMedPubMedCentralGoogle Scholar
  2. Aguilar A, Roemer G, Debenham S, Binns M, Garcelon D, Wayne RK (2004) High MHC diversity maintained by balancing selection in an otherwise genetically monomorphic mammal. Proc Natl Acad Sci U S A 101(10):3490–3494PubMedPubMedCentralGoogle Scholar
  3. Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19(9):1655–1664PubMedPubMedCentralGoogle Scholar
  4. Alter I, Gragert L, Fingerson S, Maiers M, Louzoun Y (2017) HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes. PLoS Comput Biol 13(8):e1005693PubMedPubMedCentralGoogle Scholar
  5. Apanius V, Penn D, Slev PR, Ruff LR, Potts WK (1997) The nature of selection on the major histocompatibility complex. Crit Rev Immunol 17(2):179–224PubMedGoogle Scholar
  6. Balding DJ (2006) A tutorial on statistical methods for population association studies. Nat Rev Genet 7(10):781–791PubMedGoogle Scholar
  7. Behar DM, Yunusbayev B, Metspalu M, Metspalu E, Rosset S, Parik J, Rootsi S, Chaubey G, Kutuev I, Yudkovsky G (2010) The genome-wide structure of the Jewish people. Nature 466(7303):238–242Google Scholar
  8. Bird CE, Karl SA, Smouse PE, Toonen RJ (2011) Detecting and measuring genetic differentiation. Phylogeography and population genetics in Crustacea 19:31–55Google Scholar
  9. Brand A, Doxiadis I, Roelen D (2013) On the role of HLA antibodies in hematopoietic stem cell transplantation. HLA 81(1):1–11Google Scholar
  10. Bryc K, Durand EY, Macpherson JM, Reich D, Mountain JL (2015) The genetic ancestry of African Americans, Latinos, and European Americans across the United States. Am J Hum Genet 96(1):37–53PubMedPubMedCentralGoogle Scholar
  11. Chinen J, Buckley RH (2010) Transplantation immunology: solid organ and bone marrow. J Allergy Clin Immunol 125(2):S324–S335PubMedPubMedCentralGoogle Scholar
  12. Chua EW, Kennedy MA (2012) Current state and future prospects of direct-to-consumer pharmacogenetics. Front Pharmacol 3:152PubMedPubMedCentralGoogle Scholar
  13. Consortium GP (2015) A global reference for human genetic variation. Nature 526(7571):68Google Scholar
  14. Costa CL,Schneider DM, Ramos MF, de Aguiar MA (2017) “Constructing phylogenetic trees in individual based models.” arXiv preprint arXiv:1709.04416Google Scholar
  15. Excoffier L, Laval G, Schneider S (2005) Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol Bioinformatics Online 1:47Google Scholar
  16. Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131(2):479–491PubMedPubMedCentralGoogle Scholar
  17. Ferrell PB, McLeod HL (2008). “Carbamazepine, HLA-B* 1502 and risk of Stevens–Johnson syndrome and toxic epidermal necrolysis: US FDA recommendations.”Google Scholar
  18. Fujimura JH, Rajagopalan R (2010) Different differences: the use of ‘genetic ancestry’ versus race in biomedical human genetic research. Soc Stud Sci. Google Scholar
  19. Gaujoux R, Seoighe C (2010) A flexible R package for nonnegative matrix factorization. BMC bioinformatics 11(1):367PubMedPubMedCentralGoogle Scholar
  20. Geneugelijk K, Wissing J, Koppenaal D, Niemann M, Spierings E (2017) Computational approaches to facilitate epitope-based HLA matching in solid organ transplantation. J Immunol ResGoogle Scholar
  21. Gragert L, Eapen M, Williams E, Freeman J, Spellman S, Baitty R, Hartzman R, Rizzo JD, Horowitz M, Confer D (2014a) HLA match likelihoods for hematopoietic stem-cell grafts in the US registry. N Engl J Med 371(4):339–348PubMedPubMedCentralGoogle Scholar
  22. Gragert L, Fingerson S, Albrecht M, Maiers M, Kalaycio M, Hill BT (2014b) Fine-mapping of HLA ASSOCIATIONS with chronic lymphocytic leukemia in US populations. Blood 124(17):2657–2665PubMedPubMedCentralGoogle Scholar
  23. Gragert L, Madbouly A, Freeman J, Maiers M (2013) Six-locus high resolution HLA haplotype frequencies derived from mixed-resolution DNA typing for the entire US donor registry. Hum Immunol 74(10):1313–1320PubMedGoogle Scholar
  24. Hammer MF, Karafet T, Rasanayagam A, Wood ET, Altheide TK, Jenkins T, Griffiths RC, Templeton AR, Zegura SL (1998) Out of Africa and back again: nested cladistic analysis of human Y chromosome variation. Mol Biol Evol 15(4):427–441PubMedGoogle Scholar
  25. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106(23):9362–9367PubMedPubMedCentralGoogle Scholar
  26. Hollenbach JA, Saperstein A, Albrecht M, Vierra-Green C, Parham P, Norman PJ, Maiers M (2015) Race, ethnicity and ancestry in unrelated transplant matching for the National Marrow Donor Program: a comparison of multiple forms of self-identification with genetics. PloS one 10(8):e0135960PubMedPubMedCentralGoogle Scholar
  27. Holoshitz J (2013) The quest for better understanding of HLA-disease association: scenes from a road less travelled by. Discov Med 16(87):93–101PubMedPubMedCentralGoogle Scholar
  28. Kaeuffer R, Réale D, Coltman D, Pontier D (2007) Detecting population structure using STRUCTURE software: effect of background linkage disequilibrium. Heredity 99(4):374PubMedGoogle Scholar
  29. Kennedy GC, Matsuzaki H, Dong S, Liu W-m, Huang J, Liu G, Su X, Cao M, Chen W, Zhang J (2003) Large-scale genotyping of complex DNA. Nat Biotechnol 21(10):1233–1237PubMedGoogle Scholar
  30. Klitz W, Gragert L, Maiers M, Fernandez-Viña M, Ben-Naeh Y, Benedek G, Brautbar C, Israel S (2010) Genetic differentiation of Jewish populations. Tissue Antigens 76(6):442–458PubMedGoogle Scholar
  31. Kollman C, Maiers M, Gragert L, Müller C, Setterholm M, Oudshoorn M, Hurley CK (2007) Estimation of HLA-A,-B,-DRB1 haplotype frequencies using mixed resolution data from a national registry with selective retyping of volunteers. Hum Immunol 68(12):950–958PubMedGoogle Scholar
  32. Lam T, Shen M, Chia J, Chan S, Ren E (2013) Population-specific recombination sites within the human MHC region. Heredity 111(2):131PubMedPubMedCentralGoogle Scholar
  33. Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Proces SystGoogle Scholar
  34. Lobkovsky AE,Levi L,Wolf YI, Maiers M, Gragert L, Alter I, Louzoun Y, Koonin EV (2019) “Multiplicative fitness, rapid haplotype discovery, and fitness decay explain evolution of human MHC”. Proceedings of the National Academy of Sciences: 201714436Google Scholar
  35. Madbouly A, Gragert L, Freeman J, Leahy N, Gourraud PA, Hollenbach JA, Kamoun M, Fernandez‐Vina M, Maiers M (2014) Validation of statistical imputation of allele‐level multilocus phased genotypes from ambiguous HLA assignments. Tissue antigens 84(3):285–92PubMedGoogle Scholar
  36. Maiers M, Halagan M, Joshi S, Ballal HS, Jagannatthan L, Damodar S, Srinivasan P, Narayan S, Khattry N, Malhotra P (2014) HLA match likelihoods for Indian patients seeking unrelated donor transplantation grafts: a population-based study. The Lancet Haematol 1(2):e57–e63PubMedGoogle Scholar
  37. Malaspinas A-S, Westaway MC, Muller C, Sousa VC, Lao O, Alves I, Bergström A, Athanasiadis G, Cheng JY, Crawford JE (2016) A genomic history of Aboriginal Australia. Nature 538(7624):207–214PubMedGoogle Scholar
  38. Manor S, Halagan M, Shriki N, Yaniv I, Zisser B, Maiers M, Madbouly A, Stein J (2016) High-resolution HLA A ∼ B ∼ DRB1 haplotype frequencies from the Ezer Mizion Bone Marrow Donor Registry in Israel. Hum Immunol 77(12):1114–1119PubMedGoogle Scholar
  39. McEvoy BP, Lind JM, Wang ET, Moyzis RK, Visscher PM, van Holst Pellekaan SM, Wilton AN (2010) Whole-genome genetic diversity in a sample of Australians with deep Aboriginal ancestry. Am J Hum Genet 87(2):297–305PubMedPubMedCentralGoogle Scholar
  40. Moorjani P, Patterson N, Hirschhorn JN, Keinan A, Hao L, Atzmon G, Burns E, Ostrer H, Price AL, Reich D (2011) The history of African gene flow into Southern Europeans, Levantines, and Jews. PLoS Genet 7(4):e1001373PubMedPubMedCentralGoogle Scholar
  41. Nunnally JC, Bernstein I (1994) Psychometric Theory (McGraw-Hill Series in Psychology). McGraw-Hill, New YorkGoogle Scholar
  42. Parham P, Ohta T (1996) Population biology of antigen presentation by MHC class I molecules. Science 272(5258):67–74PubMedGoogle Scholar
  43. Paschou P, Drineas P, Lewis J, Nievergelt CM, Nickerson DA, Smith JD, Ridker PM, Chasman DI, Krauss RM, Ziv E (2008) Tracing sub-structure in the European American population with PCA-informative markers. PLoS Genet 4(7):e1000114PubMedPubMedCentralGoogle Scholar
  44. Phillips BL, Callaghan C (2017) The immunology of organ transplantation. Surgery (Oxford)Google Scholar
  45. Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK (2009) Signals of recent positive selection in a worldwide sample of human populations. Genome Res 19(5):826–837PubMedPubMedCentralGoogle Scholar
  46. Porras-Hurtado L, Ruiz Y, Santos C, Phillips C, Carracedo Á, Lareu MV (2013) An overview of STRUCTURE: applications, parameter settings, and supporting software. Front Genet 4Google Scholar
  47. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959PubMedPubMedCentralGoogle Scholar
  48. Robinson J, Halliwell JA, Hayhurst JD, Flicek P, Parham P, Marsh SG (2014) The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res 43(D1):D423–D431PubMedPubMedCentralGoogle Scholar
  49. Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, Xie X, Byrne EH, McCarroll SA, Gaudet R, Schaffner SF, Lander ES, C. International HapMap, Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, Zhao H, Zhou J, Gabriel SB, Barry R, Blumenstiel B, Camargo A, Defelice M, Faggart M, Goyette M, Gupta S, Moore J, Nguyen H, Onofrio RC, Parkin M, Roy J, Stahl E, Winchester E, Ziaugra L, Altshuler D, Shen Y, Yao Z, Huang W, Chu X, He Y, Jin L, Liu Y, Shen Y, Sun W, Wang H, Wang Y, Wang Y, Xiong X, Xu L, Waye MM, Tsui SK, Xue H, Wong JT, Galver LM, Fan JB, Gunderson K, Murray SS, Oliphant AR, Chee MS, Montpetit A, Chagnon F, Ferretti V, Leboeuf M, Olivier JF, Phillips MS, Roumy S, Sallee C, Verner A, Hudson TJ, Kwok PY, Cai D, Koboldt DC, Miller RD, Pawlikowska L, Taillon-Miller P, Xiao M, Tsui LC, Mak W, Song YQ, Tam PK, Nakamura Y, Kawaguchi T, Kitamoto T, Morizono T, Nagashima A, Ohnishi Y, Sekine A, Tanaka T, Tsunoda T, Deloukas P, Bird CP, Delgado M, Dermitzakis ET, Gwilliam R, Hunt S, Morrison J, Powell D, Stranger BE, Whittaker P, Bentley DR, Daly MJ, de Bakker PI, Barrett J, Chretien YR, Maller J, McCarroll S, Patterson N, Pe’er I, Price A, Purcell S, Richter DJ, Sabeti P, Saxena R, Schaffner SF, Sham PC, Varilly P, Altshuler D, Stein LD, Krishnan L, Smith AV, Tello-Ruiz MK, Thorisson GA, Chakravarti A, Chen PE, Cutler DJ, Kashuk CS, Lin S, Abecasis GR, Guan W, Li Y, Munro HM, Qin ZS, Thomas DJ, McVean G, Auton A, Bottolo L, Cardin N, Eyheramendy S, Freeman C, Marchini J, Myers S, Spencer C, Stephens M, Donnelly P, Cardon LR, Clarke G, Evans DM, Morris AP, Weir BS, Tsunoda T, Johnson TA, Mullikin JC, Sherry ST, Feolo M, Skol A, Zhang H, Zeng C, Zhao H, Matsuda I, Fukushima Y, Macer DR, Suda E, Rotimi CN, Adebamowo CA, Ajayi I, Aniagwu T, Marshall PA, Nkwodimmah C, Royal CD, Leppert MF, Dixon M, Peiffer A, Qiu R, Kent A, Kato K, Niikawa N, Adewole IF, Knoppers BM, Foster MW, Clayton EW, Watkin J, Gibbs RA, Belmont JW, Muzny D, Nazareth L, Sodergren E, Weinstock GM, Wheeler DA, Yakub I, Gabriel SB, Onofrio RC, Richter DJ, Ziaugra L, Birren BW, Daly MJ, Altshuler D, Wilson RK, Fulton LL, Rogers J, Burton J, Carter NP, Clee CM, Griffiths M, Jones MC, McLay K, Plumb RW, Ross MT, Sims SK, Willey DL, Chen Z, Han H, Kang L, Godbout M, Wallenburg JC, L’Archeveque P, Bellemare G, Saeki K, Wang H, An D, Fu H, Li Q, Wang Z, Wang R, Holden AL, Brooks LD, McEwen JE, Guyer MS, Wang VO, Peterson JL, Shi M, Spiegel J, Sung LM, Zacharia LF, Collins FS, Kennedy K, Jamieson R, Stewart J (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449(7164):913–918PubMedPubMedCentralGoogle Scholar
  50. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425PubMedGoogle Scholar
  51. Sanchez-Mazas A, Thorsby E (2012) HLA in anthropology: the enigma of Easter Island. Clin Transpl:167–173Google Scholar
  52. Shiina T, Hosomichi K, Inoko H, Kulski JK (2009) The HLA genomic loci map: expression, interaction, diversity and disease. J Hum Genet 54(1):15–39PubMedPubMedCentralGoogle Scholar
  53. Simmonds M, Gough S (2007) The HLA region and autoimmune disease: associations and mechanisms of action. Curr Genomics 8(7):453–465PubMedPubMedCentralGoogle Scholar
  54. Single RM, Meyer D, Hollenbach JA, Nelson MP, Noble JA, Erlich HA, Thomson G (2002) Haplotype frequency estimation in patient populations: the effect of departures from Hardy-Weinberg proportions and collapsing over a locus in the HLA region. Genetic Epidemiology: The Official Publication of the International Genetic Epidemiology Society 22(2):186–195Google Scholar
  55. Sjakste T, Kalnina J, Paramonova N, Nikitina-Zake L, Sjakste N (2016) “Journal of Molecular and Genetic Medicine.”Google Scholar
  56. Slater N, Louzoun Y, Gragert L, Maiers M, Chatterjee A, Albrecht M (2015) Power laws for heavy-tailed distributions: modeling allele and haplotype diversity for the national marrow donor program. PLoS Comput Biol 11(4):e1004204PubMedPubMedCentralGoogle Scholar
  57. Tokunaga K,Imanishi T, Takahashi K, Juji T (1996) “On the origin and dispersal of East Asian populations as viewed from HLA haplotypes”. Prehistoric mongoloid dispersals: 187-197.Google Scholar
  58. Traherne J (2008) Human MHC architecture and evolution: implications for disease association studies. Int J Immunogenet 35(3):179–192PubMedPubMedCentralGoogle Scholar
  59. Verdu P, Pemberton TJ, Laurent R, Kemp BM, Gonzalez-Oliver A, Gorodezky C, Hughes CE, Shattuck MR, Petzelt B, Mitchell J (2014) Patterns of admixture and population structure in native populations of Northwest North America. PLoS Genet 10(8):e1004530PubMedPubMedCentralGoogle Scholar
  60. Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS Biol 4(3):e72PubMedPubMedCentralGoogle Scholar
  61. Wang S, Lewis CM Jr, Jakobsson M, Ramachandran S, Ray N, Bedoya G, Rojas W, Parra MV, Molina JA, Gallo C (2007) Genetic variation and population structure in Native Americans. PLoS Genet 3(11):e185PubMedPubMedCentralGoogle Scholar
  62. Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. evolution:1358–1370Google Scholar
  63. Weir BS, Hill WG (2002) Estimating F-statistics. Annu Rev Genet 36(1):721–750PubMedGoogle Scholar
  64. Weir BS, Ott J (1997) Genetic data analysis II. Trends Genet 13(9):379Google Scholar
  65. Williamson SH, Hubisz MJ, Clark AG, Payseur BA, Bustamante CD, Nielsen R (2007) Localizing recent adaptive evolution in the human genome. PLoS Genet 3(6):e90PubMedPubMedCentralGoogle Scholar
  66. Zhou H, Alexander D, Lange K (2011) A quasi-Newton acceleration for high-dimensional optimization algorithms. Stat Comput 21(2):261–273PubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Mathematics and Gonda brain research instituteBar-Ilan UniversityRamat-GanIsrael
  2. 2.Bioinformatics ResearchCenter for International Blood and Marrow Transplant ResearchMinneapolisUSA

Personalised recommendations