Research in Computational Molecular Biology

Volume 4955 of the series Lecture Notes in Computer Science pp 424-433

On the Inference of Ancestries in Admixed Populations

  • Sriram SankararamanAffiliated withComputer Science Division, University of California
  • , Gad KimmelAffiliated withComputer Science Division, University of CaliforniaInternational Computer Science Institute
  • , Eran HalperinAffiliated withInternational Computer Science Institute
  • , Michael I. JordanAffiliated withComputer Science Division, University of CaliforniaDepartment of Statistics, University of California

* Final gross prices may vary according to local VAT.

Get Access


Inference of ancestral information in recently admixed populations, in which every individual is composed of a mixed ancestry (e.g., African Americans in the US), is a challenging problem. Several previous model-based approaches have used hidden Markov models (HMM) to model the problem, however, the Markov Chain Monte Carlo (MCMC) algorithms underlying these models converge slowly on realistic datasets. While retaining the HMM as a model, we show that a combination of an accurate fast initialization and a local hill-climb in likelihood results in significantly improved estimates of ancestry. We studied this approach in two scenarios—the inference of locus-specific ancestries in a population that is assumed to originate from two unknown ancestral populations, and the inference of allele frequencies in one ancestral population given those in another.