On the Inference of Ancestries in Admixed Populations

  • Sriram Sankararaman
  • Gad Kimmel
  • Eran Halperin
  • Michael I. Jordan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4955)

Abstract

Inference of ancestral information in recently admixed populations, in which every individual is composed of a mixed ancestry (e.g., African Americans in the US), is a challenging problem. Several previous model-based approaches have used hidden Markov models (HMM) to model the problem, however, the Markov Chain Monte Carlo (MCMC) algorithms underlying these models converge slowly on realistic datasets. While retaining the HMM as a model, we show that a combination of an accurate fast initialization and a local hill-climb in likelihood results in significantly improved estimates of ancestry. We studied this approach in two scenarios—the inference of locus-specific ancestries in a population that is assumed to originate from two unknown ancestral populations, and the inference of allele frequencies in one ancestral population given those in another.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bonnen, P., Pe’er, I., Plenge, R., Salit, J., Lowe, J., Shapero, M., Lifton, R., Breslow, J., Daly, M., Reich, D., et al.: Evaluating potential for whole-genome studies in Kosrae, an isolated population in Micronesia. Nat. Genet. 38, 214–217 (2006)CrossRefGoogle Scholar
  2. 2.
    Price, A., Patterson, N., Plenge, R., Weinblatt, M., Shadick, N., Reich, D.: Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics 38, 904–909 (2006)CrossRefGoogle Scholar
  3. 3.
    Pritchard, J., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000)Google Scholar
  4. 4.
    Falush, D., Stephens, M., Pritchard, J.K.: Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003)Google Scholar
  5. 5.
    Patterson, N., Hattangadi, N., Lane, B., Lohmueller, K.E., Hafler, D.A., Oksenberg, J.R., Hauser, S.L., Smith, M.W., O’Brien, S.J., Altshuler, D., Daly, M.J., et al.: Methods for high-density admixture mapping of disease genes. Am. J. Hum. Genet. 74, 979–1000 (2004)CrossRefGoogle Scholar
  6. 6.
    Hoggart, C., Shriver, M., Kittles, R., Clayton, D., McKeigue, P.: Design and analysis of admixture mapping studies. Am. J. Hum. Genet. 74, 965–978 (2004)CrossRefGoogle Scholar
  7. 7.
    Tang, H., Coram, M., Wang, P., Zhu, X., Risch, N.: Reconstructing genetic ancestry blocks in admixed individuals. Am. J. Hum. Genet. 79, 1–12 (2006)CrossRefGoogle Scholar
  8. 8.
    Sankararaman, S., Sridhar, S., Kimmel, G., Halperin, E.: Estimating local ancestry in admixed populations. American Journal of Human Genetics (to appear)Google Scholar
  9. 9.
    Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing. Prentice-Hall, Upper Saddle River (2001)Google Scholar
  10. 10.
  11. 11.
  12. 12.
    Nachman, M., Crowell, S.: Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297–304 (2000)Google Scholar
  13. 13.
    Tian, C., Hinds, D.A., Shigeta, R., Kittles, R., Ballinger, D.G., Seldin, M.F.: A genomewide single-nucleotide-polymorphism panel with high ancestry information for African American admixture mapping. Am. J. Hum. Genet. 79, 640–649 (2006)CrossRefGoogle Scholar
  14. 14.
    Brooks, S.P., Gelman, A.: General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics 7(4), 434–455 (1998)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Sriram Sankararaman
    • 1
  • Gad Kimmel
    • 1
    • 2
  • Eran Halperin
    • 2
  • Michael I. Jordan
    • 1
    • 3
  1. 1.Computer Science DivisionUniversity of CaliforniaBerkeleyUSA
  2. 2.International Computer Science InstituteBerkeleyUSA
  3. 3.Department of StatisticsUniversity of CaliforniaBerkeleyUSA

Personalised recommendations