Rapidly Registering Identity-by-Descent Across Ancestral Recombination Graphs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9029)

Abstract

The genomes of remotely related individuals occasionally contain long segments that are Identical By Descent (IBD). Sharing of IBD segments has many applications in population and medical genetics, and it is thus desirable to study their properties in simulations. However, no current method provides a direct, efficient means to extract IBD segments from simulated genealogies. Here, we introduce computationally efficient approaches to extract ground-truth IBD segments from a sequence of genealogies, or equivalently, an ancestral recombination graph. Specifically, we use a two-step scheme, where we first identify putative shared segments by comparing the common ancestors of all pairs of individuals at some distance apart. This reduces the search space considerably, and we then proceed by determining the true IBD status of the candidate segments. Under some assumptions and when allowing a limited resolution of segment lengths, our run-time complexity is reduced from \(O(n^3\log n)\) for the naïve algorithm to \(O(n\log n)\), where \(n\) is the number of individuals in the sample.

Keywords

Identity by Descent Ancestral Recombination Graphs Population Genetics Simulation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Albrechtsen, A., Moltke, I., Nielsen, R.: Natural selection and the distribution of identity-by-descent in the human genome. Genetics 186(1), 295–308 (2010)CrossRefGoogle Scholar
  2. 2.
    Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162(4), 2025–2035 (2002)Google Scholar
  3. 3.
    Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G., Panario, D., Viola, A., (eds.): LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, London (2000)Google Scholar
  4. 4.
    Berkman, O., Galil, Z., Schieber, B., Vishkin, U.: Highly parallelizable problems. In: Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing, STOC 1989, pp. 309–319. ACM, New York (1989)Google Scholar
  5. 5.
    Browning, B.L., Browning, S.R.: A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 88(2), 173–182 (2011)CrossRefGoogle Scholar
  6. 6.
    Browning, B.L., Browning, S.R.: Detecting identity by descent and estimating genotype error rates in sequence data. Am. J. Hum. Genet. 93(5), 840–851 (2013)CrossRefGoogle Scholar
  7. 7.
    Carmi, S., Palamara, P.F., Vacic, V., Lencz, T., Darvasi, A., Pe’er, I.: The variance of identity-by-descent sharing in the Wright-Fisher model. Genetics 193(3), 911–928 (2013)CrossRefGoogle Scholar
  8. 8.
    Carmi, S., Wilton, P.R., Wakeley, J., Pe’er, I.: A renewal theory approach to IBD sharing. Theor. Popul. Biol. 97, 35–48 (2014)CrossRefMATHGoogle Scholar
  9. 9.
    Chiang, C.W.K., Ralph, P., Novembre, J.: Conflations of short IBD blocks can bias inferred length of IBD (2014)Google Scholar
  10. 10.
    Conrad, D.F., Jakobsson, M., Coop, G., Wen, X., Wall, J.D., Rosenberg, N.A., Pritchard, J.K.: A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat. Genet. 38, 1251–1260 (2006)CrossRefGoogle Scholar
  11. 11.
    Consortium, T.W.T.C.C.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)Google Scholar
  12. 12.
    Excoffier, L., Foll, M.: Fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios. Bioinformatics (2011)Google Scholar
  13. 13.
    Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V.C., Foll, M.: Robust demographic inference from genomic and SNP data. PLoS Genet. 9(10), e1003905 (2013)Google Scholar
  14. 14.
    Fearnhead, P., Donnelly, P.: Estimating recombination rates from population genetic data. Genetics 159(3), 1299–1318 (2001)Google Scholar
  15. 15.
    Griffiths, R.C., Marjoram, P.: Ancestral inference from samples of DNA sequences with recombination. J. Comput. Biol. 3(4), 479–502 (1996)CrossRefGoogle Scholar
  16. 16.
    Gershon, E., Shaked, U.: Applications. In: Gershon, E., Shaked, U. (eds.) Advanced Topics in Control and Estimation of State-multiplicative Noisy Systems. LNCIS, vol. 439, pp. 201–216. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  17. 17.
    Guha, S., Rosenfeld, J.A., Malhotra, A.K., Lee, A.T., Gregersen, P.K., Kane, J.M., Pe’er, I., Darvasi, A., Lencz, T.: Implications for health and disease in the genetic signature of the Ashkenazi Jewish population. Genome Biol. 13, R2 (2012)CrossRefGoogle Scholar
  18. 18.
    Gusev, A., Lowe, J.K., Stoffel, M., Daly, M.J., Altshuler, D., Breslow, J.L., Friedman, J.M., Pe’er, I.: Whole population, genome-wide mapping of hidden relatedness. Genome Res. 19(2), 318–326 (2009)CrossRefGoogle Scholar
  19. 19.
    Gusev, A., Kenny, E.E., Lowe, J.K., Salit, J., Saxena, R., Kathiresan, S., Altshuler, D.M., Friedman, J.M., Breslow, J.L., Pe’er, I.: DASH: A method for identical-by-descent haplotype mapping uncovers association with recent variation. Am. J. Hum. Genet. 88(6), 706–717 (2011)CrossRefGoogle Scholar
  20. 20.
    Gusev, A., Palamara, P.F., Aponte, G., Zhuang, Z., Darvasi, A., Gregersen, P., Pe’er, I.: The architecture of long-range haplotypes shared within and across populations. Mol. Biol. Evol. 29(2), 473–486 (2012)CrossRefGoogle Scholar
  21. 21.
    Harris, K., Nielsen, R.: Inferring demographic history from a spectrum of shared haplotype lengths. PLoS Genet. 9, e1003521 (2013)CrossRefGoogle Scholar
  22. 22.
    Henn, B.M., Cavalli-Sforza, L.L., Feldman, M.W.: The great human expansion. Proc. Natl. Acad. Sci. USA 109, 17758–17764 (2012)CrossRefGoogle Scholar
  23. 23.
    Henn, B.M., Hon, L., Macpherson, J.M., Eriksson, N., Saxonov, S., Pe’er, I., Mountain, J.L.: Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples. PLoS One 7(4), e34267 (2012)Google Scholar
  24. 24.
    Hudson, R.R.: Gene genealogies and the coalescent process. Oxford surveys in evolutionary biology 7(1), 44 (1990)Google Scholar
  25. 25.
    Hudson, R.R.: Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23(2), 183–201 (1983)CrossRefMATHGoogle Scholar
  26. 26.
    Huff, C.D., Witherspoon, D.J., Simonson, T.S., Xing, J., Watkins, W.S., Zhang, Y., Tuohy, T.M., Neklason, D.W., Burt, R.W., Guthery, S.L., Woodward, S.R., Jorde, L.B.: Maximum-likelihood estimation of recent shared ancestry (ERSA). Genome Res. 21, 768–774 (2011)CrossRefGoogle Scholar
  27. 27.
    Li, H., Durbin, R.: Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011)CrossRefGoogle Scholar
  28. 28.
    Li, H., Wiehe, T.: Coalescent tree imbalance and a simple test for selective sweeps based on microsatellite variation. PLoS Comput. Biol. 9, e1003060 (2013)CrossRefGoogle Scholar
  29. 29.
    Liang, L., Zöllner, S., Abecasis, G.R.: GENOME: a rapid coalescent-based whole genome simulator. Bioinformatics 23(12), 1565–1567 (2007)CrossRefGoogle Scholar
  30. 30.
    Marjoram, P., Wall, J.: Fast “coalescent" simulation. BMC Genet. 7(1), 16 (2006)CrossRefGoogle Scholar
  31. 31.
    Mathieson, I., McVean, G.: Demography and the age of rare variants. PLoS Genet. 10(8), e1004528 (2014)Google Scholar
  32. 32.
    McVean, G.A., Cardin, N.J.: Approximating the coalescent with recombination. Philos. T. Roy. Soc. B. 360(1459), 1387–1393 (2005)Google Scholar
  33. 33.
    Palamara, P.F., Lencz, T., Darvasi, A., Pe’er, I.: Length distributions of identity by descent reveal fine-scale demographic history. Am. J. Hum. Genet. 91(5), 809–822 (2012)CrossRefGoogle Scholar
  34. 34.
    Palamara, P.F., Pe’er, I.: Inference of historical migration rates via haplotype sharing. Bioinformatics 29(13), 180–188 (2013)CrossRefGoogle Scholar
  35. 35.
    Ralph, P., Coop, G.: The geography of recent genetic ancestry across Europe. PLoS Biol. 11(5), e1001555 (2013)Google Scholar
  36. 36.
    Schaffner, S.F., Foo, C., Gabriel, S., Reich, D., Daly, M.J., Altshuler, D.: Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 15(11), 1576–1583 (2005)CrossRefGoogle Scholar
  37. 37.
    Simonsen, K.T., Churchill, G.A.: A Markov chain model of coalescence with recombination. Theor. Popul. Biol. 52, 43–59 (1997)CrossRefMATHGoogle Scholar
  38. 38.
    Su, S.Y., Kasberger, J., Baranzini, S., Byerley, W., Liao, W., Oksenberg, J., Sherr, E., Jorgenson, E.: Detection of identity by descent using next-generation whole genome sequencing data. BMC Bioinformatics 13, 121 (2012)CrossRefGoogle Scholar
  39. 39.
    Tataru, P., Nirody, J.A., Song, Y.S.: diCal-IBD: demography-aware inference of identity-by-descent tracts in unrelated individuals. Bioinformatics 30, 3430–3431 (2014)Google Scholar
  40. 40.
    Wakeley, J.: Coalescent Theory, an Introduction. Roberts and Company, Greenwood Village, CO (2005)Google Scholar
  41. 41.
    Wiuf, C., Hein, J.: Recombination as a point process along sequences. Theor. Popul. Biol. 55, 248–259 (1999)CrossRefMATHGoogle Scholar
  42. 42.
    Yang, S.: IBDdetection. https://github.com/morrisyoung/IBDdetection (2014)
  43. 43.
    Zhang, Q.S., Browning, B.L., Browning, S.R.: Genome-wide haplotypic testing in a Finnish cohort identifies a novel association with low-density lipoprotein cholesterol. Eur. J. Hum, Genet (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceColumbia UniversityNew YorkUSA
  2. 2.Department of Systems BiologyColumbia UniversityNew YorkUSA

Personalised recommendations