Rapidly Registering Identity-by-Descent Across Ancestral Recombination Graphs
- 2.4k Downloads
Abstract
The genomes of remotely related individuals occasionally contain long segments that are Identical By Descent (IBD). Sharing of IBD segments has many applications in population and medical genetics, and it is thus desirable to study their properties in simulations. However, no current method provides a direct, efficient means to extract IBD segments from simulated genealogies. Here, we introduce computationally efficient approaches to extract ground-truth IBD segments from a sequence of genealogies, or equivalently, an ancestral recombination graph. Specifically, we use a two-step scheme, where we first identify putative shared segments by comparing the common ancestors of all pairs of individuals at some distance apart. This reduces the search space considerably, and we then proceed by determining the true IBD status of the candidate segments. Under some assumptions and when allowing a limited resolution of segment lengths, our run-time complexity is reduced from \(O(n^3\log n)\) for the naïve algorithm to \(O(n\log n)\), where \(n\) is the number of individuals in the sample.
Keywords
Identity by Descent Ancestral Recombination Graphs Population Genetics SimulationPreview
Unable to display preview. Download preview PDF.
References
- 1.Albrechtsen, A., Moltke, I., Nielsen, R.: Natural selection and the distribution of identity-by-descent in the human genome. Genetics 186(1), 295–308 (2010)CrossRefGoogle Scholar
- 2.Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162(4), 2025–2035 (2002)Google Scholar
- 3.Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G., Panario, D., Viola, A., (eds.): LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, London (2000)Google Scholar
- 4.Berkman, O., Galil, Z., Schieber, B., Vishkin, U.: Highly parallelizable problems. In: Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing, STOC 1989, pp. 309–319. ACM, New York (1989)Google Scholar
- 5.Browning, B.L., Browning, S.R.: A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 88(2), 173–182 (2011)CrossRefGoogle Scholar
- 6.Browning, B.L., Browning, S.R.: Detecting identity by descent and estimating genotype error rates in sequence data. Am. J. Hum. Genet. 93(5), 840–851 (2013)CrossRefGoogle Scholar
- 7.Carmi, S., Palamara, P.F., Vacic, V., Lencz, T., Darvasi, A., Pe’er, I.: The variance of identity-by-descent sharing in the Wright-Fisher model. Genetics 193(3), 911–928 (2013)CrossRefGoogle Scholar
- 8.Carmi, S., Wilton, P.R., Wakeley, J., Pe’er, I.: A renewal theory approach to IBD sharing. Theor. Popul. Biol. 97, 35–48 (2014)CrossRefzbMATHGoogle Scholar
- 9.Chiang, C.W.K., Ralph, P., Novembre, J.: Conflations of short IBD blocks can bias inferred length of IBD (2014)Google Scholar
- 10.Conrad, D.F., Jakobsson, M., Coop, G., Wen, X., Wall, J.D., Rosenberg, N.A., Pritchard, J.K.: A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat. Genet. 38, 1251–1260 (2006)CrossRefGoogle Scholar
- 11.Consortium, T.W.T.C.C.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)Google Scholar
- 12.Excoffier, L., Foll, M.: Fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios. Bioinformatics (2011)Google Scholar
- 13.Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V.C., Foll, M.: Robust demographic inference from genomic and SNP data. PLoS Genet. 9(10), e1003905 (2013)Google Scholar
- 14.Fearnhead, P., Donnelly, P.: Estimating recombination rates from population genetic data. Genetics 159(3), 1299–1318 (2001)Google Scholar
- 15.Griffiths, R.C., Marjoram, P.: Ancestral inference from samples of DNA sequences with recombination. J. Comput. Biol. 3(4), 479–502 (1996)CrossRefGoogle Scholar
- 16.Gershon, E., Shaked, U.: Applications. In: Gershon, E., Shaked, U. (eds.) Advanced Topics in Control and Estimation of State-multiplicative Noisy Systems. LNCIS, vol. 439, pp. 201–216. Springer, Heidelberg (2013) CrossRefGoogle Scholar
- 17.Guha, S., Rosenfeld, J.A., Malhotra, A.K., Lee, A.T., Gregersen, P.K., Kane, J.M., Pe’er, I., Darvasi, A., Lencz, T.: Implications for health and disease in the genetic signature of the Ashkenazi Jewish population. Genome Biol. 13, R2 (2012)CrossRefGoogle Scholar
- 18.Gusev, A., Lowe, J.K., Stoffel, M., Daly, M.J., Altshuler, D., Breslow, J.L., Friedman, J.M., Pe’er, I.: Whole population, genome-wide mapping of hidden relatedness. Genome Res. 19(2), 318–326 (2009)CrossRefGoogle Scholar
- 19.Gusev, A., Kenny, E.E., Lowe, J.K., Salit, J., Saxena, R., Kathiresan, S., Altshuler, D.M., Friedman, J.M., Breslow, J.L., Pe’er, I.: DASH: A method for identical-by-descent haplotype mapping uncovers association with recent variation. Am. J. Hum. Genet. 88(6), 706–717 (2011)CrossRefGoogle Scholar
- 20.Gusev, A., Palamara, P.F., Aponte, G., Zhuang, Z., Darvasi, A., Gregersen, P., Pe’er, I.: The architecture of long-range haplotypes shared within and across populations. Mol. Biol. Evol. 29(2), 473–486 (2012)CrossRefGoogle Scholar
- 21.Harris, K., Nielsen, R.: Inferring demographic history from a spectrum of shared haplotype lengths. PLoS Genet. 9, e1003521 (2013)CrossRefGoogle Scholar
- 22.Henn, B.M., Cavalli-Sforza, L.L., Feldman, M.W.: The great human expansion. Proc. Natl. Acad. Sci. USA 109, 17758–17764 (2012)CrossRefGoogle Scholar
- 23.Henn, B.M., Hon, L., Macpherson, J.M., Eriksson, N., Saxonov, S., Pe’er, I., Mountain, J.L.: Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples. PLoS One 7(4), e34267 (2012)Google Scholar
- 24.Hudson, R.R.: Gene genealogies and the coalescent process. Oxford surveys in evolutionary biology 7(1), 44 (1990)Google Scholar
- 25.Hudson, R.R.: Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23(2), 183–201 (1983)CrossRefzbMATHGoogle Scholar
- 26.Huff, C.D., Witherspoon, D.J., Simonson, T.S., Xing, J., Watkins, W.S., Zhang, Y., Tuohy, T.M., Neklason, D.W., Burt, R.W., Guthery, S.L., Woodward, S.R., Jorde, L.B.: Maximum-likelihood estimation of recent shared ancestry (ERSA). Genome Res. 21, 768–774 (2011)CrossRefGoogle Scholar
- 27.Li, H., Durbin, R.: Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011)CrossRefGoogle Scholar
- 28.Li, H., Wiehe, T.: Coalescent tree imbalance and a simple test for selective sweeps based on microsatellite variation. PLoS Comput. Biol. 9, e1003060 (2013)CrossRefGoogle Scholar
- 29.Liang, L., Zöllner, S., Abecasis, G.R.: GENOME: a rapid coalescent-based whole genome simulator. Bioinformatics 23(12), 1565–1567 (2007)CrossRefGoogle Scholar
- 30.Marjoram, P., Wall, J.: Fast “coalescent" simulation. BMC Genet. 7(1), 16 (2006)CrossRefGoogle Scholar
- 31.Mathieson, I., McVean, G.: Demography and the age of rare variants. PLoS Genet. 10(8), e1004528 (2014)Google Scholar
- 32.McVean, G.A., Cardin, N.J.: Approximating the coalescent with recombination. Philos. T. Roy. Soc. B. 360(1459), 1387–1393 (2005)Google Scholar
- 33.Palamara, P.F., Lencz, T., Darvasi, A., Pe’er, I.: Length distributions of identity by descent reveal fine-scale demographic history. Am. J. Hum. Genet. 91(5), 809–822 (2012)CrossRefGoogle Scholar
- 34.Palamara, P.F., Pe’er, I.: Inference of historical migration rates via haplotype sharing. Bioinformatics 29(13), 180–188 (2013)CrossRefGoogle Scholar
- 35.Ralph, P., Coop, G.: The geography of recent genetic ancestry across Europe. PLoS Biol. 11(5), e1001555 (2013)Google Scholar
- 36.Schaffner, S.F., Foo, C., Gabriel, S., Reich, D., Daly, M.J., Altshuler, D.: Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 15(11), 1576–1583 (2005)CrossRefGoogle Scholar
- 37.Simonsen, K.T., Churchill, G.A.: A Markov chain model of coalescence with recombination. Theor. Popul. Biol. 52, 43–59 (1997)CrossRefzbMATHGoogle Scholar
- 38.Su, S.Y., Kasberger, J., Baranzini, S., Byerley, W., Liao, W., Oksenberg, J., Sherr, E., Jorgenson, E.: Detection of identity by descent using next-generation whole genome sequencing data. BMC Bioinformatics 13, 121 (2012)CrossRefGoogle Scholar
- 39.Tataru, P., Nirody, J.A., Song, Y.S.: diCal-IBD: demography-aware inference of identity-by-descent tracts in unrelated individuals. Bioinformatics 30, 3430–3431 (2014)Google Scholar
- 40.Wakeley, J.: Coalescent Theory, an Introduction. Roberts and Company, Greenwood Village, CO (2005)Google Scholar
- 41.Wiuf, C., Hein, J.: Recombination as a point process along sequences. Theor. Popul. Biol. 55, 248–259 (1999)CrossRefzbMATHGoogle Scholar
- 42.Yang, S.: IBDdetection. https://github.com/morrisyoung/IBDdetection (2014)
- 43.Zhang, Q.S., Browning, B.L., Browning, S.R.: Genome-wide haplotypic testing in a Finnish cohort identifies a novel association with low-density lipoprotein cholesterol. Eur. J. Hum, Genet (2014)CrossRefGoogle Scholar