# Genealogical Properties of Subsamples in Highly Fecund Populations

- 79 Downloads

## Abstract

We consider some genealogical properties of nested samples. The complete sample is assumed to have been drawn from a natural population characterised by high fecundity and sweepstakes reproduction (abbreviated HFSR). The random gene genealogies of the samples are—due to our assumption of HFSR—modelled by coalescent processes which admit multiple mergers of ancestral lineages looking back in time. Among the genealogical properties we consider are the probability that the most recent common ancestor is shared between the complete sample and the subsample nested within the complete sample; we also compare the lengths of ‘internal’ branches of nested genealogies between different coalescent processes. The results indicate how ‘informative’ a subsample is about the properties of the larger complete sample, how much information is gained by increasing the sample size, and how the ‘informativeness’ of the subsample varies between different coalescent processes.

## Keywords

Coalescent High fecundity Nested samples Multiple mergers Time to most recent common ancestor## Mathematics Subject Classification

92D15 60J28## Notes

### Acknowledgements

We thank Alison Etheridge for many and very valuable comments and suggestions, especially regarding Theorem 1. BE was funded by DFG grant STE 325/17-1 to Wolfgang Stephan through Priority Programme SPP1819: Rapid Evolutionary Adaptation. FF was funded by DFG grant FR 3633/2-1 through Priority Program 1590: Probabilistic Structures in Evolution.

## Supplementary material

## References

- 1.Agrios, G.: Plant Pathology. Academic Press, Amsterdam (2005)Google Scholar
- 2.Árnason, E., Halldórsdóttir, K.: Nucleotide variation and balancing selection at the Ckma gene in Atlantic cod: analysis with multiple merger coalescent models. PeerJ
**3**, e786 (2015). https://doi.org/10.7717/peerj.786 CrossRefGoogle Scholar - 3.Arratia, R., Barbour, A.D., Tavaré, S.: Logarithmic Combinatorial Structures: A Probabilistic Approach. European Mathematical Society (EMS), Zürich (2003)CrossRefMATHGoogle Scholar
- 4.Barney, B.T., Munkholm, C., Walt, D.R., Palumbi, S.R.: Highly localized divergence within supergenes in atlantic cod (gadus morhua) within the gulf of maine. BMC Genomics
**18**(1) (2017). https://doi.org/10.1186/s12864-017-3660-3 - 5.Barton, N.H., Etheridge, A.M., Véber, A.: Modelling evolution in a spatial continuum. J. Stat. Mech.
**2013**(01), P01,002 (2013). http://stacks.iop.org/1742-5468/2013/i=01/a=P01002 - 6.Basu, A., Majumder, P.P.: A comparison of two popular statistical methods for estimating the time to most recent common ancestor (tmrca) from a sample of DNA sequences. J. Genet.
**82**(1–2), 7–12 (2003)CrossRefGoogle Scholar - 7.Berestycki, J., Berestycki, N., Schweinsberg, J.: Beta-coalescents and continuous stable random trees. Ann. Probab.
**35**, 1835–1887 (2007)MathSciNetCrossRefMATHGoogle Scholar - 8.Berestycki, J., Berestycki, N., Schweinsberg, J.: Small-time behavior of beta coalescents. Ann. Inst. H Poincaré Probab. Stat.
**44**, 214–238 (2008)ADSMathSciNetCrossRefMATHGoogle Scholar - 9.Berestycki, N.: Recent progress in coalescent theory. Ensaios Mathématicos
**16**, 1–193 (2009)MathSciNetMATHGoogle Scholar - 10.Bertoin, J.: Exchangeable coalescents. Cours d’école doctorale, pp. 20–24 (2010)Google Scholar
- 11.Bhaskar, A., Clark, A., Song, Y.: Distortion of genealogical properties when the sample size is very large. PNAS
**111**, 2385–2390 (2014)ADSCrossRefGoogle Scholar - 12.Birkner, M., Blath, J.: Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model. J. Math. Biol.
**57**, 435–465 (2008)MathSciNetCrossRefMATHGoogle Scholar - 13.Birkner, M., Blath, J.: Coalescents and population genetic inference. Trends Stoch. Anal.
**353**, 329 (2009)CrossRefMATHGoogle Scholar - 14.Birkner, M., Blath, J., Capaldo, M., Etheridge, A.M., Möhle, M., Schweinsberg, J., Wakolbinger, A.: Alpha-stable branching and beta-coalescents. Electron. J. Probab.
**10**, 303–325 (2005)MathSciNetCrossRefMATHGoogle Scholar - 15.Birkner, M., Blath, J., Eldon, B.: An ancestral recombination graph for diploid populations with skewed offspring distribution. Genetics
**193**, 255–290 (2013)CrossRefGoogle Scholar - 16.Birkner, M., Blath, J., Eldon, B.: Statistical properties of the site-frequency spectrum associated with \(\varLambda \)-coalescents. Genetics
**195**, 1037–1053 (2013)CrossRefGoogle Scholar - 17.Birkner, M., Blath, J., Möhle, M., Steinrücken, M., Tams, J.: A modified lookdown construction for the Xi-Fleming-Viot process with mutation and populations with recurrent bottlenecks. ALEA Lat. Am. J. Probab. Math. Stat.
**6**, 25–61 (2009)MathSciNetMATHGoogle Scholar - 18.Birkner, M., Blath, J., Steinrücken, M.: Analysis of DNA sequence variation within marine species using Beta-coalescents. Theor. Popul. Biol.
**87**, 15–24 (2013)CrossRefMATHGoogle Scholar - 19.Blath, J., Cronjäger, M.C., Eldon, B., Hammer, M.: The site-frequency spectrum associated with \(\varXi \)-coalescents. Theor. Popul. Biol.
**110**, 36–50 (2016). https://doi.org/10.1016/j.tpb.2016.04.002 CrossRefMATHGoogle Scholar - 20.Bolthausen, E., Sznitman, A.: On Ruelle’s probability cascades and an abstract cavity method. Commun. Math. Phys.
**197**, 247–276 (1998)ADSMathSciNetCrossRefMATHGoogle Scholar - 21.Capra, J.A., Stolzer, M., Durand, D., Pollard, K.S.: How old is my gene? Trends Genet.
**29**(11), 659–668 (2013)CrossRefGoogle Scholar - 22.Desai, M.M., Walczak, A.M., Fisher, D.S.: Genetic diversity and the structure of genealogies in rapidly adapting populations. Genetics
**193**(2), 565–585 (2013)CrossRefGoogle Scholar - 23.Dong, R., Gnedin, A., Pitman, J.: Exchangeable partitions derived from markovian coalescents. Ann. Appl. Probab.
**17**, 1172–1201 (2007)MathSciNetCrossRefMATHGoogle Scholar - 24.Donnelly, P., Kurtz, T.G.: Particle representations for measure-valued population models. Ann. Probab.
**27**, 166–205 (1999)MathSciNetCrossRefMATHGoogle Scholar - 25.Donnelly, P., Tavare, S.: Coalescents and genealogical structure under neutrality. Annu. Rev. Genet.
**29**(1), 401–421 (1995)CrossRefGoogle Scholar - 26.Durrett, R.: Probability Models for DNA Sequence Evolution, 2nd edn. Springer, New York (2008)CrossRefMATHGoogle Scholar
- 27.Durrett, R., Schweinsberg, J.: Approximating selective sweeps. Theor. Popul. Biol.
**66**, 129–138 (2004)CrossRefMATHGoogle Scholar - 28.Durrett, R., Schweinsberg, J.: A coalescent model for the effect of advantageous mutations on the genealogy of a population. Stoch. Proc. Appl.
**115**, 1628–1657 (2005)MathSciNetCrossRefMATHGoogle Scholar - 29.Eldon, B.: Inference methods for multiple merger coalescents. In: Pontarotti, P. (ed.) Evolutionary Biology: Convergent Evolution, Evolution of Complex Traits, Concepts and Methods, pp. 347–371. Springer, New York (2016)CrossRefGoogle Scholar
- 30.Eldon, B., Birkner, M., Blath, J., Freund, F.: Can the site-frequency spectrum distinguish exponential population growth from multiple-merger coalescents. Genetics
**199**, 841–856 (2015)CrossRefGoogle Scholar - 31.Eldon, B., Wakeley, J.: Coalescent processes when the distribution of offspring number among individuals is highly skewed. Genetics
**172**, 2621–2633 (2006)CrossRefGoogle Scholar - 32.Eldon, B., Wakeley, J.: Linkage disequilibrium under skewed offspring distribution among individuals in a population. Genetics
**178**, 1517–1532 (2008)CrossRefGoogle Scholar - 33.Etheridge, A.: Some Mathematical Models from Population Genetics. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-16632-7 CrossRefMATHGoogle Scholar
- 34.Etheridge, A., Griffiths, R.: A coalescent dual process in a Moran model with genic selection. Theor. Popul. Biol.
**75**, 320–330 (2009)CrossRefMATHGoogle Scholar - 35.Etheridge, A.M., Griffiths, R.C., Taylor, J.E.: A coalescent dual process in a Moran model with genic selection, and the Lambda coalescent limit. Theor. Popul. Biol.
**78**, 77–92 (2010)CrossRefMATHGoogle Scholar - 36.Ewens, W.J.: Mathematical Population Genetics 1: Theoretical Introduction, vol. 27. Springer, New York (2012)MATHGoogle Scholar
- 37.Freund, F., Möhle, M.: On the size of the block of 1 for \(\varXi \)-coalescents with dust. Modern Stoch. Theory Appl. 4(4), 407–425 (2017). https://doi.org/10.15559/17-VMSTA92
- 38.Freund, F., Siri-Jégousse, A.: Minimal clade size in the bolthausen-sznitman coalescent. J. Appl. Probab.
**51**(3), 657–668 (2014)MathSciNetCrossRefMATHGoogle Scholar - 39.Goldschmidt, C., Martin, J.B.: Random recursive trees and the Bolthausen-Sznitman coalescent. Electron. J. Probab.
**10**(21), 718–745 (2005)MathSciNetCrossRefMATHGoogle Scholar - 40.Griffiths, R.C., Tavare, S.: Monte carlo inference methods in population genetics. Math. Comput. Model.
**23**(8–9), 141–158 (1996)MathSciNetCrossRefMATHGoogle Scholar - 41.Griffiths, R.C., Tavaré, S.: The age of a mutation in a general coalescent tree. Commun. Stat. Stoch. Model.
**14**, 273–295 (1998)MathSciNetCrossRefMATHGoogle Scholar - 42.Griswold, C.K., Baker, A.J.: Time to the most recent common ancestor and divergence times of populations of common chaffinches (Fringilla coelebs) in Europe and North Africa: insights into Pleistocene refugia and current levels of migration. Evolution
**56**(1), 143–153 (2002)CrossRefGoogle Scholar - 43.Halldórsdóttir, K., Árnason, E.: Whole-genome sequencing uncovers cryptic and hybrid species among Atlantic and Pacific cod-fish (2015). https://www.biorxiv.org/content/early/2015/12/20/034926
- 44.Hintze, J.L., Nelson, R.D.: Violin plots: a box plot-density trace synergism. Am. Stat.
**52**(2), 181–184 (1998). https://doi.org/10.1080/00031305.1998.10480559 Google Scholar - 45.Hedgecock, D.: Does variance in reproductive success limit effective population sizes of marine organisms? In: Beaumont, A. (ed.) Genetics and Evolution of Aquatic Organisms, pp. 1222–1344. Chapman and Hall, London (1994)Google Scholar
- 46.Hedgecock, D., Pudovkin, A.I.: Sweepstakes reproductive success in highly fecund marine fish and shellfish: a review and commentary. Bull Mar. Sci.
**87**, 971–1002 (2011)CrossRefGoogle Scholar - 47.Hedrick, P.: Large variance in reproductive success and the \({N}_e/{N}\) ratio. Evolution
**59**(7), 1596 (2005). https://doi.org/10.1554/05-009 CrossRefGoogle Scholar - 48.Hénard, O.: The fixation line in the \({\varLambda }\)-coalescent. Ann. Appl. Probab.
**25**(5), 3007–3032 (2015)MathSciNetCrossRefMATHGoogle Scholar - 49.Herriger, P., Möhle, M.: Conditions for exchangeable coalescents to come down from infinity. Alea
**9**(2), 637–665 (2012)MathSciNetMATHGoogle Scholar - 50.Hird, S., Kubatko, L., Carstens, B.: Rapid and accurate species tree estimation for phylogeographic investigations using replicated subsampling. Mol. Phylogenetics Evol.
**57**(2), 888–898 (2010)CrossRefGoogle Scholar - 51.Hovmøller, M.S., Sørensen, C.K., Walter, S., Justesen, A.F.: Diversity of
*Puccinia striiformis*on cereals and grasses. Annu. Rev. Phytopathol.**49**, 197–217 (2011)CrossRefGoogle Scholar - 52.Hudson, R.R.: Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol.
**23**, 183–201 (1983)CrossRefMATHGoogle Scholar - 53.Huillet, T., Möhle, M.: On the extended Moran model and its relation to coalescents with multiple collisions. Theor. Popul. Biol.
**87**, 5–14 (2013)CrossRefMATHGoogle Scholar - 54.Kaj, I., Krone, S.M.: The coalescent process in a population with stochastically varying size. J. Appl. Probab.
**40**(01), 33–48 (2003)MathSciNetCrossRefMATHGoogle Scholar - 55.King, L., Wakeley, J.: Empirical bayes estimation of coalescence times from nucleotide sequence data. Genetics
**204**(1), 249–257 (2016). https://doi.org/10.1534/genetics.115.185751 CrossRefGoogle Scholar - 56.Kingman, J.F.C.: The coalescent. Stoch. Proc. Appl.
**13**, 235–248 (1982)MathSciNetCrossRefMATHGoogle Scholar - 57.Kingman, J.F.C.: Exchangeability and the evolution of large populations. In: Koch, G., Spizzichino, F. (eds.) Exchangeability in Probability and Statistics, pp. 97–112. North-Holland, Amsterdam (1982)Google Scholar
- 58.Kingman, J.F.C.: On the genealogy of large populations. J. Appl. Probab.
**19A**, 27–43 (1982)MathSciNetCrossRefMATHGoogle Scholar - 59.Li, G., Hedgecock, D.: Genetic heterogeneity, detected by PCR-SSCP, among samples of larval Pacific oysters ( Crassostrea gigas ) supports the hypothesis of large variance in reproductive success. Can. J. Fish. Aquat. Sci.
**55**(4), 1025–1033 (1998). https://doi.org/10.1139/f97-312 CrossRefGoogle Scholar - 60.May, A.W.: Fecundity of Atlantic cod. J. Fish. Res. Board Can.
**24**, 1531–1551 (1967)CrossRefGoogle Scholar - 61.Möhle, M.: Robustness results for the coalescent. J. Appl. Probab.
**35**(02), 438–447 (1998)MathSciNetCrossRefMATHGoogle Scholar - 62.Möhle, M.: On sampling distributions for coalescent processes with simultaneous multiple collisions. Bernoulli
**12**(1), 35–53 (2006)MathSciNetMATHGoogle Scholar - 63.Möhle, M.: Coalescent processes derived from some compound Poisson population models. Electron. Commun. Probab.
**16**, 567–582 (2011)MathSciNetCrossRefMATHGoogle Scholar - 64.Möhle, M., Sagitov, S.: A classification of coalescent processes for haploid exchangeable population models. Ann. Probab.
**29**, 1547–1562 (2001)MathSciNetCrossRefMATHGoogle Scholar - 65.Möhle, M., Sagitov, S.: Coalescent patterns in diploid exchangeable population models. J. Math. Biol.
**47**, 337–352 (2003)MathSciNetCrossRefMATHGoogle Scholar - 66.Neher, R.A., Hallatschek, O.: Genealogies of rapidly adapting populations. Proc. Natl. Acad. Sci.
**110**(2), 437–442 (2013)ADSCrossRefGoogle Scholar - 67.Niwa, H.S., Nashida, K., Yanagimoto, T.: Reproductive skew in japanese sardine inferred from DNA sequences. ICES J. Mar. Sci.
**73**(9), 2181–2189 (2016). https://doi.org/10.1093/icesjms/fsw070 CrossRefGoogle Scholar - 68.Oosthuizen, E., Daan, N.: Egg fecundity and maturity of North Sea cod,
*Gadus morhua*. Neth. J. Sea Res.**8**(4), 378–397 (1974)CrossRefGoogle Scholar - 69.Pettengill, J.B.: The time to most recent common ancestor does not (usually) approximate the date of divergence. PloS ONE
**10**(8), e0128,407 (2015)CrossRefGoogle Scholar - 70.Pitman, J.: Coalescents with multiple collisions. Ann. Probab.
**27**, 1870–1902 (1999)MathSciNetCrossRefMATHGoogle Scholar - 71.Sagitov, S.: The general coalescent with asynchronous mergers of ancestral lines. J. Appl. Probab.
**36**, 1116–1125 (1999)MathSciNetCrossRefMATHGoogle Scholar - 72.Sagitov, S.: Convergence to the coalescent with simultaneous mergers. J. Appl. Probab.
**40**, 839–854 (2003)MathSciNetCrossRefMATHGoogle Scholar - 73.Sargsyan, O., Wakeley, J.: A coalescent process with simultaneous multiple mergers for approximating the gene genealogies of many marine organisms. Theor. Popul. Biol.
**74**, 104–114 (2008)CrossRefMATHGoogle Scholar - 74.Saunders, I.W., Tavaré, S., Watterson, G.A.: On the genealogy of nested subsamples from a haploid population. Adv. Appl. Probab.
**16**(3), 471 (1984). https://doi.org/10.2307/1427285 MathSciNetCrossRefMATHGoogle Scholar - 75.Schweinsberg, J.: Rigorous results for a population model with selection II: genealogy of the population. Electron. J. Probab. https://doi.org/10.1214/17-EJP58 (2017)
- 76.Schweinsberg, J.: Coalescents with simultaneous multiple collisions. Electron. J. Probab.
**5**, 1–50 (2000)MathSciNetCrossRefMATHGoogle Scholar - 77.Schweinsberg, J.: A necessary and sufficient condition for the-coalescent to come down from the infinity. Electron. Commun. Probab.
**5**, 1–11 (2000)MathSciNetCrossRefMATHGoogle Scholar - 78.Schweinsberg, J.: Coalescent processes obtained from supercritical Galton-Watson processes. Stoch. Proc. Appl.
**106**, 107–139 (2003)MathSciNetCrossRefMATHGoogle Scholar - 79.Simon, M., Cordo, C.: Inheritance of partial resistance to
*Septoria tritici*in wheat (*Triticum aestivum*): limitation of pycnidia and spore production. Agronomie**17**(6–7), 343–347 (1997)CrossRefGoogle Scholar - 80.Slack, R.: A branching process with mean one and possibly infinite variance. Probab. Theory Relat. Fields
**9**(2), 139–145 (1968)MathSciNetMATHGoogle Scholar - 81.Spouge, J.L.: Within a sample from a population, the distribution of the number of descendants of a subsample’s most recent common ancestor. Theor. Popul. Biol.
**92**, 51–54 (2014)CrossRefMATHGoogle Scholar - 82.Tajima, F.: Evolutionary relationships of DNA sequences in finite populations. Genetics
**105**, 437–460 (1983)Google Scholar - 83.Timm, A., Yin, J.: Kinetics of virus production from single cells. Virology
**424**(1), 11–17 (2012)CrossRefGoogle Scholar - 84.Wakeley, J.: Coalescent Theory. Roberts & Co, Greenwood Village (2007)MATHGoogle Scholar
- 85.Wakeley, J., Takahashi, T.: Gene genealogies when the sample size exceeds the effective size of the population. Mol. Biol. Evol.
**20**, 208–2013 (2003)CrossRefGoogle Scholar - 86.Waples, R.S.: Tiny estimates of the \({N_e}/{N}\) ratio in marine fishes: are they real? J. Fish Biol.
**89**(6), 2479–2504 (2016). https://doi.org/10.1111/jfb.13143 CrossRefGoogle Scholar - 87.Wiuf, C., Donnelly, P.: Conditional genealogies and the age of a neutral mutant. Theor. Popul. Biol.
**56**(2), 183–201 (1999). https://doi.org/10.1006/tpbi.1998.1411