Advertisement

Coalescence times for three genes provide sufficient information to distinguish population structure from population size changes

  • Simona Grusea
  • Willy Rodríguez
  • Didier Pinchon
  • Lounès Chikhi
  • Simon Boitard
  • Olivier Mazet
Article

Abstract

The increasing amount of genomic data currently available is expanding the horizons of population genetics inference. A wide range of methods have been published allowing to detect and date major changes in population size during the history of species. At the same time, there has been an increasing recognition that population structure can generate genetic data similar to those generated under models of population size change. Recently, Mazet et al. (Heredity 116(4):362–371, 2016) introduced the idea that, for any model of population structure, it is always possible to find a panmictic model with a particular function of population size-change having an identical distribution of \(T_{2}\) (the time of the first coalescence for a sample of size two). This implies that there is an identifiability problem between a panmictic and a structured model when we base our analysis only on \(T_2\). In this paper, based on an analytical study of the rate matrix of the ancestral lineage process, we obtain new theoretical results about the joint distribution of the coalescence times \((T_3,T_2)\) for a sample of three haploid genes in a n-island model with constant size. Even if, for any \(k \ge 2\), it is always possible to find a size-change scenario for a panmictic population such that the marginal distribution of \(T_k\) is exactly the same as in a n-island model with constant population size, we show that the joint distribution of the coalescence times \((T_3,T_2)\) for a sample of three genes contains enough information to distinguish between a panmictic population and a n-island model of constant size.

Keywords

Inverse instantaneous coalescence rate (IICR) Population structure Population size change Demographic history Rate matrix Structured coalescent 

Mathematics Subject Classification

92D15 60J27 60J35 60E05 15A18 

Notes

Acknowledgements

The authors wish to thank Josué M. Corujo Rodríguez for very interesting discussions in preparing this article. The authors also thank the anonymous reviewers for their reading and for valuable suggestions. This research was funded through the 2015–2016 BiodivERsA COFUND call for research proposals, with the national funders ANR (ANR-16-EBI3-0014), FCT (Biodiversa/0003/2015) and PT-DLR (01LC1617A), under the INFRAGECO (Inference, Fragmentation, Genomics, and Conservation) Project (https://infrageco-biodiversa.org/). The research was also supported by the LABEX entitled TULIP (ANR-10-LABX-41), as well as the Pôle de Recherche et d’Enseignement Suprieur (PRES) and the Région Midi-Pyrénées, France. We finally thank the LIA BEEG-B (Laboratoire International Associé—Bioinformatics, Ecology, Evolution, Genomics and Behaviour) (CNRS) and the PESSOA program for facilitating travel and collaboration between EDB, IMT and INSA in Toulouse and the IGC, in Portugal.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. Beaumont MA (1999) Detecting population expansion and decline using microsatellites. Genetics 153(4):2013–2029Google Scholar
  2. Beaumont MA (2004) Recent developments in genetic data analysis: what can they tell us about human demographic history? Heredity 92(5):365–379CrossRefGoogle Scholar
  3. Bhaskar A, Song YS (2014) Descartes’ rule of signs and the identifiability of population demographic models from genomic variation data. Ann Stat 42(6):2469–2493MathSciNetCrossRefzbMATHGoogle Scholar
  4. Chikhi L, Sousa VC, Luisi P, Goossens B, Beaumont MA (2010) The confounding effects of population structure, genetic diversity and the sampling scheme on the detection and quantification of population size changes. Genetics 186(3):983–995CrossRefGoogle Scholar
  5. Chikhi L, Rodríguez W, Grusea S, Santos P, Boitard S, Mazet O (2018) The IICR (inverse instantaneous coalescence rate) as a summary of genomic diversity: insights into demographic inference and model choice. Heredity 120:13–24CrossRefGoogle Scholar
  6. Goossens B, Chikhi L, Ancrenaz M, Lackman-Ancrenaz I, Andau P, Bruford MW et al (2006) Genetic signature of anthropogenic population collapse in orang-utans. PLoS Biol 4(2):285CrossRefGoogle Scholar
  7. Griffiths RC, Tavaré S (1994) Sampling theory for neutral alleles in a varying environment. Philos Trans R Soc B Biol Sci 344(1310):403–410CrossRefGoogle Scholar
  8. Griffiths RC, Tavaré S (1998) The age of a mutation in a general coalescent tree. Stoch Models 14(1–2):273–295MathSciNetCrossRefzbMATHGoogle Scholar
  9. Heller R, Chikhi L, Siegismund HR (2013) The confounding effect of population structure on Bayesian skyline plot inferences of demographic history. PLoS ONE 8(5):e62992CrossRefGoogle Scholar
  10. Herbots HM (1994) Stochastic models in population genetics: genealogy and genetic differentiation in structured populations. PhD thesis, University of London. https://qmro.qmul.ac.uk/xmlui/handle/123456789/1482?show=full
  11. Hudson RR (1983) Properties of a neutral allele model with intragenic recombination. Theor Popul Biol 23(2):183–201CrossRefzbMATHGoogle Scholar
  12. Hudson RR (2002) Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18(2):337–338CrossRefGoogle Scholar
  13. Hudson RR et al (1990) Gene genealogies and the coalescent process. Oxf Surv Evol Biol 7(1):44Google Scholar
  14. Kim J, Mossel E, Rácz MZ, Ross N (2015) Can one hear the shape of a population history? Theor Popul Biol 100:26–38 ISSN 0040-5809CrossRefzbMATHGoogle Scholar
  15. Kingman JFC (1982) The coalescent. Stoch Process Appl 13(3):235–248MathSciNetCrossRefzbMATHGoogle Scholar
  16. Lang S (2002) Algebra, rev. 3 edn. Springer, BerlinCrossRefzbMATHGoogle Scholar
  17. Lapierre M, Lambert A, Achaz G (2017) Accuracy of demographic inferences from the site frequency spectrum: the case of the Yoruba population. Genetics 206(1):439–449CrossRefGoogle Scholar
  18. Mazet O, Rodríguez W, Chikhi L (2015) Demographic inference using genetic data from a single individual: separating population size variation from population structure. Theor Popul Biol 104:46–58CrossRefzbMATHGoogle Scholar
  19. Mazet O, Rodríguez W, Grusea S, Boitard S, Chikhi L (2016) On the importance of being structured: instantaneous coalescence rates and human evolution-lessons for ancestral population size inference? Heredity 116(4):362–371CrossRefGoogle Scholar
  20. Myers S, Fefferman C, Patterson N (2008) Can one learn history from the allelic spectrum? Theor Popul Biol 73(3):342–348CrossRefzbMATHGoogle Scholar
  21. Nei M, Takahata N (1993) Effective population size, genetic diversity, and coalescence time in subdivided populations. J Mol Evol 37(3):240–244CrossRefGoogle Scholar
  22. Nickalls RWD (1993) A new approach to solving the cubic: Cardan’s solution revealed. Math Gazette 77:354–359. http://www.nickalls.org/dick/papers/maths/cubic1993.pdf
  23. Nielsen R, Wakeley J (2001) Distinguishing migration from isolation: a Markov Chain Monte Carlo approach. Genetics 158(2):885–896Google Scholar
  24. Norris JR (1998) Markov Chains. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, CambridgeGoogle Scholar
  25. Notohara M (1990) The coalescent and the genealogical process in geographically structured population. J Math Biol 29(1):59–75MathSciNetCrossRefzbMATHGoogle Scholar
  26. Peter BM, Wegmann D, Excoffier L (2010) Distinguishing between population bottleneck and population subdivision by a Bayesian model choice procedure. Mol Ecol 19(21):4648–4660CrossRefGoogle Scholar
  27. Quéméré E, Amelot X, Pierson J, Crouau-Roy B, Chikhi L (2012) Genetic data suggest a natural prehuman origin of open habitats in northern Madagascar and question the deforestation narrative in this region. Proc Natl Acad Sci USA 109(32):13028–13033CrossRefGoogle Scholar
  28. Rodríguez W (2016) Estimation de l’histoire démographique des populations à partir de génomes entièrement séquencés. PhD thesis, University Paul Sabatier, Toulouse, June 2016Google Scholar
  29. Rogers AR, Harpending H (1992) Population growth makes waves in the distribution of pairwise genetic differences. Mol Biol Evol 9(3):552–569Google Scholar
  30. Schiffels S, Durbin R (2014) Inferring human population size and separation history from multiple genome sequences. Nat Genet 46(8):919–925CrossRefGoogle Scholar
  31. Slatkin M (1991) Inbreeding coefficients and coalescence times. Genet Res 58(02):167–175CrossRefGoogle Scholar
  32. Storz JF, Beaumont MA (2002) Testing for genetic evidence of population expansion and contraction: an empirical analysis of microsatellite DNA variation using a hierarchical Bayesian model. Evolution 56(1):154–166CrossRefGoogle Scholar
  33. Tajima F (1983) Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437–460Google Scholar
  34. Terhorst J, Song YS (2015) Fundamental limits on the accuracy of demographic inference based on the sample frequency spectrum. Proc Natl Acad Sci USA 112(25):7677–7682CrossRefGoogle Scholar
  35. Vallander SS (1973) Calculation of the Wasserstein distance between probability distributions on the line. Theory Probab Appl 18:784–786CrossRefzbMATHGoogle Scholar
  36. Wakeley J (1999) Nonequilibrium migration in human history. Genetics 153(4):1863–1871Google Scholar
  37. Weissman DB, Hallatschek O (2017) Minimal-assumption inference from population-genomic data. eLife 6:e24836CrossRefGoogle Scholar
  38. Wilkinson-Herbots HM (1998) Genealogy and subpopulation differentiation under various models of population structure. J Math Biol 37(6):535–585MathSciNetCrossRefzbMATHGoogle Scholar
  39. Wright S (1931) Evolution in mendelian populations. Genetics 16(2):97Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Institut de Mathématiques de Toulouse, Université de ToulouseInstitut National des Sciences AppliquéesToulouseFrance
  2. 2.Laboratoire Évolution et Diversité Biologique (EDB UMR 5174)Université de Toulouse Midi-Pyrénées, CNRS, IRD, UPSToulouse Cedex 9France
  3. 3.Instituto Gulbenkian de CiênciaOeirasPortugal
  4. 4.GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVTCastanet TolosanFrance

Personalised recommendations