Skip to main content

Coalescent Models

  • Chapter
  • First Online:
Human Population Genomics
  • 1400 Accesses

Abstract

The standard neutral coalescent model and its extensions to include changes in population size over time and population structure are reviewed. Gene genealogies are shown to provide the hidden structure behind patterns of genetic variation. Expressions for expected levels of genetic variation are presented and explained, and tests of the standard neutral model based on the frequencies of mutations at single-nucleotide sites (aka “site frequencies”) are outlined. Several examples of deviations from the standard model are discussed, and their effects on expected site frequencies are illustrated. Some attention is given to the fact that coalescent theory has not fully grappled with the existence of underlying population pedigrees.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Achaz G (2009) Frequency spectrum neutrality tests: one for all and all for one. Genetics 183:249–258

    Article  PubMed  PubMed Central  Google Scholar 

  • Alvarado-Serrano DF, Hickerson MJ (2016) Spatially explicit summary statistics for historical population genetic inference. Methods Ecol Evol 7:418–427

    Article  Google Scholar 

  • Alvarez G, Ceballos FC, Quinteiro C (2009) The role of inbreeding in the extinction of a European royal dynasty. PLoS One 4(4):e5174

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Beaumont MA (2010) Approximate Bayesian computation in evolution and ecology. Annu Rev Ecol Evol Syst 41:379–406

    Article  Google Scholar 

  • Beerli P (2006) Comparison of Bayesian and maximum-likelihood inference of population genetic parameters. Bioinformatics 22:341–345

    Article  CAS  PubMed  Google Scholar 

  • Bycro C et al (2018) The UK biobank resource with deep phenotyping and genomic data. Nature 562:203–209

    Article  CAS  Google Scholar 

  • Cannings C (1974) The latent roots of certain Markov chains arising in genetics: a new approach. I. Haploid models. Adv Appl Probab 6:260–290

    Article  Google Scholar 

  • Cannings C, Thompson EA, Skolnick MH (1978) Probability functions on complex pedigrees. Adv Appl Probab 10:26–61

    Article  Google Scholar 

  • Cargill M, Altshuler D, Ireland J, Sklar P, Ardlie K, Patil N, Lane CR, Lim EP, Kalyanaraman N, Nemesh J, Ziaugra L, Friedland L, Rolfe A, Warrington J, Lipshutz R, Daly GQ, Lander ES (1999) Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat Genet 22:231–237

    Article  CAS  PubMed  Google Scholar 

  • Chang JT (1999) Recent common ancestors of all present-day individuals. Adv Appl Probab 31:1002–1026

    Article  Google Scholar 

  • de Iorio M, Griffiths RC, Leblois R, Rousset F (2005) Stepwise mutation likelihood computation by sequential importance sampling in subdivided population models. Theoret Pop Biol 68:41–53

    Article  Google Scholar 

  • Donnelly P, Tavaré S (1995) Coalescents and genealogical structure under neutrality. Annu Rev Genet 29:401–421

    Article  CAS  PubMed  Google Scholar 

  • Drummond AJ, Suchard MA, Xie D, Rambaut A (2012) Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29:1969–1973

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Durrett R, Schweinsberg J (2004) Approximating selective sweeps. Theoret Pop Biol 66:129–138

    Article  Google Scholar 

  • Etheridge AM, Pfaffelhuber P, Wakolbinger A (2006) An approximate sampling formula under genetic hitchhiking. Ann Appl Probab 16:685–729

    Article  Google Scholar 

  • Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theoret Pop Biol 3:87–112

    Article  CAS  Google Scholar 

  • Ewens WJ (1974) A note on the sampling theory for infinite alleles and infinite sites models. Theoret Pop Biol 6:143–148

    Article  CAS  Google Scholar 

  • Ewens WJ (1990) Population genetics theory—the past and the future. In: Lessard S (ed) Mathematical and statistical developments of evolutionary theory. Kluwer Academic, Amsterdam, pp 177–227

    Chapter  Google Scholar 

  • Ewens WJ (2004) Mathematical population genetics, vol I: theoretical foundations. Springer, Berlin

    Book  Google Scholar 

  • Fay JC, Wu C-I (2000) Hitchhiking under positive Darwinian selection. Genetics 155:1405–1413

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fearnhead P (2006) Perfect simulation from nonneutral population genetic models: variable population size and population subsdivision. Genetics 174:1397–1406

    Article  PubMed  PubMed Central  Google Scholar 

  • Felsenstein J (2006) Accuracy of coalescent likelihood estimates: do we need more sites, more sequences, or more loci? Mol Biol Evol 23:691–700

    Article  CAS  PubMed  Google Scholar 

  • Ferretti L, Perez-Enciso M, Ramos-Onsins S (2010) Optimal neutrality tests based on the frequency spectrum. Genetics 186:353–365

    Article  PubMed  PubMed Central  Google Scholar 

  • Fisher RA (1930) The genetical theory of natural selection. Clarendon, Oxford

    Book  Google Scholar 

  • Fu Y-X (1995) Statistical properties of segregating sites. Theoret Pop Biol 48:172–197

    Article  CAS  Google Scholar 

  • Fu Y-X (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915–925

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fu Y-X, Li W-H (1993) Statistical tests of neutrality of mutations. Genetics 133:693–709

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Griffiths RC, Tavaré S (1994) Simulating probability distributions in the coalescent. Theoret Pop Biol 46:131–159

    Article  Google Scholar 

  • Griffiths RC, Tavaré S (1996) Monte Carlo inference methods in population genetics. Math Comput Modelling 23:141–158

    Article  Google Scholar 

  • Hanski I, Gaggiotti OE (2004) Ecology, genetics, and evolution of metapopulations. Elsevier Academic, London

    Google Scholar 

  • Harris K (2019) From a database of genomes to a forest of evolutionary trees. Nat Genet 51:1304–1307

    Article  CAS  Google Scholar 

  • Hasegawa M, Kishino H, Yano H (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174

    Article  CAS  PubMed  Google Scholar 

  • Hawks J. “Coalescent Gene Genealogies” from the Wolfram Demonstrations Project. http://demonstrations.wolfram.com/CoalescentGeneGenealogies/

  • Hein J, Schierup MH, Wiuf C (2005) Gene genealogies, variation and evolution: a primer in coalescent theory. Oxford University Press, Oxford

    Google Scholar 

  • Herbots HM (1997) The structured coalescent. In: Donnelly P, Tavaré S (eds) Progress in population genetics and human evolution, IMA volumes in mathematics and its applications, vol 87. Springer, New York, pp 231–255

    Google Scholar 

  • Hey J (2010) Isolation with migration models for more than two populations. Mol Biol Evol 27:905–920

    Article  CAS  PubMed  Google Scholar 

  • Hey J, Nielsen R (2004) Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and D. persimilis. Genetics 167:747–760

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hey J, Nielsen R (2007) Integration within the Felsenstein equation for improved Markov chain Monte Carlo methods in population genetics. Proc Natl Acad Sci U S A 104:2785–2790

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hochman A (2019) Race and reference. Biology & Philosophy 34:32

    Article  Google Scholar 

  • Hodgkinson A, Eyre-Walker A (2010) Human triallelic sites: evidence for a new mutational mechanism? Genetics 184:233–241

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hudson RR (1983) Testing the constant-rate neutral allele model with protein sequence data. Evolution 37:203–217

    Article  PubMed  Google Scholar 

  • Hudson RR (1990) Gene genealogies and the coalescent process. In: Futuyma DJ, Antonovics J (eds) Oxford surveys in evolutionary biology, vol 7. Oxford University Press, Oxford, pp 1–44

    Google Scholar 

  • Huff CD, Xing J, Rogers AR, Witherspoon D, Jorde LB (2010) Mobile elements reveal small population size in the ancient ancestors of Homo sapiens. Proc Natl Acad Sci USA 107:2147–2152

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Keinan A, Clark AG (2012) Recent explosive human population growth has resulted in an excess of rare genetic variants. Science 336:740–743

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kelleher J, Wong Y, Wohns AW, Fadil C, Albers PK, McVean G (2019) Inferring whole-genome histories in large population datasets. Nat Genet 51:1330–1338

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kingman JFC (1982a) On the genealogy of large populations. J Appl Probab 19A:27–43

    Article  Google Scholar 

  • Kingman JFC (1982b) The coalescent. Stoch Process Appl 13:235–248

    Article  Google Scholar 

  • Kingman JFC (1982c) Exchangeability and the evolution of large populations. In: Koch G, Spizzichino F (eds) Exchangeability in probability and statistics. North-Holland, Amsterdam, pp 97–112

    Google Scholar 

  • Ko A, Nielsen R (2019) Joint estimation of pedigrees and effective population size using Markov chain Monte Carlo. Genetics 212:855–868

    Article  PubMed  PubMed Central  Google Scholar 

  • Kuhner MK (2006) LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22:768–770

    Article  CAS  PubMed  Google Scholar 

  • Kuhner MK, Yamato J, Felsenstein J (1995) Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. Genetics 140:1421–1430

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Leffler EM, Bullaughey K, Matute DR, Meyer WK, Ségurel L, Venkat A, Andolfatto P, Przeworski M (2012) Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol 10(9):e1001388

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li W-H (1976) Distribution of nucleotide difference between two randomly chosen cistrons in a subdivided population: the finite island model. Theoret Pop Biol 10:303–308

    Article  CAS  Google Scholar 

  • Li H, Durbin R (2011) Inference of population history from individual whole-genome sequences. Nature 475:493–496

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li N, Stephens M (2003) Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165:2213–2233

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Malécot G (1946) La consaguinite dans une population limitee. Comp Rendus Acad Sci Paris 222:841–843

    Google Scholar 

  • Mallick S et al (2016) The Simons genome diversity project: 300 genomes from 142 diverse populations. Nature 538:201–206

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Möhle M (1998a) Robustness results for the coalescent. J Appl Probab 35:438–447

    Article  Google Scholar 

  • Möhle M (1998b) A convergence theorem for Markov chains arising in population genetics and the coalescent with partial selfing. Adv Appl Probab 30:493–512

    Article  Google Scholar 

  • Möhle M (1998c) Coalescent results for two-sex population models. Adv Appl Probab 30:513–520

    Article  Google Scholar 

  • Möhle M (1999) The concept of duality and applications to Markov processes arising in neutral population genetics models. Bernoulli 5:761–777

    Article  Google Scholar 

  • Möhle M, Sagitov S (2001) A classification of coalescent processes for haploid exchangeable population models. Ann Appl Probab 29:1547–1562

    Google Scholar 

  • Notohara M (1990) The coalescent and the genealogical process in geographically structured population. J Math Biol 9:59–75

    Google Scholar 

  • Ott J (1999) Analysis of human genetic linkage, 3rd edn. Johns Hopkins University Press, Baltimore

    Google Scholar 

  • Pfaffelhuber P, Wakolbinger A (2005) The process of most recent common ancestors in an evolving coalescent. Stoch Proc App 116:1836–1859

    Article  Google Scholar 

  • Pluzhnikov A, Donnelly P (1996) Optimal sequencing strategies for surveying molecular genetic diversity. Genetics 144:1247–1262

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Polanski A, Kimmel M (2003) New explicit expressions for relative frequencies of single-nucleotide polymorphisms with application to statistical inference on population growth. Genetics 165:427–436

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rauch EM, Bar-Yam Y (2004) Theory predicts the uneven distribution of genetic diversity within species. Nature 431:449–452

    Article  CAS  PubMed  Google Scholar 

  • Rohde DLT, Olsen S, Chang JT (2003) Modeling the recent common ancestry of all living humans. Nature 425:798–804

    CAS  Google Scholar 

  • Rosenberg NA (2006) Standardized subsets of the HGDP-CEPH human genome diversity cell line panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann Hum Genet 70:841–847

    Article  CAS  PubMed  Google Scholar 

  • Rosenberg NA, Mahajan S, Ramachandran S, Zhao C, Pritchard JK, Feldman MW (2005) Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet 1:e70

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Sainudiin R, Véber A (2018) Full likelihood inference from the site frequency spectrum based on the optimal tree resolution. Theoret Pop Biol 124:1–15

    Article  Google Scholar 

  • Sargsyan O, Wakeley J (2008) A coalescent process with simultaneous multiple mergers for approximating the gene genealogies of many marine organisms. Theoret Pop Biol 74:104–114

    Article  Google Scholar 

  • Simonsen KL, Churchill GA, Aquadro CF (1995) Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141:413–429

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sjödin P, Kaj I, Krone S, Lascoux M, Nordborg M (2005) On the meaningand existence of an effective population size. Genetics 169:1061–1070

    Article  PubMed  PubMed Central  Google Scholar 

  • Slatkin M (1987) The average number of sites separating DNA sequences drawn from a subdivided population. Theoret Pop Biol 32:42–49

    Article  CAS  Google Scholar 

  • Slatkin M (1991) Inbreeding coefficients and coalescence times. Genet Res Camb 58:167–175

    Article  CAS  Google Scholar 

  • Slatkin M, Hudson RR (1991) Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129:555–562

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Speidel L, Forest M, Sinan S, Myers SR (2019) A method for genome-wide genealogy estimation for thousands of samples. Nat Genet 51:1321–1329

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Spence JP, Steinrücken M, Terhorst J, Song YS (2018) Inference of population history using coalescent HMMs: review and outlook. Curr Op Genet Devel 53:70–76

    Article  CAS  Google Scholar 

  • Stephens M, Donnelly P (2000) Inference in molecular population genetics. J R Stat Soc Ser B 62:605–655

    Article  Google Scholar 

  • Stephens M, Donnelly P (2003) Ancestral inference in population genetics models with selection. Aust N Z J Stat 45:395–430

    Article  Google Scholar 

  • Stephens JC, Schneider JA, Tanguay DA, Choi J, Acharya T, Stanley SE, Jiang R, Messer CJ, Chew A, Han J-H, Duan J, Carr JL, Lee MS, Koshy B, Kumar AM, Zhang G, Newell W, Windemuth A, Xu C, Kalbfleisch TS, Shaner SL, Arnold K, Schultz V, Drysdale CM, Nandabalan K, Judson RS, Ruaño G, Vovis GF (2001) Haplotype variation and linkage disequilibrium in 313 human genes. Science 293:489–493

    Article  CAS  PubMed  Google Scholar 

  • Strobeck C (1987) Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117:149–153

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tajima F (1983) Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437–460

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA. Genetics 123:585–595

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tajima F (1997) Estimation of the amount of DNA polymorphism and statistical tests of the neutral mutation hypothesis based on DNA polymorphism. In: Donnelly P, Tavaré S (eds) Progress in population genetics and human evolution. Springer, New York, pp 149–164

    Chapter  Google Scholar 

  • Takahata N (1988) The coalescent in two partially isolated diffusion populations. Genet Res Camb 53:213–222

    Article  Google Scholar 

  • Takahata N, Nei M (1985) Gene genealogy and variance of interpopulational nucleotide differences. Genetics 110:325–344

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tavaré S (1984) Lines-of-descent and genealogical processes, and theirapplication in population genetic models. Theor Popul Biol 26:119–164

    Article  PubMed  Google Scholar 

  • The 1000 Genomes Project Consortium (2015) A global reference for human genetic variation. Nature 526:68–74

    Article  CAS  Google Scholar 

  • Valdes AM, Slatkin M, Freimer NB (1993) Allele frequencies at microsatellite loci: the stepwise mutation model revisited. Genetics 133:737–749

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wakeley J (1999) Non-equilibrium migration in human history. Genetics 153:1863–1871

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wakeley J (2009) Coalescent theory: an introduction. Macmillan Learning, Macmillan, New York

    Google Scholar 

  • Wakeley J, Hey J (1997) Estimating ancestral population parameters. Genetics 145:847–855

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wakeley J, King L, Low BS, Ramachandran S (2012) Gene genealogies within a fixed pedigree, and the robustness of Kingman's coalescent. Genetics 190:1433–1445

    Article  PubMed  PubMed Central  Google Scholar 

  • Wakeley J, King L, Wilton P (2016) Effects of the population pedigree on genetic signatures of historical demographic events. Proc Natl Acad Sci USA 113:7994–8001

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Watterson GA (1975) On the number of segregating sites in genetical models without recombination. Theoret Pop Biol 7:256–276

    Article  CAS  Google Scholar 

  • Watterson GA (1982) Mutant substitutions at linked nucleotide sites. Adv Appl Probab 14:166–205

    Article  Google Scholar 

  • Wilkinson-Herbots HM (2008) The distribution of the coalescence time and the number of pairwise nucleotide differences in the “isolation with migration” model. Theoret Pop Biol 73:277–288

    Article  Google Scholar 

  • Wilton PR, Baduel P, Landon MM, Wakeley J (2017) Population structure and coalescence in pedigrees: comparisons to the structured coalescent and a framework for inference. Theoret Pop Biol 115:1–12

    Article  Google Scholar 

  • Winther GW, Giordano R, Edge MD, Nieslen R (2015) The mind, the lab, and the field: three kinds of populations in scientific practice. Stud Hist Phil Biol Biomed Sci 52:12–21

    Article  Google Scholar 

  • Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wu Y (2010) Exact computation of coalescent likelihood for panmictic and subdivided populations under the infinite sites model. IEEE/ACM Trans Comput Biol Bioinform 7:611–618

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Wakeley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wakeley, J. (2021). Coalescent Models. In: Lohmueller, K.E., Nielsen, R. (eds) Human Population Genomics. Springer, Cham. https://doi.org/10.1007/978-3-030-61646-5_1

Download citation

Publish with us

Policies and ethics