Abstract
The coalescent has become perhaps the most widely-used population genetics model. By modeling the ancestry of a sample, rather than the evolution of the entire population from which the sample is drawn, it provides a computationally efficient framework for data simulation. Furthermore, from a theoretical perspective, it provides the under-pinnings for many useful analysis techniques. In this chapter, we introduce the coalescent, describe some of the problems that it has been used to address, discuss practical implications that follow from the insight it provides, and summarize some of the available software.
Keywords
- Pairwise Difference
- Approximate Bayesian Computation
- Much Recent Common Ancestor
- Ancestral Recombination Graph
- Coalescent Process
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Auton, A., McVean, G.: Recombination rate estimation in the presence of hotspots. Genome Research 17, 1219–1227 (2007)
Bahlo, M., Griffiths, R.: Coalescence time for two genes from a subdivided population. J. Math. Biol. 43, 397–410 (2001)
Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002)
Browning, B., Browning, S.: Efficient multilocus association testing for whole genome association studies using localized haplotype clustering. Genet Epidemiol 31, 365–375 (2007)
Cann, R., Stoneking, M.,Wilson, A.: Mitochondrial DNA and human evolution. Nature 325, 31–36 (1987)
Carvajal-Rodriguez, A.: GENOMEPOP: A program to simulate genomes in populations. BMC Bioinformatics 9, 223 (2008)
Chen, G., Marjoram, P.,Wall, J.: Fast and flexible simulation of DNA sequence data. Genome Res. 19, 136–142 (2009)
Cheverud, J.: A simple correction for multiple comparisons in interval mapping genome scans. Heredity 87, 52–58 (2001)
Cooper, G., Amos, W., Hoffman, D., Rubinsztein, D.: Network analysis of human Y microsatellite haplotypes. Hum. Mol. Genet. 5, 1759–1766 (1996)
Degnan, J., Salter, L.: Gene tree distributions under the coalescent process. Evolution 59, 24–37 (2005)
Di Rienzo, A., Wilson, A.C.: Branching pattern in the evolutionary tree for human mitochondrial DNA. Proc. Nat. Acad. Sci. 88, 1597–1601 (1991)
Dorit, R.L., Akashi, H., Gilbert, W.: Absense of polymorphism at the ZFY locus on the human Y chromosome. Science 268, 1183–1185 (1995)
Durrant, C., Zondervan, K.T., Cardon, L.R., Hunt, S., Deloukas, P., Morris, A.P.: Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes. Am. J. Hum. Genet. 75, 35–43 (2004)
Eswaran, V., Harpending, H., Rogers, A.: Genomics refutes an exclusively African origin of humans. Journal of Human Evolution 49, 1–18 (2005)
Excoffier, L.: Human demographic history: Refining the recent African origin model. Current Opinion in Genetics & Development 12, 675–682 (2002)
Excoffier, L., Heckel, G.: Computer programs for population genetics data analysis: A survival guide. Nat. Rev. Genet. 7, 745–758 (2006)
Fagundes, N.J.R., Ray, N., Beaumont, M., Neuenschwander, S., Salzano, F.M., Bonatto, S.L., Excoffier, L.: Statistical evaluation of alternative models of human evolution. Proc. Natl. Acad. Sci. 104, 17,614–17,619 (2007)
Fearnhead, P.: Perfect simulation from non-neutral population genetic models: Variable population size and population sub-division. Genetics 174, 1397–1406 (2006)
Fearnhead, P.: The stationary distribution of allele frequencies when selection acts at unlinked loci. Theor. Pop. Biol. 70, 376–386 (2006)
Fearnhead, P., Donnelly, P.: Estimating recombination rates from population genetic data. Genetics 159, 1299–1318 (2001)
Fisher, R.A.: The Genetical Theory of Natural Selection. Clarendon Press (1930)
Fu, Y.X., Li, W.H.: Maximum likelihood estimation of population parameters. Genetics 134, 1261–1270 (1993)
Garrigan, D., Hammer, M.: Reconstructing human origins in the genomic era. Nat. Rev. Genet. 7, 669–680 (2006)
Garrigan, D., Hammer, M.: Ancient lineages in the genome: a response to Fagundes et al. Proc. Natl. Acad. Sci. 105, E3 (2008)
Green, R., Krause, J., Ptak, S., Briggs, A., Ronan, M.: Analysis of one million base pairs of Neanderthal DNA. Nature 444, 330–336 (2006)
Griffiths, R.C., Marjoram, P.: An ancestral recombination graph. In: P. Donnelly, S. Tavaré (eds.) Progress in Population Genetics and Human Evolution, IMA Volumes in Mathematics and its Applications, vol. 87, pp. 100–117. Springer Verlag (1997)
Hammer, M.: A recent common ancestry for the human Y chromosome. Nature 378, 376–378 (1995)
Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970)
Hein, J., Schierup, M.H., Wiuf, C.: Gene Genealogies, Variation and Evolution. Oxford University Press, Oxford (2005)
Hellenthal, G., Stephens, M.: msHOT: Modifying Hudson’s ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics 23, 520–521 (2007)
Hoggart, C.J., Chadeau-Hyam, M., Clark, T.G., Lampariello, R.,Whittaker, J.C., Iorio, M.D., Balding, D.J.: Sequence-level population simulations over large genomic regions. Genetics 177, 1725–1731 (2007)
Hudson, R.R.: Properties of a neutral allele model with intragenic recombination. Theor. Popn. Biol. 23, 183–201 (1983)
Hudson, R.R.: Gene genealogies and the coalescent process. In: D. Futuyma, J. Antonovics (eds.) Oxford Surveys in Evolutionary Biology, vol. 7, pp. 1–44. Oxford University Press (1990)
Hudson, R.R.: Two-locus sampling distributions and their application. Genetics 159, 1805–1817 (2001)
Hudson, R.R.: Generating samples under a Wright-Fisher neutral model. Bioinformatics 18, 337–338 (2002)
Hudson, R.R., Kaplan, N.L.: The coalescent process in models with selection and recombination. Genetics 120, 831–840 (1988)
Huentelman, M., Craig, D., Shieh, A., Corneveaux, J.: SNiPer: improved SNP genotype calling for Affymetrix 10K GeneChip microarray data. BMC Genomics 6, 149 (2005)
Jobling, M., Tyler-Smith, C.: Fathers and sons: The Y chromosome and human evolution. Trends in Genetics 11, 449–456 (1995)
Kimmel, G., Karp, R., Jordan, M., Halperin, E.: Association mapping and significance estimation via the coalescent. Am. J. Hum. Genet. 83, 675–683 (2008)
Kingman, J.F.C.: The coalescent. Stoch. Proc. Applns. 13, 235–248 (1982)
Kingman, J.F.C.: Exchangeability and the evolution of large populations. In: G. Koch, F. Spizzichino (eds.) Exchangeability in probability and statistics, pp. 97–112. North-Holland Publishing Company (1982)
Kingman, J.F.C.: On the genealogy of large populations. J. Appl. Prob. 19A, 27–43 (1982)
Krone, S.M., Neuhauser, C.: Ancestral processes with selection. Theor. Popn. Biol. 51, 210–237 (1997)
Kuhner, M.K.: LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22, 768–770 (2006)
Laval, G., Excoffier, L.: SIMCOAL 2.0: A program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history. Bioinformatics 20, 2485–2487 (2004)
Li, N., Stephens, M.: Modelling linkage disequilibrium, and identifying recombination hotspots using SNP data. Genetics 165, 2213–2233 (2003)
Liang, L., Zollner, S., Abecasis, G.R.: GENOME: A rapid coalescent-based whole genome simulator. Bioinformatics 23, 1565–1567 (2007)
Liu, L., Pearl, D.: Species trees from gene trees: reconstructing Bayes posterior distributions of a species phylogeny using estimated gene tree distributions. Mathematical Biosciences Institute Tech. Report (2006)
Marchini, J., Howie, B., Myers, S., McVean, G., Donnelly, P.: A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 39, 906–913 (2007)
Marjoram, P., Donnelly, P.: Pairwise comparison of mitochondrial DNA sequences in subdivided populations and implications for early human evolution. Genetics 136, 673–683 (1994)
Marjoram, P., Tavaré, S.: Modern computational approaches for analysing molecular genetic variation data. Nat. Rev. Genet. 7, 759–770 (2006)
Marjoram, P., Wall, J.D.: Fast “coalescent” simulation. BMC Genetics 7:16 (2006)
McPeek, M.S., Strahs, A.: Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am. J. Hum. Genet. 65, 858–875 (1999)
McVean, G., Myers, S., Hunt, S., Deloukas, P.: The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004)
McVean, G.A.T., Cardin, N.J.: Approximating the coalescent with recombination. Phil. Trans. R. Soc. B 360, 1387–1393 (2005)
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1091 (1953)
Minichiello, M., Durbin, R.: Mapping trait loci by use of inferred ancestral recombination graphs. Am. J. Hum. Genet. 79, 910–922 (2006)
Molitor, J., Marjoram, P., Thomas, D.: Application of Bayesian clustering via Voronoi tesselations to the analysis of haplotype risk and gene mapping. Am. J. Hum. Genet. 73, 1368–1384 (2003)
Molitor, J., Marjoram, P., Thomas, D.: Application of Bayesian spatial statistical methods to the analysis of haplotype effects and gene mapping. Gen. Epi. 25, 95–105 (2003)
Morris, A.: Direct analysis of unphased SNP genotype data in population-based association studies via Bayesian partition modelling of haplotypes. Genet Epidemiol 29, 91–107 (2005)
Morris, A.P., Whittaker, J.C., Balding, D.J.: Fine-scale mapping of disease loci via shattered coalescent modeling of genealogies. Am J Hum Genet 70, 686–707 (2002). DOI 10.1086/339271
Moskvina, V., Schmidt, K.M.: On multiple-testing correction in genome-wide association studies. Genet Epidemiol 32, 567–573 (2008)
Navarro, A., Barton, N.H.: The effects of multilocus balancing selection on neutral variability. Genetics 161, 849–63 (2002)
Neuhauser, C., Krone, S.M.: The genealogy of samples in models with selection. Genetics 145, 519–534 (1997)
Noonan, J., Coop, G., Kudaravalli, S., Smith, D.: Sequencing and analysis of Neanderthal genomic DNA. Science 314, 1113–1118 (2006)
Nordborg, M.: Coalescent theory. In: D.J. Balding, M.J. Bishop, C. Cannings (eds.) Handbook of Statistical Genetics, pp. 179–208. John Wiley & Sons, Inc., New York (2001)
Nordborg, M., Innan, H.: The genealogy of sequences containing multiple sites subject to strong selection in a subdivided population. Genetics 163, 1201–1213 (2003)
Nyholt, D.: A simple correction for multiple testing for SNPs in linkage disequilibrium with each other. Am. J. Hum. Genet. 74, 765–769 (2004)
Nyholt, D.: Evaluation of Nyholt’s procedure for multiple testing correction - author’s reply. Hum. Hered. 60, 61–62 (2005)
Peng, B., Kimmel, M.: simuPOP: A forward-time population genetics simulation environment. Bioinformatics 21, 3686–3687 (2005)
Pfaffelhuber, P., Haubold, B., Wakolbinger, A.: Approximate genealogies under genetic hitchhiking. Genetics 174, 1995–2008 (2006)
Plagnol, V., Wall, J.: Possible ancestral structure in human populations. PLoS Genet 2(e105) (2006)
Portin, P.: Evolution of man in the light of molecular genetics: A review. Part I. Our evolutionary history and genomics. Hereditas 144, 80–95 (2007)
Portin, P.: Evolution of man in the light of molecular genetics: A review. Part II. Regulation of gene function, evolution of speech and of brains. Hereditas 145, 113–125 (2008)
Posada, D., Maxwell, T., Templeton, A.: TreeScan: A bioinformatic application to search for genotype/phenotype associations using haplotype trees. Bioinformatics 21, 2130–2132 (2005)
Pritchard, J.K., Seielstad, M.T., Perez-Lezaun, A., Feldman, M.W.: Population growth of human Y chromosomes: A study of Y chromosome microsatellites. Mol. Biol. Evol. 16, 1791–1798 (1999)
Rannala, B., Yang, Z.: Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. Genetics 164, 1645–1656 (2003)
Relethford, J.H.: Genetic evidence and the modern human origins debate. Heredity 100, 555–563 (2008)
Ripley, B.D.: Stochastic simulation. John Wiley & Sons, Inc., New York (1982)
Roberts, A., McMillan, L., Wang, W., Parker, J., Rusyn, I., Threadgill, D.: Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows. Bioinformatics 23, i401–i407 (2007)
Salyakina, D., Seaman, S.R., Browning, B.L., Dudbridge, F., Müller-Myhsok, B.: Evaluation of Nyholt’s procedure for multiple testing correction. Hum. Hered. 60, 19–25 (2005)
Saunders, I.W., Tavaré, S., Watterson, G.A.: On the genealogy of nested subsamples from a haploid population. Adv. Appl. Prob. 16, 471–491 (1984)
Servin, B., Stephens, M.: Imputation-based analysis of association studies: Candidate regions and quantitative traits. PLoS Genet. 3, e114 (2007)
Siegmund, K., Marjoram, P., Shibata, D.: Modeling DNA methylation in a population of cancer cells. Statistical Applications in Genetics and Molecular Biology 7, a18 (2008)
Slade, P.F.: Simulation of ‘hitch-hiking’ genealogies. J. Math. Biol. 42, 41–70 (2001)
Slade, P.F.: The structured ancestral selection graph and the many-demes limit. Genetics 169, 1117–1131 (2005)
Slatkin, M.: Simulating genealogies of selected alleles in a population of variable size. Genetics Research 78, 49–57 (2001)
Slatkin, M., Hudson, R.R.: Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129, 555–562 (1991)
Smith, N.G.C., Fearnhead, P.: A comparison of three estimators of the population-scaled recombination rate: accuracy and robustness. Genetics 171, 2051–2062 (2005)
Spencer, C.C.A.: SelSim: A program to simulate population genetic data with natural selection and recombination. Bioinformatics 20, 3673–3675 (2004)
Stringer, C., Andrews, P.: Genetic and fossil evidence for the origin of modern humans. Science 239, 1263–1268 (1988)
Tavaré, S.: Line-of-descent and genealogical processes, and their applications in population genetics models. Theor. Popn. Biol. 26, 119–164 (1984)
Tavaré, S., Balding, D.J., Griffiths, R.C., Donnelly, P.: Inferring coalescence times for molecular sequence data. Genetics 145, 505–518 (1997)
Tavaré, S., Zeitouni, O.: Lectures on Probability Theory and Statistics. Springer-Verlag (2001)
Templeton, A.: Genetics and recent human evolution. Evolution 61, 1507–1519 (2007)
Templeton, A.R.: A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping or DNA sequencing. V. Analysis of case/control sampling designs: Alzheimer’s disease and the Apoprotein E locus. Genetics 140, 403–409 (1995)
Templeton, A.R.: Haplotype trees and modern human origins. Yrbk Phys Anthropol 48, 33–59 (2005)
Templeton, A.R., Maxwell, T., Posada, D., Stengard, J.H., Boerwinkle, E., Sing, C.F.: Tree scanning: A method for using haplotype trees in phenotype/genotype association studies. Genetics 169, 441–453 (2005)
The International HapMap Consortium: A haplotype map of the human genome. Nature 437, 1299–1320 (2005)
The Wellcome Trust Case Control Consortium: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)
Toleno, D., Morrell, P., Clegg, M.: Error detection in SNP data by considering the likelihood of recombinational history implied by three-site combinations. Bioinformatics 23, 1807–1814 (2007)
Wakeley, J.: Coalescent Theory: An Introduction. Roberts & Company (2008)
Waldron, E., Whittaker, J., Balding, D.: Fine mapping of disease genes via haplotype clustering. Genet Epidemiol 30, 170–179 (2006)
Wall, J.D.: A comparison of estimators of the population recombination rate. Mol. Biol. Evol. 17, 156–163 (2000)
Wallace, D.: 1994 Willliam Alan Award Address - Mitochondrial DNA variation in human evolution, degenerative disease, and aging. Am. J. Hum. Genet. 57, 201–223 (1995)
Watterson, G.A.: On the number of segregating sites in genetical models without recombination. Theor. Popn. Biol. 7, 256–276 (1975)
Weiss, G., von Haeseler, A.: Inference of population history using a likelihood approach. Genetics 149, 1539–1546 (1998)
Whitfield, L.S., Sulston, J.E., Goodfellow, P.N.: Sequence variation of the human Y chromosome. Nature 378, 379–380 (1995)
Wills, C.:When did Eve live? An evolutionary detective story. Evolution 49, 593–607 (1995)
Wiuf, C., Hein, J.: The ancestry of a sample of sequences subject to recombination. Genetics 151, 1217–1228 (1999)
Wiuf, C., Hein, J.: Recombination as a point process along sequences. Theor. Popul. Biol. 55, 248–259 (1999)
Wright, S.: Evolution in Mendelian populations. Genetics 16, 97–159 (1931)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer US
About this chapter
Cite this chapter
Marjoram, P., Joyce, P. (2010). Practical Implications of Coalescent Theory. In: Heath, L., Ramakrishnan, N. (eds) Problem Solving Handbook in Computational Biology and Bioinformatics. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09760-2_4
Download citation
DOI: https://doi.org/10.1007/978-0-387-09760-2_4
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-09759-6
Online ISBN: 978-0-387-09760-2
eBook Packages: Computer ScienceComputer Science (R0)