Probability Distribution of Tree Age for the Simple Birth–Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree

Mulder, Willem H.

doi:10.1007/s11538-023-01196-7

Probability Distribution of Tree Age for the Simple Birth–Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree

Original Article
Published: 01 September 2023

Volume 85, article number 94, (2023)
Cite this article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Willem H. Mulder¹

160 Accesses
Explore all metrics

Abstract

In this contribution, a general expression is derived for the probability density of the time to the most recent common ancestor (TMRCA) of a simple birth–death tree, a widely used stochastic null-model of biological speciation and extinction, conditioned on the constant birth and death rates and number of extant lineages. This density is contrasted with a previous result which was obtained using a uniform prior for the time of origin. The new distribution is applied to two problems of phylogenetic interest. First, that of the probability of the number of taxa existing at any time in the past in a tree of a known number of extant species, and given birth and death rates, and second, that of determining the TMRCA of two randomly selected taxa in an unobserved tree that is produced by a simple birth-only, or Yule, process. In the latter case, it is assumed that only the rate of bifurcation (speciation) and the size, or number of tips, are known. This is shown to lead to a closed-form analytical expression for the probability distribution of this parameter, which is arrived at based on the known mathematical form of the age distribution of Yule trees of a given size and branching rate, which is derived here de novo, and a similar distribution which additionally is conditioned on tree age. The new distribution is the exact Yule prior for divergence times of pairs of taxa under the stated conditions and is potentially useful in statistical (Bayesian) inference studies of phylogenies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Shape of Phylogenies Under Phase-Type Distributed Times to Speciation and Extinction

Article Open access 14 September 2022

On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model

Article 21 November 2019

The total external length of the evolving Kingman coalescent

Article 09 April 2016

References

Bailey NTJ (1964) The elements of stochastic processes with applications to the natural sciences. Wiley, New York
MATH Google Scholar
Bartoszek K, Sagitov S (2015) A consistent estimator of the evolutionary rate. J Theor Biol 371:69–78
Article MathSciNet MATH Google Scholar
Crawford FW, Suchard M (2013) Diversity, disparity, and evolutionary rate estimation for unresolved Yule trees. Syst Biol 62:439–455
Article Google Scholar
Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7(1):214–221
Article Google Scholar
Felsenstein J (2004) Inferring phylogenies. Sunderland (Mass.): Sinauer Assoc
Gernhard T (2008) The conditioned reconstructed process. J Theor Biol 253(4):769–778
Article MathSciNet MATH Google Scholar
Gernhard T, Hartmann K, Steel M (2008) Stochastic properties of generalised Yule models, with biodiversity applications. J Math Biol 57(5):713–735
Article MathSciNet MATH Google Scholar
Heled J, Drummond AJ (2011) Calibrated tree priors for relaxed phylogenetics and divergence time estimation. Syst Biol 61(1):138–149
Article Google Scholar
Ignatieva A, Hein J, Jenkins PA (2020) A characterisation of the reconstructed birth-death process through time-scaling. Theor Popul Biol 134:61–76
Article MATH Google Scholar
Kendall DG (1948) On the generalized “birth-and-death” process. Ann Math Stat 19:1–15
Article MathSciNet MATH Google Scholar
Mulder WH (2011) Probability distributions of ancestries and genealogical distances on stochastically generated rooted binary trees. J Theor Biol 280(1):139–145 (Addendum: J Theor Biol 314 (2012): 216–217)
Article MathSciNet MATH Google Scholar
Mulder WH, Crawford FW (2015) On the distribution of interspecies correlation for Markov models of character evolution on Yule trees. J Theor Biol 364:275–283
Article MATH Google Scholar
Nee S (2006) Birth-death models in macroevolution. Ann Rev Ecol Evol, Syst 37:1–17
Article Google Scholar
Nee S, May RM, Harvey PH (1994) The reconstructed evolutionary process. Philos Trans R Soc Ser B Biol Sci 344(1309):305–311
Article Google Scholar
Rannala B, Yang Z (1996) Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 43(3):304–311
Article Google Scholar
Rosenberg NA (2006) The mean and variance of the numbers of r-pronged nodes and r-caterpillars in Yule-generated genealogical trees. Ann Combin 10(1):129–146
Article MathSciNet MATH Google Scholar
Rosenberg NA, Feldman MW (2002) The relationship between coalescence times and species divergence times. In: Slatkin M, Veuille M (eds) Modern developments in theoretical population genetics, vol 9. Oxford University Press, Oxford, pp 130–164
Google Scholar
Sheinman M, Massip F, Arndt PF (2015) Statistical properties of pairwise distances between leaves on a random Yule tree. PLoS ONE 10(3):e0120206
Article Google Scholar
Stadler T (2009) On incomplete sampling under birth and death models and connections to the sampling-based coalescent. J Theor Biol 261(1):58–66
Article MathSciNet MATH Google Scholar
Stadler T (2010) Sampling-through-time in birth-death trees. J Theor Biol 267:396–404
Article MathSciNet MATH Google Scholar
Stadler T, Steel M (2012) Distribution of branch lengths and phylogenetic diversity under homogeneous speciation models. J Theor Biol 297:33–40
Article MathSciNet MATH Google Scholar
Steel M, McKenzie A (2001) Properties of phylogenetic trees generated by Yule-type speciation models. Math Biosci 170:91–112
Article MathSciNet MATH Google Scholar
Steel M, Mooers A (2010) The expected length of pendant and interior edges of a Yule tree. Appl Math Lett 23(11):1315–1319
Article MathSciNet MATH Google Scholar
Yule GU (1924) A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, FRS. Philos Trans R Soc Lond B 213:21–87
Google Scholar

Download references

Acknowledgements

I thank the two anonymous reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Department of Chemistry, The University of the West Indies, Mona Campus, Kingston 7, Jamaica
Willem H. Mulder

Authors

Willem H. Mulder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Willem H. Mulder.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 374 kb)

Appendix: Comparison with Age Distribution Based on an Improper Uniform Prior for the Time of Origin

In a previous study on the density of the times of speciation events, Gernhard (2008) considers the distribution of the time of origin of a BD tree having a known number of descendants at the present time. This earlier work takes a somewhat different approach where it is assumed that the moment at which a tree of unknown size and birth and death rates emerged is entirely unknown and is equally likely to have occurred at any time in the past. This amounts to assuming the age τ of any tree to follow an improper uniform distribution on [0, ∞) which is thus taken to be the prior (for a more recent application of this model assumption, see Ignatieva et al 2020). If the parameters n, λ and μ are known, application of Bayes’ theorem gives rise to the posterior distribution for tree age density conditioned on n, λ, μ which, in the notation used by Gernhard (2008; theorem 3.2), is found to be

$$ q_{or} (\tau |n,\lambda ,\mu ) = n\lambda \left( {1 - \frac{\mu }{\lambda }} \right)^{2} e^{ - (\lambda - \mu )\tau } \frac{{\left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }}. $$

(A.1)

In this case, the tree starts with a single lineage which may split after some time.

This result cannot be compared directly with Eq. (8) of the present study which defines the time of origin of a BD tree as that of the first bifurcation. To make a comparison between the two approaches requires a slight modification of the argument presented in subSect. 2.1, where we now interpret the MRCA of the subtree highlighted in Fig. 1 to mark the “birth” of either one of its daughter trees.

Thus, the probability that a tree that starts with a single lineage at time T – τ will have grown to size n at time T is proportional to

$$ 2p_{n}^{2} (\tau ) + 2p_{n} (\tau )(1 - p_{n} (\tau )) = 2p_{n} (\tau ). $$

(A.2)

After normalising with $2\sum\nolimits_{n = 0}^{\infty } {p_{n} (\tau )} = 2$, this probability is found to be simply equal to $p_{n} (\tau )$, which should then replace $\hat{p}_{n} (\tau )$ in Eq. (3). Otherwise, the procedure for calculating the new distribution, which will be denoted as $\tilde{P}(\tau |n,\lambda ,\mu )$, is exactly the same as that followed in deriving Eq. (8).

The analogue of Eq. (5) is

$$ \tilde{P}(\tau |T,n,\lambda ,\mu )d\tau = \tilde{C}_{n} (T,\lambda ,\mu )e^{ - 2(\lambda - \mu )\tau } \frac{{\left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }}d\tau , $$

(A.3)

where $\tilde{C}_{n} (T,\lambda ,\mu )$ is the normalisation factor which, in the limit T → ∞, follows from

$$ \begin{aligned} \mathop {\lim }\limits_{T \to \infty } \tilde{C}_{n} (T,\lambda ,\mu )^{ - 1} & = \int_{0}^{\infty } {d\tau e^{ - 2(\lambda - \mu )\tau } } \frac{{\left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }} \\ & = \frac{1}{\lambda - \mu }\left( {\frac{1}{n(1 - \mu /\lambda )} - \sum\limits_{k = 0}^{\infty } {\frac{{(\mu /\lambda )^{k} }}{n + k + 1}} } \right). \\ \end{aligned} $$

(A.4)

With tree age τ thus redefined, its distribution now becomes

$$ \tilde{P}(\tau |n,\lambda ,\mu ) = \frac{{\lambda (1 - \mu /\lambda )e^{ - 2(\lambda - \mu )\tau } \left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {\frac{1}{n(1 - \mu /\lambda )} - \sum\limits_{k = 0}^{\infty } {\frac{{(\mu /\lambda )^{k} }}{n + k + 1}} } \right)\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }}, $$

(A.5)

which differs from Eq. (A.1) by an extra factor $e^{ - (\lambda - \mu )\tau }$ (and hence, the normalisation constant will also be different). It should be noted that this is precisely the factor by which the average total population from which the clade is sampled would have shrunk when returning to the point where the MRCA emerged.

That the results are not identical is therefore not surprising based on the different perspectives, viz. a uniform prior on time of origin vs. a picture based on a subtree embedded in an exponentially increasing population (assuming λ and μ known).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mulder, W.H. Probability Distribution of Tree Age for the Simple Birth–Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree. Bull Math Biol 85, 94 (2023). https://doi.org/10.1007/s11538-023-01196-7

Download citation

Received: 22 February 2023
Accepted: 11 August 2023
Published: 01 September 2023
DOI: https://doi.org/10.1007/s11538-023-01196-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probability Distribution of Tree Age for the Simple Birth–Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree

Abstract

Access this article

Similar content being viewed by others

The Shape of Phylogenies Under Phase-Type Distributed Times to Speciation and Extinction

On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model

The total external length of the evolving Kingman coalescent

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary file1 (PDF 374 kb)

Appendix: Comparison with Age Distribution Based on an Improper Uniform Prior for the Time of Origin

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Probability Distribution of Tree Age for the Simple Birth–Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree

Abstract

Access this article

Similar content being viewed by others

The Shape of Phylogenies Under Phase-Type Distributed Times to Speciation and Extinction

On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model

The total external length of the evolving Kingman coalescent

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary file1 (PDF 374 kb)

Appendix: Comparison with Age Distribution Based on an Improper Uniform Prior for the Time of Origin

Appendix: Comparison with Age Distribution Based on an Improper Uniform Prior for the Time of Origin

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation