Skip to main content
Log in

Probability Distribution of Tree Age for the Simple Birth–Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree

  • Original Article
  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

In this contribution, a general expression is derived for the probability density of the time to the most recent common ancestor (TMRCA) of a simple birth–death tree, a widely used stochastic null-model of biological speciation and extinction, conditioned on the constant birth and death rates and number of extant lineages. This density is contrasted with a previous result which was obtained using a uniform prior for the time of origin. The new distribution is applied to two problems of phylogenetic interest. First, that of the probability of the number of taxa existing at any time in the past in a tree of a known number of extant species, and given birth and death rates, and second, that of determining the TMRCA of two randomly selected taxa in an unobserved tree that is produced by a simple birth-only, or Yule, process. In the latter case, it is assumed that only the rate of bifurcation (speciation) and the size, or number of tips, are known. This is shown to lead to a closed-form analytical expression for the probability distribution of this parameter, which is arrived at based on the known mathematical form of the age distribution of Yule trees of a given size and branching rate, which is derived here de novo, and a similar distribution which additionally is conditioned on tree age. The new distribution is the exact Yule prior for divergence times of pairs of taxa under the stated conditions and is potentially useful in statistical (Bayesian) inference studies of phylogenies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Bailey NTJ (1964) The elements of stochastic processes with applications to the natural sciences. Wiley, New York

    MATH  Google Scholar 

  • Bartoszek K, Sagitov S (2015) A consistent estimator of the evolutionary rate. J Theor Biol 371:69–78

    Article  MathSciNet  MATH  Google Scholar 

  • Crawford FW, Suchard M (2013) Diversity, disparity, and evolutionary rate estimation for unresolved Yule trees. Syst Biol 62:439–455

    Article  Google Scholar 

  • Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7(1):214–221

    Article  Google Scholar 

  • Felsenstein J (2004) Inferring phylogenies. Sunderland (Mass.): Sinauer Assoc

  • Gernhard T (2008) The conditioned reconstructed process. J Theor Biol 253(4):769–778

    Article  MathSciNet  MATH  Google Scholar 

  • Gernhard T, Hartmann K, Steel M (2008) Stochastic properties of generalised Yule models, with biodiversity applications. J Math Biol 57(5):713–735

    Article  MathSciNet  MATH  Google Scholar 

  • Heled J, Drummond AJ (2011) Calibrated tree priors for relaxed phylogenetics and divergence time estimation. Syst Biol 61(1):138–149

    Article  Google Scholar 

  • Ignatieva A, Hein J, Jenkins PA (2020) A characterisation of the reconstructed birth-death process through time-scaling. Theor Popul Biol 134:61–76

    Article  MATH  Google Scholar 

  • Kendall DG (1948) On the generalized “birth-and-death” process. Ann Math Stat 19:1–15

    Article  MathSciNet  MATH  Google Scholar 

  • Mulder WH (2011) Probability distributions of ancestries and genealogical distances on stochastically generated rooted binary trees. J Theor Biol 280(1):139–145 (Addendum: J Theor Biol 314 (2012): 216–217)

    Article  MathSciNet  MATH  Google Scholar 

  • Mulder WH, Crawford FW (2015) On the distribution of interspecies correlation for Markov models of character evolution on Yule trees. J Theor Biol 364:275–283

    Article  MATH  Google Scholar 

  • Nee S (2006) Birth-death models in macroevolution. Ann Rev Ecol Evol, Syst 37:1–17

    Article  Google Scholar 

  • Nee S, May RM, Harvey PH (1994) The reconstructed evolutionary process. Philos Trans R Soc Ser B Biol Sci 344(1309):305–311

    Article  Google Scholar 

  • Rannala B, Yang Z (1996) Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 43(3):304–311

    Article  Google Scholar 

  • Rosenberg NA (2006) The mean and variance of the numbers of r-pronged nodes and r-caterpillars in Yule-generated genealogical trees. Ann Combin 10(1):129–146

    Article  MathSciNet  MATH  Google Scholar 

  • Rosenberg NA, Feldman MW (2002) The relationship between coalescence times and species divergence times. In: Slatkin M, Veuille M (eds) Modern developments in theoretical population genetics, vol 9. Oxford University Press, Oxford, pp 130–164

    Google Scholar 

  • Sheinman M, Massip F, Arndt PF (2015) Statistical properties of pairwise distances between leaves on a random Yule tree. PLoS ONE 10(3):e0120206

    Article  Google Scholar 

  • Stadler T (2009) On incomplete sampling under birth and death models and connections to the sampling-based coalescent. J Theor Biol 261(1):58–66

    Article  MathSciNet  MATH  Google Scholar 

  • Stadler T (2010) Sampling-through-time in birth-death trees. J Theor Biol 267:396–404

    Article  MathSciNet  MATH  Google Scholar 

  • Stadler T, Steel M (2012) Distribution of branch lengths and phylogenetic diversity under homogeneous speciation models. J Theor Biol 297:33–40

    Article  MathSciNet  MATH  Google Scholar 

  • Steel M, McKenzie A (2001) Properties of phylogenetic trees generated by Yule-type speciation models. Math Biosci 170:91–112

    Article  MathSciNet  MATH  Google Scholar 

  • Steel M, Mooers A (2010) The expected length of pendant and interior edges of a Yule tree. Appl Math Lett 23(11):1315–1319

    Article  MathSciNet  MATH  Google Scholar 

  • Yule GU (1924) A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, FRS. Philos Trans R Soc Lond B 213:21–87

    Google Scholar 

Download references

Acknowledgements

I thank the two anonymous reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Willem H. Mulder.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 374 kb)

Appendix: Comparison with Age Distribution Based on an Improper Uniform Prior for the Time of Origin

Appendix: Comparison with Age Distribution Based on an Improper Uniform Prior for the Time of Origin

In a previous study on the density of the times of speciation events, Gernhard (2008) considers the distribution of the time of origin of a BD tree having a known number of descendants at the present time. This earlier work takes a somewhat different approach where it is assumed that the moment at which a tree of unknown size and birth and death rates emerged is entirely unknown and is equally likely to have occurred at any time in the past. This amounts to assuming the age τ of any tree to follow an improper uniform distribution on [0, ∞) which is thus taken to be the prior (for a more recent application of this model assumption, see Ignatieva et al 2020). If the parameters n, λ and μ are known, application of Bayes’ theorem gives rise to the posterior distribution for tree age density conditioned on n, λ, μ which, in the notation used by Gernhard (2008; theorem 3.2), is found to be

$$ q_{or} (\tau |n,\lambda ,\mu ) = n\lambda \left( {1 - \frac{\mu }{\lambda }} \right)^{2} e^{ - (\lambda - \mu )\tau } \frac{{\left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }}. $$
(A.1)

In this case, the tree starts with a single lineage which may split after some time.

This result cannot be compared directly with Eq. (8) of the present study which defines the time of origin of a BD tree as that of the first bifurcation. To make a comparison between the two approaches requires a slight modification of the argument presented in subSect. 2.1, where we now interpret the MRCA of the subtree highlighted in Fig. 1 to mark the “birth” of either one of its daughter trees.

Thus, the probability that a tree that starts with a single lineage at time Tτ will have grown to size n at time T is proportional to

$$ 2p_{n}^{2} (\tau ) + 2p_{n} (\tau )(1 - p_{n} (\tau )) = 2p_{n} (\tau ). $$
(A.2)

After normalising with \(2\sum\nolimits_{n = 0}^{\infty } {p_{n} (\tau )} = 2\), this probability is found to be simply equal to \(p_{n} (\tau )\), which should then replace \(\hat{p}_{n} (\tau )\) in Eq. (3). Otherwise, the procedure for calculating the new distribution, which will be denoted as \(\tilde{P}(\tau |n,\lambda ,\mu )\), is exactly the same as that followed in deriving Eq. (8).

The analogue of Eq. (5) is

$$ \tilde{P}(\tau |T,n,\lambda ,\mu )d\tau = \tilde{C}_{n} (T,\lambda ,\mu )e^{ - 2(\lambda - \mu )\tau } \frac{{\left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }}d\tau , $$
(A.3)

where \(\tilde{C}_{n} (T,\lambda ,\mu )\) is the normalisation factor which, in the limit T → ∞, follows from

$$ \begin{aligned} \mathop {\lim }\limits_{T \to \infty } \tilde{C}_{n} (T,\lambda ,\mu )^{ - 1} & = \int_{0}^{\infty } {d\tau e^{ - 2(\lambda - \mu )\tau } } \frac{{\left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }} \\ & = \frac{1}{\lambda - \mu }\left( {\frac{1}{n(1 - \mu /\lambda )} - \sum\limits_{k = 0}^{\infty } {\frac{{(\mu /\lambda )^{k} }}{n + k + 1}} } \right). \\ \end{aligned} $$
(A.4)

With tree age τ thus redefined, its distribution now becomes

$$ \tilde{P}(\tau |n,\lambda ,\mu ) = \frac{{\lambda (1 - \mu /\lambda )e^{ - 2(\lambda - \mu )\tau } \left( {1 - e^{ - (\lambda - \mu )\tau } } \right)^{n - 1} }}{{\left( {\frac{1}{n(1 - \mu /\lambda )} - \sum\limits_{k = 0}^{\infty } {\frac{{(\mu /\lambda )^{k} }}{n + k + 1}} } \right)\left( {1 - \frac{\mu }{\lambda }e^{ - (\lambda - \mu )\tau } } \right)^{n + 1} }}, $$
(A.5)

which differs from Eq. (A.1) by an extra factor \(e^{ - (\lambda - \mu )\tau }\) (and hence, the normalisation constant will also be different). It should be noted that this is precisely the factor by which the average total population from which the clade is sampled would have shrunk when returning to the point where the MRCA emerged.

That the results are not identical is therefore not surprising based on the different perspectives, viz. a uniform prior on time of origin vs. a picture based on a subtree embedded in an exponentially increasing population (assuming λ and μ known).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mulder, W.H. Probability Distribution of Tree Age for the Simple Birth–Death Process, with Applications to Distributions of Number of Ancestral Lineages and Divergence Times for Pairs of Taxa in a Yule Tree. Bull Math Biol 85, 94 (2023). https://doi.org/10.1007/s11538-023-01196-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11538-023-01196-7

Keywords

Navigation