Abstract
Maximum likelihood estimators are used extensively to estimate unknown parameters of stochastic trait evolution models on phylogenetic trees. Although the MLE has been proven to converge to the true value in the independent-sample case, we cannot appeal to this result because trait values of different species are correlated due to shared evolutionary history. In this paper, we consider a 2-state symmetric model for a single binary trait and investigate the theoretical properties of the MLE for the transition rate in the large-tree limit. Here, the large-tree limit is a theoretical scenario where the number of taxa increases to infinity and we can observe the trait values for all species. Specifically, we prove that the MLE converges to the true value under some regularity conditions. These conditions ensure that the tree shape is not too irregular, and holds for many practical scenarios such as trees with bounded edges, trees generated from the Yule (pure birth) process, and trees generated from the coalescent point process. Our result also provides an upper bound for the distance between the MLE and the true value.
Similar content being viewed by others
References
Ané C (2008) Analysis of comparative data with hierarchical autocorrelation. Ann Appl Stat 2(3):1078–1102
Ané C, Ho LST, Roch S (2017) Phase transition on the convergence rate of parameter estimation under an Ornstein–Uhlenbeck diffusion on a tree. J Math Biol 74(1–2):355–385
Felsenstein J (1981) Evolutionary trees from gene frequencies and quantitative characters: finding maximum likelihood estimates. Evolution 35(6):1229–1242
Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125(1):1–15
Harmon LJ, Weir JT, Brock CD, Glor RE, Challenger W (2007) GEIGER: investigating evolutionary radiations. Bioinformatics 24(1):129–131
Ho LST, Ané C (2013) Asymptotic theory with hierarchical autocorrelation: Ornstein–Uhlenbeck tree models. Ann Stat 41(2):957–981
Ho LST, Ané C (2014) Intrinsic inference difficulties for trait evolution with Ornstein–Uhlenbeck models. Methods Ecol Evol 5(11):1133–1146
Jammalamadaka SR, Janson S (1986) Limit theorems for a triangular scheme of U-statistics with applications to inter-point distances. Ann Probab 14(4):1347–1358
Lambert A, Stadler T (2013) Birth-death models and coalescent point processes: the shape and probability of reconstructed phylogenies. Theor Popul Biol 90:113–128
Li G, Steel M, Zhang L (2008) More taxa are not necessarily better for the reconstruction of ancestral character states. Syst Biol 57(4):647–653
Lipton RJ, Tarjan RE (1979) A separator theorem for planar graphs. SIAM J Appl Math 36(2):177–189
Mooers A, Schluter D (1999) Reconstructing ancestor states with maximum likelihood: support for one- and two-rate models. Syst Biol 48(3):623–633
Mossel E, Steel M (2014) Majority rule has transition ratio 4 on Yule trees under a 2-state symmetric model. J Theor Biol 360:315–318
Pennell MW, Eastman JM, Slater GJ, Brown JW, Uyeda JC, FitzJohn RG, Alfaro ME, Harmon LJ (2014) geiger v2.0: an expanded suite of methods for fitting macroevolutionary models to phylogenetic trees. Bioinformatics 30(15):2216–2218
Sagitov S, Bartoszek K (2012) Interspecies correlation for neutrally evolving traits. J Theor Biol 309:11–19
Tuffley C, Steel M (1997) Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull Math Biol 59(3):581–607
Van Erven T, Harremoës P (2014) Rényi divergence and Kullback–Leibler divergence. IEEE Trans Inf Theory 60(7):3797–3820
Yule GU (1925) A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FRS. Philoso Trans R Soc Lond Ser B 213:21–87
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
LSTH was supported by startup funds from Dalhousie University, the Canada Research Chairs program, and the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant RGPIN-2018-05447. FAM was supported by CISE-1564137, and in part by a Faculty Scholar grant from the Howard Hughes Medical Institute and the Simons Foundation. MAS was supported by National Science Foundation Grant DMS1264153 and National Institutes of Health grant R01 AI107034.
A Proofs
A Proofs
1.1 A.1 Proof of Lemma 2
Note that by symmetry, we have \({\mathbb {P}}({\mathbf {Y}} = {\mathbf {y}}~|~\rho = 0) = {\mathbb {P}}(1 - {\mathbf {Y}} = {\mathbf {y}}~|~ \rho = 1)\). We deduce that
which completes the proof.
1.2 A.2 Proof of Lemma 3
Denote \(P^{(u)}_v = {\mathbb {P}}({\mathbf {Y_u}} ~|~ {\mathbb {T}}_u, \mu , \rho _u = v)\) for \(u \in \{ 0, 1\}\), \(v \in \{ 0,1 \}\). We have
Moreover
Therefore
1.3 A.3 Proof of Lemma 4
Without loss of generality, we assume that \(\mu _1 < \mu _2\). By the mean value theorem, there exists \({{\tilde{\mu }}}_{uv} \in (\mu _1, \mu _2)\) for any \(u, v \in \{ 0, 1\}\) such that
We observe that there exists a \(C_{{\underline{\mu }}, {\overline{\mu }}}>0\) such that
Therefore,
This implies that
Note that
By applying (5) for all \(2n-3\) edges on the tree, we deduce that
Hence,
which validates the lemma.
1.4 A.4 Proof of Lemma 7
For all x, y, we have \(|u(x) - u(y)| \le |v(x)-v(y)| + 2c\). Let Y be an independent and identically distributed copy of X, we have
Note that for all \(z, c \in {\mathbb {R}}\) and \(\omega >1\),
Therefore,
1.5 A.5 Proof of Eq. (4)
In order to establish Eq. (4), we use the following Lemma.
Lemma 11
(Remark 3.4 in Jammalamadaka and Janson (1986)) Let \(X_1, X_2, \ldots , X_n\) be an i.i.d. sequence of random variables and \(f_n(x, y)\) be an indicator function on \({\mathbb {R}}^2\) such that
for some constant \(\lambda >0\). Define \(U_n =\sum _{1 \le i< j \le n}{f_n(X_i, X_j)}\).
Then \(U_n \rightarrow _d \text {Poisson}(\lambda )\).
We apply this Lemma with \(f_n(x, y) = I\{|x-y| < r_n\}\) where \(r_n = \epsilon /n^2\) for the sequence \(t_1, t_2, \ldots , t_n\) of the coalescent point process. Note that by Equation (4.3) in Jammalamadaka and Janson (1986),
for some constant \(c>0\). On the other hand, we have
Therefore, \(n^3 E[f_n(t_1, t_2) f_n(t_1, t_3)] \rightarrow 0\). Hence,
Rights and permissions
About this article
Cite this article
Ho, L.S.T., Dinh, V., Matsen, F.A. et al. On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model. J. Math. Biol. 80, 1119–1138 (2020). https://doi.org/10.1007/s00285-019-01453-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-019-01453-1
Keywords
- Maximum likelihood estimator
- 2-state symmetric model
- Trait evolution
- Phylogenetics
- Yule process
- Coalescent point process