On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model

Ho, Lam Si Tung; Dinh, Vu; Matsen, Frederick A.; Suchard, Marc A.

doi:10.1007/s00285-019-01453-1

On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model

Published: 21 November 2019

Volume 80, pages 1119–1138, (2020)
Cite this article

Journal of Mathematical Biology Aims and scope Submit manuscript

Lam Si Tung Ho¹^na1,
Vu Dinh²,
Frederick A. Matsen IV³ &
…
Marc A. Suchard⁴

252 Accesses
13 Altmetric
Explore all metrics

Abstract

Maximum likelihood estimators are used extensively to estimate unknown parameters of stochastic trait evolution models on phylogenetic trees. Although the MLE has been proven to converge to the true value in the independent-sample case, we cannot appeal to this result because trait values of different species are correlated due to shared evolutionary history. In this paper, we consider a 2-state symmetric model for a single binary trait and investigate the theoretical properties of the MLE for the transition rate in the large-tree limit. Here, the large-tree limit is a theoretical scenario where the number of taxa increases to infinity and we can observe the trait values for all species. Specifically, we prove that the MLE converges to the true value under some regularity conditions. These conditions ensure that the tree shape is not too irregular, and holds for many practical scenarios such as trees with bounded edges, trees generated from the Yule (pure birth) process, and trees generated from the coalescent point process. Our result also provides an upper bound for the distance between the MLE and the true value.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Phase transition on the convergence rate of parameter estimation under an Ornstein–Uhlenbeck diffusion on a tree

Article 30 May 2016

Phase transition in the sample complexity of likelihood-based phylogeny inference

Article 03 August 2017

Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent

Article 01 February 2018

References

Ané C (2008) Analysis of comparative data with hierarchical autocorrelation. Ann Appl Stat 2(3):1078–1102
Article MathSciNet Google Scholar
Ané C, Ho LST, Roch S (2017) Phase transition on the convergence rate of parameter estimation under an Ornstein–Uhlenbeck diffusion on a tree. J Math Biol 74(1–2):355–385
Article MathSciNet Google Scholar
Felsenstein J (1981) Evolutionary trees from gene frequencies and quantitative characters: finding maximum likelihood estimates. Evolution 35(6):1229–1242
Article Google Scholar
Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125(1):1–15
Article MathSciNet Google Scholar
Harmon LJ, Weir JT, Brock CD, Glor RE, Challenger W (2007) GEIGER: investigating evolutionary radiations. Bioinformatics 24(1):129–131
Article Google Scholar
Ho LST, Ané C (2013) Asymptotic theory with hierarchical autocorrelation: Ornstein–Uhlenbeck tree models. Ann Stat 41(2):957–981
Article MathSciNet Google Scholar
Ho LST, Ané C (2014) Intrinsic inference difficulties for trait evolution with Ornstein–Uhlenbeck models. Methods Ecol Evol 5(11):1133–1146
Article Google Scholar
Jammalamadaka SR, Janson S (1986) Limit theorems for a triangular scheme of U-statistics with applications to inter-point distances. Ann Probab 14(4):1347–1358
Article MathSciNet Google Scholar
Lambert A, Stadler T (2013) Birth-death models and coalescent point processes: the shape and probability of reconstructed phylogenies. Theor Popul Biol 90:113–128
Article Google Scholar
Li G, Steel M, Zhang L (2008) More taxa are not necessarily better for the reconstruction of ancestral character states. Syst Biol 57(4):647–653
Article Google Scholar
Lipton RJ, Tarjan RE (1979) A separator theorem for planar graphs. SIAM J Appl Math 36(2):177–189
Article MathSciNet Google Scholar
Mooers A, Schluter D (1999) Reconstructing ancestor states with maximum likelihood: support for one- and two-rate models. Syst Biol 48(3):623–633
Article Google Scholar
Mossel E, Steel M (2014) Majority rule has transition ratio 4 on Yule trees under a 2-state symmetric model. J Theor Biol 360:315–318
Article Google Scholar
Pennell MW, Eastman JM, Slater GJ, Brown JW, Uyeda JC, FitzJohn RG, Alfaro ME, Harmon LJ (2014) geiger v2.0: an expanded suite of methods for fitting macroevolutionary models to phylogenetic trees. Bioinformatics 30(15):2216–2218
Article Google Scholar
Sagitov S, Bartoszek K (2012) Interspecies correlation for neutrally evolving traits. J Theor Biol 309:11–19
Article MathSciNet Google Scholar
Tuffley C, Steel M (1997) Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull Math Biol 59(3):581–607
Article Google Scholar
Van Erven T, Harremoës P (2014) Rényi divergence and Kullback–Leibler divergence. IEEE Trans Inf Theory 60(7):3797–3820
Article Google Scholar
Yule GU (1925) A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FRS. Philoso Trans R Soc Lond Ser B 213:21–87
Google Scholar

Download references

Author information

Lam Si Tung Ho and Vu Dinh have equally contributed to this work.

Authors and Affiliations

Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
Lam Si Tung Ho
Department of Mathematical Sciences, University of Delaware, Newark, USA
Vu Dinh
Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, USA
Frederick A. Matsen IV
Departments of Biomathematics, Biostatistics and Human Genetics, University of California, Los Angeles, USA
Marc A. Suchard

Authors

Lam Si Tung Ho
View author publications
You can also search for this author in PubMed Google Scholar
Vu Dinh
View author publications
You can also search for this author in PubMed Google Scholar
Frederick A. Matsen IV
View author publications
You can also search for this author in PubMed Google Scholar
Marc A. Suchard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lam Si Tung Ho.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

LSTH was supported by startup funds from Dalhousie University, the Canada Research Chairs program, and the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant RGPIN-2018-05447. FAM was supported by CISE-1564137, and in part by a Faculty Scholar grant from the Howard Hughes Medical Institute and the Simons Foundation. MAS was supported by National Science Foundation Grant DMS1264153 and National Institutes of Health grant R01 AI107034.

A Proofs

1.1 A.1 Proof of Lemma 2

Note that by symmetry, we have ${\mathbb {P}}({\mathbf {Y}} = {\mathbf {y}}~|~\rho = 0) = {\mathbb {P}}(1 - {\mathbf {Y}} = {\mathbf {y}}~|~ \rho = 1)$. We deduce that

$$\begin{aligned} {\mathbb {P}}(h({\mathbf {Y}}) = x~|~\rho = 0)&= {\mathbb {P}}({\mathbf {Y}} \in h^{-1}(x)~|~\rho = 0) \\&= {\mathbb {P}}({\mathbf {1}} - {\mathbf {Y}} \in h^{-1}(x)~|~\rho = 1) \\&= {\mathbb {P}}(h({\mathbf {1}} - {\mathbf {Y}}) = x~|~\rho = 1) \\&= {\mathbb {P}}(h({\mathbf {Y}}) = x~|~\rho = 1) \end{aligned}$$

which completes the proof.

1.2 A.2 Proof of Lemma 3

Denote $P^{(u)}_v = {\mathbb {P}}({\mathbf {Y_u}} ~|~ {\mathbb {T}}_u, \mu , \rho _u = v)$ for $u \in \{ 0, 1\}$, $v \in \{ 0,1 \}$. We have

$$\begin{aligned} P_{{\mathbb {T}}_1,\mu }({\mathbf {Y_1}}) P_{{\mathbb {T}}_2,\mu }({\mathbf {Y_2}}) = \frac{1}{4} \sum _{u,v \in \{ 0, 1\}}{P^{(1)}_u P^{(2)}_v}. \end{aligned}$$

Moreover

$$\begin{aligned} P_{{\mathbb {T}},\mu }({\mathbf {Y}}) = \frac{1 + e^{-2 \mu d}}{4} \sum _{u \in \{ 0, 1 \}}{P^{(1)}_u P^{(2)}_u} + \frac{1 - e^{-2 \mu d}}{4} \sum _{u \in \{ 0, 1 \}}{P^{(1)}_u P^{(2)}_{1-u}}. \end{aligned}$$

Therefore

$$\begin{aligned} \frac{1}{1 - e^{-2 \mu d}} P_{{\mathbb {T}}_1,\mu }({\mathbf {Y_1}}) P_{{\mathbb {T}}_2,\mu }({\mathbf {Y_2}}) \le P_{{\mathbb {T}},\mu }({\mathbf {Y}}) \le 2 P_{{\mathbb {T}}_1,\mu }({\mathbf {Y_1}}) P_{{\mathbb {T}}_2,\mu }({\mathbf {Y_2}}). \end{aligned}$$

1.3 A.3 Proof of Lemma 4

Without loss of generality, we assume that $\mu _1 < \mu _2$. By the mean value theorem, there exists ${{\tilde{\mu }}}_{uv} \in (\mu _1, \mu _2)$ for any $u, v \in \{ 0, 1\}$ such that

$$\begin{aligned} \left| \log [{\mathbf {P}}_{\mu _1}(t)]_{uv} - \log [{\mathbf {P}}_{\mu _2}(t)]_{uv} \right| = \frac{ t e^{- 2 {{\tilde{\mu }}}_{uv} t}}{[{\mathbf {P}}_{{{\tilde{\mu }}}_{uv}}(t)]_{uv}} |\mu _1 - \mu _2| \le \frac{ t e^{- 2 {{\tilde{\mu }}}_{uv} t}}{1 - e^{- 2 {{\tilde{\mu }}}_{uv} t}} |\mu _1 - \mu _2|. \end{aligned}$$

We observe that there exists a $C_{{\underline{\mu }}, {\overline{\mu }}}>0$ such that

$$\begin{aligned} \sup _{t \ge 0; {{\tilde{\mu }}}_{uv} \in ({\underline{\mu }}, {\overline{\mu }})} {\frac{ t e^{- 2 {{\tilde{\mu }}}_{uv} t}}{1 - e^{- 2 {{\tilde{\mu }}}_{uv} t}}} \le C_{{\underline{\mu }}, {\overline{\mu }}}. \end{aligned}$$

Therefore,

$$\begin{aligned} | \log [{\mathbf {P}}_{\mu _1}(t)]_{uv} - \log [{\mathbf {P}}_{\mu _2}(t)]_{uv} | \le C_{{\underline{\mu }}, {\overline{\mu }}} |\mu _1 - \mu _2|. \end{aligned}$$

This implies that

$$\begin{aligned}{}[{\mathbf {P}}_{\mu _1}(t)]_{uv} \le e^{C_{{\underline{\mu }}, {\overline{\mu }}} |\mu _1 - \mu _2|} [{\mathbf {P}}_{\mu _2}(t)]_{uv}. \end{aligned}$$

(5)

Note that

$$\begin{aligned} P_{{\mathbb {T}},\mu }({\mathbf {Y}}) = \frac{1}{2} \sum _{y}{\left( \prod _{(u,v)\in E}{[{\mathbf {P}}_\mu ( d_{uv})}]_{y_u y_v} \right) }. \end{aligned}$$

By applying (5) for all $2n-3$ edges on the tree, we deduce that

$$\begin{aligned} P_{{\mathbb {T}},\mu _1}({\mathbf {Y}}) \le e^{(2n-3) C_{{\underline{\mu }}, {\overline{\mu }}} |\mu _1 - \mu _2| } P_{{\mathbb {T}},\mu _2}({\mathbf {Y}}) . \end{aligned}$$

Hence,

$$\begin{aligned} |\ell _{{\mathbb {T}},\mu _1}({\mathbf {Y}}) - \ell _{{\mathbb {T}},\mu _2}({\mathbf {Y}})| \le (2n-3) C_{{\underline{\mu }}, {\overline{\mu }}} |\mu _1 - \mu _2|, \end{aligned}$$

which validates the lemma.

1.4 A.4 Proof of Lemma 7

For all x, y, we have $|u(x) - u(y)| \le |v(x)-v(y)| + 2c$. Let Y be an independent and identically distributed copy of X, we have

$$\begin{aligned} 2\mathrm {Var}[u(X)]&= {\mathbb {E}}_{X}[u(X)^2] + {\mathbb {E}}_{Y}[u(Y)^2] - 2 {\mathbb {E}}_{X}[u(X)] {\mathbb {E}}_{Y}[u(Y)]\\&= {\mathbb {E}}_{X, Y}\left( u(X)^2 + u(Y)^2 - 2 u(X) u(Y)\right) \\&= {\mathbb {E}}_{X, Y}\left( [u(X)-u(Y)]^2\right) \\&\le {\mathbb {E}}_{X, Y}\left( [|v(X)-v(Y)| + 2c]^2 \right) . \end{aligned}$$

Note that for all $z, c \in {\mathbb {R}}$ and $\omega >1$,

$$\begin{aligned} (z + 2c)^2 \le \omega z^2 + \frac{4\omega }{\omega -1} c^2. \end{aligned}$$

Therefore,

$$\begin{aligned} 2\mathrm {Var}[u(X)]&\le \omega {\mathbb {E}}_{X, Y}\left( [v(X)-v(Y)]^2 \right) +\frac{4\omega }{\omega -1}c^2\\&= 2 \omega \mathrm {Var}[v(X)] +\frac{4\omega }{\omega -1}c^2. \end{aligned}$$

1.5 A.5 Proof of Eq. (4)

In order to establish Eq. (4), we use the following Lemma.

Lemma 11

(Remark 3.4 in Jammalamadaka and Janson (1986)) Let $X_1, X_2, \ldots , X_n$ be an i.i.d. sequence of random variables and $f_n(x, y)$ be an indicator function on ${\mathbb {R}}^2$ such that

$$\begin{aligned} n^3 E[f_n(X_1, X_2) f_n(X_1, X_3)] \rightarrow 0 \quad \mathrm{and}\quad \frac{1}{2}n^2E[f_n(X_1, X_2)] \rightarrow \lambda \end{aligned}$$

for some constant $\lambda >0$. Define $U_n =\sum _{1 \le i< j \le n}{f_n(X_i, X_j)}$.

Then $U_n \rightarrow _d \text {Poisson}(\lambda )$.

We apply this Lemma with $f_n(x, y) = I\{|x-y| < r_n\}$ where $r_n = \epsilon /n^2$ for the sequence $t_1, t_2, \ldots , t_n$ of the coalescent point process. Note that by Equation (4.3) in Jammalamadaka and Janson (1986),

$$\begin{aligned} \frac{1}{2}n^2E[f_n(t_1, t_2)] \rightarrow c \epsilon \int _{0}^T{\phi (x)^2 dx} \end{aligned}$$

for some constant $c>0$. On the other hand, we have

$$\begin{aligned} E[f_n(t_1, t_2) f_n(t_1, t_3)]&= E[(E[f_n(t_1, t_2) \mid t_1] )^2]\\&= \int _{0}^T{\left( \int _{t - r_n}^{t+r_n}{\phi (\tau )d\tau }\right) ^2\phi (t) dt}\le \frac{4 \Vert \phi \Vert _\infty ^2 \epsilon ^2}{n^4}. \end{aligned}$$

Therefore, $n^3 E[f_n(t_1, t_2) f_n(t_1, t_3)] \rightarrow 0$. Hence,

$$\begin{aligned} {\mathbb {P}}\left( n^2 \min _{1 \le i < j \le n-1}{|t_i - t_j|} \le \epsilon \right) = P(U_n = 0) \rightarrow 1 - \exp \left( -c \epsilon \int _0^T{\phi ^2} \right) . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ho, L.S.T., Dinh, V., Matsen, F.A. et al. On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model. J. Math. Biol. 80, 1119–1138 (2020). https://doi.org/10.1007/s00285-019-01453-1

Download citation

Received: 09 March 2019
Revised: 04 November 2019
Published: 21 November 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s00285-019-01453-1

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model

Abstract

Access this article

Similar content being viewed by others

Phase transition on the convergence rate of parameter estimation under an Ornstein–Uhlenbeck diffusion on a tree

Phase transition in the sample complexity of likelihood-based phylogeny inference

Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Proofs

1.1 A.1 Proof of Lemma 2

1.2 A.2 Proof of Lemma 3

1.3 A.3 Proof of Lemma 4

1.4 A.4 Proof of Lemma 7

1.5 A.5 Proof of Eq. (4)

Lemma 11

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model

Abstract

Access this article

Similar content being viewed by others

Phase transition on the convergence rate of parameter estimation under an Ornstein–Uhlenbeck diffusion on a tree

Phase transition in the sample complexity of likelihood-based phylogeny inference

Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Proofs

A Proofs

1.1 A.1 Proof of Lemma 2

1.2 A.2 Proof of Lemma 3

1.3 A.3 Proof of Lemma 4

1.4 A.4 Proof of Lemma 7

1.5 A.5 Proof of Eq. (4)

Lemma 11

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation