Abstract
Diffusion processes on trees are commonly used in evolutionary biology to model the joint distribution of continuous traits, such as body mass, across species. Estimating the parameters of such processes from tip values presents challenges because of the intrinsic correlation between the observations produced by the shared evolutionary history, thus violating the standard independence assumption of large-sample theory. For instance (Ho and Ané, Ann Stat 41:957–981, 2013) recently proved that the mean (also known in this context as selection optimum) of an Ornstein–Uhlenbeck process on a tree cannot be estimated consistently from an increasing number of tip observations if the tree height is bounded. Here, using a fruitful connection to the so-called reconstruction problem in probability theory, we study the convergence rate of parameter estimation in the unbounded height case. For the mean of the process, we provide a necessary and sufficient condition for the consistency of the maximum likelihood estimator (MLE) and establish a phase transition on its convergence rate in terms of the growth of the tree. In particular we show that a loss of \(\sqrt{n}\)-consistency (i.e., the variance of the MLE becomes \(\varOmega (n^{-1})\), where n is the number of tips) occurs when the tree growth is larger than a threshold related to the phase transition of the reconstruction problem. For the covariance parameters, we give a novel, efficient estimation method which achieves \(\sqrt{n}\)-consistency under natural assumptions on the tree. Our theoretical results provide practical suggestions for the design of comparative data collection.
Similar content being viewed by others
References
Adamczak R, Miłoś P (2014) U-statistics of Ornstein–Uhlenbeck branching particle system. J Theor Probab 27(4):1071–1111
Adamczak R, Miłoś P (2015) CLT for Ornstein–Uhlenbeck branching particle system. Electron J Probab 20(42):1–35
Anderson TW (1984) An introduction to multivariate statistical analysis, 2nd edn. Wiley, Chichester
Athreya K, Ney P (2004) Branching processes, Dover Books on Mathematics Series. Dover, New York
Bartoszek K, Pienaar J, Mostad P, Andersson S, Hansen TF (2012) A phylogenetic comparative method for studying multivariate adaptation. J Theor Biol 314:204–215
Bartoszek K, Sagitov S (2015) Phylogenetic confidence intervals for the optimal trait value. J Appl Probab 52(4):1115–1132
Bininda-Emonds O, Cardillo M, Jones KE, MacPhee RDE, Beck RMD, Grenyer R, Price SA, Vos RA, Gittleman JL, Purvis A (2007) The delayed rise of present-day mammals. Nature 446(7135):507–512
Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, Albert FW, Zeller U, Khaitovich P, Grutzner F, Bergmann S, Nielsen R, Pääbo S, Kaessmann H (2011) The evolution of gene expression levels in mammalian organs. Nature 478(7369):343–348
Butler MA, King AA (2004) Phylogenetic comparative analysis: a modeling approach for adaptive evolution. Am Nat 164(6):683–695
Cooper N, Purvis A (2010) Body size evolution in mammals: complexity in tempo and mode. Am Nat 175(6):727–738
Crawford FW, Suchard MA (2013) Diversity, disparity, and evolutionary rate estimation for unresolved Yule trees. Syst Biol 62(3):439–455
Evans WS, Kenyon C, Peres Y, Schulman LJ (2000) Broadcasting on trees and the Ising model. Ann Appl Probab 10(2):410–433
Felsenstein J (2004) Inferring phylogenies. Sinauer Associates
Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125(1):1–15
Hansen TF (1997) Stabilizing selection and the comparative analysis of adaptation. Evolution 51(5):1341–1351
Harmon L, Weir J, Brock C, Glor R, Challenger W (2008) GEIGER: investigating evolutionary radiations. Bioinformatics 24:129–131
Harmon LJ, Losos JB, Jonathan Davies T, Gillespie RG, Gittleman JL, Bryan Jennings W, Kozak KH, McPeek MA, Moreno-Roark F, Near TJ, Purvis A, Ricklefs RE, Schluter D, Schulte II,JA, Seehausen O, Sidlauskas BL, Torres-Carvajal O, Weir JT, Mooers AØ (2010) Early bursts of body size and shape evolution are rare in comparative data. Evolution 64(8):2385–2396
Ho LST, Ané C (2013) Asymptotic theory with hierarchical autocorrelation: Ornstein–Uhlenbeck tree models. Ann Stat 41:957–981
Ho LST, Ané C (2014) Intrinsic inference difficulties for trait evolution with Ornstein–Uhlenbeck models. Methods Ecol Evol 5(11):1133–1146
Ho LST, Ané C (2014) A linear-time algorithm for Gaussian and non-Gaussian trait evolution models. Syst Biol 63(3):397–408
Jetz W, Thomas G, Joy J, Hartmann K, Mooers A (2012) The global diversity of birds in space and time. Nature 491(7424):444–448
Lawler E (1976) Combinatorial optimization: networks and matroids. Holt, Rinehart and Winston
Mossel E, Roch S, Sly A (2013) Robust estimation of latent tree graphical models: inferring hidden states with inexact parameters. IEEE Trans Inf Theory 59(7):4357–4373
Mossel E, Steel M (2014) Majority rule has transition ratio 4 on yule trees under a 2-state symmetric model. J Theor Biol 360(7):315–318
Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290
Peres Y (1999) Probability on trees: an introductory climb. In: Bernard P (ed) Lectures on probability theory and statistics, Lecture Notes in Mathematics, vol. 1717. Springer, Berlin, pp 193–280
Rohlfs RV, Harrigan P, Nielsen R (2014) Modeling gene expression evolution with an extended Ornstein–Uhlenbeck process accounting for within-species variation. Mol Biol Evol 31(1):201–211
Semple C, Steel A (2003) Phylogenetics. Oxford lecture series in mathematics and its applications. Oxford University Press, Oxford
Shao J (2003) Mathematical statistics. Springer, Berlin
Venditti C, Meade A, Pagel M (2011) Multiple routes to mammalian diversity. Nature 479(7373):393–396
Yule GU (1925) A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FRS. Philos Trans R Soc Lond Ser B 213:21–87
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was funded in part by NSF grants DMS-1007144 and DMS-1149312 (CAREER), and an Alfred P. Sloan Research Fellowship to S. Roch, and by NSF Grant DMS-1106483 to C. Ané.
Rights and permissions
About this article
Cite this article
Ané, C., Ho, L.S.T. & Roch, S. Phase transition on the convergence rate of parameter estimation under an Ornstein–Uhlenbeck diffusion on a tree. J. Math. Biol. 74, 355–385 (2017). https://doi.org/10.1007/s00285-016-1029-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-016-1029-x