Advertisement

Journal of Mathematical Biology

, Volume 74, Issue 1–2, pp 355–385 | Cite as

Phase transition on the convergence rate of parameter estimation under an Ornstein–Uhlenbeck diffusion on a tree

  • Cécile Ané
  • Lam Si Tung HoEmail author
  • Sebastien Roch
Article

Abstract

Diffusion processes on trees are commonly used in evolutionary biology to model the joint distribution of continuous traits, such as body mass, across species. Estimating the parameters of such processes from tip values presents challenges because of the intrinsic correlation between the observations produced by the shared evolutionary history, thus violating the standard independence assumption of large-sample theory. For instance (Ho and Ané, Ann Stat 41:957–981, 2013) recently proved that the mean (also known in this context as selection optimum) of an Ornstein–Uhlenbeck process on a tree cannot be estimated consistently from an increasing number of tip observations if the tree height is bounded. Here, using a fruitful connection to the so-called reconstruction problem in probability theory, we study the convergence rate of parameter estimation in the unbounded height case. For the mean of the process, we provide a necessary and sufficient condition for the consistency of the maximum likelihood estimator (MLE) and establish a phase transition on its convergence rate in terms of the growth of the tree. In particular we show that a loss of \(\sqrt{n}\)-consistency (i.e., the variance of the MLE becomes \(\varOmega (n^{-1})\), where n is the number of tips) occurs when the tree growth is larger than a threshold related to the phase transition of the reconstruction problem. For the covariance parameters, we give a novel, efficient estimation method which achieves \(\sqrt{n}\)-consistency under natural assumptions on the tree. Our theoretical results provide practical suggestions for the design of comparative data collection.

Keywords

Ornstein–Uhlenbeck Phase transition Evolution Phylogenetic Consistency Maximum likelihood estimator 

Mathematics Subject Classification

Primary 62F12 Secondary 92D15 92B10 

References

  1. Adamczak R, Miłoś P (2014) U-statistics of Ornstein–Uhlenbeck branching particle system. J Theor Probab 27(4):1071–1111MathSciNetCrossRefzbMATHGoogle Scholar
  2. Adamczak R, Miłoś P (2015) CLT for Ornstein–Uhlenbeck branching particle system. Electron J Probab 20(42):1–35MathSciNetzbMATHGoogle Scholar
  3. Anderson TW (1984) An introduction to multivariate statistical analysis, 2nd edn. Wiley, ChichesterzbMATHGoogle Scholar
  4. Athreya K, Ney P (2004) Branching processes, Dover Books on Mathematics Series. Dover, New YorkGoogle Scholar
  5. Bartoszek K, Pienaar J, Mostad P, Andersson S, Hansen TF (2012) A phylogenetic comparative method for studying multivariate adaptation. J Theor Biol 314:204–215MathSciNetCrossRefGoogle Scholar
  6. Bartoszek K, Sagitov S (2015) Phylogenetic confidence intervals for the optimal trait value. J Appl Probab 52(4):1115–1132MathSciNetCrossRefzbMATHGoogle Scholar
  7. Bininda-Emonds O, Cardillo M, Jones KE, MacPhee RDE, Beck RMD, Grenyer R, Price SA, Vos RA, Gittleman JL, Purvis A (2007) The delayed rise of present-day mammals. Nature 446(7135):507–512CrossRefGoogle Scholar
  8. Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, Albert FW, Zeller U, Khaitovich P, Grutzner F, Bergmann S, Nielsen R, Pääbo S, Kaessmann H (2011) The evolution of gene expression levels in mammalian organs. Nature 478(7369):343–348CrossRefGoogle Scholar
  9. Butler MA, King AA (2004) Phylogenetic comparative analysis: a modeling approach for adaptive evolution. Am Nat 164(6):683–695CrossRefGoogle Scholar
  10. Cooper N, Purvis A (2010) Body size evolution in mammals: complexity in tempo and mode. Am Nat 175(6):727–738CrossRefGoogle Scholar
  11. Crawford FW, Suchard MA (2013) Diversity, disparity, and evolutionary rate estimation for unresolved Yule trees. Syst Biol 62(3):439–455CrossRefGoogle Scholar
  12. Evans WS, Kenyon C, Peres Y, Schulman LJ (2000) Broadcasting on trees and the Ising model. Ann Appl Probab 10(2):410–433MathSciNetCrossRefzbMATHGoogle Scholar
  13. Felsenstein J (2004) Inferring phylogenies. Sinauer AssociatesGoogle Scholar
  14. Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125(1):1–15MathSciNetCrossRefGoogle Scholar
  15. Hansen TF (1997) Stabilizing selection and the comparative analysis of adaptation. Evolution 51(5):1341–1351CrossRefGoogle Scholar
  16. Harmon L, Weir J, Brock C, Glor R, Challenger W (2008) GEIGER: investigating evolutionary radiations. Bioinformatics 24:129–131CrossRefGoogle Scholar
  17. Harmon LJ, Losos JB, Jonathan Davies T, Gillespie RG, Gittleman JL, Bryan Jennings W, Kozak KH, McPeek MA, Moreno-Roark F, Near TJ, Purvis A, Ricklefs RE, Schluter D, Schulte II,JA, Seehausen O, Sidlauskas BL, Torres-Carvajal O, Weir JT, Mooers AØ (2010) Early bursts of body size and shape evolution are rare in comparative data. Evolution 64(8):2385–2396Google Scholar
  18. Ho LST, Ané C (2013) Asymptotic theory with hierarchical autocorrelation: Ornstein–Uhlenbeck tree models. Ann Stat 41:957–981MathSciNetCrossRefzbMATHGoogle Scholar
  19. Ho LST, Ané C (2014) Intrinsic inference difficulties for trait evolution with Ornstein–Uhlenbeck models. Methods Ecol Evol 5(11):1133–1146CrossRefGoogle Scholar
  20. Ho LST, Ané C (2014) A linear-time algorithm for Gaussian and non-Gaussian trait evolution models. Syst Biol 63(3):397–408CrossRefGoogle Scholar
  21. Jetz W, Thomas G, Joy J, Hartmann K, Mooers A (2012) The global diversity of birds in space and time. Nature 491(7424):444–448CrossRefGoogle Scholar
  22. Lawler E (1976) Combinatorial optimization: networks and matroids. Holt, Rinehart and WinstonGoogle Scholar
  23. Mossel E, Roch S, Sly A (2013) Robust estimation of latent tree graphical models: inferring hidden states with inexact parameters. IEEE Trans Inf Theory 59(7):4357–4373MathSciNetCrossRefGoogle Scholar
  24. Mossel E, Steel M (2014) Majority rule has transition ratio 4 on yule trees under a 2-state symmetric model. J Theor Biol 360(7):315–318CrossRefzbMATHGoogle Scholar
  25. Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290CrossRefGoogle Scholar
  26. Peres Y (1999) Probability on trees: an introductory climb. In: Bernard P (ed) Lectures on probability theory and statistics, Lecture Notes in Mathematics, vol. 1717. Springer, Berlin, pp 193–280Google Scholar
  27. Rohlfs RV, Harrigan P, Nielsen R (2014) Modeling gene expression evolution with an extended Ornstein–Uhlenbeck process accounting for within-species variation. Mol Biol Evol 31(1):201–211CrossRefGoogle Scholar
  28. Semple C, Steel A (2003) Phylogenetics. Oxford lecture series in mathematics and its applications. Oxford University Press, OxfordGoogle Scholar
  29. Shao J (2003) Mathematical statistics. Springer, BerlinCrossRefzbMATHGoogle Scholar
  30. Venditti C, Meade A, Pagel M (2011) Multiple routes to mammalian diversity. Nature 479(7373):393–396CrossRefGoogle Scholar
  31. Yule GU (1925) A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FRS. Philos Trans R Soc Lond Ser B 213:21–87CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Cécile Ané
    • 1
  • Lam Si Tung Ho
    • 2
    Email author
  • Sebastien Roch
    • 3
  1. 1.Departments of Statistics and of BotanyUniversity of Wisconsin-MadisonMadisonUSA
  2. 2.Departments of StatisticsUniversity of Wisconsin-MadisonMadisonUSA
  3. 3.Departments of Mathematics and Statistics (by courtesy)University of Wisconsin-MadisonMadisonUSA

Personalised recommendations