Limiting behaviour of Fréchet means in the space of phylogenetic trees

Abstract

As demonstrated in our previous work on \({\varvec{T}}_{\!4}\), the space of phylogenetic trees with four leaves, the topological structure of the space plays an important role in the non-classical limiting behaviour of the sample Fréchet means in \({\varvec{T}}_{\!4}\). Nevertheless, the techniques used in that paper cannot be adapted to analyse Fréchet means in the space \({\varvec{T}}_{\!m}\) of phylogenetic trees with \(m(\geqslant \!5)\) leaves. To investigate the latter, this paper first studies the log map of \({\varvec{T}}_{\!m}\). Then, in terms of a modified version of this map, we characterise Fréchet means in \({\varvec{T}}_{\!m}\) that lie in top-dimensional or co-dimension one strata. We derive the limiting distributions for the corresponding sample Fréchet means, generalising our previous results. In particular, the results show that, although they are related to the Gaussian distribution, the forms taken by the limiting distributions depend on the co-dimensions of the strata in which the Fréchet means lie.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Bacak, M. (2014). Computing medians and means in hadamard spaces. SIAM Journal on Optimization, 24, 1542–1566.

    MathSciNet  Article  MATH  Google Scholar 

  2. Barden, D., Le, H., Owen, M. (2013). Central limit theorems for Fréchet means in the space of phylogenetic trees. Electronic Journal of Probability, 18(25).

  3. Basrak, B. (2010). Limit theorems for the inductive mean on metric trees. Journal of Applied Probability, 47, 1136–1149.

    MathSciNet  Article  MATH  Google Scholar 

  4. Bhattacharya, R., Patrangenaru, V. (2005). Large sample theory of intrinsic and extrinsic sample means on manifolds-II. Annals of Statistics, 33, 1225–1259.

  5. Bhattacharya, R., Patrangenaru, V. (2014). Statistics on manifolds and landmarks based image analysis: A nonparametric theory with applications. Journal of Statistical Planning and Inference, 145, 1–22.

  6. Billera, L., Holmes, S., Vogtmann, K. (2001). Geometry of the space of phylogenetic trees. Advances in Applied Mathematics, 27, 733–767.

  7. Bridson, M., Haefliger, A. (1999). Metric Spaces of Non-positive Curvature. Berlin: Springer.

  8. Dryden, I., Mardia, K. (1998). Statistical Shape Analysis. Chichester: Wiley.

  9. Dryden, I., Le, H., Preston, S., Wood, A. (2014). Mean shapes, projections and intrinsic limiting distributions. Journal of Statistical Planning and Inference, 145, 25–32.

  10. Feragen, A., Owen, M., Petersen, J., Wille, M., Thomsen, L., Dirksen, A., de Bruijne, M. (2013). Tree-space statistics and approximations for large-scale analysis of anatomical trees. In Information Processing in Medical Imaging, 23rd International Conference, IPMI (pp. 74–85)

  11. Holmes, S. (2003). Statistics for phylogenetic trees. Theoretical Population Biology, 63, 17–32.

    Article  MATH  Google Scholar 

  12. Hotz, T., Huckemann, S., Le, H., Marron, J., Mattingly, J., Miller, E., et al. (2013). Sticky central limit theorems on open books. Annals of Applied Probability, 23, 2238–2258.

  13. Kendall, W., Le, H. (2011). Limit theorems for empirical fréchet means of independent and non-identically distributed manifold-valued random variables. Brazilian Journal of Probability and Statistics, 25, 323–352.

  14. Miller, E., Owen, M., Provan, S. (2015). Polyhedral computational geometry for averaging metric phylogenetic trees. Advances in Applied Mathematics, 68, 51–91.

  15. Nye, T. (2011). Principal components analysis in the space of phylogenetic trees. Annals of Statistics, 39, 2716–2739.

    MathSciNet  Article  MATH  Google Scholar 

  16. Nye, T. (2014). An algorithm for constructing principal geodesics in phylogenetic treespace. Transactions on Computational Biology and Bioinformatics, 11, 304–315.

    Article  Google Scholar 

  17. Owen, M. (2011). Computing geodesic distances in tree space. SIAM Journal on Discrete Mathematics, 25, 1506–1529.

    MathSciNet  Article  MATH  Google Scholar 

  18. Owen, M., Provan, J. (2011). A fast algorithm for computing geodesic distances in tree space. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8, 2–13.

  19. Schroder, E. (1870). Vier combinatorische probleme. Zeitschrift fur Mathematik und Physik, 15, 361–376.

    MATH  Google Scholar 

  20. Vogtmann, K. (2007). Geodesics in the space of trees. www.math.cornell.edu/~vogtmann/papers/TreeGeodesicss/index.html.

  21. Willis, A. (2016). Confidence sets for phylogenetic trees. arXiv:1607.08288v1 [stat.ME].

  22. Ziezold, H. (1977). On expected figures and a strong law of large numbers for random elements in quasi-metric spaces. In Transactions of the seventh Prague Conference on Information Theory, Statistical Decision Functions and Random Processes A (pp. 591–602).

Download references

Acknowledgments

H. Le acknowledges funding from the Engineering and Physical Sciences Research Council. M. Owen acknowledges the support of the Fields Institute.

Author information

Affiliations

Authors

Corresponding author

Correspondence to H. Le.

About this article

Verify currency and authenticity via CrossMark

Cite this article

Barden, D., Le, H. & Owen, M. Limiting behaviour of Fréchet means in the space of phylogenetic trees. Ann Inst Stat Math 70, 99–129 (2018). https://doi.org/10.1007/s10463-016-0582-9

Download citation

Keywords

  • Central limit theorem
  • Fréchet mean
  • Log map
  • Phylogenetic trees
  • Stratified manifold