New Algorithms for Computing Phylogenetic Biodiversity

  • Constantinos Tsirogiannis
  • Brody Sandel
  • Adrija Kalvisa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8701)


A common problem that appears in many case studies in ecology is the following: given a rooted phylogenetic tree \(\mathcal{T}\) and a subset R of its leaf nodes, we want to compute the distance between the elements in R. A very popular distance measure that can be used for this reason is the Phylogenetic Diversity (PD), which is defined as the cost of the minimum weight Steiner tree in \(\mathcal{T}\) that spans the nodes in R. To analyse the value of the PD for a given set R it is important also to calculate the variance of this measure. However, the best algorithm known so far for computing the variance of the PD is inefficient; for any input tree \(\mathcal{T}\) that consists of n nodes, this algorithm has Θ(n 2) running time. Moreover, computing efficiently the variance and higher order statistical moments is a major open problem for several other phylogenetic measures. We provide the following results:

  • We describe a new algorithm that computes efficiently in practice the variance of the pd. This algorithm has O(si(\(\mathcal{T}\)) + DSSI \(^2(\mathcal{T}))\) running time; here si(\(\mathcal{T}\)) denotes the Sackin’s Index of \(\mathcal{T}\), and DSSI \((\mathcal{T})\) is a new index whose value depends on how balanced \(\mathcal{T}\) is.

  • We provide for the first time exact formulas for computing the mean and the variance of another popular biodiversity measure, the Mean Nearest Taxon Distance (mntd). These formulas apply specifically to ultrametric trees. For an ultrametric tree \(\mathcal{T}\) of n nodes, we show how we can compute the mean of the mntd in O(n) time, and its variance in O(si(\(\mathcal{T}\)) + DSSI \(^2(\mathcal{T}))\) time.

  • We introduce a new measure which we call the Core Ancestor Cost  (cac). A major advantage of this measure is that for any integer k > 0 we can compute all first k statistical moments of the cac in O(si(\(\mathcal{T}) +nk+k^2)\) time in total, using O(n + k) space.

We have implemented the new algorithms for computing the variance of the pd and of the mntd, and the statistical moments of the cac. We conducted experiments on large phylogenetic datasets and we show that our algorithms perform efficiently in practice.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bininda-Emonds, O.R.P., Cardillo, M., Jones, K.E., MacPhee, R.D.E., Beck, R.M.D., Grenyer, R., Price, S.A., Vos, R.A., Gittleman, J.L., Purvis, A.: The Delayed Rise of Present-Day Mammals. Nature 446, 507–512 (2007)CrossRefGoogle Scholar
  2. 2.
    Blum, M.G.B., François, O.: On Statistical Tests of Phylogenetic Tree Imbalance: The Sackin and Other Indices Revisited. Mathematical Biosciences 195, 14–153 (2005)CrossRefGoogle Scholar
  3. 3.
    Cadotte, M., Albert, C.H., Walker, S.C.: The Ecology of Differences: Assessing Community Assembly with Trait and Evolutionary Distances. Ecology Letters 16, 1234–1244 (2013)CrossRefGoogle Scholar
  4. 4.
    Cooper, N., Rodriguez, J., Purvis, A.: A Common Tendency for Phylogenetic Overdispersion in Mammalian Assemblages. Proceedings of the Royal Society B 275, 2031–2037 (2008)CrossRefGoogle Scholar
  5. 5.
    O’Dwyer, J.P., Kembel, S.W., Green, J.L.: Phylogenetic Diversity Theory Sheds Light on the Structure of Microbial Communities. PLoS Computational Biology 8(12), e1002832(2012)Google Scholar
  6. 6.
    Faller, B., Pardi, F., Steel, M.: Distribution of Phylogenetic Diversity Under Random Extinction. Journal of Theoretical Biology 251, 286–296 (2008)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Goloboff, P.A., Catalano, S.A., Mirandeb, J.M., Szumika, C.A., Ariasa, J.S., Kallersjoc, M., Farris, J.S.: Phylogenetic Analysis of 73 060 Taxa Corroborates Major Eukaryotic Groups. Cladistics 25, 211–230 (2009)CrossRefGoogle Scholar
  8. 8.
    Graham, C.H., Parra, J.L., Rahbek, C., McGuire, J.A.: Phylogenetic Structure in Tropical Hummingbird Communities. Proceedings of the National Academy of Sciences USA 106, 19673–19678 (2009)CrossRefGoogle Scholar
  9. 9.
    Kembel, S.W., Hubbell, S.P.: The Phylogenetic Structure of a Neotropical Forest Tree Community. Ecology 87, S86–S99 (2006)Google Scholar
  10. 10.
    Kissling, W.D., Eiserhardt, W.L., Baker, W.J., Borchsenius, F., Couvreur, T.L.P., Balslev, H., Svenning, J.-C.: Cenozoic Imprints on the Phylogenetic Structure of Palm Species Assemblages Worldwide. Proceedings of the National Academy of Sciences USA 109, 7379–7384 (2012)CrossRefGoogle Scholar
  11. 11.
    Kraft, N.J.B., Cornwell, W.K., Webb, C.O., Ackerly, D.D.: Trait Evolution, Community Assembly, and the Phylogenetic Structure of Ecological Communities. The American Naturalist 170, 271–283 (2007)CrossRefGoogle Scholar
  12. 12.
    Nipperess, D.A., Matsen IV., F.A.: The Mean and Variance of Phylogenetic Diversity Under Rarefaction. Methods in Ecology and Evolution 4, 566–572 (2013)CrossRefGoogle Scholar
  13. 13.
    Steel, M.: Tools to Construct and Study Big Trees: A Mathematical Perspective. In: Hodkinson, T., Parnell, J., Waldren, S. (eds.) Reconstructing the Tree of Life: Taxonomy and Systematics of Species Rich Taxa, pp. 97–112. CRC Press (2007)Google Scholar
  14. 14.
    Tsirogiannis, C., Sandel, B.: Computing the skewness of the phylogenetic mean pairwise distance in linear time. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 170–184. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  15. 15.
    Tsirogiannis, C., Sandel, B., Cheliotis, D.: Efficient computation of popular phylogenetic tree measures. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 30–43. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Vellend, M., Cornwell, W.K., Magnuson-Ford, K., Mooers, A.Ø.: Measuring Phylogenetic Biodiversity. In: Magurran, A., McGill, B. (eds.) Biological Diversity: Frontiers in Measurement and Assessment, Oxford University Press (2010)Google Scholar
  17. 17.
    Webb, C.O., Ackerly, D.D., McPeek, M.A., Donoghue, M.J.: Phylogenies and Community Ecology. Annual review of ecology and systematics 33, 475–505 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Constantinos Tsirogiannis
    • 1
  • Brody Sandel
    • 1
  • Adrija Kalvisa
    • 2
  1. 1.MADALGO and Department of BioscienceAarhus UniversityDenmark
  2. 2.Faculty of BiologyUniversity of LatviaLatvia

Personalised recommendations