Abstract
Conservation biologists need robust, intuitive mathematical tools to quantify and assess patterns and changes in biodiversity. Here we review some commonly used abundancebased species diversity measures and their phylogenetic generalizations. Most of the previous abundancesensitive measures and their phylogenetic generalizations lack an essential property, the replication principle or doubling property. This often leads to inconsistent or counterintuitive interpretations, especially in conservation applications. Hill numbers or the “effective number of species” obey the replication principle and thus resolve many of the interpretational problems. Hill numbers were recently extended to incorporate phylogeny; the resulting measures take into account phylogenetic differences between species while still satisfying the replication principle. We review the framework of phylogenetic diversity measures based on Hill numbers and their decomposition into independent alpha and beta components. Both additive and multiplicative decompositions lead to the same classes of normalized phylogenetic similarity or differentiation measures. These classes include multipleassemblage phylogenetic generalizations of the Jaccard, Sørensen, Horn and MorisitaHorn measures. For two assemblages, these classes also include the commonly used UniFrac and PhyloSør indices as special cases. Our approach provides a mathematically rigorous, selfconsistent, ecologically meaningful set of tools for conservationists who must assess the phylogenetic diversity and complementarity of potential protected areas. Our framework is applied to a real dataset to illustrate (i) how to use phylogenetic diversity profiles to completely convey species abundances and phylogenetic information among species in an assemblage; and (ii) how to use phylogenetic similarity (or differentiation) profiles to assess phylogenetic resemblance or difference among multiple assemblages.
You have full access to this open access chapter, Download chapter PDF
Similar content being viewed by others
Keywords
 Diversity
 Diversity decomposition
 Hill numbers
 Phylogenetic diversity
 Replication principle
 Species diversity
Introduction
Many of the most pressing and fundamental questions in biodiversity conservation require robust and sensible measures for quantifying and assessing changes in biodiversity. Many environmental and monitoring projects also require objective and meaningful similarity (or differentiation) measures to compare the diversities of multiple assemblages and their degree of complementarity in order to best conserve genetic, species, and ecosystem diversity . An enormous number of diversity measures and related similarity (or differentiation) indices have been proposed, not only in ecology but also in genetics, economics, information science, linguistics, physics, and social sciences, among others. See Magurran (2004) and Magurran and McGill (2011) for overviews.
In traditional species diversity measures, all species are considered to be equally different from each other; only species richness and abundances are involved. There are two general approaches: parametric and nonparametric (Magurran 2004). Parametric approaches assume a particular species abundance distribution (such as the lognormal or gamma) or a species rank abundance distribution (such as the negative binomial or logseries), and then use the parameters (e.g., Fisher’s alpha) of the distribution to quantify diversity. However, these methods often do not perform well and the results are uninterpretable unless the “true” species abundance distribution is known (Colwell and Coddington 1994; Chao 2005). The parametric model also does not permit meaningful comparison of assemblages with different abundance distributions. For example, a lognormal abundance model cannot be compared to an assemblage whose abundance distribution follows a gamma distribution. Nonparametric methods make no assumptions about the distributional form of the underlying species abundance distribution. The most widely used abundancesensitive nonparametric measures have been the Shannon entropy and the GiniSimpson index . These two measures, along with species richness were integrated into a class of measures called generalized entropies (Havrdra and Charvat 1967; Daróczy 1970; Patil and Taillie 1979; Tsallis 1988; Keylock 2005), which will be briefly reviewed in this chapter.
How to quantify abundancebased species diversity in an assemblage has been one of the most controversial issues in community ecology (e.g. Hurlbert 1971; Routledge 1979; Patil and Taillie 1982; Purvis and Hector 2000; Jost 2006, 2007; Jost et al. 2010). There have also been intense debates on the choice of diversity partitioning schemes; see Ellison (2010) and the Forum that follows it. Surprisingly, all authors in that forum achieved a consensus on the use of Hill number s , also called “effective number of species”, as the best choice to quantify abundancebased species diversity. Hill numbers are a mathematically unified family of diversity indices (differing among themselves only by a parameter q) that incorporate species richness and species relative abundances. They were first used in ecology by MacArthur (1965, 1972), developed by Hill (1973), and recently reintroduced to ecologists by Jost (2006, 2007).
Hill number s obey the replication principle or doubling property , an essential mathematical property that capture biologists’ notion of diversity (MacArthur 1965; Hill 1973). This property requires that if we have N equally diverse, equally large assemblages with no species in common, the diversity of the pooled assemblage must be N times the diversity of a single group. In other words, they are linear with respect to addition of equallycommon species. We will review different versions of this property later. Classical diversity measures, such as Shannon entropy and the GiniSimpson index , do not obey this principle and can lead to inconsistent or counterintuitive interpretations, especially in conservation applications (Jost 2006, 2007). Hill numbers resolve many of the interpretational problems caused by classical diversity indices. Diversity measures that obey the replication principle yield selfconsistent assessment in conservation applications, have intuitivelyinterpretable magnitudes, and can be meaningfully decomposed. In this chapter, Hill numbers are adopted as a general framework for quantifying and partitioning diversities.
Pielou (1975, p. 17) was the first to notice that traditional abundancebased species diversity measures could be broadened to include phylogenetic, functional, or other differences between species. We here concentrate on phylogenetic differences, though our framework can also be extended to functional traits (Tilman 2001; Petchey and Gaston 2002; Weiher 2011). For conservation purposes, an assemblage of phylogenetically divergent species is more diverse than an assemblage consisting of closely related species, all else being equal. Phylogenetic differences among species can be based directly on their evolutionary histories, either in the form of taxonomic classification or wellsupported phylogenetic trees (Faith 1992; Warwick and Clarke 1995; McPeek and Miller 1996; Crozier 1997; Helmus et al. 2007; Webb 2000; Webb et al. 2002; Pavoine et al. 2010; Ives and Helmus 2010, 2011; Vellend et al. 2011; CavenderBares et al. 2009, 2012 among others). Three special issues in Ecology were devoted to integrating ecology and phylogenetics; see McPeek and Miller (1996), Webb et al. (2006), and CavenderBares et al. (2012) and papers in each issue. Phylogenetic diversity measures are especially relevant for conservation applications, since they quantify the amount of evolutionary history preserved by the assemblage; see Lean and MacLaurin (chapter “The Value of Phylogenetic Diversity ”).
The most widely used phylogenetic metric is Faith’s phylogenetic diversity (PD ) (Faith 1992) which is defined as the sum of the branch length s of a phylogenetic tree connecting all species in the target assemblage. As shown in Chao et al. (2010), Faith’s PD can be regarded as a phylogenetic generalization of species richness . The rarefaction formula for Faith’s PD was developed by Nipperess and Matsen (2013) and Nipperess (chapter “The Rarefaction of Phylogenetic Diversity : Formulation, Extension and Application”). Recently, Chao et al. (2015) derived an integrated sampling, rarefaction, and extrapolation methodology to compare Faith’s PD of a set of assemblages. Like species richness, Faith’s PD does not consider species abundances. For some conservation applications, the mere presence or absence of a species is all that matters, or all that can be determined from the available data. In those cases, Faith’s PD is a good measure of phylogenetic diversity. However, there are important advantages to incorporating abundance information into phylogenetic diversity measures for conservation. For example, some human impacts can result in the phylogenetic simplification of an ecosystem, reducing the population shares of phylogenetically distinct species relative to typical species. An abundancebased measure can catch this effect before it leads to actual extinctions.
Ecosystem simplification may be worthy of conservation concern even if it does not lead to extinctions of focal organisms. Often, the focal organisms for conservation represent a tiny fraction of the ecosystem’s biomass or richness . Each focal species will be tied to a web of nonfocal species whose abundances are not usually monitored (e.g., insects). All else being equal, a more equitable distribution of the abundances of focal organisms will be able to support a more diverse, robust and stable set of nonfocal species. Faith (chapter “Using Phylogenetic Dissimilarities Among Sites for Biodiversity Assessments and Conservation ”) rightly argues that phylogenetic diversity is a good proxy for functional diversity. Therefore an ecosystem with a more equitable distribution of abundance across phylogenetic lineages should also exhibit greater functional complexity (per interaction between individuals) than an ecosystem whose phylogenetically unusual elements are rare. If we have to prioritize such ecosystems, the more phylogenetically equitable one, which thoroughly integrates diverse lineages, should be preferred. In addition to being more resistant to lineage extinctions, a complex, wellintegrated ecosystem may be worth preserving in and of itself, above and beyond its component species; conservation is not just about species. Evolution may take a different course in ecosystems whose members are constantly surprised by their interactions compared with an ecosystem whose interactors are highly predictable. These conservation goals – robustness against extinction of distinctive lineages, and preservation of wellintegrated ecosystems with unique future option values – require phylogenetic diversity measures that incorporate species importance values.
Rao ’s quadratic entropy Q (Rao 1982), a generalization of the GiniSimpson index , was the first diversity measure that accounts for both phylogeny and species abundances. The phylogenetic entropy H _{ P } (Allen et al. 2009) extends Shannon entropy to incorporate phylogenetic distance s among species. Since Shannon entropy and the GiniSimpson index do not obey the replication principle , neither do their phylogenetic generalizations. These generalizations will therefore have the same interpretational problems as their parent measures; see Chao et al. (2010, their Supplementary Material) for examples.
Chao et al. (2010) extended Hill number s and related similarity measures to incorporate phylogeny. The new phylogenetic Hill numbers obey a generalized replication principle . Their measures were subsequently extended by Faith and Richards (2012) and Faith (2013). Both the original Hill numbers and their phylogenetic generalizations facilitate diversity decomposition (Jost 2007; Chiu et al. 2014). As with the original Hill numbers, both additive and multiplicative decompositions of phylogenetic Hill numbers lead to the same classes of similarity (or differentiation) measures. Hill numbers therefore provide a unified framework to quantify both abundancebased and phylogenetic diversity.
In this chapter, we first briefly review the classic abundancebased species diversity measures (section “Generalized Entropies”) and their phylogenetic generalizations (section “Phylogenetic generalized entropies”) for an assemblage. Then we focus on the framework of Hill number s (section “Hill numbers and the replication principle”), phylogenetic Hill numbers (section “Phylogenetic Hill numbers and related measures”) and related phylogenetic diversity measures. We also discuss the replication principle and its phylogenetic generalization (section “Replication principle for phylogenetic diversity measures”). For multiple assemblages, we review the diversity decomposition based on phylogenetic diversity measures (section “Decomposition of phylogenetic diversity measures”). The associated phylogenetic similarity and differentiation measures are then presented (section “Normalized phylogenetic similarity measures”). We use a real example for illustration (section “An example”). Our practical recommendations are provided in section “Conclusion”.
Classic Measures and Their Phylogenetic Generalizations
Generalized Entropies
The species richness of an assemblage is a simple count of the number of species present. It is the most intuitive and frequently used measure of biodiversity , and is a key metric in conservation biology (MacArthur and Wilson 1967; Hubbell 2001; Magurran 2004). However, it does not incorporate any information about the abundances of species, and it is a very hard number to estimate accurately from small samples (Colwell and Coddington 1994; Chao 2005; Gotelli and Colwell 2011).
Shannon entropy is a popular classical abundancebased diversity index and has been used in many disciplines. Shannon entropy is
where S is the number of species in the assemblage, and the ith species has relative abundance p _{ i }. Shannon entropy gives the uncertainty in the species identity of a randomly chosen individual in the assemblage. Another popular measure is the GiniSimpson index ,
which gives the probability that two randomly chosen individuals belong to different species. These two abundancesensitive measures, along with species richness , can be united into a single family of generalized entropy :
The parameter q determines the sensitivity of the measure to the relative frequencies of the species. When q = 0, ^{q} H becomes S − 1; When q tends to 1, ^{q} H tends to Shannon entropy . When q = 2, ^{q} H reduces to the GiniSimpson index . This family was found many times in different disciplines (Havrdra and Charvat 1967; Daróczy 1970; Patil and Taillie 1979; Tsallis 1988; Keylock 2005). There are many other families of generalized entropies, notably the Rényi entropies (Rényi 1961).
Although the traditional abundancesensitive generalized entropies and their special cases have been useful in many disciplines (e.g., see Magurran 2004), they do not behave in the same intuitive linear way as species richness . In ecosystems with high diversity , mass extinctions hardly affect their values (Jost 2010). They also lead to logical contradictions in conservation biology, because they do not measure a conserved quantity (e.g., under a given conservation plan, the proportion of “diversity” lost and the proportion preserved can both be 90 % or more); see Jost (2006, 2007) and Jost et al. (2010). Thus, changes in their magnitude cannot be properly compared or interpreted. Also, the main measure of similarity in the additive approach for traditional measures, the withingroup or “ alpha” diversity divided by the total or “gamma” diversity, does not actually quantify the compositional similarity of the assemblages under study. This ratio can be arbitrarily close to unity (supposedly indicating high similarity) even when the assemblages being compared have no species in common. Finally, these measures each use different units (e.g., the GiniSimpson index is a probability whereas Shannon entropy is in units of information), so they cannot be compared with each other. All these problems are consequences of their failure to satisfy the replication principle . Hill number s obey the replication principle and resolve all these problems; see section “Hill numbers and the replication principle”.
Phylogenetic Generalized Entropies
The classic measures reviewed in section “Generalized Entropies” were extended to incorporate phylogenetic distance between species. As mentioned in the Introduction and will be shown in section “Phylogenetic Hill numbers and related measures”, Faith’s PD can be regarded as a phylogenetic generalization of species richness .
Rao ’s quadratic entropy takes account of both phylogeny and species abundances (Rao 1982):
where d _{ ij } denotes the phylogenetic distance (in years since divergence, number of DNA base changes, or other metric) between species i and j, and p _{ i } and p _{ j } denote the relative abundance of species i and j. This index measures the average phylogenetic distance between any two individuals randomly selected from the assemblage. Rao ’s Q represents a phylogenetic generalization of the GiniSimpson index because in the special case of no phylogenetic structure (all species are equally related to one another), d _{ ii } = 0 and d _{ ij } = 1 (i ≠ j), it reduces to the GiniSimpson index.
The phylogenetic entropy H _{ P } is a generalization of Shannon’s entropy to incorporate phylogenetic distance s among species (Allen et al. 2009):
where the summation is over all branches of a rooted phylogenetic tree, L _{ i } is the length of branch i, and a _{ i } denotes the summed relative abundance of all species descended from branch i.
For ultrametric trees, Faith’s PD , Allen et al.’s H _{ P }, and Rao ’s Q can be united into a single parametric family of phylogenetic generalized entropies (Pavoine et al. 2009):
Here, L _{ i } and a _{ i } are defined in Eq. (2b) and T is the age of the root node of the tree. Then ^{0} I = Faith’s PD minus T; ^{1} I is identical to Allen et al.’s entropy H _{ P } given in Eq. (2b); and ^{2} I is identical to Rao ’s quadratic entropy Q given in Eq. (2a). In the special case that T = 1 (the tree height is normalized to unit length) and all branches have unit length, then the phylogenetic generalized entropy reduces to the classical generalized entropy defined in Eq. (1c), with species relative abundances {p _{1}, p _{2}, …, p _{ S }} as the tipnode abundances.
The abundancesensitive (q > 0) phylogenetic generalized entropies provide useful information, but they do not obey the replication principle and thus have the same interpretational problems as their parent measures. This motivated Chao et al. (2010) to extend Hill number s to phylogenetic Hill numbers, which obey the replication principle; see section “Phylogenetic Hill numbers and related measures”.
Hill Numbers and Their Phylogenetic Generalizations
Hill Numbers and the Replication Principle
Pioneering work by Kimura and Crow (1964) in genetics and MacArthur (1965) in ecology showed that the Shannon and GiniSimpson measures can be easily converted to “effective number of species” (i.e., the number of equally abundant species that are needed to give the same value of the diversity measure), which use the same units as species richness . Shannon entropy can be converted by taking its exponential, and the GiniSimpson index can be converted by the formula 1/(1−H _{ GS }). Hill (1973) integrated species richness and the converted Shannon and GiniSimpson measures into a class of diversity measures called “ Hill number s ” of order q, or the “effective number of species”, defined as
This measure is undefined for q = 1, but its limit as q tends to 1 exists and gives
The relationship between Hill number of order q (q ≠ 1) and the generalized entropy can be expressed as
When q = 0, the species abundances do not count at all and ^{0} D = S is obtained. When q = 1, the species are weighed in proportion to their frequencies, and the measure ^{1} D (in Eq. (3b)) can be interpreted as the effective number of common or “typical” species (i.e., species with typical abundances) in the assemblage. When q = 2, abundant species are favored and rare species are discounted; the measure ^{2} D becomes the inverse Simpson concentration. The measure ^{2} D can be interpreted as the effective number of dominant or very abundant species in the assemblage. In general, if ^{q} D = x, then the diversity of order q of this community is the same as that of an idealized reference community with x equally abundant species. All Hill number s are in units of “species”. It is thus possible to plot them on a single graph as a continuous function of the parameter q. This diversity profile characterizes the speciesabundance distribution of an assemblage and provides complete information about its diversity. The steepness of its slope graphically illustrates the degree of dominance in the assemblage. An example is given in section “An example”.
Hill number s differ fundamentally from Shannon entropy and the GiniSimpson index in that they obey the replication principle . Hill (1973) proved a weak version of the doubling property : if two completely distinct assemblages (i.e., no species in common) have identical relative abundance distributions, then the Hill number doubles if the assemblages are combined with equal weights. Chiu et al. (2014, their Appendix B) recently proved a strong version of the doubling property: if two completely distinct assemblages have identical Hill numbers of order q (relative abundance distributions may be different, unlike the weak version), then the Hill number of the same order doubles if the two assemblages are combined with equal weights. Species richness is a Hill number (with q = 0) and obeys both versions of the doubling property, but most other diversity indices do not obey even the weak version. Because Hill numbers obey this replication principle, changes in their magnitude have simple interpretations, and the ratio of alpha diversity to gamma diversity accurately reflects the compositional similarity of the communities. The replication principle is best known in economics, where it has long been recognized as an important property of concentration and diversity measures (Hannah and Kay 1977). In ecology, the doubling property has been extensively discussed by many authors (MacArthur 1965, 1972; Hill 1973; Whittaker 1972; Routledge 1979; Peet 1974; Jost 2006, 2007, 2009; Ricotta and Szeidl 2009; Jost et al. 2010) and has been extended to phylogenetic measures (Chao et al. 2010); see below.
Phylogenetic Hill Numbers and Related Measures
When the branch length s are proportional to divergence time, all branch tips are the same distance from the root (the first node). Such trees are called “ultrametric” trees. We first discuss the phylogenetic diversity measures for ultrametric trees. The phylogenetic Hill number s developed by Chao et al. (2010) for an ultrametric tree can be intuitively explained as the Hill number of a timeaverage of a tree’s generalized entropy over some evolutionary time interval of interest. Suppose the phylogenetic tree for an assemblage is calibrated to some relative or absolute timescale. We can slice this phylogenetic tree at any time t in the past; see the left panel of Fig. 1 (reproduced from Chao et al. 2010) for illustration and details about how to deal with shared lineages. The number of lineages at that time is the number of branch cuts, and the relative importance of each of these lineages for the presentday assemblage is the sum of the relative abundances of the branch’s descendants in the presentday assemblage. Using these relative importance values, we can calculate the generalized entropy of order q for the slice. The mean of these entropies, beginning at time –T (i.e., T years before present) and continuing until the present, is converted to a Hill number using Eq. (3c). This is the phylogenetic Hill number, which conveys information about the shape of the tree over the time interval of interest. Chao et al. (2010) symbolize it as \( {}{}^q\overline{D}\;(T) \), and also refer to it as the mean phylogenetic diversity of order q over T years (or simply the mean diversity for the interval [−T, 0]):
where B _{ T } is the set of all branches in the time interval [−T, 0], L _{ i } is the length of branch i in the set B _{ T }, and a _{ i } is the total relative abundance descended from branch i. The mean diversity \( {}{}^q\overline{D}\;(T) \) is interpreted as “the effective number of equally abundant and equally distinct lineages all with branch length s T during the time interval from T years ago to the present”. Here “equally distinct” also implies that the phylogenetic distance between any two species is T, so lineages are completely distinct (i.e., there are no shared branches).
The phylogenetic Hill number s are invariant to the units used to measure branch length s. When all lineages are completely distinct, the measure \( {}{}^q\overline{D}\;(T) \) reduces to the Hill numbers \( {}^qD={\left({\displaystyle \sum}_i{a}_i^q\right)}^{1/\left(1q\right)} \). This includes the special case that T tends to zero, i.e., the case that we ignore phylogeny and only consider the presentday community . This shows that the framework based on Hill numbers provides a unified approach to integrate abundances and phylogeny. Also, here we have a simple idealized reference tree to understand the value of \( {}{}^q\overline{D}\;(T)=z \) for an arbitrary tree: the mean phylogenetic diversity of the tree over the time period [−T, 0] is the same as the diversity of an idealized assemblage consisting of z equally abundant and equally distinct lineages all with branch length T.
For q = 0, when T is chosen as the age of the root node, we have \( {}{}^0\overline{D}\;(T)=\mathrm{Faith}'\mathrm{s}\;\mathrm{P}\mathrm{D}/T \), which can be interpreted as lineage richness . Faith’s PD can thus be regarded as a phylogenetic generalization of species richness. We can roughly interpret \( {}{}^1\overline{D}\;(T) \) as the effective number of common lineages, and \( {}{}^2\overline{D}\;(T) \) as the effective number of dominant lineages in the time period [−T, 0]. When T is chosen as the age of the root node, a simple relationship exists between phylogenetic entropy H _{ P } (Allen et al. 2009) and the measure \( {}{}^1\overline{D}\;(T) \):
For q = 2, when T is chosen as the age of the root node, there is a simple relationship between our measures and the widely used Rao ’s quadratic entropy Q (Chao et al. 2010):
The branch or phylogenetic diversity ^{q} PD(T) of order q during the time interval from T years ago to the present is defined as the product of \( {}{}^q\overline{D}\;(T) \) and T. It quantifies the amount of evolutionary history on the system over the interval [−T, 0], or “the effective total branchlength” (Chao et al. 2010):
If q = 0, and T is age of the root node, then ^{0} PD(T) reduces to Faith’s PD , regardless of branching pattern or abundances. As explained by Chao et al. (2010), we could imagine that all the branch segments in the interval [−T, 0] form a single assemblage with relative abundance set {a _{ i }/T; i∈B _{ T }}. In this assemblage, for each i there are L _{ i } “branches” with relative abundance a _{ i }/T. Then the Hill number of order q for this assemblage is exactly the branch diversity ^{q} PD(T) given in Eq. (5a). Dividing this Hill number by T, we obtain \( {}{}^q\overline{D}\;(T) \) given in Eq. (4a). Note in our framework that ^{q} PD(T) is truly a class of Hill numbers (“the effective number of lineageyears”), whereas \( {}{}^q\overline{D}\;(T) \) (“the effective number of lineages”) denotes a (generalized) mean of Hill numbers. See Faith and Richards (2012) and Faith (2013) for extensions of the measure ^{q} PD(T).
Unlike previous phylogenetic diversity measures developed in the literature, \( {}{}^q\overline{D}\;(T) \) and ^{q} PD(T) depend explicitly on two parameters, the abundance sensitivity parameter q and the time perspective (or timedepth) parameter T. The reasons we need this timedepth parameter and our suggestion to choose a perspective time are given as follows.

1.
When we compare the phylogenetic diversities of several assemblages based on the measures \( {}{}^q\overline{D}\;(T) \) and ^{q} PD(T), all measures should refer to the same time periods to make meaningful comparisons. That is, the timedepth T should be kept as the same for all assemblages. Therefore, a parameter is required to specify the timedepth.

2.
The choice of time perspective should reflect an investigator’s aims and facilitate comparisons with other studies. We suggest that at least two selected time perspectives should be included: T = 0, and T = the age of the root node of a phylogenetic tree connecting all species in the study. For the case of T = 0, the phylogeny is ignored and the diversity profile reduces to the profile in the presentday assemblage based on the ordinary Hill number s . If we choose T to be the age of the oldest node in the tree, we recover some of the standard measures of phylogenetic diversity (see Eqs. (4c) and (4d)).

3.
As suggested in Chiu et al. (2014), other time perspectives can be selected, such as T = the age of the node at which the group of interest diverges from the rest of the species. This choice of T is independent of the species actually sampled, so it allows statistically robust comparisons across investigations and regions (unlike the conventional choice of T as the root node of the tree containing the species actually observed). This choice also provides an accurate measure of the proportion of a taxonomic group’s evolutionary history preserved in a given assemblage. Another choice is the time of the most recent common ancestor of all taxa alive today. Other choices may be made, depending on the purpose of an investigation. The formula in Chiu et al. (2014, p. 42) can be used to convert phylogenetic diversity from one temporal perspective to another.
To see how the measures vary with q and time perspective T, we recommend using two types of profiles to completely characterize phylogenetic tree information and species abundances as described below. See section “An example” for examples. (1) The first type of diversity profile is obtained by plotting ^{q} PD(T) or \( {}{}^q\overline{D}\;(T) \) as a function of order q as q varies from 0 to about 3 or 4 (beyond which there is usually little change), for some selected values of temporal perspective T. For this type of profile, ^{q} PD(T) and \( {}{}^q\overline{D}\;(T) \) have similar patterns as T is fixed, so it is sufficient to plot the profile only for one measure. (2) The second type of diversity profile is obtained by plotting ^{q} PD(T) and \( {}{}^q\overline{D}\;(T) \) as functions of T separately for q = 0, 1, and 2. This profile shows the effect of timedepth or evolution change on our diversity measures.
For the second type of profile, ^{q} PD(T) and \( {}{}^q\overline{D}\;(T) \) generally exhibit different patterns (the profile of \( {}{}^q\overline{D}\;(T) \) is decreasing with T whereas the profile of ^{q} PD(T) for q = 0 (Faith’s PD ) is always increasing, and for q > 0 is generally increasing up to a certain point, so the profiles for both measures are informative. The parameter q gives the sensitivity of the two measures to presentday species relative abundances. As in the ordinary Hill number s , the measures with q = 2 favor more abundant species, so they are useful in ecological studies to examine the phylogenetic relationships of the dominant species in a set of assemblages, or those examining functional diversity . The measures of q = 0 emphasizes rare species, so they are useful when abundance information is not necessarily relevant (e.g., when ecologists try to identify past episodes of differentiation, or for some conservation biology applications). The measures with q = 1 weigh species according to their frequencies and can be used in most applications when neither dominant nor rare species should be favored.
When the measure of evolutionary change is typically based on the number of nucleotide base changes at a selected locus, or the amount of functional or morphological differentiation from a common ancestor, the branches of the resulting tree will then be uneven, so the tree is nonultrametric. In this case, Chao et al. (2010) showed that the time parameter T in all formulas should be replaced by the mean base change or mean branch length \( \overline{T}, \) the mean of the distances from the tree base to each of the terminal branch tips (i.e., the mean evolutionary change per species over the interval of interest). See the right panel of Fig. 1 for an illustrative example. Let \( {B}_{\overline{T}} \) denote the set of branches connecting all focal species, with mean branch length \( \overline{T}. \) Then we can express \( \overline{T} \) as \( \overline{T}={\displaystyle \sum}_{i\in {B}_{\overline{T}}}{L}_i{a}_i \). The diversity of a nonultrametric tree with mean evolutionary change \( \overline{T} \) is the same as that of an ultrametric tree with time parameter \( \overline{T}. \) Therefore, the diversity formulas for a nonultrametric tree are obtained by replacing T by \( \overline{T} \) in Eqs. (4a), (4b), (5a), and (5b). The resulting measures are denoted respectively as \( {}{}^q\overline{D}\;\left(\overline{T}\right) \), \( {}{}^1\overline{D}\;\left(\overline{T}\right) \), \( {}{}^qPD\;\left(\overline{T}\right) \) and \( {}{}^1PD\;\left(\overline{T}\right) \); see Chao et al. (2010) for details. When we compare the phylogenetic diversity based on the measures \( {}{}^q\overline{D}\;\left(\overline{T}\right) \) and \( {}{}^qPD\;\left(\overline{T}\right) \) for several nonultrametric trees, all measures should refer to the same mean base change \( \overline{T} \) to make meaningful comparisons.
Replication Principle for Phylogenetic Diversity Measures
The replication principle was generalized to a phylogenetic version in Chao et al. (2010). Suppose there are N equally large and completely phylogenetically distinct assemblages (no shared lineages across assemblages, though lineages within an assemblage may be shared); see Fig. 2 (reproduced from Chiu et al. 2014) for an illustrative example. Suppose these assemblages have the same phylogenetic Hill number X. If these assemblages are pooled, then the pooled assemblages must have a phylogenetic Hill number N × X. In the proof of this replication principle, Chao et al. (2010) assumed that these N assemblages have the same mean branch length s. Here we relax this assumption and allow assemblages to have different mean branch lengths. (In the special case of ultrametric trees, this means that we allow different time perspectives for different assemblages.)
Suppose in assemblage k, the mean branch length is \( {\overline{T}}_k \), and the branch set is \( {B}_{{\overline{T}}_k,k} \) (we omit \( {\overline{T}}_k \) in the subscript and just use B _{ k } in the following proof for notational simplicity) with branch lengths {L _{ ik }; i∈B _{ k }} and the corresponding nodes abundances {a _{ ik }; i∈B _{ k }}, k = 1, 2, …, N. Assume that all assemblages have the same phylogenetic Hill number s \( {}{}^q\overline{D}\;\left({\overline{T}}_k\right)=X, \) implying \( {\displaystyle \sum}_{i\in {B}_k}{L}_{ik}\kern0.24em {a}_{ik}^q={X}^{1q}{\overline{T}}_k \) for all k =1, 2, …, N. When the N trees are pooled with equal weight for each tree, each node abundance a _{ ik } in the pooled tree becomes a _{ ik }/N, and the mean branch length becomes \( \overline{T}=\left(1/N\right){\displaystyle \sum}_{k=1}^N{\overline{T}}_k \). Then the phylogenetic Hill number of order q for the pooled assemblage becomes
This proves a stronger version of the replication principle for phylogenetic Hill number s . Note the mean branch length in the pooled assemblage is the average of individual mean branch lengths. For example, if \( {}{}^q\overline{D}\left({\overline{T}}_1=2\right)={}{}^q\overline{D}\left({\overline{T}}_2=6\right)=10, \) then in an effective sense, there are ten lineages with mean branch length 2 in Assemblage 1 and there are ten lineages with mean branch length 6 in Assemblage 2. The replication principle implies that there are 20 lineages in the pooled tree with mean branch length 4. Since \( {}{}^qPD\;\left({\overline{T}}_k\right)={}{}^q\overline{D}\;\left({\overline{T}}_k\right)\times {\overline{T}}_k \), the replication principle for the phylogenetic diversity \( {}{}^qPD\;\left(\overline{T}\right) \) does need the assumption that all assemblages have the same mean branch lengths \( \left({\overline{T}}_1={\overline{T}}_2=\dots ={\overline{T}}_N\right) \). The proof is parallel and thus omitted.
Decomposition of Phylogenetic Diversity Measures
Decomposition of species richness and its phylogenetic analogues into within and betweengroup (alpha and beta) components is widely used (Whittaker 1972; Faith et al. 2009). However, these take no notice of abundance differences between sites. Conservationists using these measures cannot distinguish a site whose species are equally abundant from a site with the same species but with a highly skewed abundance distribution whose most phylogenetically distinctive species are rare. The former site would be a better bet for conservation. These considerations, and others, motivate the development of decomposition theory for abundancebased phylogenetic diversity measures. The decomposition also leads to abundancesensitive measures of phylogenetic similarity and complementarity.
When there are N assemblages, the phylogenetic Hill number s \( {}{}^q\overline{D}\;(T) \) (Eqs. 4a and 4b) and phylogenetic diversity ^{q} PD(T) (Eqs. 5a and 5b) of the pooled assemblage can be multiplicatively decomposed into independent alpha and beta components (Chiu et al. 2014). We briefly describe the decomposition of the measure \( {}{}^q\overline{D}\;(T) \) here for the ultrametric case, and only summarize the decomposition of the measure ^{q} PD(T). The extension to the nonultrametric case for both measures is obtained by simply replacing all T in the formulas with the mean branch length \( \overline{T} \) of the pooled assemblage.
To begin the partitioning, a pooled tree is constructed for the N assemblages. Assume that there are S species in the presentday assemblage (i.e., there are S tip nodes). For any tip node i, let z _{ ik } denote any measure of species importance of the ith species in the kth assemblage, i = 1, 2, …, S, k = 1, 2, …, N. The measure z _{ ik } is referred to as “abundance” for simplicity, although it can be absolute abundances, relative abundances, incidence, biomasses, cover areas or any other importance measure. Define \( {z}_{+k}={\displaystyle \sum}_{i=1}^S{z}_{ik} \) (i.e., the “+” sign in z _{ +k } denotes a sum over the tip nodes only) as the current size of the kth assemblage. Let \( {z}_{++}={\displaystyle \sum}_{k=1}^N{z}_{+k} \) be the total abundance in the presentday pooled assemblage.
Now consider the phylogenetic tree in the time interval [−T, 0], and in the pooled assemblage define B _{ T } and L _{ i } as in section “Phylogenetic Hill numbers and related measures”. We extend the definition of z _{ ik } to include all nodes and their corresponding branches by defining z _{ ik } for all i∈B _{ T } as the total abundances descended from branch i. (Here the index i can correspond to both tipnode and internal node; if i is a tipnode, then z _{ ik } represents data of the current assemblage as defined in the preceding paragraph.) As shown in Fig. 2 of Chiu et al. (2014), the diversity for each individual assemblage can be computed from the pooled tree structure, and only the node abundances vary with assemblages.
In the pooled assemblage, the node abundance for branch i (i∈B _{ T }) is \( {z}_{i+}={\displaystyle \sum}_{k=1}^N{z}_{ik} \) with branch relative abundance z _{ i+}/z _{++}, so the phylogenetic gamma diversity of order q can be calculated from Eq. (4a) as
The limit when q approaches unity exists and is equal to
The gamma diversity is the effective number of equally abundant and equally distinct lineages all with branch length s T in the pooled assemblage.
Chiu et al. (2014) derived the following phylogenetic alpha diversity for q ≥ 0 and q ≠ 1:
For q = 1, we have
The alpha diversity is interpreted as the effective number of equally abundant and equally distinct lineages all with branch length s T in an individual assemblage. When normalized measures of species importance (like relative abundance or relative biomass) are used to quantify species importance, we have z _{++} = N in Eqs. (8a) and (8b). The alpha formula then reduces to a generalized mean of the local diversities with the following property: if all assemblages have the same diversity X, the alpha diversity is also X (Jost 2007). For nonnormalized measures of species importance, like absolute abundance or biomass, this property does not hold. This is because when species absolute abundances are compared, for example, a threespecies assemblage with absolute abundances {2, 5, 8} will not be treated as identical as another threespecies assemblage with absolute abundances {200, 500, 800}. However, these two assemblages are treated as identical when only relative abundances are compared.
Chiu et al. (2014) proved that the phylogenetic gamma Hill number (Eqs. 7a and 7b) is always greater than or equal to the phylogenetic alpha Hill number (Eqs. 8a and 8b) for all q ≥ 0 regardless of species abundances and tree structures. Based on a multiplicative partitioning, the phylogenetic beta diversity is the ratio of gamma diversity to alpha diversity :
When the N assemblages are identical in species identities and species abundances, then \( {}{}^q\overline{D}_{\beta }(T)=1 \) for any T. When the N assemblages are completely phylogenetically distinct (no shared lineages), then \( {}{}^q\overline{D}_{\beta }(T)=N, \) no matter what the diversities or tree shapes of the assemblages. The measure \( {}{}^q\overline{D}_{\beta }(T) \) thus quantifies the effective number of completely phylogenetically distinct assemblages in the interval [−T, 0]. As proved by Chiu et al. (2014), the phylogenetic beta diversity \( {}{}^q\overline{D}_{\beta }(T) \) is always between unity and N for any given alpha value, implying alpha and beta components are unrelated (or independent) for both measures, \( {}{}^q\overline{D}\;(T) \) and ^{q} PD(T); see Chao et al. (2012) for a rigorous discussion of unrelatedness and independence of two measures. When all lineages in the pooled assemblage are completely distinct (no lineages shared) in the interval [−T, 0], the phylogenetic alpha, beta and gamma Hill number s reduce to those based on ordinary Hill numbers. This includes the limiting case in which T tends to zero, so that phylogeny is ignored.
Parallel decomposition can be made for the phylogenetic diversity ^{q} PD(T), and we summarize the following relations: \( {}{}^qP{D}_{\gamma }(T)={}{}^q\overline{D}_{\gamma }(T)\times T \) and \( {}{}^qP{D}_{\alpha }(T)={}{}^q\overline{D}_{\alpha }(T)\times T. \) Under a multiplicative partitioning scheme, we have \( {}{}^qP{D}_{\beta }(T)={}{}^qP{D}_{\gamma }(T)/{}{}^qP{D}_{\alpha }(T)={}{}^q\overline{D}_{\beta }(T) \), i.e., the beta components from partitioning the phylogenetic Hill number s \( {}{}^q\overline{D}\;(T) \) and phylogenetic diversity ^{q} PD(T) are identical, implying the interpretation and the corresponding similarity or differentiation measures (in the next section) are also identical. Thus, it is sufficient to focus only on the measure \( {}{}^q\overline{D}_{\beta }(T) \), which will be referred to as the phylogenetic beta diversity or beta component for simplicity.
For each of the two measures, \( {}{}^q\overline{D}\;(T) \) and ^{q} PD(T), alpha and gamma diversities obey the replication principle . Then the beta diversity formed by taking their ratio is replicationinvariant (Chiu et al. 2014). That is, when assemblages are replicated, the beta diversity does not change. Therefore, when we pool equallydistinct subtrees, such as pooling equallyancient subfamilies, the beta diversity is unchanged by pooling the subfamilies if all subfamilies show the same beta diversity (“consistency in aggregation”).
We now give the phylogenetic beta diversities for the special cases of q = 0, 1 and 2.

(a)
When q = 0, we have \( {}{}^0\overline{D}_{\beta }(T)={L}_{\gamma }(T)/{L}_{\alpha }(T) \), where L _{γ}(T) denotes the total branch length of the pooled tree (the gamma component of Faith’s PD ) and L _{ α }(T) denotes the average length of individual trees (the alpha component of Faith’s PD).

(b)
When q = 1, the phylogenetic beta diversity of order 1 is
$$ {}{}^1\overline{D}_{\beta }(T)= \exp \left[\left({H}_{P,\gamma }{H}_{P,\alpha}\right)/T+{\displaystyle \sum}_{k=1}^N\left(\frac{z_{+k}}{z_{++}}\right) \log \left(\frac{z_{+k}}{z_{++}}\right)+ \log N\right], $$(10a)where H _{ P,γ } and H _{ P,α } denote respectively the gamma and alpha phylogenetic entropy . When the species importance measure z _{ ik } represents the ith species relative abundance in the kth currenttime assemblage, then \( {z}_{+k}=1,\kern0.24em {z}_{++}=N,\kern0.24em {z}_{+k}/{z}_{++}=1/N. \) In this special case, we have \( {}{}^1\overline{D}_{\beta }(T)= \exp \left[\left({H}_{P,\gamma }{H}_{P,\alpha}\right)/T\right] \). Thus an additive decomposition for phylogenetic entropy H _{ P } holds (Pavoine et al. 2009; Mouchet and Mouillot 2011), as for ordinary Shannon entropy (Jost 2007).

(c)
When q = 2, the phylogenetic beta diversity can be expressed as
$$ {}{}^2\overline{D}_{\beta }(T)=\frac{{\displaystyle \sum}_{i\in {B}_T}{L}_i{\displaystyle \sum}_{k=1}^N{z}_{ik}^2}{{\displaystyle \sum_{i\in {B}_T}^N{L}_i{z}_{i+}^2}}\;. $$
In the special case of \( {z}_{+k}=1,\kern0.24em {z}_{++}=N \), this phylogenetic beta diversity of order 2 can be linked to quadratic entropy as
where Q _{ γ } and Q _{ α } denote respectively the gamma and alpha quadratic entropy . The above formula is also applicable to nonultrametric trees by replacing all T with \( \overline{T} \), the mean branch length in the pooled assemblage; see Chiu et al. (2014, Appendix C) for a proof.
Normalized Phylogenetic Similarity Measures
For traditional abundancebased diversity , the most commonly used similarity measures include Nassemblage generalizations of the Jaccard et al. (1966) and MorisitaHorn (Morisita 1959) measures. The latter three measures were integrated into a class of C _{ qN } measures by Chao et al. (2008). Jost (2006, 2007), Chao et al. (2008, 2012), and Chiu et al. (2014) have demonstrated that all the above measures are monotonic transformations of beta diversity based on the ordinary Hill number s . This is an advantage of using the framework of Hill numbers: a direct link exists between diversity and similarity (or differentiation) among assemblages.
Chiu et al. (2014) extended this framework by proposing four classes of similarity (or differentiation) measures that are monotonic functions of phylogenetic beta diversity . The basic idea is that the phylogenetic beta diversity , a ratio of gamma and alpha phylogenetic Hill number s , is independent of alpha and measures the pure differentiation among assemblages. The phylogenetic beta component always lies in the range [1, N] for any measures of species importance and all orders q ≥ 0. Since the range depends on N, the phylogenetic beta diversity cannot be used to compare phylogenetic differentiation among assemblages across multiple regions with different numbers of assemblages. To remove the dependence on N, several transformations can be used to transform the phylogenetic beta component onto [0, 1] to measure local overlap, regional overlap, homogeneity and turnover. We give a summary of these four transformations below and tabulate formulas and the relationship with previous measures in Table 1 for the two most important classes. The formulas for the special cases for q = 0, 1 and 2 are also displayed there.

1.
A class of branch overlap measures from a local perspective:
$$ {\overline{C}}_{qN}(T)=\frac{N^{1q}{\left[{}{}^q\overline{D}_{\beta }(T)\right]}^{1q}}{N^{1q}1}. $$(11a)This gives the effective average proportion of shared branches in an individual assemblage. This class of similarity measures extends the C _{ qN } overlap measure derived in Chao et al. (2008) to a phylogenetic version. The corresponding differentiation measure \( 1{\overline{C}}_{qN}(T) \) quantifies the effective average proportion of nonshared branches in an individual assemblage.

(1a)
For q = 0, this similarity measure is referred to as the “phyloSørensen ” Nassemblage overlap measure because for N = 2, it reduces to the measure PhyloSør (phyloSørensen) developed by Bryant et al. (2008) and Ferrier et al. (2007).

(1b)
For q = 1, this measure \( {\overline{C}}_{1N}(T) \) is called the “phyloHorn ” Nassemblage overlap measure because it extends Horn (1966) twoassemblage measure to incorporate phylogenies for N assemblages.

(1c)
For q = 2, \( {\overline{C}}_{2N}(T) \) is called the “phyloMorisitaHorn ” Nassemblage similarity measure because it extends MorisitaHorn measure (Morisita 1959) to incorporate phylogenies for N assemblages. The differentiation measure \( 1{\overline{C}}_{2N}(T) \) when the species importance measure is relative abundances reduces to the measure proposed by de Bello et al. (2010). However, their measure is valid only for ultrametric trees (p. 7 of de Bello et al. 2010). Here, the measure can be applied to nonultrametric trees to obtain
$$ 1{\overline{C}}_{2N}\left(\overline{T}\right)=\frac{1\left[1/{}{}^2\overline{D}_{\beta}\left(\overline{T}\right)\right]}{11/N}=\frac{Q_{\gamma }{Q}_{\alpha }}{\left(11/N\right)\left(\overline{T}{Q}_{\alpha}\right)}, $$(11b)where Q _{ γ } and Q _{ α } are respectively gamma and alpha quadratic entropy , and \( \overline{T} \) is the mean branch length in the pooled assemblage. A general form for any species importance measure (including absolute abundances) is
$$ 1{\overline{C}}_{2N}\left(\overline{T}\right)=\frac{{\displaystyle \sum}_{i\in {B}_{\overline{T}}}{L}_i{\displaystyle \sum}_{m>k}^N{\left({z}_{im}{z}_{ik}\right)}^2}{\left(N1\right){\displaystyle \sum}_{i\in {B}_{\overline{T}}}{L}_i{\displaystyle \sum}_{k=1}^N{z}_{ik}^2}\;. $$(11c)The above expression shows that the similarity index \( {\overline{C}}_{2N}\left(\overline{T}\right) \), as in all other abundancesensitive similarity measures, is unity if and only if \( {z}_{ij}={z}_{ik} \) (i.e., species importance measures are identical for any node i in the branch set and for any two assemblages j and k). This reveals that the similarity index \( {\overline{C}}_{2N}\left(\overline{T}\right) \) quantifies the nodebynode resemblance among the N abundance sets {z _{ ik }; i∈B _{T̅}}, k = 1, 2, …, N from a local perspective. See Fig. 2 of Chiu et al. (2014) for a simple example of the framework.

(1a)

2.
A class of branch overlap measures from a regional perspective:
$$ {\overline{U}}_{qN}(T)=\frac{{\left[1/{}{}^q\overline{D}_{\beta }(T)\right]}^{1q}{\left(1/N\right)}^{1q}}{1{\left(1/N\right)}^{1q}} $$(12a)This class of measures quantifies the effective proportion of shared branches in the pooled assemblage. The corresponding differentiation measure \( 1{\overline{U}}_{qN}(T) \) quantifies the effective average proportion of nonshared branches in the pooled assemblage.

(2a)
For q = 0, this measure is called the “phyloJaccard ” Nassemblage measure because for N = 2 the measure \( 1{\overline{U}}_{02}(T) \) reduces to the Jaccardtype UniFrac measure developed by Lozupone and Knight (2005) and the PDdissimilarity measure developed by Faith et al. (2009).

(2b)
For q = 1, this measure is identical to the “phyloHorn ” Nassemblage overlap measure \( {\overline{C}}_{1N}(T) \); see Table 1.

(2c)
For q = 2, we refer to the measure U̅ _{2N (T) } as a “phyloregionaloverlap ” measure. When the species importance measure is relative abundance, we have the following formula for nonultrametric trees:
$$ 1{\overline{U}}_{2N}\left(\overline{T}\right)=\frac{N{}{}^2\overline{D}_{\beta}\left(\overline{T}\right)}{N1}=\frac{Q_{\gamma }{Q}_{\alpha }}{\left(N1\right)\left(\overline{T}{Q}_{\gamma}\right)}, $$where \( \overline{T} \) denotes the mean branch length in the pooled assemblage. A general form for any species importance measure (including absolute abundances) is
$$ 1{\overline{U}}_{2N}\left(\overline{T}\right)=\frac{{\displaystyle \sum}_{i\in {B}_{\overline{T}}}{L}_i{\displaystyle \sum}_{m>k}^N{\left({z}_{im}{z}_{ik}\right)}^2}{\left(N1\right){\displaystyle \sum}_{i\in {B}_{\overline{T}}}{L}_i{z}_{i+}^2}\;. $$
The numerator is the same as that in \( {\overline{C}}_{2N}\left(\overline{T}\right) \), revealing that the similarity index \( {\overline{U}}_{2N}\left(\overline{T}\right) \) also quantifies the nodebynode resemblance among the N abundance sets {z _{ ik }; i∈B _{ T̅ }}, k = 1, 2, …, N; but here the denominator (for the purpose of normalization) is different and takes a regional perspective.

(2a)

3.
A class of phylogenetic homogeneity measures
$$ {\overline{S}}_{qN}(T)=\frac{1/{}{}^q\overline{D}_{\beta }(T)1/N}{11/N}. $$(12b)This measure is linear in the proportion of regional phylogenetic diversity contained in a typical assemblage.

(3a)
For q = 0, it reduces to the “phyloJaccard ” measure U̅ _{0N (T)}, i.e., \( {\overline{S}}_{0N}(T)={\overline{U}}_{0N}(T) \).

(3b)
For q = 1, this measure does not reduce to the “phyloHorn ” overlap measure.

(3c)
For q = 2, this measure is identical to \( {\overline{C}}_{2N}(T) \), the “phyloMorisitaHorn ” similarity measure, i.e., \( {\overline{S}}_{2N}(T)={\overline{C}}_{2N}(T). \)

(3a)

4.
A class of measures of the complement of “phylogenetic turnover rate”:
$$ {\overline{V}}_{qN}(T)=\frac{N{}{}^q\overline{D}_{\beta }(T)}{N1}=1\frac{{}{}^q\overline{D}_{\beta }(T)1}{N1}. $$(12c)This measure in linear in the phylogenetic beta diversity and the corresponding differentiation measure \( \left[{}{}^q\overline{D}_{\beta }(T)1\right]/\left(N1\right) \) quantifies the relative branch turnover rate per assemblage.

(4a)
For q = 0, the measure \( {\overline{V}}_{0N}(T) \) is identical to the “phyloSørensen ” measure, i.e., \( {\overline{V}}_{0N}(T)={\overline{C}}_{0N}(T) \).

(4b)
For q = 1, this measure does not reduce to the “phyloHorn ” overlap measure.

(4c)
For q = 2, this measure is identical to U̅ _{2N (T)}, the “phyloregionaloverlap ” measure. That is, \( {\overline{V}}_{2N}(T)={\overline{U}}_{2N}(T) \).

(4a)
As with the phylogenetic diversity measures, all the above similarity or differentiation measures are functions of two parameters: the sensitivity parameter q and the time perspective T. Thus, for each measure, we suggest using the two types of profiles described in section “Phylogenetic Hill numbers and related measures” for the two major similarity measures \( {\overline{C}}_{qN}(T) \) and U̅ _{ qN }(T) (or their complements) to convey complete information about the similarity or differentiation of a set of assemblages. An example showing the two types of profiles is given in section “An example”.
The lineage excess \( {}{}^q\overline{D}_{\gamma }(T){}{}^q\overline{D}_{\alpha }(T) \) and the phylogenetic diversity excess \( {}{}^qP{D}_{\gamma }(T){}{}^qP{D}_{\alpha }(T) \) can be interpreted as the effective number of regional lineages (or regional phylogenetic diversity) not contained in a typical local assemblage. However, they cannot be directly applied to compare the similarity or differentiation across multiple regions because both depend not only on the number of assemblages, but also on their corresponding alpha diversity . Following Chao et al. (2012) and Chiu et al. (2014, their Appendix D) proved that we can eliminate these dependences by using an appropriate normalization. After proper normalizations, the two measures lead to the same four classes of normalized similarity and differentiation measures as those obtained from the phylogenetic beta diversity . This is another advantage of using the framework of phylogenetic Hill number s . That is, a consensus can be achieved on phylogenetic similarity and differentiation measures, including Nassemblage phylogenetic generalizations of the classic Jaccard, Sørensen, Horn and MorisitaHorn measures, regardless of whether one prefers multiplicative or additive decompositions.
An Example
We apply the phylogenetic diversity measures and similarity (or differentiation) measures considered in this chapter to a real conservation biology case discussed by Pavoine et al. (2009), a heavilyfished assemblage of 52 rockfish species of the genus Sebastes collected for 20 years over three decades (1980–1986, 1993–1994, 1996, 1998–2007) from the Southern California Bight, USA. The phylogenetic tree for these 52 species was obtained from Hyde and Vetter (2007); see Fig. 3a. The age of the root for these species is around 7.9 million years (Myr).
We separate the data into three decades: 1980s, 1990s and 2000s, which will be referred to as Assemblages (and Decades) I, II and III respectively. Within each decade’s assemblage, species abundances are pooled. The species relative abundances for the three assemblages are shown in Fig. 3a. There were 48, 44 and 39 species in Decades I, II and III, respectively. (Note that each data point here is a mean of many years’ observations.) A subtree containing only the six dominant species (those with relative abundance >8 % in at least one assemblage) is shown in Fig. 3b. All six species are shared in the three assemblages and four of them have been in isolated lineages for 6 Myr.
As suggested in section “Phylogenetic Hill numbers and related measures”, we present for each assemblage two types of profiles. In Fig. 4a, we plot the measure \( {}{}^q\overline{D}\;(T) \) as a function of order q, 0 ≤ q ≤ 3, for two selected values of temporal perspectives: T = 0 (phylogeny is ignored) and T = 7.9 Myr (whole phylogenetic tree in Fig. 3a is considered). In Fig. 4b, we plot \( {}{}^q\overline{D}\;(T) \) and ^{q} PD(T) as functions of T separately for q = 0, 1, and 2 for 0 ≤ T ≤ 10.
Based on our phylogenetic diversity measures, all profiles in Fig. 4 reveal that the diversity in the most recent decade (Decade III) is the lowest among the three decades in the rockfish assemblage. This implies an appreciable loss of species (as shown in the first type of profile for T = 0), loss of lineages (as shown in the second type of profile based on the measure \( {}{}^q\overline{D}\;(T) \)), and loss of evolutionary history (as shown in the second type of profiles based on the measure ^{q} PD(T)) over the three decades.
When species/lineage abundances are discounted (q = 0 in the left panels of Fig. 4b), both lineage richness (based on the measure \( {}{}^0\overline{D}\;(T) \)) and total branch length s (based on the measure ^{0} PD(T), i.e., Faith’s PD ) exhibit the expected ordering: Decade I > Decade II > Decade III. When species/lineage abundances are counted (i.e. q = 1 and 2 in Fig. 4b), the profiles for Decades I and II cross because the assemblage of Decade II has more even abundant species than that of Decade I (see the first type of profiles for T = 0 and Fig. 3a, b). Note that if the timedepth is greater than 6 Myr (including the age of the root), then all the abundancesensitive phylogenetic measures for the three assemblages are very close because most of the dominant species began to diverge around 6 Myr (Fig. 3b). This also explains the closeness of the three profiles in the first type of profile for T = 7.9 Myr (the right panel in Fig. 4a).
To illustrate the phylogenetic differentiation among assemblages, we focus on measuring the phylogenetic differentiation between any two decades for three pairs (i.e. Decades I vs. II, Decades I vs. III and Decades II vs. III). To see how the phylogenetic differentiation measures vary with the time perspective q and with the order T, we show two types of profiles for each of the two differentiation measures \( 1{\overline{C}}_{qN}(T) \) and \( 1{\overline{U}}_{qN}(T) \) in Figs. 5 and 6. In Fig. 5a, we present the first type of profile that plots the measure \( 1{\overline{C}}_{qN}(T) \) as a function of q where q is in the range [0, 3] for two time perspectives: T = 0 (nonphylogenetic case) and T = 7.9 Myr (the age of the root node). In Fig. 5b, the same type of differentiation profile is shown for the other measure \( 1{\overline{U}}_{qN}(T) \). Then in Fig. 6a, b, we present the second type of profile that shows the two measures as a function of temporal perspective T, 0 ≤ T ≤ 10, for q = 0, 1 and 2 separately.
Based on the two phylogenetic differentiation measures, all profiles in Figs. 5 and 6 show consistent patterns. When species/lineages abundances are discounted (q = 0), the differences among the differentiation measures of the three pairs of assemblages are not appreciable, as shown in the two left panels in Fig. 6 and in the initial point in each of profiles in Fig. 5. When species/lineages abundances are counted (q > 0), the compositional differentiation between Decades I vs. II is generally close to that between Decades I vs. III, and the differentiation between two recent decades (Decades II vs. III) is much lower than any of the other two pairs. This implies that the composition of species/lineage abundances has changed after 1990. Examining the relative abundances for those dominant species listed in Fig. 3b, we see that the most abundant species S. paucispinis (23 %) in Decade I became less abundant in both Decade II (9 %) and Decade III (11 %); the second most abundant species S. mystinus (11 %) in Decade I became quite rare in both Decade II (4 %) and Decade III (5 %). Also, the species S. miniatus in Decade I was rare, but it became the most dominant species in both Decade II (12 %) and Decade III (25 %). These compositional changes for dominant species help explain the above findings.
As the time perspective T becomes large, more dominant shared lineages are added to the two assemblages, implying the differentiation between any two assemblages should exhibit a nonincreasing trend as T is increased. Our two differentiation measures for q > 0 in Fig. 6 show the expected decreasing trend, and the decline rates differ for q = 1 and q = 2. Based on Fig. 3b, we see that most of the dominant and isolated species began to diverge around 6 Myr ago. Thus, the two differentiation profiles for q = 1 and 2 start to decrease sharply around 6 Myr especially for order q = 2. Since the node abundances near roots (where the differentiation values are near zero) are relatively high and dominant in the whole tree, all values of the phylogenetic differentiation measures for T = 7.9 Myr (the first type of profile for T = 7.9 Myr in the right panel of Fig. 5) are substantially lower than their corresponding nonphylogenetic differentiation measure by comparing two figures (T = 0 and T = 7.9 Myr) in each row of Fig. 5. The two types of profiles (in Fig. 5a, b, and 6a, b) demonstrate that the two differentiation measures \( 1{\overline{C}}_{qN}(T) \) and \( 1{\overline{U}}_{qN}(T) \) can incorporate the differences in both tree structure and lineage abundances.
In summary, our phylogenetic diversity measures have shown an appreciable loss of species, lineage and evolutionary history in rockfish assemblage over time due to fishing pressure, and our phylogenetic differentiation measures show a pronounced change of species/lineages composition after 1990.
Conclusion

1.
To quantify phylogenetic diversity of an assemblage, we suggest using two measures: (i) the phylogenetic Hill number \( {}{}^q\overline{D}\;(T) \) (Eqs. 4a and 4b) which measures the “the effective number of equally abundant and equally distinct lineages all with branch length s T”, and (ii) the phylogenetic or branch diversity ^{q} PD(T) (Eqs. 5a and 5b) which measures the “effective total lineagelength”, i.e., the total evolutionary history on an assemblage since time T. These two measures depend explicitly on two parameters, the abundance sensitivity parameter q and the time perspective (or timedepth) parameter T.

2.
Two types of diversity profiles are recommended for considering species/branch abundances and phylogenetic information: (i) The first type of diversity profile is obtained by plotting ^{q} PD(T) or \( {}{}^q\overline{D}\;(T) \) as a function of order q, for some selected values of temporal perspective T including T = 0 (i.e., the nonphylogenetic profile based on the ordinary Hill number s ), and T = the age of the most basal node. See the upper panels of Fig. 4 for an example. It would be also informative to include T = the age of the divergence between the group under study and the rest of the tree. (ii) The second type of diversity profile is obtained by plotting ^{q} PD(T) and \( {}{}^q\overline{D}\;(T) \) as functions of T separately for q = 0, 1, and 2; see the middle and lower panels of Fig. 4 for an example. The second type of profile shows the effect of timedepth or evolution change on our diversity measures.

3.
When there are multiple assemblages, the phylogenetic gamma Hill number is the effective number of equally abundant and equally distinct lineages in the pooled assemblage; the phylogenetic alpha Hill number is the effective number of equally abundant and equally distinct lineages per assemblage. Thus the phylogenetic beta Hill number, as the ratio of gamma and beta, is interpreted as “the number of phylogenetically completely distinct assemblages”. In this case, alpha and beta are unrelated (or independent). The difference of phylogenetic gamma and alpha Hill numbers is lineage excess, which is dependent on both alpha and gamma. The phylogenetic beta Hill number and lineage excess lead to the same classes of similarity and differentiation measures, listed in section “Normalized phylogenetic similarity measures”. See Table 1 for the two major classes of phylogenetic overlap measures, \( {\overline{C}}_{qN}(T) \) from a local perspective and U̅ _{ qN }(T) from a regional perspective.

4.
To assess the phylogenetic resemblance or differentiation among assemblages, two types of similarity or differentiation profiles as those in Point 2 are suggested for the two major classes of measures, \( {\overline{C}}_{qN}(T) \) and U̅ _{ qN }(T) (Table 1); see Figs. 5 and 6 for examples.
References
Allen B, Kon M, BarYam Y (2009) A new phylogenetic diversity measure generalizing the Shannon index and its application to phyllostomid bats. Am Nat 174(2):236–243
Bryant JA, Lamanna C, Morlon H, Kerkhoff AJ, Enquist BJ, Green JL (2008) Microbes on mountainsides: contrasting elevational patterns of bacterial and plant diversity. Proc Natl Acad Sci U S A 105:11505–11511
CavenderBares J, Ackerly DD, Kozak KH (2012) Integrating ecology and phylogenetics: the footprint of history in modernday communities. Ecology 93(sp8):S1–S3
Cavender‐Bares J, Kozak KH, Fine PV, Kembel SW (2009) The merging of community ecology and phylogenetic biology. Ecol Lett 12(7):693–715
Chao A (2005) Species estimation and applications. In: Kotz S, Balakrishnan N, Read CB, Vidakovic B (eds) Encyclopedia of statistical sciences, vol 12, 2nd edn. Wiley, New York, pp 7907–7916
Chao A, Jost L, Chiang SC, Jiang YH, Chazdon RL (2008) A twostage probabilistic approach to multiplecommunity similarity indices. Biometrics 64(4):1178–1186
Chao A, Chiu CH, Jost L (2010) Phylogenetic diversity measures based on Hill numbers. Phil Trans R Soc London B365:3599–3609
Chao A, Chiu CH, Hsieh TC (2012) Proposing a resolution to debates on diversity partitioning. Ecology 93(9):2037–2051
Chao A, Chiu CH, Hsieh TC, Davis T, Nipperess DA, Faith DP (2015) Rarefaction and extrapolation of phylogenetic diversity. Methods Ecol Evol 6:380–388
Chiu CH, Jost L, Chao A (2014) Phylogenetic beta diversity, similarity, and differentiation measures based on Hill numbers. Ecol Monog 84(1):21–44
Colwell RK, Coddington JA (1994) Estimating terrestrial biodiversity through extrapolation. Philos Trans R Soc Lond B345:101–118
Crozier R (1997) Preserving the information content of species: genetic diversity, phylogeny, and conservation worth. Annu Rev Ecol Syst 28:243–268
Daróczy Z (1970) Generalized information functions. Inf Control 16(1):36–51
de Bello F, Lavorel S, Díaz S, Harrington R, Cornelissen JH, Bardgett RD, Berg MP, Cipriotti P, Feld CK, Hering D (2010) Towards an assessment of multiple ecosystem processes and services via functional traits. Biol Cons 19(10):2873–2893
Ellison AM (2010) Partitioning diversity. Ecology 91:1962–1963
Faith DP (1992) Conservation evaluation and phylogenetic diversity. Biol Cons 61(1):1–10
Faith DP (2013) Biodiversity and evolutionary history: useful extensions of the PD phylogenetic diversity assessment framework. Ann NY Acad Sci 1289(1):69–89
Faith DP, Richards ZT (2012) Climate change impacts on the tree of life: changes in phylogenetic diversity illustrated for acropora corals. Biology 1(3):906–932
Faith DP, Lozupone CA, Nipperess D, Knight R (2009) The cladistic basis for the phylogenetic diversity (PD) measure links evolutionary features to environmental gradients and supports broad applications of microbial ecology’s “phylogenetic beta diversity” framework. Int J Mol Sci 10:4723–4741
Ferrier S, Manion G, Elith J, Richardson K (2007) Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Divers Distrib 13(3):252–264
Gotelli NJ, Colwell RK (2011) Estimating species richness. In: Magurran AE, McGill BJ (eds) Biological diversity: frontiers in measurement and assessment. Oxford University Press, Oxford, pp 39–54
Hannah L, Kay JA (1977) Concentration in the modern industry: theory, measurement, and the U.K. experience. MacMillan, London
Havrda J, Charvát F (1967) Quantification method of classification processes: concept of structural αentropy. Kybernetika 3(1):30–35
Helmus MR, Bland TJ, Williams CK, Ives AR (2007) Phylogenetic measures of biodiversity. Am Nat 169(3):68–83
Hill M (1973) Diversity and evenness: a unifying notation and its consequences. Ecology 54(2):427–432
Horn HS (1966) Measurement of “overlap” in comparative ecological studies. Am Nat 100:419–424
Hubbell SP (2001) A unified theory of biodiversity and biogeography. Princeton University Press, Princeton
Hurlbert SH (1971) The nonconcept of species diversity: a critique and alternative parameters. Ecology 52(4):577–586
Hyde JR, Vetter RD (2007) The origin, evolution, and diversification of rockfishes of the genus Sebastes (Cuvier). Mol Phylogenet Evol 44:790–811
Ives AR, Helmus MR (2010) Phylogenetic Metrics of Community Similarity. Am Nat 176:128–142
Ives AR, Helmus MR (2011) Generalized linear mixed models for phylogenetic analyses of community structure. Ecol Monogr 81(3):511–525
Jost L (2006) Entropy and diversity. Oikos 113(2):363–375
Jost L (2007) Partitioning diversity into independent alpha and beta components. Ecology 88:2427–2439
Jost L (2009) Mismeasuring biological diversity: response to Hoffman and Hoffman (2008). Ecol Econ 68:925–927
Jost L (2010) Independence of alpha and beta diversities. Ecology 91:1969–1974
Jost L, DeVries P, Walla T, Greeney H, Chao A, Ricotta C (2010) Partitioning diversity for conservation analyses. Divers Distrib 16(1):65–76
Keylock C (2005) Simpson diversity and the ShannonWiener index as special cases of a generalized entropy. Oikos 109:203–207
Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population. Genetics 49(4):725–738
Lozupone C, Knight R (2005) UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 71(12):8228–8235
MacArthur RH (1965) Patterns of species diversity. Biol Rev 40(4):510–533
MacArthur RH (1972) Geographical ecology. Harper & Row, New York
MacArthur RH, Wilson EO (1967) The theory of island biogeography. Princeton University Press, Princeton
Magurran AE (2004) Measuring biological diversity. Blackwell, Oxford
Magurran AE, McGill BJ (eds) (2011) Biological diversity: frontiers in measurement and assessment. Oxford University Press, Oxford
McPeek MA, Miller TE (1996) Evolutionary biology and community ecology. Ecology 77(5):1319–1320
Morisita M (1959) Measuring of interspecific association and similarity between communities. Memoires of the Faculty of Science, Kyushu University. Ser E (Biol) 3:65–80
Mouchet MA, Mouillot D (2011) Decomposing phylogenetic entropy into α, β and γ components. Biol Lett 7(2):205–209
Nipperess DA, Matsen FA IV (2013) The mean and variance of phylogenetic diversity under rarefaction. Methods Ecol Evol 4:566–572
Patil GP, Taillie C (1979) A study on diversity profiles and orderings for a bird community in the vicinity of Colstrip, Montana. In: Patil GP, Rosenzweig M (eds) Contemporary quantitative ecology and related econometrics. International Cooperative Publishing House, Fairland, pp 23–48
Patil GP, Taillie C (1982) Diversity as a concept and its measurement. J Am Stat Assoc 77:548–561
Pavoine S, Love MS, Bonsall MB (2009) Hierarchical partitioning of evolutionary and ecological patterns in the organization of phylogenetically structured species assemblages: application to rockfish (genus: Sebastes) in the Southern California Bight. Ecol Lett 12(9):898–908
Pavoine S, Baguette M, Bonsall MB (2010) Decomposition of trait diversity among the nodes of a phylogenetic tree. Ecol Monogr 80(3):485–507
Peet RK (1974) The measurement of species diversity. Annu Rev Ecol Syst 5:285–307
Petchey OL, Gaston KJ (2002) Functional diversity (FD), species richness and community composition. Ecol Lett 5(3):402–411
Pielou EC (1975) Ecological diversity. Wiley, New York
Purvis A, Hector A (2000) Getting the measure of biodiversity. Nature 405(6783):212–219
Rao CR (1982) Diversity and dissimilarity coefficients: a unified approach. Theor Popul Biol 21(1):24–43
Rényi A (1961) On measures of entropy and information. In: Neyman J (ed) Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, Berkeley, pp 547–561
Ricotta C, Szeidl L (2009) Diversity partitioning of Rao’s quadratic entropy. Theor Popul Biol 76(4):299–302
Routledge R (1979) Diversity indices: which ones are admissible? J Theor Biol 76(4):503–515
Tilman D (2001) Functional diversity. In: Levin AS (ed) Encyclopedia of biodiversity, vol 3. Academic Press, San Diego, pp 109–120
Tsallis C (1988) Possible generalization of BoltzmannGibbs statistics. J Stat Phys 52(1):479–487
Vellend M, Cornwell WK, MagnusonFord K, Mooers A (2011) Measuring phylogenetic biodiversity. In: Magurran AE, McGill BJ (eds) Biological diversity: frontiers in measurement and assessment. Oxford University Press, Oxford, pp 194–207
Warwick R, Clarke K (1995) New ‘biodiversity’ measures reveal a decrease in taxonomic distinctness with increasing stress. Mar Ecol Prog Ser 129(1):301–305
Webb CO (2000) Exploring the phylogenetic structure of ecological communities: an example for rain forest trees. Am Nat 156:145–155
Webb CO, Ackerly DD, McPeek MA, Donoghue MJ (2002) Phylogenies and community ecology. Annu Rev Ecol Syst 33:475–505
Webb CO, Losos JB, Agrawal AA (2006) Integrating phylogenies into community ecology. Ecology 87(sp7):S1–S2
Weiher E (2011) A primer of trait and functional diversity. In: Magurran AE, McGill BJ (eds) Biological diversity: frontiers in measurement and assessment. Oxford University Press, Oxford, pp 175–193
Whittaker RH (1972) Evolution and measurement of species diversity. Taxon 12:213–251
Acknowledgements
The authors thank Roseli Pellens, Samuel Scheiner and an anonymous reviewer for helpful comments and suggestions. This work was supported by the Taiwan National Science Council under Contracts 1002118M007006 and 1032628M007007 for AC. CC is supported by a postdoctoral fellowship, National Tsing Hua University, Taiwan. LJ was supported by a donation from John V. Moore to the Population Biology Foundation. All numerical results presented in section “An example” of this chapter were obtained by expanding the R scripts provided in Pavoine et al. (2009, their Appendix S1) to include Hill number s and our phylogenetic measures.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is distributed under the terms of the Creative Commons AttributionNoncommercial 2.5 License (http://creativecommons.org/licenses/bync/2.5/) which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
The images or other third party material in this chapter are included in the work’s Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work’s Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.
Copyright information
© 2016 The Author(s)
About this chapter
Cite this chapter
Chao, A., Chiu, CH., Jost, L. (2016). Phylogenetic Diversity Measures and Their Decomposition: A Framework Based on Hill Numbers. In: Pellens, R., Grandcolas, P. (eds) Biodiversity Conservation and Phylogenetic Systematics. Topics in Biodiversity and Conservation, vol 14. Springer, Cham. https://doi.org/10.1007/9783319224619_8
Download citation
DOI: https://doi.org/10.1007/9783319224619_8
Publisher Name: Springer, Cham
Print ISBN: 9783319224602
Online ISBN: 9783319224619
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)