# The Rarefaction of Phylogenetic Diversity: Formulation, Extension and Application

## Abstract

Like other measures of diversity, Phylogenetic Diversity (PD) increases monotonically and asymptotically with increasing sample size. This relationship can be described by a rarefaction curve tracing the expected PD for a given number of accumulation units. Accumulation units represent individual organisms, collections of organisms (e.g. sites), or even species (or equivalent), giving individual-based, sample-based and species-based curves respectively. The formulation for the exact analytical solution for the rarefaction of PD is given in an expanded form to demonstrate congruence with the classic formulation for the rarefaction of species richness. Rarefaction is commonly applied as a standardisation for diversity values derived from differing numbers of sampling units. However, the solution can be simply extended to create measures of phylogenetic evenness, phylogenetic beta-diversity and phylogenetic dispersion, derived from individual-based, sample-based and species-based curves respectively. This extension, termed ∆PD, is simply the initial slope of the rarefaction curve and is related to entropy measures such as PIE (Probability of Interspecific Encounter) and Gini-Simpson entropy. The application of rarefaction of PD to sample standardisation and measurement of phylogenetic evenness, phylogenetic beta-diversity and phylogenetic dispersion is demonstrated. Future prospects for PD rarefaction include the recognition of evolutionary hotspots (independent of species richness), the basis for ecological theory such as phylogeny-area relationships, and the prediction of unseen biodiversity.

### Keywords

Alpha diversity Beta diversity Evenness Phylogeny Sampling curves## Introduction

Phylogenetic Diversity (PD) is a simple, intuitive and effective measure of biodiversity. The PD of a set of taxa, represented as the tips of a phylogenetic tree, is the sum of the branch lengths connecting those taxa (Faith 1992). PD is a particularly flexible measure because it can be applied to any set of relationships among entities that can be reasonably portrayed as a tree. Thus, the tips do not, by necessity, need to represent species but could be higher taxa, Operational Taxonomic Units, Evolutionarily Significant Units, individual organisms or unique haplotypes. Further, the tree itself might not portray evolutionary relationships but instead be, for example, a cluster dendrogram portraying functional relationships among taxa (Petchey and Gaston 2002).

Since the original formulation by Faith (1992), PD has come to be not just a single measure equating to a phylogenetically weighted form of richness, but rather a general class of measures dealing with various aspects of alpha and beta-diversity (Faith 2013). The common feature of this class of measures is the summation of branch lengths rather than the counting of tips. By substituting branch segments (intervals between nodes on a phylogenetic tree) for species, and including a weighting for the length of that segment, it is possible to modify many of the classic measures of Species Diversity (SD) to a PD equivalent (Faith 2013). By this means, phylogenetically weighted measures of endemism (Faith et al. 2004; Rosauer et al. 2009), ecological resemblance (Ferrier et al. 2007; Nipperess et al. 2010), and entropy (Chao et al. 2010, and chapter “Phylogenetic Diversity Measures and Their Decomposition: A Framework Based on Hill Numbers”) have been developed, for example.

Beside the units by which sampling effort is measured, Gotelli and Colwell (2001) distinguished between “accumulation curves” and “rarefaction curves”, based on the process by which the sampling curve is calculated. An accumulation curve plots a single ordering of individuals or samples (or species) against a cumulatively calculated concave diversity measure. The jagged shape of the resulting curve is highly dependent on the, often arbitrary, order of the accumulation units. To resolve this problem, rarefaction curves instead plot the *expected* value of the diversity measure against the corresponding number of accumulation units. Rarefaction can be achieved using an algorithmic procedure of repeated random sub-sampling of the full set of accumulation units and calculating the mean diversity (Gotelli and Colwell 2001). However, Hurlbert (1971) and Simberloff (1972) showed that expected diversity can be calculated using an exact analytical solution, obviating the need for computer-intensive repeated sub-sampling. Initially, this solution was for individuals-based rarefaction curves, but it has since been shown that the same solution applies to sample-based rarefaction (Kobayashi 1974; Ugland et al. 2003; Mao et al. 2005; Chiarucci et al. 2008).

The original purpose of rarefaction was to allow the comparison of datasets with differing amounts of sampling effort (Sanders 1968). Assemblages can be compared “fairly” when rarefied to the same number of accumulation units (Gotelli and Colwell 2001). However, rarefaction has broader application than this single purpose. Depending of the unit of accumulation, the shape of the rarefaction curve provides information on ecological evenness (Olszewski 2004) and beta-diversity (Crist and Veech 2006). Rarefaction of species richness also forms the basis of estimators of species richness, including unseen species (Colwell and Coddington 1994). In the case of PD, species-based rarefaction curves also allow for a measure of phylogenetic dispersion (Webb et al. 2002), effectively the expected PD for some given number of species (Nipperess and Matsen 2013). A solution for the rarefaction of PD is therefore desirable as it will allow for these applications to be realised for phylogenetically explicit datasets.

Rarefaction of Phylogenetic Diversity, using an algorithmic solution of repeated sub-sampling, has now been done several times (see for example Lozupone and Knight 2008; Turnbaugh et al. 2009; Yu et al. 2012). However, an analytical solution for PD rarefaction, similar to that determined by Hurlbert (1971) for species richness, is preferable both because its results are exact (not dependent on the number of repeated subsamples) and substantially more computationally efficient. Nipperess and Matsen (2013) recently published just such a solution for both the mean and variance of PD under rarefaction. This solution is quite general, being applicable to rooted and unrooted trees, and even allowing partition of the tree into smaller components than the individual branch segments. As a result, the solution is given in a very generalised form and its relationship with classic rarefaction formula for species richness is not immediately clear.

In this chapter, I provide a detailed formulation for the exact analytical solution for expected (mean) Phylogenetic Diversity for a given amount of sampling effort. This formulation is for the specific but common case of a rooted phylogenetic tree where whole branch segments are selected under rarefaction. I use the same form of expression as used by Hurlbert (1971) to demonstrate the direct relationship between rarefaction of PD and rarefaction of species richness. I do not include a solution for variance of PD under rarefaction due to its complexity when given in this form and instead refer the reader to Nipperess and Matsen (2013). I extend this framework to show how the initial slope of the rarefaction curve (∆*PD*) can be used as a flexible measure of phylogenetic evenness, phylogenetic beta-diversity or phylogenetic dispersion, depending on the unit of accumulation. I apply PD rarefaction and the derived ∆PD measure to real ecological datasets to demonstrate its usefulness in addressing ecological questions. Finally, I discuss some future directions for the extension and application of PD rarefaction.

## Formulation

*S*) for a given amount of sampling is simply the sum of probabilities (

*p*) of each species occurring in a subset of

*m*accumulation units (Eq. 1).

*p*) of each species being selected by a random draw of

*m*accumulation units from the total set of

*N*units. Regardless of whether the accumulation unit is an individual or a sample, this probability is a function of the frequency (

*n*) with which species

*i*occurs across the set of

*N*accumulation units (Chiarucci et al. 2008). Since

*N*is a set of finite size, random draws from that set should be without replacement and thus

*p*is defined by the hypergeometric distribution (Hurlbert 1971). Substituting into Eq. 1, the expected species richness is as follows (Eq. 2).

*p*in Eq. 1. Note that the expressions in curved brackets are binomial coefficients and not simple fractions, while the quantity subtracted from one within the square brackets is a fraction. The denominator in this fraction gives the number of distinct subsets of size

*m*that can be drawn from the total set of

*N*units. The numerator gives the number of distinct subsets of size

*m*that do not contain species

*i*. Equation 2 is the same as that originally proposed by Hurlbert (1971).

*S*species, there is a corresponding set of

*T*branch segments. Each branch segment (

*j*) has a length (

*L*) measured as sequence substitutions, millions of years, or some other biologically meaningful estimate of difference. Considering only rooted phylogenetic trees, PD is calculated as follows (Eq. 3).

*rooted PD*value of Pardi and Goldman (2007). The alternative, called

*unrooted PD*by Pardi and Goldman (2007), includes only the branch segments connecting a subset of species to their common ancestor, and thus a subset containing only a single species would have zero PD. The former definition, rooted PD, is adopted here because it allows for the straight-forward formulation of a whole class of derived PD measures (Faith 2013), and because it is concordant with the original idea of PD acting as a surrogate for the feature diversity of a set (Faith 1992; Faith et al. 2009). Obviously, rooted PD requires a rooted phylogenetic tree, even if the choice of root is arbitrary (Nipperess and Matsen 2013).

*m*accumulation units (Fig. 2). This is achieved by extending the classic rarefaction formula through a substitution of species for branch segments in a phylogenetic tree. Since PD is simply the sum of branch lengths, then the expected PD must also be the sum of branch lengths, each weighted by the probability (

*q*) of its occurrence in a subset of size

*m*(O’Dwyer et al. 2012). So, for a rooted phylogenetic tree represented as a set of

*T*branch segments, the expected PD is given as follows (Eq. 4).

*o*) depends on the frequency of occurrence of species that are descendent from that branch segment. Let

*x*be a binary value indicating whether species

*i*is (1) or is not (0) a descendant of branch segment

*j*. Multiplying

*x*by

*n*and summing across all species will give the total number of occurrences of branch segment

*j*among

*N*accumulation units (Eq. 5).

*expected*

*PD*of Faith (2013) but differs in that random draws are without replacement following the hypergeometric distribution.

*species*. A species, in this context, is simply a collection of individuals in much the same way as a sample is a collection of individuals, and the same equations apply. Under these circumstances,

*o*

_{j}is equal to the sum of

*x*

_{ij}(over all species) as

*n*

_{i}will always equal 1, and

*N*is equal to

*S*. Substituting into Eq. 6 gives the following formula for rarefaction by species (Eq. 7).

## Extension

*m*= 2) and the expected species richness for one individual (

*m*= 1), and is the probability that the second individual will be a different species from the first (Olszewski 2004). The PIE index is directly related to the Gini-Simpson index – the probability that two individuals selected at random will be different species. The difference between these two indices is in the form of random sampling – Gini-Simpson samples with replacement (thus assuming infinite population size) while PIE, just like rarefaction, samples without replacement. Following Olszewski (2004), PIE can be expressed as the following (Eq. 8) where

*E*[

*S*

_{1}] and

*E*[

*S*

_{2}] refer to the expected species richness of one and two randomly drawn individuals respectively. Note that

*E*[

*S*

_{1}] always equals one in this case.

*E*[

*S*

_{1}] and

*E*[

*S*

_{2}] is the expected number of species in the second sample that are not found in the first. Thus, the PIE index can be used to measure beta-diversity if applied to sample-based rarefaction. This interpretation is directly related to the additive partitioning of species diversity into alpha and beta components where alpha-diversity is the mean (expected) richness of a single sample and beta-diversity is the gain in species richness from a single sample to a larger set of samples and can be read directly from a rarefaction curve (Crist and Veech 2006).

*PD*) when adding a second accumulation unit to the first. Further, because PD rarefaction curves can also meaningfully use species as accumulation units, we can extend this idea to include a measure of phylogenetic dispersion where the gain in PD is the expected branch length in the lineage (path from tip to root) of a second randomly selected species that is not shared with the first. Thus, we can define a general measure (∆

*PD*) for phylogenetic evenness, phylogenetic beta-diversity or phylogenetic dispersion, depending on the accumulation units chosen (Eq. 9, see also Fig. 1). ∆

*PD*is very similar to the ∆

*PDq*measure of Faith (2013) although in that case, probabilities are not derived from the hypergeometric distribution. Further, ∆

*PDq*is specifically applied to the problem of estimating loss of PD from extinction – a problem that is mathematically similar to rarefaction.

*E*[

*PD*

_{1}] is the lineage length. When rarefaction is by units of samples,

*E*[

*PD*

_{1}] will equal the average PD of a sample and will be equal to ∆

*PD*in the extreme case where each sample shares no branch length with any other sample. Thus, whether referring to units of individuals, species or samples,

*E*[

*PD*

_{1}] represents the theoretical maximum of ∆

*PD*and can be used to standardise the measure as follows.

## Application

The following is a demonstration of the application of PD rarefaction, and the derived ∆PD statistics, to real ecological datasets. These applications are not intended to provide definitive answers to ecologically important questions but are, rather, simple demonstrations of how PD rarefaction can allow new analyses to be undertaken and, hopefully, new insights gained.

In all these applications, I have used published data on mammals. This is principally for convenience as mammals (Bininda-Emonds et al. 2007) and birds (Jetz et al. 2012) are the only major taxonomic groups for which comprehensive species-level supertrees are available. I have used an updated version of the mammal supertree of Bininda-Emonds et al. (2007) published as supplementary material by Fritz et al. (2009). In this supertree, all branch lengths are measured in units of time (millions of years between branching events), allowing for a straight-forward interpretation of PD as cumulative evolutionary history (Proches et al. 2006).

All analyses were conducted using the statistical software, *R* version 2.15.2 (R Core Team 2012). Phylogenetic information was processed using the *ape* package in R (Paradis et al. 2004). PD rarefaction analyses used the *phylodiv*, *phylocurve* and *phylorare* functions, written by the author and available from: http://davidnipperess.blogspot.com.au.

### Standardisation of Sampling

The most commonly used application for rarefaction is standardisation to allow comparisons to be made between datasets with differing amounts of sampling effort. Standardisation can be achieved by rarefying all datasets back to a common (typically the minimum) number of accumulation units (Sanders 1968; Gotelli and Colwell 2001).

Law et al. (1998) surveyed bats in ten State Forests of the south-west slopes region of New South Wales, Australia. Survey methods were a combination of ultra-sonic detectors, harp-traps, mist-nets and trip-lines. For the purposes of this demonstration, only data from the harp-traps will be used. A harp-trap is a rectangular frame, stringed vertically with nylon line, placed so as to intercept the flight path of low-flying bats (Tidemann and Woodside 1978). A bat striking the nylon lines of the trap will tumble down into a collecting bag at the bottom.

Comparison of diversity measures for bat assemblages for ten state forests of the south-west slopes region of New South Wales, Australia

State forest | Individuals ( | Species richness ( | Phylogenetic diversity ( | Standardised phylogenetic diversity ( |
---|---|---|---|---|

Bago | 99 | 6 | 159 | 132 |

Maragle | 208 | 8 | 170 | 136 |

Buccleuch | 100 | 7 | 211 | 150 |

Bungongo | 70 | 6 | 221 | 140 |

Woomargama | 121 | 7 | 198 | 133 |

Carabost | 153 | 8 | 198 | 155 |

Murraguldrie | 95 | 6 | 214 | 133 |

Ellerslie | 46 | 4 | 134 | 105 |

Tumblong | 77 | 7 | 188 | 151 |

Minjary | 15 | 3 | 113 | 113 |

### Phylogenetic Evenness

The extension of PD rarefaction to ∆PD allows for the measurement of phylogenetic evenness, which is essentially a measure of the distribution of individuals among branches in a phylogenetic tree (Webb and Pitman 2002). A phylogenetically even community is one where the most evolutionarily distinct species are also the most abundant. Because ∆PD will increase with both increasing phylogenetic evenness and phylogenetic diversity, it is more correctly a measure of entropy (Jost 2006), directly comparable to the PIE and Gini-Simpson indices. It has a particularly close relationship with the quadratic entropy measure of Rao (1982). Rao’s quadratic entropy measures the average distance between individuals in an assemblage. When that distance is measured as patristic distance (path length on a phylogenetic tree), ∆PD will be approximately half of Rao’s quadratic entropy. ∆PD is also similar in intent, but not in form, to the phylogenetic entropy index of Allen et al. (2009).

Comparison of diversity measures for bat assemblages from four habitats along a disturbance gradient in the Selva Lacandona, Chiapas, Mexico

Habitat | Individuals | Species richness | PIE | Phylogenetic diversity | Phylogenetic evenness (∆PD) | Phylogenetic component (∆PD/PIE) |
---|---|---|---|---|---|---|

Cornfield | 572 | 17 | 0.786 | 295 | 17.2 | 21.8 |

Oldfield | 690 | 20 | 0.809 | 469 | 18.1 | 22.4 |

Cacao | 699 | 21 | 0.851 | 493 | 18.2 | 21.3 |

Forest | 444 | 27 | 0.884 | 609 | 20.4 | 23.0 |

The trend in phylogenetic evenness may simply be reflecting the abundance distribution among species. To determine the phylogenetic contribution to phylogenetic evenness, ∆PD was divided by the PIE index (Table 2). Since PIE is the probability that the second randomly selected individual is a different species to the first, we can divide ∆PD by PIE to get the expected branch length of that species (conditional on the second individual being a different species). This value is related to phylogenetic dispersion (∆PD from a species-based rarefaction curve) but differs due to the conditional probability structure, and effectively measures the pure phylogenetic contribution to ∆PD independent of the abundance distributions among species. We see, in this case, that the phylogenetic component generally decreases with increasing disturbance (Cacao being the exception), supporting the notion that disturbance favours more closely related species.

### Phylogenetic Beta-Diversity

Phylogenetic beta-diversity is effectively the turnover of branch lengths between samples in space and/or time. Like its species-level equivalent, phylogenetic beta-diversity can be measured on a pair-wise basis (Lozupone and Knight 2005; Bryant et al. 2008; Nipperess et al. 2010) or as a single value for a set of samples (Anderson et al. 2010). Rarefaction of PD provides a means for deriving a single value of beta-diversity for a set of samples of any size via the ∆PD measure, which is a phylogenetic analogue of the additive partitioning approach of Crist and Veech (2006).

Comparison of diversity measures for small mammal assemblages of sites in the Tanami Desert and Uluru-Kata Tjuta National Park, Northern Territory, Australia

Region | No. of sites | Species richness (alpha) | Species richness (gamma) | Species beta diversity (additive) | Phylogenetic beta diversity (∆PD) |
---|---|---|---|---|---|

Tanami | 15 | 3.13 | 14 | 10.87 | 59.92 |

Uluru | 13 | 6.54 | 13 | 6.46 | 22.54 |

### Phylogenetic Dispersion

Phylogenetic dispersion is a measure of the average phylogenetic distance among species (or tips) (Webb et al. 2002) and is in effect a measure of tree shape (Davies and Buckley 2012). ∆PD provides a simple, intuitive measure of dispersion as the expected gain in PD of adding a second randomly selected species to the first. It can also be seen as a means of correcting for variation in species richness among samples, as it is well known that PD increases with species richness (Rodrigues and Gaston 2002).

I generated PD rarefaction curves and ∆PD values for the mammal faunas of 71 of the 79 terrestrial ecoregions recognised by Olson et al. (2001) as constituting the Australasian biogeographic realm. Data were sourced from the wildfinder database (http://worldwildlife.org/pages/wildfinder) of the World Wildlife Fund. Eight ecoregions were excluded from the analysis because they had less than two species and thus a ∆PD value could not be calculated.

Comparison of diversity measures for mammal assemblages of selected ecoregions of the Australasian biogeographic realm

Ecoregion | Species richness | Phylogenetic diversity (Ma) | Phylogenetic dispersion (∆PD) |
---|---|---|---|

New Caledonia dry forests | 7 | 347 | 51.3 |

Central range montane rainforests | 109 | 2768 | 103.6 |

Mount Lofty woodlands | 34 | 1504 | 110.7 |

## Future Directions

As demonstrated here, rarefaction of PD has a straightforward application in standardising PD across samples so that they can be compared directly. Further, depending on the accumulation unit, the rarefaction formula can be extended to the calculation of metrics of phylogenetic evenness, phylogenetic beta-diversity and phylogenetic dispersion. However, the application of the PD rarefaction formula and its extension to other metrics is still very much in its infancy. Here I will outline some future directions for PD rarefaction.

Rarefaction by units of species allows for the comparison of locations while controlling for variation in species richness. This can easily be done by either rarefying all locations to a given number of species (Nipperess and Matsen 2013) or via ∆PD as demonstrated here. This kind of correction has previously been done by including species richness as an explanatory variable in a statistical model and taking the residuals (Davies et al. 2008) or by comparison to a null model derived by repeated subsampling (Davies et al. 2007). The latter method is often used as a statistical test of phylogenetic dispersion (also known as phylogenetic structure) where random draws are taken from a species pool, representing a null community assembly process (Webb 2000). Such methods are no longer necessary as the exact relationship between species richness and PD is described by the rarefaction curve (Nipperess and Matsen 2013). Further, the exact analytical solution is computationally efficient, allowing for practical application to very large datasets.

By removing the effect of species richness, we can identify “evolutionary hotspots” with higher than expected phylogenetic diversity (Davies et al. 2008; Nipperess and Matsen 2013) on a regional or global scale. We can then use the standardised PD values (called relative PD by Davies et al. 2007) to explore the environmental, ecological and historical processes that lead to the observed patterns of high or low phylogenetic dispersion (Kooyman et al. 2013). Ultimately, we may be able to develop the theory to predict these patterns (Davies et al. 2007), in a similar vein to what has been done for species richness (Arrhenius 1921; MacArthur and Wilson 1963; Rosindell et al. 2011). For example, the relationship of species richness with area is well known but the phylogeny-area relationship has only recently begun to be explored (Morlon et al. 2011). Rarefaction curves have an obvious connection to species-area curves (Olszewski 2004) and thus the development of PD rarefaction may well improve understanding of the phylogeny-area relationship. In particular, species-based rarefaction of PD allows for the separation of species diversity effects from those purely explained by phylogeny.

It is possible to predict how much Phylogenetic Diversity is yet to be sampled from the observed rarefaction curve. Rarefaction is the basis of several species diversity estimators, which attempt to calculate total diversity (including unseen species) for a set of individuals or samples by effectively extending the curve beyond the observed sampling depth (Colwell and Coddington 1994). It follows that a useful extension of PD rarefaction would be a PD estimator that predicts unseen branch length, given the observed rate of accumulation of PD. It is important to note that PD rarefaction calculates the expected branch length gained by adding additional accumulation units but does not predict where on the tree these branches will come from. Similarly, a biodiversity estimator based on PD rarefaction may be able to predict the amount of PD not yet sampled but would not be able to predict where these unseen branches would be added to an existing tree. This would be, nevertheless, an exciting development.

It has recently been proposed that the standardisation of samples for species diversity should not be done by rarefaction to the same size (i.e. no. of individuals), but rather by sample completeness (Alroy 2010; Jost 2010; Chao and Jost 2012). Completeness, when measured by a statistic known as coverage (Good 1953), is the proportion of individuals in a community that are represented by species in a sample from that community (Chao and Jost 2012). When samples differ in their coverage, they should be standardised to equal coverage before a “fair” comparison can be made. Much like expected species richness, the coverage of a sample can be estimated from the sample size and the distribution of individuals among the species in the sample (Chao and Jost 2012). Given that standardisation by sample completeness has been shown to yield a less biased comparison of species richness between communities (Chao and Jost 2012), it would be desirable to have a similar method of standardisation for PD. Since rarefaction of coverage is mathematically related to rarefaction of sample size, the recent work on estimating PD from sample size will no doubt form the basis from which estimated PD for sample coverage will be developed.

Finally, a general issue when considering any PD measure is uncertainty regarding the length of branches and the topology (branching pattern) of the tree. All PD measures (including those presented here) assume that the branch lengths and their arrangement in the tree are perfectly known. This is obviously an abstraction, although PD can be surprisingly robust to this source of variation (Swenson 2009). One solution to this dilemma is to calculate PD, including rarefied PD, for a large number of possible trees and report the mean and confidence limits. The output from a Bayesian phylogenetic analysis is a large number of trees, each with their own topology and corresponding branch lengths (see for example Jetz et al. 2012) and so lends itself well to this approach. However, when the possible trees number in the thousands and tens of thousands, this is obviously computationally intensive. An analytical solution, directly incorporating uncertainty into the calculation, would therefore be desirable. This is not an easy extension of the PD rarefaction solution because both variation in branch length and topology (affecting the probability of encountering internal branches) would need to be taken into account. It is worth remembering that phylogenetic relationships are not the only source of uncertainty when investigating real ecological communities – neither the abundance, nor even the presence (occupancy), of species are necessarily known with precision.

## Conclusion

The formulation for the rarefaction of Phylogenetic Diversity (PD) is given in expanded form to show its simplicity and its connection to the classic formula for the rarefaction of species richness (Hurlbert 1971; Simberloff 1972). The method is exact and efficient and should be preferred over the algorithmic (Monte Carlo) solution involving repeated random sub-sampling. Further, the extension to the calculation of ∆PD provides a flexible and general framework for the measurement of biodiversity as phylogenetic evenness, phylogenetic beta-diversity or phylogenetic dispersion. The applications of PD rarefaction and ∆PD presented here are hopefully useful in improving understanding of the importance of rarefaction in ecology and in guiding future applications of the method. There are, I believe, exciting prospects for PD rarefaction in the future, including as a general method for standardising PD by removing variation with species richness, and for predicting unseen (i.e. un-sampled) PD. The recent availability of comprehensive phylogenies (Bininda-Emonds et al. 2007; Jetz et al. 2012) and rich data on species occurrences (Flemons et al. 2007), coupled with analytical advances such as PD rarefaction, allows us to better understand the distribution of Phylogenetic Diversity on the surface of the Earth and the processes giving rise to that distribution. This is valuable for its own sake but will also inform efforts to conserve as much of the Tree of Life as possible in the face of future extinctions (Rosauer and Mooers 2013).

## Notes

### Acknowledgements

I would like to thank Frederick Matsen (Fred Hutchinson Cancer Research Center), Daniel Faith (Australian Museum), Peter Wilson (Macquarie University), Anne Chao (National Tsing Hua University) and Robert Colwell (University of Connecticut) for their input regarding the rarefaction of PD and what it all might mean. Daniel Miranda-Esquivel (Universidad Industrial de Santander) and Pedro Cardoso (University of Helsinki) provided helpful feedback on an earlier version of this chapter. This research was supported by the Australian Research Council (DP0665761 and DP1095200).

### References

- Allen B, Kon M, Bar-Yam Y (2009) A new phylogenetic diversity measure generalizing the Shannon index and its application to phyllostomid bats. Am Nat 174:236–243CrossRefPubMedGoogle Scholar
- Alroy J (2010) The shifting balance of diversity among major marine animal groups. Science 329:1191–1194CrossRefPubMedGoogle Scholar
- Anderson MJ, Crist TO, Chase JM et al (2010) Navigating the multiple meanings of β diversity: a roadmap for the practicing ecologist. Ecol Lett 14:19–28CrossRefPubMedGoogle Scholar
- Arrhenius O (1921) Species and area. J Ecol 9:95–99CrossRefGoogle Scholar
- Bininda-Emonds ORP, Cardillo M, Jones KE et al (2007) The delayed rise of present-day mammals. Nature 446:507–512CrossRefPubMedGoogle Scholar
- Bryant JA, Lamanna C, Morlon H et al (2008) Microbes on mountainsides: contrasting elevational patterns of bacterial and plant diversity. Proc Natl Acad Sci 105:11505–11511CrossRefPubMedPubMedCentralGoogle Scholar
- Chao A, Jost L (2012) Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size. Ecology 93:2533–2547CrossRefPubMedGoogle Scholar
- Chao A, Chiu CH, Jost L (2010) Phylogenetic diversity measures based on Hill numbers. Phil Trans R Soc B 365:3599–3609CrossRefPubMedPubMedCentralGoogle Scholar
- Chiarucci A, Bacaro G, Rocchini D, Fattorini L (2008) Discovering and rediscovering the sample-based rarefaction formula in the ecological literature. Community Ecol 9:121–123CrossRefGoogle Scholar
- Colwell RK, Coddington J (1994) Estimating terrestrial biodiversity through extrapolation. Phil Trans R Soc B 345:101–118CrossRefPubMedGoogle Scholar
- Crist TO, Veech JA (2006) Additive partitioning of rarefaction curves and species-area relationships: unifying alpha, beta and gamma-diversity with sample size and habitat area. Ecol Lett 9:923–932CrossRefPubMedGoogle Scholar
- Davies TJ, Buckley LB (2012) Exploring the phylogenetic history of mammal species richness. Glob Ecol Biogeogr 21:1096–1105CrossRefGoogle Scholar
- Davies RG, Orme CDL, Webster AJ et al (2007) Environmental predictors of global parrot (Aves: Psittaciformes) species richness and phylogenetic diversity. Glob Ecol Biogeogr 16:220–233CrossRefGoogle Scholar
- Davies TJ, Fritz SA, Grenyer R et al (2008) Phylogenetic trees and the future of mammalian biodiversity. Proc Natl Acad Sci 105(Suppl 1):11556–11563CrossRefPubMedPubMedCentralGoogle Scholar
- Faith DP (1992) Conservation evaluation and phylogenetic diversity. Biol Conserv 61:1–10CrossRefGoogle Scholar
- Faith DP (2013) Biodiversity and evolutionary history: useful extensions of the PD phylogenetic diversity assessment framework. Ann N Y Acad Sci 1289:69–89CrossRefPubMedGoogle Scholar
- Faith DP, Reid CAM, Hunter J (2004) Integrating phylogenetic diversity, complementarity, and endemism for conservation assessment. Conserv Biol 18:255–261CrossRefGoogle Scholar
- Faith DP, Lozupone CA, Nipperess DA, Knight R (2009) The cladistic basis for the phylogenetic diversity (PD) measure links evolutionary features to environmental gradients and supports broad applications of microbial ecology’s ‘Phylogenetic Beta Diversity’ framework. Int J Mol Sci 10:4723–4741CrossRefPubMedPubMedCentralGoogle Scholar
- Ferrier S, Manion G, Elith J, Richardson K (2007) Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Divers Distrib 13:252–264CrossRefGoogle Scholar
- Flemons P, Guralnick R, Krieger J et al (2007) A web-based GIS tool for exploring the world’s biodiversity: The Global Biodiversity Information Facility Mapping and Analysis Portal Application (GBIF-MAPA). Ecol Informa 2:49–60CrossRefGoogle Scholar
- Fritz SA, Bininda-Emonds ORP, Purvis A (2009) Geographical variation in predictors of mammalian extinction risk: big is bad, but only in the tropics. Ecol Lett 12:538–549CrossRefPubMedGoogle Scholar
- Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40:237–264CrossRefGoogle Scholar
- Gotelli NJ, Colwell RK (2001) Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecol Lett 4:379–391CrossRefGoogle Scholar
- Helmus MR, Keller W, Paterson MJ et al (2010) Communities contain closely related species during ecosystem disturbance. Ecol Lett 13:162–174CrossRefPubMedGoogle Scholar
- Hurlbert S (1971) The nonconcept of species diversity: a critique and alternative parameters. Ecology 52:577–586CrossRefGoogle Scholar
- Jetz W, Thomas GH, Joy JB et al (2012) The global diversity of birds in space and time. Nature 491:444–448CrossRefPubMedGoogle Scholar
- Jost L (2006) Entropy and diversity. Oikos 113:363–375CrossRefGoogle Scholar
- Jost L (2010) The relation between evenness and diversity. Diversity 2:207–232CrossRefGoogle Scholar
- Kobayashi S (1974) The species-area relation I. A model for discrete sampling. Res Popul Ecol 15:223–237CrossRefGoogle Scholar
- Kooyman RM, Rossetto M, Sauquet H, Laffan SW (2013) Landscape patterns in rainforest phylogenetic signal: isolated islands of refugia or structured continental distributions? PLoS One 8:e80685CrossRefPubMedPubMedCentralGoogle Scholar
- Lande R (1996) Statistics and partitioning of species diversity, and similarity among multiple communities. Oikos 76:5–13CrossRefGoogle Scholar
- Law B, Anderson J, Chidel M (1998) A bat survey in State Forests on the south-west slopes region of New South Wales with suggestions of improvements for future surveys. Aust Zool 30:467–479CrossRefGoogle Scholar
- Lozupone CA, Knight R (2005) UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 71:8228–8235CrossRefPubMedPubMedCentralGoogle Scholar
- Lozupone CA, Knight R (2008) Species divergence and the measurement of microbial diversity. FEMS Microbiol Rev 32:557–578CrossRefPubMedPubMedCentralGoogle Scholar
- MacArthur RH, Wilson EO (1963) An equilibrium theory of insular zoogeography. Evolution 17:373–387CrossRefGoogle Scholar
- Mao C, Colwell RK, Chang J (2005) Estimating the species accumulation curve using mixtures. Biometrics 61:433–441CrossRefPubMedGoogle Scholar
- Medellin RA, Equihua M, Amin M (2000) Bat diversity and abundance as indicators of disturbance in neotropical rainforests. Conserv Biol 14:1666–1675CrossRefGoogle Scholar
- Morlon H, Schwilk DW, Bryant JA et al (2011) Spatial patterns of phylogenetic diversity. Ecol Lett 14:141–149CrossRefPubMedPubMedCentralGoogle Scholar
- Morton SR, Brown JH, Kelt DA, Reid JRW (1994) Comparisons of community structure among small mammals of North American and Australian deserts. Aust J Zool 42:501–525CrossRefGoogle Scholar
- Nipperess DA, Matsen FA IV (2013) The mean and variance of phylogenetic diversity under rarefaction. Methods Ecol Evol 4:566–572CrossRefPubMedPubMedCentralGoogle Scholar
- Nipperess DA, Faith DP, Barton K (2010) Resemblance in phylogenetic diversity among ecological assemblages. J Veg Sci 21:809–820CrossRefGoogle Scholar
- O’Dwyer JP, Kembel SW, Green JL (2012) Phylogenetic diversity theory sheds light on the structure of microbial communities. PLoS Comput Biol 8:e1002832CrossRefPubMedPubMedCentralGoogle Scholar
- Olson D, Dinerstein E, Wikramanayake E et al (2001) Terrestrial ecoregions of the world: a new map of life on Earth. Bioscience 51:933–938CrossRefGoogle Scholar
- Olszewski TD (2004) A unified mathematical framework for the measurement of richness and evenness within and among multiple communities. Oikos 104:377–387CrossRefGoogle Scholar
- Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290CrossRefPubMedGoogle Scholar
- Pardi F, Goldman N (2007) Resource-aware taxon selection for maximizing phylogenetic diversity. Syst Biol 56:431–444CrossRefPubMedGoogle Scholar
- Petchey OL, Gaston KJ (2002) Functional diversity(FD), species richness and community composition. Ecol Lett 5:402–411CrossRefGoogle Scholar
- Proches S, Wilson J, Cowling R (2006) How much evolutionary history in a 10 x 10 plot? Proc R Soc B 273:1143–1148CrossRefPubMedPubMedCentralGoogle Scholar
- R Core Team (2012) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/
- Rao CR (1982) Diversity and dissimilarity coefficients: a unified approach. Theor Popul Biol 21:24–43CrossRefGoogle Scholar
- Rodrigues ASL, Gaston KJ (2002) Maximising phylogenetic diversity in the selection of networks of conservation areas. Biol Conserv 105:103–111CrossRefGoogle Scholar
- Rosauer DF, Mooers AO (2013) Nurturing the use of evolutionary diversity in nature conservation. Trends Ecol Evol 28:322–323CrossRefPubMedGoogle Scholar
- Rosauer D, Laffan SW, Crisp MD et al (2009) Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. Mol Ecol 18:4061–4072CrossRefPubMedGoogle Scholar
- Rosindell J, Hubbell SP, Etienne RS (2011) The unified neutral theory of biodiversity and biogeography at age ten. Trends Ecol Evol 26:340–348CrossRefPubMedGoogle Scholar
- Sanders HL (1968) Marine benthic diversity: a comparative study. Am Nat 102:243–282CrossRefGoogle Scholar
- Simberloff D (1972) Properties of the rarefaction diversity measurement. Am Nat 106:414–418CrossRefGoogle Scholar
- Swenson NG (2009) Phylogenetic resolution and quantifying the phylogenetic diversity and dispersion of communities. PLoS One 4:e4390CrossRefPubMedPubMedCentralGoogle Scholar
- Tidemann CR, Woodside DP (1978) A collapsible bat-trap and a comparison of results obtained with the trap and with mist-nets. Wildl Res 5:355–362CrossRefGoogle Scholar
- Turnbaugh PJ, Hamady M, Yatsunenko T et al (2009) A core gut microbiome in obese and lean twins. Nature 457:480–484CrossRefPubMedGoogle Scholar
- Ugland K, Gray JS, Ellingsen KE (2003) The species-accumulation curve and estimation of species richness. J Anim Ecol 72:888–897CrossRefGoogle Scholar
- Webb CO (2000) Exploring the phylogenetic structure of ecological communities: an example for rain forest trees. Am Nat 156:145–155CrossRefPubMedGoogle Scholar
- Webb CO, Pitman N (2002) Phylogenetic balance and ecological evenness. Syst Biol 51:898–907CrossRefPubMedGoogle Scholar
- Webb CO, Ackerly DD, McPeek M, Donoghue MJ (2002) Phylogenies and community ecology. Ann Rev Ecol Syst 33:475–505CrossRefGoogle Scholar
- Yu DW, Ji Y, Emerson BC, Wang X (2012) Biodiversity soup: metabarcoding of arthropods for rapid biodiversity assessment and biomonitoring. Methods Ecol Evol 3:613–623CrossRefGoogle Scholar

## Copyright information

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution-Noncommercial 2.5 License (http://creativecommons.org/licenses/by-nc/2.5/) which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

The images or other third party material in this chapter are included in the work’s Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work’s Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.