Advertisement

Measures of Biological Diversity: Overview and Unified Framework

  • Vincenzo CrupiEmail author
Open Access
Chapter
Part of the History, Philosophy and Theory of the Life Sciences book series (HPTL, volume 24)

Abstract

A variety of statistical measures of diversity have been employed across biology and ecology, including Shannon entropy, the Gini-Simpson index, so-called effective numbers of species (aka Hill’s measures), and more besides. I will review several major options and then present a comprehensive formalism in which all these can be embedded as special cases, depending on the setting of two parameters, labelled degree and order. This mathematical framework is adapted from generalized information theory. A discussion of the theoretical meaning of the parameters in biological applications provides insight into the conceptual features and limitations of current approaches. The unified framework described also allows for the development of a tailored solution for the measurement of biological diversity that jointly satisfies otherwise divergent desiderata put forward in the literature.

Keywords

Diversity Richness Evenness Entropy Information theory 

Suppose that four different species, X, Y, W, and Z, are present in a given environment at a certain time, counting exactly 50, 25, 15, and 10 organisms each, respectively. At the same time in a different location (alternatively: at a later moment in the same area) the numbers are 40, 30, 30, and 0, respectively. In which of these two situations one deals with a more diverse community?

This example, although drastically simplified, illustrates a rather general problem. With minor variations, X, Y, W, and Z might just as well be the firms operating in a sector of the economy (see, e.g., Chakravarty and Eichhorn 1991), the languages spoken in a region (see, e.g., Greenberg 1956), the types of television channels in a country (see, e.g., Aslama et al. 2004), or the parties in a political system (see, e.g., Golosov 2010) characterized by their shares of market, speakers, overall broadcasting, or parliamentary seats. In each of these domains, and still others, measuring diversity (or, conversely, concentration) has been a significant scientific issue. In biology and ecology, tracking diversity over space and time is of course a key topic for environmental concerns, but biological diversity also plays a relevant theoretical role for its connections with other variables of interest, such as stability, predation pressure, and so on.

In this chapter, I will review a variety of measures of biological diversity that have been employed and discussed across the scientific literature. Relying on a few intuitive illustrations, I will carry out an assessment of the strengths and limitations of some major options, including Shannon entropy, the Gini-Simpson index, so-called effective numbers of species, and more besides. I will also highlight the appeal of one specific measure, which might have not received adequate attention so far. Finally, I will describe a comprehensive formalism and point out how all diversity measures previously considered can be conveniently embedded in it as special cases, depending on the setting of two parameters, labelled order and degree. This unified mathematical framework is adapted from generalized information theory (Aczél 1984). As we will see, it provides insight into the conceptual features of current approaches and can allow for tailored technical solutions for the measurement of biological diversity.

6.1 Richness

For our current purposes, measuring diversity has to do with how a given quantity is distributed among some well defined categories.1 In biological applications, it is typical (although by no means necessary) that such categories amount to distinct species characterized by their relative abundance. The latter quantity, in turn, is often simply measured by the proportion of organisms of that species in the overall target population (but biomass can be employed as well, for instance). In what follows, we will denote diversity as D(p1, …, pn), where n species—s1, …, sn—are involved and pi is the relative abundance of the i-th species, si. With this bit of formal notation, we can thus represent our initial example with categories X, Y, W, and Z as concerning the comparison between D(0.5, 0.25, 0.15, 0.10) and D(0.40, 0.30, 0.30, 0). Note that, as a direct consequence of our definition, p1, …, pn will always be positive numbers summing to 1, so that (p1, …, pn) actually represents a probability distribution. In our canonical interpretation, pi equals the probability that a randomly selected individual from the target population belongs to species si.

The simplest way to measure diversity, and a useful starting point for discussion, is to just count out the number of species with non-zero relative abundance. This straightforward approach relies on what is usually labelled richness, namely, how many different species are represented in an environment. In our formalism, it can be computed as follows (with the convenient convention that 00 = 0):
$$ Richness\left({p}_1,\dots, {p}_n\right)=\sum \limits_{i=1}^n{p_i}^0 $$
As pi0 = 1 whenever pi > 0, Richness always takes an integer value corresponding to how many ps are strictly positive, i.e., how many species are effectively instantiated by some organism. In order to satisfy the appealing constraint that diversity is null (rather than 1) in the extreme case when only one species is present, the following minor variation is sometimes employed (Patil and Taille 1982, p. 551):
$$ Richnes{s}^{\ast}\left({p}_1,\dots, {p}_n\right)=\sum \limits_{i=1}^n{p_i}^0-1 $$
As austere as it may seem as a measure of diversity, Richness yields “an intuitive property that is implicit in much biological reasoning about diversity” (Chao and Jost 2012, p. 204). A basic illustration of such property (aka the replication principle, see Jost 2009, p. 927) is conveyed by the following example.

Test Case 1

(Jost 2006, p. 363). Let us consider communities consisting of n equally common species, like (1/5, …, 1/5) (with n = 5) and (1/10, …, 1/10) (with n = 10). Arguably, diversity in the latter case should just be twice as in the former. A compelling measure of diversity should recover such assessment. Richness clearly does, because Richness(1/10, …, 1/10) = 10 and Richness(1/5, …, 1/5) = 5.

The richness measure is of course completely transparent in its interpretation, but it is also simplistic in a fairly obvious way: it is entirely insensitive to how even/uneven the distribution is. Here is a second test case to clarify the point.

Test Case 2

(evenness sensitivity, see Pielou 1975, p. 7). For a given number of species n, a compelling measure of diversity should assign maximum value to a completely even distribution (with p1 = … = pn = 1/n), and a strictly lower value to a distribution which is much more skewed. Richness, however, clearly fails this condition. For instance, one has Richness(0.25, 0.25, 0.25, 0.25) = 4 = Richness(0.97, 0.01, 0.01, 0.01).

6.2 Entropies and Diversity

One traditional approach to meet the requirement underlying Test case 2 is to analyze biological diversity on the basis of entropy measures developed in information theory (Csizár 2008). By far the most widely known such measure is Shannon’s (Shannon 1948). In our current notation, it amounts to the following2:
$$ {D}_{Shannon}\left({p}_1,\dots, {p}_n\right)=\sum \limits_{i=1}^n{p}_i\log \left(\frac{1}{p_i}\right) $$
How can this measure be interpreted in the biological context? The quantity \( \log \left(\frac{1}{p_i}\right) \) can be seen as representing potential surprise, to wit, how surprising it would be to find out that a randomly selected individual from the target population belongs to species si. In fact, such index of surprise is null in the extreme case when the outcome is already known for sure (so that pi = 1, and log(1) = 0) and it is increasingly and indefinetely large as pi approches 0. As a consequence, DShannon(p1, … , pn) quantifies the average (expected) surprise should one get to know the species to which a randomly sampled element will belong. Appropriately, such expected surprise will be low when a very uneven distribution—such as (0.97, 0.01, 0.01, 0.01)—implies a low level of uncertainty about the outcome, because one species is (or few of them are) very likely to be instantiated in a random draw. On the other hand, expected surprise gets its maximum value (for given n) when a completely even distribution—namely, (0.25, 0.25, 0.25, 0.25)—implies the highest level of uncertainty, because each species is equally likely to be instantiated in a random draw. In fact, the Shannon index of diversity gets Test case 2 just right: in particular, DShannon(0.25, 0.25, 0.25, 0.25) = 1.386 > 0.168 = DShannon(0.97, 0.01, 0.01, 0.01).
Another very popular index of diversity which is appropriately sensitive to the unevenness of the abundance distribution is quadratic entropy (Vajda and Zvárová 2007), also widely known as the Gini or the Gini-Simpson index (after Gini 1912 and Simpson 1949):
$$ {D}_{Gini}\left({p}_1,\dots, {p}_n\right)=1-\sum \limits_{i=1}^n{p}_i^2 $$
DGini, too, can be given a convenient interpretation. It amounts to 1 minus the probability that two random draws (with replacement) from the background population instantiate the same category (in fact, the latter probability is pi × pi, for each species si). For this reason, in biological and ecological applications, DGini is often seen as the probability of interspecific encounter (see Patil and Taille 1982, pp. 548–550), to wit, the probability that two random draws do not instantiate the same species. Also, 1—pi can be taken as a natural measure of the rarity of species si in the target environment. Then, DGini(p1, …, pn) computes the average (expected) rarity of the species to which a (randomly sampled) individual would belong, as emphasized by the following equivalent rendition:
$$ {D}_{Gini}\left({p}_1,\dots, {p}_n\right)=\sum \limits_{i=1}^n{p}_i\left(1-{p}_i\right) $$
Expected rarity will be low when a very uneven distribution—such as (0.97, 0.01, 0.01, 0.01)—implies that one very common species is (or few of them are) likely to be instantiated in a random draw. On the other hand, expected rarity gets its maximum value (for given n) when a completely even distribution—namely, (0.25, 0.25, 0.25, 0.25)—implies that the species instantiated in a random draw will always have the same, and substantial, degree of rarity. Accordingly, the Gini index of diversity also gets Test case 2 right: DGini(0.25, 0.25, 0.25, 0.25) = 0.75 > 0.06 = DGini(0.97, 0.01, 0.01, 0.01).

One important and well-known feature of both the Shannon and the Gini index of diversity is that they are concave functions. The formal definition of concavity involves the notion of a mixture\( {M}_{\alpha}^{P,{P}^{\ast }}=\left(\alpha {p}_1+\left(1-\alpha \right){p}_1^{\ast },\dots, \alpha {p}_n+\left(1-\alpha \right){p}_n^{\ast}\right) \) of two distributions of relative abundance P = (p1, … , pn) and \( {P}^{\ast }=\left({p}_1^{\ast },\dots, {p}_n^{\ast}\right) \), where α ∈ [0,1] determines the relative weights of the distributions combined. As a plain illustration, if P = (0.9, 0.1), P* = (0.7, 0.3), and α = 0.5, then the mixture is \( {M}_{\alpha}^{P,{P}^{\ast }} \) = (0.8, 0.2). A measure of diversity D is then said to be concave if and only if, for any P, P*, and any α ∈ [0,1], one has \( \alpha D(P)+\left(1-\alpha \right)D\left({P}^{\ast}\right)\le D\left({M}_{\alpha}^{P,{P}^{\ast }}\right) \). Concavity is sometimes advocated for measures of biological diversity as conveying the idea that, if one pools together two distinct populations X and Y composed by possibly different abundance distributions (P vs. P*) of the same list of n species, then the aggregate should have a diversity that is at least as great as the average of the initial diversities of X and Y. Whether or not such condition is meant to be compelling in general, one significant implication of concavity is involved in the following example.

Test Case 3

Consider three subsequent moments in time t1, t2, and t3, with corresponding relative abundace distributions of a population with n = 10, as follows.

t1: (0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1)

t2: (0.5,  0.1,  0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05)

t3: (0.9,  0.1,  0,   0,   0,   0,   0,   0,   0,   0)

Quite clearly, a drop in diversity occurred from t1 to t2. But, arguably, the loss of diversity was even greater from t2 to t3, essentially because most of the species (in fact, 80% of them) disappeared altogether. A compelling measure of diversity should recover such assessment, and concave measures such as DShannon and DGini do. DShannon takes values 2.303 at t1, 1.775 at t2, and then drops to 0.325 at t3. With DGini, we have 0.90, 0.72, and 0.18, respectively.

As we already know, because it lacks evenness sensitivity entirely, the Richness measure fails our Test case 2 above. Concerning Test case 3, Richness gets the main point right, namely that a larger drop in diversity occurs from t2 to t3 (although, again because of evenness insensitivity, it fails to detect any change in diversity from t1 to t2). Noting that DShannon and DGini do well in both cases 2 and 3, one might conclude that diversity should be quantified by these measures. And yet, the assesment of biological diversity on the basis of entropy measures such as DShannon and DGini has been forcefully questioned because such measures do not fulfil the replication principle, and in fact fail as basic a benchmark as Test case 1. As it turns out, DShannon(1/10, …, 1/10) = 2.303, which is definitely less than twice DShannon(1/5, …, 1/5) = 1.609, and DGini(1/10, …, 1/10) = 0.9, which is definitely less than twice DGini(1/5, …, 1/5) = 0.8. To emphasize the troubling consequences of these failures, Jost (2009, p. 926) put forward a variation that is even more extreme (notation slightly adapted):

Biologists frequently use measures of diversity to detect changes in the environment due to pollution, climate change, or other factors. […] Suppose a continent has a million equally common species, and a meteor impact kills 999,900 of the species, leaving 100 species untouched. Any biologist, if asked, would say that this meteor impact caused a large absolute and relative drop in diversity. Yet DGini only decreases from 0.999999 to 0.99, a drop of less than 1%. Evidently, the metric of this measure does not match the intuitive concept of diversity as used by biologists, and ecologists relying on DGini will often misjudge the magnitude of ecosystem change. This same problem arises when DShannon is equated with diversity.

6.3 Effective Numbers

Consider Jost’s meteor illustration above. According to the replication principle (which Jost 2009 strongly advocates), diversity should be 10.000 times lower after the impact than it was before (1.000.000/100). Is it possible to define a measure such that (as it happens with richness but not with entropy) the replication principle is retained and at the same time (as it happens with entropy but not richness) the (un)evenness of the distribution is also integrated in the assessment of diversity?

A crucial step in this direction was made in classical work by MacArthur (1965) and Hill (1973). To introduce their proposal, consider an entropy measure such as DGini, and take one specific example such as DGini(0.4, 0.3, 0.2, 0.1), which amounts to 0.7. We now ask: how many species should a completely even population include in order for its diversity to be just the same (i.e., 0.7) according to DGini itself? Note that, for a completely even population of n species, DGini equals \( \sum \limits_{i=1}^n\frac{1}{n}\left(1-\frac{1}{n}\right)=1-\frac{1}{n} \). For such a hypothetical population of equally abundant species to have DGini-diversity of 0.7, it should then hold that \( 1-\frac{1}{n} \) = 0.7, by which we compute \( n=\frac{1}{1-0.7} \), and thus n = 3.333… So a hypothetical completely even population of 3.333… species would have the same DGini-diversity as our initial population with actual distribution (0.4, 0.3, 0.2, 0.1). Given a canonical diversity index such as DGini, this number of corresponding equally abundant categories is usually called the effective number of species (relative to the index at issue, DGini in this case). As our illustration shows, the effective number of species is a theoretical construct: often it will not be an integer. Generalizing from our computation above, one can see that the effective number corresponding to DGini is as follows:
$$ {D}_{Gini- EN}\left({p}_1,\dots, {p}_n\right)=\frac{1}{1-{D}_{Gini}\left({p}_1,\dots, {p}_n\right)}=\frac{1}{\sum_{i=1}^n{p}_i^2} $$
In this case, too, in order to have null diversity (rather than 1) in the extreme case when only one species is present, a minor variation can be employed:
$$ {D}_{Gini- EN}^{\ast}\left({p}_1,\dots, {p}_n\right)=\frac{1}{\sum_{i=1}^n{p}_i^2}-1 $$
An effective number measure like DGini-EN meets the requirement of combining the replication principle and evenness sensitivity. Indeed, it can be shown that DGini-EN(p1, …, pn) = Richness(p1, …, pn) whenever the abundance distribution is uniform, so that, for instance, DGini-EN(1/10, …, 1/10) = 10 and DGini-EN(1/5, …, 1/5) = 5 (see Test case 1 above). On the other hand, DGini-EN is a smooth and strictly increasing function of DGini, thus it retains the evenness sensitivity of the latter when distributions are not uniform: for instance, DGini-EN(0.25, 0.25, 0.25, 0.25) = 4 while DGini-EN(0.97, 0.01, 0.01, 0.01) = 1.064 (see Test case 2 above). To achieve the same results, one can alternatively generate an effective number measure from yet another evenness sensitive index of diversity, such as Shannon entropy. The general method is the same: take the actual value of DShannon  (p1, … , pn), equate it to \( {D}_{Shannon}\left(\frac{1}{n},\dots, \frac{1}{n}\right) \), which amounts to log(1/n), then solve for n. The resulting measure is \( {D}_{Shannon- EN}\left({p}_1,\dots, {p}_n\right)={e}^{D_{Shannon}\left({p}_1,\dots, {p}_n\right)} \) (see, e.g., Jost 2006, p. 364–365).
According to some authors, using an effective number measure is the one right way to quantify true diversity in biological and ecological applications (see Hoffmann and Hoffmann 2008, and again Jost 2009, for a debate). Following Hill (1973, pp. 429–430) and Ricotta (2003, pp. 191–192), one can highlight another attractive consequence of the replication principle, which is implied by all effective number measures. If a diversity measure D(p1, …, pn) satisfies the replication principle, then one can define a measure of evenness in a very natural way, as Evenness(p1, …, pn) = D(p1, …, pn)/n. A simple and compelling property of such a measure of evenness is that it equates a fixed maximum value of 1 (i.e., n/n) whenever the distribution P is uniform, regardless of the value of n. And a straightforward implication is that diversity can then be neatly factorized into richness and evenness as distinct and independent components, for instance as follows:
$$ {D}_{Gini- EN}\left({p}_1,\dots, {p}_n\right)= Richness\left({p}_1,\dots, {p}_n\right)\times Evennes{s}_{Gini- EN}\left({p}_1,\dots, {p}_n\right) $$
One should note, however, that effective number measures do not retain the concavity of their generating indexes. For instance, the concavity of DGini is not retained in DGini-EN (and the same applies to DShannon and DShannon-EN). One disturbing consequence is that Test case 3 above is not addressed in a convincing way. To illustrate, according to DGini-EN, diversity decreases from DGini-EN(0.1, …, 0.1) = 10 to DGini-EN(0.5, 0.1, 0.05, …, 0.05) = 3.57 between t1 and t2, but the drop is smaller from t2 to t3, with DGini-EN(0.9, 0.1, 0, …, 0) = 1.22. As pointed out above, intuition clearly goes in the opposite direction, given that from time t2 to t3 eight out of ten species have disappeared entirely.
Table 6.1 summarizes our results so far. On inspection, it naturally suggests the question whether there exist a measure of diversity yielding a satisfactory response to all of our test cases above. As a final remark in our comparative discussion, I would like to point out that this can be done. One effective way is to adopt the following as a measure of diversity (see Arimoto 1971, p. 186, for an earlier occurrence in the information theory literature):
$$ {D}_{Root}\left({p}_1,\dots, {p}_n\right)={\left(\sum \limits_{i=1}^n\sqrt{p_i}\right)}^2 $$
Once again, a  – 1 correction can be employed to yield \( {D}_{Root}^{\ast}\left({p}_1,\dots, {p}_n\right) \), with null diversity (rather than 1) in the extreme case when only one species is present.3DRoot is demonstrably evenness sensitive (see Crupi et al. 2018). It is also concave and it satisfies the replication principle (see below for this). It addresses Test case 1 appropriately, because (according to the replication principle), DRoot(1/10, …, 1/10) = 10 and DRoot(1/5, …, 1/5) = 5. Moreover, it gets Test case 2 right, because (according to evenness sensitivity), DRoot(0.25, 0.25, 0.25, 0.25) = 4 > 1.651 = DRoot(0.97, 0.01, 0.01, 0.01). And finally, in virtue of concavity, it also accommodates Test case 3, implying a moderate decrease in diversity between t1 and t2—from DRoot(0.1, …, 0.1) = 10 to DRoot(0.5, 0.1, 0.05, …, 0.05) = 7.908—and a much larger drop between t2 and t3—from DRoot(0.5, 0.1., 0.05, …, 0.05) = 7.908 to DRoot(0.9, 0.1, 0, …, 0) = 1.6.
Table 6.1

Some properties of alternative ways to quantify biological diversity.

 

Test case1 (replication principle)

Test case2 (evenness sensitivity)

Test case3 (concavity)

Richness

Yes

No

Yes

Entropy (DGini)

No

Yes

Yes

Effective number (DGini-EN)

Yes

Yes

No

Test cases are explained in the text, and associated to formally relevant mathematical conditions (in parenthesis). “Yes” / “no”: the diversity measure (in row) yields/does not yield an intuitively adequate result in the test case at issue (in column)

6.4 Parametric Measures of Diversity

The discussion above suggests that statistical measure of diversity DRoot combines a number of appealing features, and it is good news, I submit, that such a measure exists. In general, however, the plurality of non-identical ways to quantify biological diversity needs not be a reason for concern or skepticism, like in Hurlbert’s (1971, p. 585) complaint that “diversity per se does not exists”. As noted by Patil and Taille (1982, p. 551), the plurality of measures is a very mundane phenomenon in various domains: in statistics, for instance, mean and median are non-equivalent measures of “central tendency”; variance, mean absolute variation, and range are non-equivalent measures of “spread”, and so on. In fact, once their main distinctive properties become well understood, it is natural to think that different measures may be most useful relative to varying purposes or contexts. For this reason, several authors have put forward a comprehensive approach, based on parametric families of diversity measures. In the final part of this contribution, I would like to point out that all of the specific measures mentioned in the foregoing discussion can be embedded as special cases in a unified formalism taken from generalized information theory (Sharma and Mittal 1975; Hoffmann 2008), namely:
$$ {D}_{Sharma- Mittal}^{\left(r,t\right)}\left({p}_1,\dots, {p}_n\right)=\frac{1}{t-1}\left[1-{\left(\sum \limits_{i=1}^n{p}_i^r\right)}^{\frac{t-1}{r-1}}\right] $$
Parameters r and t of the Sharma and Mittal (1975) family of measures are usually taken to be non-negative (r,t ≥ 0), while for r → 1 and t → 1 \( {D}_{Sharma- Mittal}^{\left(r,t\right)}\left({p}_1,\dots, {p}_n\right) \) is known to yield the classical Shannon formula, DShannon(p1, … , pn) in our notation (see Crupi et al. 2018). Accordingly, it is costumary to just posit \( {D}_{Sharma- Mittal}^{\left(1,1\right)}\left({p}_1,\dots, {p}_n\right)={D}_{Shannon}\left({p}_1,\dots, {p}_n\right) \). Other settings of parameters r, t (known as order and degree, respectively, in the information theory literature) generate all diversity measures mentioned above, as illustrated in Table 6.2.
Table 6.2

Some important special cases of the Sharma-Mittal framework for statistical measures of diversity

(r,t)-setting

Diversity measure

r = 0

\( {D}_{Sharma- Mittal}^{\left(0,0\right)}\left({p}_1,\dots, {p}_n\right)= \) \( Richnes{s}^{\ast}\left({p}_1,\dots, {p}_n\right)=\sum \limits_{i=1}^n{p_i}^0-1 \)

t = 0

r → 1

\( {D}_{Sharma- Mittal}^{\left(1,1\right)}\left({p}_1,\dots, {p}_n\right)= \) \( {D}_{Shannon}\left({p}_1,\dots, {p}_n\right)=\sum \limits_{i=1}^n{p}_i\log \left(\frac{1}{p_i}\right) \)

t → 1

r = 2

\( {D}_{Sharma- Mittal}^{\left(2,2\right)}\left({p}_1,\dots, {p}_n\right)= \) \( {D}_{Gini}\left({p}_1,\dots, {p}_n\right)=1-\sum \limits_{i=1}^n{p}_i^2 \)

t = 2

r = 2

\( {D}_{Sharma- Mittal}^{\left(2,0\right)}\left({p}_1,\dots, {p}_n\right)= \) \( {D}_{Gini- EN}^{\ast}\left({p}_1,\dots, {p}_n\right)=\frac{1}{\sum_{i=1}^n{p}_i^2}-1 \)

t = 0

r → 1

\( {D}_{Sharma- Mittal}^{\left(1,0\right)}\left({p}_1,\dots, {p}_n\right)= \) \( {D}_{Shannon- EN}^{\ast}\left({p}_1,\dots, {p}_n\right)={e}^{\sum \limits_{i=1}^n{p}_i\log \left(\frac{1}{p_i}\right)}-1 \)

t = 0

r = ½

\( {D}_{Sharma- Mittal}^{\left(\frac{1}{2},0\right)}\left({p}_1,\dots, {p}_n\right)= \) \( {D}_{Root}^{\ast}\left({p}_1,\dots, {p}_n\right)={\left(\sum \limits_{i=1}^n\sqrt{p_i}\right)}^2-1 \)

t = 0

What is the meaning of the order (r) and degree (t) parameters in the Sharma-Mittal formalism when employed in the measurement of biological diversity?

The order parameter r is an index of the insensitivity to less abundant species. In fact, as r increases, diversity gets closer and closer to a simple (decreasing) function of one single element p* in the distribution (p1, …, pn), that is, the relative abundance of the most common species. As an illustration, on the basis of the limit for r → ∞ when t = 2, one has \( {D}_{Sharma- Mittal}^{\left(\infty, 2\right)}\left({p}_1,\dots, {p}_n\right)=1-{p}^{\ast } \) (see Crupi et al. 2018). When r = 0, on the contrary, diversity becomes a (increasing) function of the plain number of species with non-null relative abundance. The simplest illustration here is just \( {D}_{Sharma- Mittal}^{\left(0,0\right)}\left({p}_1,\dots, {p}_n\right)= Richnes{s}^{\ast}\left({p}_1,\dots, {p}_n\right) \) (see Table 6.2 and Fig. 6.1). This shows that the order parameter r indicates how much a diversity measure disregards relatively rare species. For order-0 measures, the actual distribution of relative abundance is neglected: non-zero abundance species are just counted, as if they were all equally important. For order-∞ measures, on the other hand, only the most abundant species matters, and all others are neglected altogether. The higher [lower] r is, the more [less] the common species are regarded and the rare species are discounted in the measurement of diversity.
Fig. 6.1

The Sharma-Mittal family of diversity measures is represented in a Cartesian quadrant with values of the order parameter r and of the degree parameter t lying on the x– and y–axis, respectively. Each point in the quadrant corresponds to a specific measure. A line corresponds to a distinct one-parameter generalized diversity function. Several special cases are highlighted. A point in the plane represents a concave diversity measure unless it lies strictly below the dotted line where t = 2–1/r

Importantly, for extreme values of the order parameter, an otherwise natural idea of continuity fails in the measurement of diversity: when r goes to either zero or infinity, it is not the case that small (large) changes in the abundance distribution produce comparably small (large) changes in diversity. To illustrate, all order-0 entropies remain entirely invariant upon as large a change as that from, say, (1/3, 1/3, 1/3) to (0.98, 0.01, 0.01), while they yield clearly different values for as small a change as that from (0.98, 0.01, 0.01) to (0.99, 0.01, 0). Order-∞ entropies, in turn, remain entirely invariant upon as large a change as that from, say, (0.50, 0.25, 0.25) to (0.50, 0.50, 0), yet they still yield distinct values for as small a change as that from (0.50, 0.25, 0.25) to (0.52, 0.24, 0.24).

The role of the degree parameter t is somewhat more technical: it affects a few important metric properties. To appreciate this, it is useful to consider that all specific measures we considered earlier lie either (i) on the x-axis or (ii) on the diagonal line in Fig. 6.1. This is not by chance. Let us conclude our discussion by briefly considering cases (i) and (ii) in turn.
  1. (i)

    Measures lying on the x-axis are obtained by positing t = 0, thus yielding:

     
$$ {D}_{Sharma- Mittal}^{\left(r,0\right)}\left({p}_1,\dots, {p}_n\right)={\left(\sum \limits_{i=1}^n{p_i}^r\right)}^{\frac{1}{1-r}}-1 $$
For a diversity measure in the Sharma-Mittal family, having degree 0 (t = 0) is known to be a necessary and sufficient condition to satisfy the replication principle. In fact, this was a major reason for Hill (1973) to advocate this formalism as a one-parameter generalized approach to measure diversity. More precisely, the replication principle is satisfied by \( {D}_{Sharma- Mittal}^{\left(r,0\right)}\left({p}_1,\dots, {p}_n\right)+1 \) (for any r) (see Hoffmann 2008, pp. 20–21), and that is equivant to the formula originally employed by Hill (1973, p. 428). The comprehensive approach presented here reveals one striking aspect of Hill’s measures: for any Sharma-Mittal measure of a specified order r (regardless of the concurrent value of the degree parameter t!) \( {D}_{Sharma- Mittal}^{\left(r,0\right)}\left({p}_1,\dots, {p}_n\right)+1 \) computes the corresponding effective number as defined earlier, i.e., the theoretical number of equally abundant categories that would be just as diverse as (p1, …, pn) is under that measure (see Crupi et al. 2018).
  1. (ii)

    As we have seen with some special cases like \( {D}_{Gini- EN}^{\ast}\left({p}_1,\dots, {p}_n\right) \), effective number measures of diversity may not be concave functions. Most Sharma-Mittal measures are concave, however: \( {D}_{Sharma- Mittal}^{\left(r,t\right)}\left({p}_1,\dots, {p}_n\right) \) generates a concave function as long as t ≥ 2–1/r (see Hoffmann 2008 for a proof). This implies, in particular, the concavity of all measures lying on the diagonal line in Fig. 6.1, which are obtained by positing r = t, thus yielding:

     
$$ {D}_{Sharma- Mittal}^{\left(t,t\right)}\left({p}_1,\dots, {p}_n\right)=\frac{1}{t-1}\left[1-\sum \limits_{i=1}^n{p_i}^t\right] $$
Measures of this kind are often labelled after Tsallis’s (1988, 2004) work in generalized thermodynamics. Partly because of the concavity property, the Tsallis one-parameter continuum has been recently advocated as a compelling approach to the measurement of biological diversity by Keylock (2005).

The relevance of statistical measures of diversity is an open issue for the theoretical biologist and the philosopher addressing the investigation of biodiversity, and indeed a matter of much debate (see, e.g., Barrantes and Sandoval 2009 and Blandin 2015). In this chapter, no claim has been made to the resolution of divergences in this respect. Consideration of the variety and integration of diversity measures remains important, however, for the debate to be adequately informed. Advocates of the measurement of diversity should of course be aware of the tools at their disposal. Opponents and skeptics, on the other hand, should be careful to make sure that their legitimate doubts are not inflated by too narrow an outlook on the ways in which the notion of biological diversity can be formally unpacked and assessed.

Footnotes

  1. 1.

    Our use of the term category here is very general: essentially, categories in our current sense are the elements of any partition of interest. This terminology is thus not constrained by the more technical and specific distinction between “species category” and “species taxon” (e.g., Bock 2004). In particular, a set of different taxa can be treated as a partition of categories in our terms.

  2. 2.

    The choice of a base for the logarithm is a matter of conventionally setting a unit of measurement. Usual options include 2, 10, and e. We will adopt the latter throughout our discussion, thus employing the natural logarithm in subsequent calculations.

  3. 3.

    As pointed out by Arimoto (1971, p. 186), it also turns out that \( {D}_{Root}^{\ast}\left({p}_1,\dots, {p}_n\right)=\sum \limits_{i=1}^n\sum \limits_{j=1,j\ne i}^n\sqrt{p_i{p}_j} \).

References

  1. Aczél, J. (1984). Measuring information beyond communication theory: Why some generalized information measures may be useful, others not. Aequationes Mathematicae, 27, 1–19.CrossRefGoogle Scholar
  2. Arimoto, S. (1971). Information-theoretical considerations on estimation problems. Information and Control, 19, 181–194.CrossRefGoogle Scholar
  3. Aslama, M., Hellman, H., & Sauri, T. (2004). Does market-entry regulation matter? Gazette: The International Journal for Communication Studies, 66, 113–132.CrossRefGoogle Scholar
  4. Barrantes, G., & Sandoval, L. (2009). Conceptual and statistical problems associated with the use of diversity indices in ecology. International Journal of Tropical Biology, 57, 451–460.Google Scholar
  5. Blandin, P. (2015). La diversità del vivente prima e dopo la biodiversità. Rivista di Estetica, 59, 63–92.CrossRefGoogle Scholar
  6. Bock, W. J. (2004). Species: The concept, category, and taxon. Journal of Zoological Systematics and Evolutionary Research, 42, 178–190.CrossRefGoogle Scholar
  7. Chakravarty, S., & Eichhorn, W. (1991). An axiomatic characterization of a generalized index of concentration. Journal of Productivity Analysis, 2, 103–112.CrossRefGoogle Scholar
  8. Chao, A., & Jost, L. (2012). Diversity measures. In A. Hastings & L. J. Gross (Eds.), Encyclopedia of theoretical ecology (pp. 203–207). Berkeley: University of California Press.Google Scholar
  9. Crupi, V., Nelson, J., Meder, B., Cevolani, G., & Tentori, K. (2018). Generalized information theory meets human cognition: Introducing a unified framework to model uncertainty and information search. Cognitive Science, 42, 1410–1456.CrossRefGoogle Scholar
  10. Csizár, I. (2008). Axiomatic characterizations of information measures. Entropy, 10, 261–273.CrossRefGoogle Scholar
  11. Gini, C. (1912). Variabilità e mutabilità. In Memorie di metodologia statistica, I: Variabilità e concentrazione (pp. 189–358). Milano: Giuffrè, 1939.Google Scholar
  12. Golosov, G. V. (2010). The effective number of parties: A new approach. Party Politics, 16, 171–192.CrossRefGoogle Scholar
  13. Greenberg, J. H. (1956). The measurement of linguistic diversity. Language, 32, 109–115.CrossRefGoogle Scholar
  14. Hill, M. (1973). Diversity and evenness: A unifying notation and its consequences. Ecology, 54, 427–431.CrossRefGoogle Scholar
  15. Hoffmann, S. (2008). Generalized distribution-based diversity measurement: Survey and unification. Faculty of Economics and Management Magdeburg (Working Paper 23). http://www.ww.uni-magdeburg.de/fwwdeka/femm/a2008_Dateien/2008_23.pdf. Accessed 25 Sept 2018.
  16. Hoffmann, S., & Hoffmann, A. (2008). Is there a “true” diversity? Ecological Economics, 65, 213–215.CrossRefGoogle Scholar
  17. Hurlbert, S. H. (1971). The nonconcept of species diversity: A critique and alternative parameters. Ecology, 52, 577–586.CrossRefGoogle Scholar
  18. Jost, L. (2006). Entropy and diversity. Oikos, 113, 363–375.CrossRefGoogle Scholar
  19. Jost, L. (2009). Mismeasuring biological diversity: Responses to Hoffmann and Hoffmann (2008). Ecological Economics, 68, 925–928.CrossRefGoogle Scholar
  20. Keylock, J. C. (2005). Simpson diversity and the Shannon-Wiener index as special cases of a generalized entropy. Oikos, 109, 203–207.CrossRefGoogle Scholar
  21. MacArthur, R. H. (1965). Patterns of species diversity. Biological Reviews of the Cambridge Philosophical Society, 40, 510–533.CrossRefGoogle Scholar
  22. Patil, G., & Taille, C. (1982). Diversity as a concept and its measurement. Journal of the American Statistical Association, 77, 548–561.CrossRefGoogle Scholar
  23. Pielou, E. C. (1975). Ecological diversity. New York: Wiley.Google Scholar
  24. Ricotta, C. (2003). On parametric evenness measures. Journal of Theoretical Biology, 222, 189–197.CrossRefGoogle Scholar
  25. Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423 and 623–656.CrossRefGoogle Scholar
  26. Sharma, B., & Mittal, D. (1975). New non–additive measures of entropy for discrete probability distributions. Journal of Mathematical Sciences (Delhi), 10, 28–40.Google Scholar
  27. Simpson, E. H. (1949). Measurement of diversity. Nature, 163, 688.CrossRefGoogle Scholar
  28. Tsallis, C. (1988). Possible generalization of Boltzmann-Gibbs statistics. Journal of Statistical Physics, 52, 479–487.CrossRefGoogle Scholar
  29. Tsallis, C. (2004). What should a statistical mechanics satisfy to reflect nature? Physica D, 193, 3–34.CrossRefGoogle Scholar
  30. Vajda, I., & Zvárová, J. (2007). On generalized entropies, Bayesian decisions, and statistical diversity. Kybernetika, 43, 675–696.Google Scholar

Copyright information

© The Author(s) 2019

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Center for Logic, Language, and Cognition, Department of Philosophy and EducationUniversity of TurinTurinItaly

Personalised recommendations