Abstract
Lalande et al. (Microb. Ecol. 66(3):647–658, 2013) introduced a promising approach to quantify microbial diversity from fingerprinting profiles. Their analysis is based on extrapolating the abundance of the phylotypes detectable in a fingerprint towards the rare phylotypes of the community. By considering a set of reconstructed communities, Lalande et al. obtained a range of estimates for phylotype richness, Shannon diversity and Simpson diversity. They reported narrow ranges indicating accurate estimation, especially for Shannon and Simpson diversities. Here, we show that a much larger set of reconstructed communities than the one considered by Lalande et al. is consistent with the fingerprint. We find that the estimates for phylotype richness and Shannon diversity vary over orders of magnitude, but that the estimates for Simpson diversity are restricted to a narrow range (around 10 %). We conclude that only Simpson diversity can be estimated accurately from fingerprints.
References
Lalande J, Villemur R, Deschênes L (2013) A new framework to accurately quantify soil bacterial community diversity from DGGE. Microb Ecol 66(3):647–658
Loisel P, Harmand J, Zemb O, Latrille E, Lobry C, Delgenès JP, Godon J J (2006) Denaturing gradient electrophoresis (DGE) and singlestrand conformation polymorphism (SSCP) molecular fingerprintings revisited by simulation and used as a tool to measure microbial diversity. Environ Microbiol 8(4):720–731
Blackwood CB, Hudleston D, Zak DR, Buyer JS (2007) Interpreting ecological diversity indices applied to terminal restriction fragment length polymorphism data: insights from simulated microbial communities. Appl Environ Microbiol 73(16):5276–5283
Haegeman B, Hamelin J, Moriarty J, Neal P, Dushoff J, Weitz JS (2013) Robust estimation of microbial diversity in theory and in practice. ISME J 7(6):1092–1101
Acknowledgments
This work was supported by the SYSCOMM project DISCO (ANR09SYSC003) and by the TULIP Laboratory of Excellence (ANR10LABX41).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Here, we describe the reconstructed communities of Fig. 1 and the diversity estimates shown in Fig. 2.
First, we extracted the fingerprint peak areas from Fig. 1 of Ref. [1]. The total area of the 34 extracted peak equals 20 % of the total area under the fingerprinting profile (hence, the peaktosignal ratio PSR =0.20 in the terminology of Ref. [1]). The remaining 80 % of the area under the profile corresponds to the background (that is, the subpeak background percentage SBP =0.80 in the terminology of Ref. [2]).
Second, we constructed four communities consistent with the fingerprint data. The 34 most abundant phylotypes correspond to the fingerprint peaks. The relative abundance of these phylotypes is equal to the peak areas divided by the total area under the profile. Hence, the total relative abudance of the most abundant phylotypes is equal to 0.20. We chose the abundance distribution of the rare phylotypes such that the following conditions are satisfied: (1) the total relative abundance of the rare phylotypes is equal to 0.80 and (2) the abundance of a rare phylotype is smaller than the abundance of each of the most abundant phylotypes.
We report the abundance distribution of the rare phylotypes as rankabundance curves, that is, we give the relationship between relative abundance p _{ i } and rank i for the rare phylotypes (with rank i < 34):

The red community has 10^{3} phylotypes. Its rankabundance curve is quadratic on a loglog plot, ln p _{ i } = −3.391 − 0.8554 ln i + 0.03750 (lni)^{2} for 34 < i ≤ 10^{3}.

The yellow community has 10^{4} phylotypes. Its rankabundance curve is linear on a loglog plot, ln p _{ i } = −2.924 − 0.8535 ln i for 34 < i ≤ 10^{4}.

The green community has 10^{5} phylotypes. Its rankabundance curve is linear on a loglog plot, ln p _{ i } = −2.492 − 0.9750 ln i for 34 < i ≤ 10^{5}.

The blue community has 10^{6} phylotypes. Its rankabundance curve is linear on a loglog plot, ln p _{ i } = −2.294 − 1.0306 ln i for 34 < i ≤ 10^{6}.
For the yellow, green and blue communities, the abundance distribution of the rare phylotypes is power law. For the red community this distribution is approximately power law (the rankabundance curve is slightly convex, see Fig. 1, righthand panel). For a community with 10^{3} phylotypes, a power law distribution for the rare phylotypes does not match smoothly the abundance of the dominant phylotypes.
Third, we computed three diversity metrics for the four reconstructed communities: phylotype richness D _{0}, Shannon diversity D _{1},
and Simpson diversity D _{2},
The notation D _{0}, D _{1} and D _{2} refers to Hill diversities of order 0, 1 and 2 (see Ref. [4] for details). Because Hill diversities can be interpreted as effective numbers of phylotypes, they are intercomparable. Therefore, we prefer to use the transformed diversity metrics D _{1} and D _{2} rather than Shannon diversity index H and Simpson concentration index C. We find:

For red community: D _{0}=10^{3}, D _{1}=7.4 10^{2} and D _{2}=4.1 10^{2}.

For yellow community: D _{0}=10^{4}, D _{1}=2.8 10^{3} and D _{2}=5.0 10^{2}.

For green community: D _{0}=10^{5}, D _{1}=7.7 10^{3} and D _{2}=5.2 10^{2}.

For blue community: D _{0}=10^{6}, D _{1}=1.7 10^{4} and D _{2}=5.3 10^{2}.
Finally, we generalized the analysis to a much large set of reconstructed communities. More precisely, we considered all reconstructed communities satisfying conditions (1) and (2) above. This set, although it contains unrealistic communities (for example, communities with an abrupt transition from dominant to rare phylotypes), is useful to obtain lower and upper bounds for the estimation range of the diversity metrics. Indeed, it is possible to determine the community in this set yielding the lowest and highest diversity estimates. The lowest diversity estimate is obtained for a community in which all the rare phylotypes have the same abundance as the smallest abundance of the most abundant phylotypes. The highest diversity estimate is obtained for a community in which there are a large number R of rare phylotypes which all have the same relative abundance 0.20/R.
The results of this further analysis are shown as the greyshaded regions in Fig. 2. The lower end of these regions are equal to the lowest diversity estimate. At the upper end, the shade of grey becomes gradually lighter, corresponding to the higest diversity estimate with R ranging from 10^{4} to 10^{7}. It is interesting to note the dependence of the highest diversity estimate on the number of rare phylotypes R for the three diversity metrics: when R is large, the estimate for phylotype richness increases proportional to R, the estimate for Shannon diversity increases proportional to ln R and the estimate for Simpson diversity tends to a fixed value. This establishes another argument of why Simpson diversity can be estimated more accurately than Shannon diversity and phylotype richness.
Rights and permissions
About this article
Cite this article
Haegeman, B., Sen, B., Godon, JJ. et al. Only Simpson Diversity can be Estimated Accurately from Microbial Community Fingerprints. Microb Ecol 68, 169–172 (2014). https://doi.org/10.1007/s0024801403945
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0024801403945