, Volume 74, Issue 1, pp 80-100
Date: 23 Nov 2012

A new estimator for the number of species in a population


We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ 2. Our method involves simultaneous estimation of T, γ 2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ 2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species’ distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford’s law, showing good performances of the new estimator, even beyond γ 2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level.