Abstract
We introduce an estimator for an unknown population size in a capture–recapture framework where the count of identifications follows a geometric distribution. This can be thought of as a Poisson count adjusted for exponentially distributed heterogeneity. As a result, a new Turing-type estimator under the geometric distribution is obtained. This estimator can be used in many real life situations of capture–recapture, in which the geometric distribution is more appropriate than the Poisson. The proposed estimator shows a behavior comparable to the maximum likelihood one, on both simulated and real data. Its asymptotic variance is obtained by applying a conditional technique and its empirical behavior is investigated through a large-scale simulation study. Comparisons with other well-established estimators are provided. Empirical applications, in which the population size is known, are also included to further corroborate the simulation results.
Similar content being viewed by others
References
Anan O, Böhning D, Maruotti A (2017a) Population size estimation and heterogeneity in capture–recapture data: a linear regression estimator based on the Conway–Maxwell–Poisson distribution. Stat Methods Appl 26:49–79
Anan O, Böhning D, Maruotti A (2017b) Uncertainty estimation in heterogeneous capture–recapture count data. J Stat Comput Simul 87:2094–2114
Böhning D (2008) A simple variance formula for population size estimators by conditioning. Stat Methodol 5:410–423
Böhning D (2015) Power series mixtures and the ratio plot with applications to zero-truncated count distribution modelling. METRON 73:201–216
Böhning D, Schön D (2005) Nonparametric maximum likelihood estimation of population size based on the counting distribution. J R Stat Soc Ser C 54:721–737
Böhning D, Punyapornwithaya V (2018) The geometric distribution, the ratio plot under the null and the burden of Dengue fever in Chiang Mai province. In: Böhning D, Bunge J, van der Heijden PGM (eds) Capture–recapture methods for the social and medical sciences. CRC Press, Boca Raton, pp 55–60
Böhning D, Baksh MF, Lerdsuwansri R, Gallagher J (2013) Use of the ratio plot in capture–recapture estimation. J Comput Graph Stat 22:135–155
Böhning D, van der Heijden PGM, Bunge J (2018) Capture–recapture methods for the social and medical sciences. CRC Press, Boca Raton
Borchers DL, Buckland ST, Zucchini W (2004) Estimating animal abundance: closed populations. Springer, London
Chao A (1987) Estimating the population size for capture–recapture data with unequal catchability. Biometrics 43:783–791
Chao A (1989) Estimating population size for sparse data in capture–recapture experiments. Biometrics 45:427–438
Chao A, Colwell RK (2017) Thirty years of progeny from Chao’s inequality: estimating and comparing richness with incidence data and incomplete sampling. SORT Stat Oper Res Trans 41:3–54
Coumans AM, Cruyff M, Van der Heijden PGM, Wolf J, Schmeets H (2017) Estimating homelessness in the Netherlands using a capture–recapture approach. Soc Indic Res 130:189–212
Farcomeni A, Scacciatelli D (2013) Heterogeneity and behavioural response in continuous time capture–recapture, with application to street cannabis use in Italy. Ann Appl Stat 7:2293–2314
Fisher RA, Corbet AS, Williams CB (1943) The relation between the number of species and the number of individuals in a random sample from one animal population. J Anim Ecol 12:42–58
Hwang WH, Huggins R (2005) An examination of the effect of heterogeneity on the estimation of population size using capture–recapture data. Biometrika 92:229–233
Hwang W-H, Lin C-W, Shen T-J (2015) Good–Turing frequency estimation in a finite population. Biometrical J 57:321–339
Lloyd CJ, Frommer D (2004) Regression based estimation of the false negative fraction when multiple negatives are unverified. J R Stat Soc Ser C 53:619–631
McRea RS, Morgan BJT (2014) Analysis of capture–recapture data. CRC Press, Boca Raton
Niwitpong SA, Böhning D, van der Heijden PG, Holling H (2013) Capture–recapture estimation based upon the geometric distribution allowing for heterogeneity. Metrika 76:495–519
Norris JL, Pollock KH (1996) Including model uncertainty in estimating variances in multiple capture studies. Environ Ecol Stat 3:235–244
Puig P, Barquinero JF (2011) An application of compound poisson modelling to biological dosimetry. Proc R Soc A Math Phys Eng Sci 467:897–910
Puig P, Kokonendji CC (2018) Non-parametric estimation of the number of zeros in truncated count distributions. Scand J Stat 45:347–365
Shmueli G, Minka TP, Kadane JB, Borle S, Boatwright P (2005) A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution. J R Stat Soc Ser C 54:127–142
Zelterman D (1988) Robust estimation in truncated discrete distributions with application to capture–recapture experiments. J Stat Plan Inference 18:225–237
Acknowledgements
This work is developed under the PRIN2015 supported-project “Environmental processes and human activities: capturing their interactions via statistical methods (EPHASTAT)” funded by MIUR (Italian Ministry of Education, University and Scientific Research). Antonello Maruotti is grateful to the “Centro di Ateneo per la Ricerca e l’Internalizzazione” (LUMSA) for the financial support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Appendix: Proof of Proposition 1
Appendix: Proof of Proposition 1
According to the conditional technique, we have
Starting from the first term on the right hand side of (8), the delta method we have \(E(\widehat{N}_{\textit{TG}}|n)\approx \frac{n}{1-\kappa _0}\) and, accordingly,
Since \(E(n) = N(1-{\kappa }_0)\) and \(\widehat{\kappa }_{0(TG)} = \sqrt{\frac{f_{1}}{S}} \), the variance in (9) can be estimated as:
Additionally,
We know that \(\textit{Var}\Big ( \frac{1}{1-\sqrt{\frac{f_{1}}{S}}}\Big )\) can be approximated by the delta-method. Hence, let \(y= \frac{f_{1}}{S}\) and we take \(h(y)=\frac{1}{1-\sqrt{y}}\). Then,
Furthermore,
As next step, using the conditional variance technique to estimate \(\textit{Var}\left( \frac{f_{1}}{S} \right) \), we have that
With the approximation \(E\left( \frac{f_{1}}{S}|f_{1} \right) = f_{1}E(\frac{1}{S}) \approx \frac{f_{1}}{S}\), we have that
Again, estimating \(E_{f_{1}} \left\{ \textit{Var}\left( \frac{f_{1}}{S}|f_{1} \right) \right\} \) by \(\textit{Var}\left( \frac{f_{1}}{S}|f_{1} \right) \) we have that
Using the delta method, we achieve that
Since \(X\sim \textit{Geo}(p)\) we have that \(E(X)=\frac{1-p}{p}\) and \(\textit{Var}(X)=\frac{1-p}{p^{2}}\).
Let us note that
Hence,
Substituting (11) and (12) into (10), this leads to
We have that
Rights and permissions
About this article
Cite this article
Anan, O., Böhning, D. & Maruotti, A. On the Turing estimator in capture–recapture count data under the geometric distribution. Metrika 82, 149–172 (2019). https://doi.org/10.1007/s00184-018-0695-7
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-018-0695-7