On the Turing estimator in capture–recapture count data under the geometric distribution

Anan, Orasa; Böhning, Dankmar; Maruotti, Antonello

doi:10.1007/s00184-018-0695-7

On the Turing estimator in capture–recapture count data under the geometric distribution

Published: 12 November 2018

Volume 82, pages 149–172, (2019)
Cite this article

Metrika Aims and scope Submit manuscript

318 Accesses
7 Citations
Explore all metrics

Abstract

We introduce an estimator for an unknown population size in a capture–recapture framework where the count of identifications follows a geometric distribution. This can be thought of as a Poisson count adjusted for exponentially distributed heterogeneity. As a result, a new Turing-type estimator under the geometric distribution is obtained. This estimator can be used in many real life situations of capture–recapture, in which the geometric distribution is more appropriate than the Poisson. The proposed estimator shows a behavior comparable to the maximum likelihood one, on both simulated and real data. Its asymptotic variance is obtained by applying a conditional technique and its empirical behavior is investigated through a large-scale simulation study. Comparisons with other well-established estimators are provided. Empirical applications, in which the population size is known, are also included to further corroborate the simulation results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Population size estimation and heterogeneity in capture–recapture data: a linear regression estimator based on the Conway–Maxwell–Poisson distribution

Article 18 April 2016

Two-step semiparametric empirical likelihood inference from capture–recapture data with missing covariates

Article 14 February 2024

Population Size Estimation Using Zero-Truncated Poisson Regression with Measurement Error

Article 12 January 2022

References

Anan O, Böhning D, Maruotti A (2017a) Population size estimation and heterogeneity in capture–recapture data: a linear regression estimator based on the Conway–Maxwell–Poisson distribution. Stat Methods Appl 26:49–79
Article MathSciNet MATH Google Scholar
Anan O, Böhning D, Maruotti A (2017b) Uncertainty estimation in heterogeneous capture–recapture count data. J Stat Comput Simul 87:2094–2114
Article MathSciNet MATH Google Scholar
Böhning D (2008) A simple variance formula for population size estimators by conditioning. Stat Methodol 5:410–423
Article MathSciNet MATH Google Scholar
Böhning D (2015) Power series mixtures and the ratio plot with applications to zero-truncated count distribution modelling. METRON 73:201–216
Article MathSciNet MATH Google Scholar
Böhning D, Schön D (2005) Nonparametric maximum likelihood estimation of population size based on the counting distribution. J R Stat Soc Ser C 54:721–737
Article MathSciNet MATH Google Scholar
Böhning D, Punyapornwithaya V (2018) The geometric distribution, the ratio plot under the null and the burden of Dengue fever in Chiang Mai province. In: Böhning D, Bunge J, van der Heijden PGM (eds) Capture–recapture methods for the social and medical sciences. CRC Press, Boca Raton, pp 55–60
Google Scholar
Böhning D, Baksh MF, Lerdsuwansri R, Gallagher J (2013) Use of the ratio plot in capture–recapture estimation. J Comput Graph Stat 22:135–155
Article MathSciNet Google Scholar
Böhning D, van der Heijden PGM, Bunge J (2018) Capture–recapture methods for the social and medical sciences. CRC Press, Boca Raton
Google Scholar
Borchers DL, Buckland ST, Zucchini W (2004) Estimating animal abundance: closed populations. Springer, London
MATH Google Scholar
Chao A (1987) Estimating the population size for capture–recapture data with unequal catchability. Biometrics 43:783–791
Article MathSciNet MATH Google Scholar
Chao A (1989) Estimating population size for sparse data in capture–recapture experiments. Biometrics 45:427–438
Article MathSciNet MATH Google Scholar
Chao A, Colwell RK (2017) Thirty years of progeny from Chao’s inequality: estimating and comparing richness with incidence data and incomplete sampling. SORT Stat Oper Res Trans 41:3–54
MathSciNet MATH Google Scholar
Coumans AM, Cruyff M, Van der Heijden PGM, Wolf J, Schmeets H (2017) Estimating homelessness in the Netherlands using a capture–recapture approach. Soc Indic Res 130:189–212
Article Google Scholar
Farcomeni A, Scacciatelli D (2013) Heterogeneity and behavioural response in continuous time capture–recapture, with application to street cannabis use in Italy. Ann Appl Stat 7:2293–2314
Article MathSciNet MATH Google Scholar
Fisher RA, Corbet AS, Williams CB (1943) The relation between the number of species and the number of individuals in a random sample from one animal population. J Anim Ecol 12:42–58
Article Google Scholar
Hwang WH, Huggins R (2005) An examination of the effect of heterogeneity on the estimation of population size using capture–recapture data. Biometrika 92:229–233
Article MathSciNet MATH Google Scholar
Hwang W-H, Lin C-W, Shen T-J (2015) Good–Turing frequency estimation in a finite population. Biometrical J 57:321–339
Article MathSciNet MATH Google Scholar
Lloyd CJ, Frommer D (2004) Regression based estimation of the false negative fraction when multiple negatives are unverified. J R Stat Soc Ser C 53:619–631
Article MathSciNet MATH Google Scholar
McRea RS, Morgan BJT (2014) Analysis of capture–recapture data. CRC Press, Boca Raton
Book Google Scholar
Niwitpong SA, Böhning D, van der Heijden PG, Holling H (2013) Capture–recapture estimation based upon the geometric distribution allowing for heterogeneity. Metrika 76:495–519
Article MathSciNet MATH Google Scholar
Norris JL, Pollock KH (1996) Including model uncertainty in estimating variances in multiple capture studies. Environ Ecol Stat 3:235–244
Article Google Scholar
Puig P, Barquinero JF (2011) An application of compound poisson modelling to biological dosimetry. Proc R Soc A Math Phys Eng Sci 467:897–910
Article MathSciNet MATH Google Scholar
Puig P, Kokonendji CC (2018) Non-parametric estimation of the number of zeros in truncated count distributions. Scand J Stat 45:347–365
Article MathSciNet MATH Google Scholar
Shmueli G, Minka TP, Kadane JB, Borle S, Boatwright P (2005) A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution. J R Stat Soc Ser C 54:127–142
Article MathSciNet MATH Google Scholar
Zelterman D (1988) Robust estimation in truncated discrete distributions with application to capture–recapture experiments. J Stat Plan Inference 18:225–237
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is developed under the PRIN2015 supported-project “Environmental processes and human activities: capturing their interactions via statistical methods (EPHASTAT)” funded by MIUR (Italian Ministry of Education, University and Scientific Research). Antonello Maruotti is grateful to the “Centro di Ateneo per la Ricerca e l’Internalizzazione” (LUMSA) for the financial support.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Faculty of Science, Thaksin University, Songkhla, Thailand
Orasa Anan
Southampton Statistical Sciences Research Institute and Mathematical Sciences, University of Southampton, Southampton, UK
Dankmar Böhning
Dipartimento di Giurisprudenza, Economia, Politica e Lingue Moderne, Libera Università Maria Ss. Assunta, Rome, Italy
Antonello Maruotti
Department of Mathematics, University of Bergen, Bergen, Norway
Antonello Maruotti

Authors

Orasa Anan
View author publications
You can also search for this author in PubMed Google Scholar
Dankmar Böhning
View author publications
You can also search for this author in PubMed Google Scholar
Antonello Maruotti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonello Maruotti.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Appendix: Proof of Proposition 1

According to the conditional technique, we have

$$\begin{aligned} \textit{Var}(\hat{N}_{\textit{TG}}) = \textit{Var}_{n} \left\{ E(\widehat{N}_{\textit{TG}}|n) \right\} + E_{n} \left\{ \textit{Var}(\widehat{N}_{\textit{TG}}|n) \right\} . \end{aligned}$$

(8)

Starting from the first term on the right hand side of (8), the delta method we have $E(\widehat{N}_{\textit{TG}}|n)\approx \frac{n}{1-\kappa _0}$ and, accordingly,

$$\begin{aligned} \textit{Var}_{n} \left\{ E(\widehat{N}_{\textit{TG}}|n) \right\}\approx & {} \textit{Var}_{n} \left\{ \frac{n}{1-{\kappa }_0} \right\} = \frac{1}{(1-{\kappa }_0)^{2}} \textit{Var}(n) = \frac{N(1-{\kappa }_0){\kappa }_0}{(1-{\kappa }_0)^{2}}. \end{aligned}$$

(9)

Since $E(n) = N(1-{\kappa }_0)$ and $\widehat{\kappa }_{0(TG)} = \sqrt{\frac{f_{1}}{S}} $, the variance in (9) can be estimated as:

$$\begin{aligned} \widehat{\textit{Var}}_{n}\left\{ E(\widehat{N}_{\textit{TG}}|n) \right\} = \frac{n \sqrt{\frac{f_{1}}{S}}}{\left( 1- \sqrt{\frac{f_{1}}{S}}\right) ^{2} }. \end{aligned}$$

Additionally,

$$\begin{aligned} \textit{Var}(\widehat{N}_{\textit{TG}}|n)= & {} \textit{Var}\left( \frac{n}{1-\sqrt{\frac{f_{1}}{S}}}|n \right) = n^{2} \textit{Var}\left( \frac{1}{1-\sqrt{\frac{f_{1}}{S}}}\right) . \end{aligned}$$

We know that $\textit{Var}\Big ( \frac{1}{1-\sqrt{\frac{f_{1}}{S}}}\Big )$ can be approximated by the delta-method. Hence, let $y= \frac{f_{1}}{S}$ and we take $h(y)=\frac{1}{1-\sqrt{y}}$. Then,

$$\begin{aligned} h'(y)=-(1-y^{1/2})^{-2} \left( -\frac{1}{2}y^{-1/2} \right) =\frac{1}{2\sqrt{y}(1-\sqrt{y})^{2}}. \end{aligned}$$

Furthermore,

$$\begin{aligned} \textit{Var}\left( \frac{1}{1-\sqrt{\frac{f_{1}}{S}}}|n\right)\approx & {} \left( \frac{1}{2\sqrt{y}(1-\sqrt{y})^{2}} \right) ^{2}{} \textit{Var}\left( \frac{f_{1}}{S} \right) \\= & {} \left( \frac{1}{4\frac{f_{1}}{S}\left( 1-\sqrt{\frac{f_{1}}{S}}\right) ^{4}} \right) \textit{Var}\left( \frac{f_{1}}{S} \right) . \end{aligned}$$

As next step, using the conditional variance technique to estimate $\textit{Var}\left( \frac{f_{1}}{S} \right) $, we have that

$$\begin{aligned} \textit{Var}\left( \frac{f_{1}}{S} \right)= & {} \textit{Var}_{f_{1}}\left\{ E\left( \frac{f_{1}}{S}\right) |f_{1} \right\} +E_{f_{1}}\left\{ \textit{Var} \left( \frac{f_{1}}{S}|f_{1} \right) \right\} . \end{aligned}$$

(10)

With the approximation $E\left( \frac{f_{1}}{S}|f_{1} \right) = f_{1}E(\frac{1}{S}) \approx \frac{f_{1}}{S}$, we have that

$$\begin{aligned} \textit{Var}_{f_{1}}\left\{ E\left( \frac{f_{1}}{S} |f_{1}\right) \right\}\approx & {} \textit{Var}_{f_{1}}\left( \frac{f_{1}}{S} \right) = \frac{1}{S^{2}} \textit{Var}(f_{1}) = \frac{1}{S^{2}} Np_{1}(1-p_{1}) \nonumber \\= & {} \frac{1}{S^{2}}\left( N\frac{f_{1}}{N} \left( 1-\frac{f_{1}}{N}\right) \right) = \frac{f_{1}}{S^{2}}\left( 1-\frac{f_{1}}{N}\right) . \end{aligned}$$

(11)

Again, estimating $E_{f_{1}} \left\{ \textit{Var}\left( \frac{f_{1}}{S}|f_{1} \right) \right\} $ by $\textit{Var}\left( \frac{f_{1}}{S}|f_{1} \right) $ we have that

$$\begin{aligned} E_{f_{1}} \left\{ \textit{Var}\left( \frac{f_{1}}{S}|f_{1} \right) \right\}\approx & {} \textit{Var}\left( \frac{f_{1}}{S}|f_{1} \right) = f_{1}^{2} \textit{Var}\left( \frac{1}{S}\right) \end{aligned}$$

Using the delta method, we achieve that

$$\begin{aligned} \textit{Var}\left( \frac{1}{S}\right)\approx & {} \frac{1}{S^{4}} \textit{Var}(N\bar{X}) = \frac{1}{S^{4}}N^{2} \textit{Var}(\bar{X}) = \frac{1}{S^{4}}N^{2} \frac{\textit{Var}(X)}{N}. \end{aligned}$$

Since $X\sim \textit{Geo}(p)$ we have that $E(X)=\frac{1-p}{p}$ and $\textit{Var}(X)=\frac{1-p}{p^{2}}$.

$$\begin{aligned} \textit{Var}\left( \frac{1}{S}\right)\approx & {} \frac{1}{S^{4}}N^{2} \frac{ \left( \frac{1-p}{p^{2}}\right) }{N} = \frac{1}{S^{4}}N^{2} \frac{ \left( \frac{E(X)}{p}\right) }{N} =\frac{1}{S^{4}}N^{2} \frac{ \left( \frac{E(S/N)}{p}\right) }{N} \approx \frac{1}{pS^{3}}. \end{aligned}$$

Let us note that

$$\begin{aligned} E\left( \frac{S}{N}\right) = \frac{1-p}{p}; \quad \frac{S}{N} \approx \frac{1-p}{p} \quad \mathrm{or} \quad p(S+N)\approx & {} N \quad \mathrm{or} \quad \frac{1}{p} \approx \frac{S+N}{N}.\qquad \end{aligned}$$

(12)

Hence,

$$\begin{aligned} \widehat{\textit{Var}}\left( \frac{f_{1}}{S}|f_{1} \right) =\frac{f_{1}^{2}}{S^{3}}\left( \frac{S+N}{N} \right) . \end{aligned}$$

Substituting (11) and (12) into (10), this leads to

$$\begin{aligned} \widehat{\textit{Var}} \left( \frac{f_{1}}{S} \right)= & {} \frac{1}{S^{2}}\left\{ f_{1}\left( 1-\frac{f_{1}}{N} \right) \right\} +\frac{f_{1}^{2}}{S^{3}}\left( \frac{S+N}{N} \right) \\= & {} \frac{f_{1}}{S^{2}} \left\{ \frac{N+f_{1}}{N}+\frac{f_{1}}{S}\left( \frac{S+N}{N} \right) \right\} \nonumber \\= & {} \frac{f_{1}}{S^{2}}\left\{ \frac{NS-Sf_{1}+f_{1}S+f_{1}N}{NS} \right\} =\frac{f_{1}}{S^{2}} \left\{ \frac{N(S+f_{1})}{NS} \right\} =\frac{f_{1}S+f_{1}^{2}}{S^{3}}. \end{aligned}$$

We have that

$$\begin{aligned} \widehat{\textit{Var}} \left( \frac{1}{1-\sqrt{\frac{f_{1}}{S}}} \right)= & {} \left\{ \frac{1}{\frac{4f_{1}}{S} \left( 1-\sqrt{\frac{f_{1}}{S}} \right) ^{4}} \right\} \left\{ \frac{f_{1}S+f_{1}^{2}}{S^{3}} \right\} = \widehat{\textit{Var}} \left( \frac{1}{1-\sqrt{\frac{f_{1}}{S}}} \right) \nonumber \\= & {} \left\{ \frac{S}{4f_{1}\left( 1-\sqrt{\frac{f_{1}}{S}} \right) ^{4}} \right\} \left\{ \frac{f_{1}S+f_{1}^{2}}{S^{3}} \right\} = \frac{Sf_{1}+f_{1}^{2}}{4f_{1}S^{2}\left( 1-\sqrt{\frac{f_{1}}{S}} \right) ^{4}} \nonumber \\= & {} \frac{S+f_{1}}{4S^{2}\left( 1-\sqrt{\frac{f_{1}}{S}} \right) ^{4}} . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anan, O., Böhning, D. & Maruotti, A. On the Turing estimator in capture–recapture count data under the geometric distribution. Metrika 82, 149–172 (2019). https://doi.org/10.1007/s00184-018-0695-7

Download citation

Received: 14 November 2017
Published: 12 November 2018
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s00184-018-0695-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Turing estimator in capture–recapture count data under the geometric distribution

Abstract

Access this article

Similar content being viewed by others

Population size estimation and heterogeneity in capture–recapture data: a linear regression estimator based on the Conway–Maxwell–Poisson distribution

Two-step semiparametric empirical likelihood inference from capture–recapture data with missing covariates

Population Size Estimation Using Zero-Truncated Poisson Regression with Measurement Error

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Appendix: Proof of Proposition 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the Turing estimator in capture–recapture count data under the geometric distribution

Abstract

Access this article

Similar content being viewed by others

Population size estimation and heterogeneity in capture–recapture data: a linear regression estimator based on the Conway–Maxwell–Poisson distribution

Two-step semiparametric empirical likelihood inference from capture–recapture data with missing covariates

Population Size Estimation Using Zero-Truncated Poisson Regression with Measurement Error

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Appendix: Proof of Proposition 1

Appendix: Proof of Proposition 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation