Skip to main content
Log in

Bootstrap Confidence Intervals for the Population Mean Under Inverse Sampling Design

  • Research Paper
  • Published:
Iranian Journal of Science and Technology, Transactions A: Science Aims and scope Submit manuscript

Abstract

Inverse sampling is commonly used for surveying rare (but not clustered) populations. However, when the sample size from the rare group is chosen too small, the customary unbiased estimator of the population mean appears to be highly skewed. In such a case, confidence intervals based on asymptotic normal theory have coverage rate smaller than the nominal level. As an approach to overcome this problem, we propose two resampling methods consisting of with-replacement bootstrap (BWR) and without replacement bootstrap (BWO) to construct confidence intervals for the population mean under simple inverse sampling without replacement. We carried out a simulation study to evaluate the behavior of suggested bootstrap methods, the normal approximation and the logarithmic transformation methods. Our simulation results suggest that the BWO method is preferable, since it provides intervals with coverage rate closer to the nominal level together with more balanced error rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Christman MC, Lan F (2001) Inverse adaptive cluster sampling. Biometrics 57:1096–1105

    Article  MathSciNet  MATH  Google Scholar 

  • Gross ST (1980) Median estimation in sample surveys. In: Proceeding of the survey research methods section. American Statistical Association, Alexandria, pp 181–184

  • Haldane JBS (1945) On a method of estimating frequencies. Biometrika 33:222–225

    Article  MathSciNet  MATH  Google Scholar 

  • McCarthy PJ, Snowden CB (1985) The bootstrap and finite population sampling. In: Vital and health statistics, Series 2. Public Health Service Publication, vol 95. Department of Health and Human Services, Washington, D.C., pp 85–1369

  • Moradi M, Salehi M, Brown JA, Karimi N (2011) Regression estimator under inverse sampling to estimate arsenic contamination. Environmetrics 22(7):894–900

    Article  MathSciNet  Google Scholar 

  • Murthy MN (1957) Ordered and unordered estimators in sampling without replacement. Sankhy Indian J Stat (1933–1960) 18(3/4):379–390

    MathSciNet  MATH  Google Scholar 

  • Salehi MM, Levy PS, Jamalzadeh MA, Chang KC (2006) Adaptation of multiple logistic regression to a multiple inverse sampling design: application to the Isfahan healthy heart program. Stat Med 25(1):71–85

    Article  MathSciNet  Google Scholar 

  • Salehi MM, Seber GAF (2001) A new proof of murthy’s estimator which applies to sequential sampling. Aust N Z J Stat 44:63–74

    Article  MathSciNet  MATH  Google Scholar 

  • Seber GAF, Salehi MM (2012) Adaptive sampling designs: inferense for sparse and clustered populations. Springer Science & Business Media, Berlin

    Google Scholar 

  • Smith DR, Conroy MJ, Brakhage DH (1995) Efficiency of adaptive cluster sampling for estimating density of wintering waterfowl. Biometrics 51:777–788

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Mohammadi.

Appendices

Appendix 1: Unbiasedness and the Variance of \(\hat{\mu }^*_\mathrm{s}\) Under BWR Method

Let \(os^*\) denote the bootstrap inverse sample selected with replacement from the original inverse sample s, and \(os^*_C\) and \(os^*_{\bar{C}}\) denote its parts in \(s_C\) and \(s_{\bar{C}}\), respectively. To prove unbiasedness of \(\hat{\mu }^*_\mathrm{s}\), by using the conditional property of the mathematical expectation we have \(E_*(\hat{\mu }^*_\mathrm{s})=E_*\big (E_*(\hat{\mu }^*_\mathrm{s}|~n_{bs})\big )\). Since \(os^*\) given its size \(n_{bs}\) is equivalent to a stratified random sampling with replacement of size \((r_b,n_{bs}-r_b)\) from \((s_C,s_{\bar{C}})\), we have

$$\begin{aligned} E_*(\hat{\mu }^*_\mathrm{s}|~n_{bs})=\frac{\hat{M}_b}{N}\bar{y}_{s_C}+\left( 1-\frac{\hat{M}_b}{N}\right) \bar{y}_{s_{\bar{C}}} \end{aligned}$$
(5)

Hence, \(E_*(\hat{\mu }^*_\mathrm{s})=E_*\big (\frac{\hat{M}_b}{N}\bar{y}_{s_C}+(1-\frac{\hat{M}_b}{N})\bar{y}_{s_{\bar{C}}}\big )\). Now since \(n_{bs}\) given s is distributed as a negative binomial with parameters \((r_b,\frac{r-1}{n_\mathrm{s}-1})\), it follows that \(E_*(\frac{\hat{M}_b}{N})=\frac{r-1}{n_\mathrm{s}-1}=\frac{\hat{M}}{N}\). So, \(E_*(\hat{\mu }^*_\mathrm{s})=\frac{\hat{M}}{N}\bar{y}_{s_C}+(1-\frac{\hat{M}}{N})\bar{y}_{s_{\bar{C}}}=\hat{\mu }_\mathrm{s}\).

To find the bootstrap variance of \(\hat{\mu }^*_\mathrm{s}\) given s, it may use the conditional property of variance as:

$$\begin{aligned} v_b=\text {Var}_*(\hat{\mu }^*_\mathrm{s})=\text {Var}_*\big (E_*(\hat{\mu }^*_\mathrm{s}|~n_{bs})\big )+E_*\big (\text {Var}_*(\hat{\mu }^*_\mathrm{s}|~n_{bs})\big ). \end{aligned}$$

Using Eq. (5), the first term is \(\text {Var}_*\big (E_*({\hat{\mu }}^*_\mathrm{s}|~n_{bs})\big )=\text {Var}_*(\frac{{\hat{M}_b}}{N})(\bar{y}_{s_C}-\bar{y}_{s_{\bar{C}}})^2.\) Also, for the interior variance of second term we have

$$\begin{aligned} \text {Var}_*(\hat{\mu }^*_\mathrm{s}|~n_{bs})=\bigg (\frac{\hat{M}_b}{N}\bigg )^2\text {Var}_*(\bar{y}^*_{os_C}|~n_{bs})+\bigg (1-\frac{\hat{M}_b}{N}\bigg )^2\text {Var}_*(\bar{y}^*_{os_{\bar{C}}}|~n_{bs}) \end{aligned}$$

Since \(\bar{y}^*_{os_C}\) and \(\bar{y}^*_{os_{\bar{C}}}\) given \(n_{bs}\) are independent samples with replacement from \(s_C\) and \(s_{\bar{C}}\), we have \(\text {Var}_*(\bar{y}^*_{os_C})=\frac{(r-1)s^2_C}{rr_b}\) and \(\text {Var}_*(\bar{y}^*_{os_{\bar{C}}})=\frac{(n_\mathrm{s}-r-1)S^2_{\bar{C}}}{(n_\mathrm{s}-r)(n_{bs}-r)}\). Now giving expectation from the two last terms, we obtain

$$\begin{aligned} E_*\big (\text {Var}_*\big ({\hat{\mu }}^*_\mathrm{s}|~n_{bs}\big )\big )=E_*\big (\frac{{\hat{M}_b}}{N}\big )^2\frac{(r-1)s^2_C}{rr_b}+E_*\bigg (\frac{(1-\frac{\hat{M}_b}{N})^2}{{n_{bs}-r_b}}\bigg )\frac{(n_\mathrm{s}-r-1)s^2_{\bar{C}}}{(n_\mathrm{s}-r)}. \end{aligned}$$

Combining this with the previous result completes the proof.

Appendix 2: Unbiasedness of the Bootstrap Estimator Under BWO Method

Let \(os^*\) denote the bootstrap inverse sample selected with replacement from the original inverse sample s, and \(os^*_C\) and \(os^*_{\bar{C}}\) denote its parts in \(s_C\) and \(s_{\bar{C}}\), respectively. To prove unbiasedness of \(\hat{\mu }^*_\mathrm{s}\), by using the conditional property of the mathematical expectation we have let j denote the removed unit from \(s_C\). Hence, by drawing an inverse sample without replacement from \(U^*\) and by using the unbiasedness of (1), we have

$$\begin{aligned} E_*(\hat{\mu }^*_\mathrm{s})=E_*\big (E_*(\hat{\mu }^*_\mathrm{s}~|~j)\big ) \end{aligned}$$
$$\begin{aligned} =E_*\bigg (\frac{1}{N}\sum _{k\in {U^*}}y_k\bigg ) \end{aligned}$$
$$\begin{aligned} =\frac{1}{N}E_*\Bigg (\sum _{k\in {U^*_C}}y_k+\sum _{k\in {U^*_{\bar{C}}}}y_k\Bigg )~ \end{aligned}$$
$$\begin{aligned} =\frac{1}{N}E_*\Bigg (\frac{N}{n_\mathrm{s}-1}\sum _{k\in {s_{C(-j)}}}y_k+\frac{N}{n_\mathrm{s}-1}\sum _{k\in {s_{\bar{C}}}}y_k\Bigg ) \end{aligned}$$
$$\begin{aligned}=E_*\Bigg (\frac{r-1}{n_\mathrm{s}-1}\bar{y}_{s_{C(-j)}}+\frac{n_\mathrm{s}-r}{n_\mathrm{s}-1}\bar{y}_{s_{\bar{C}}}\Bigg ) \end{aligned}$$
$$\begin{aligned}=\bigg (\frac{r-1}{n_\mathrm{s}-1}\bigg )E_*(\bar{y}_{s_{C(-j)}})+\bigg (\frac{n_\mathrm{s}-r}{n_\mathrm{s}-1}\bigg )\bar{y}_{s_{\bar{C}}}. \end{aligned}$$

Now, since \(E_*(\bar{y}_{s_{C(-j)}})=\bar{y}_{s_C}\), we have:

$$\begin{aligned} E_*(\hat{\mu }^*_\mathrm{s})=\hat{\mu }_\mathrm{s}, \end{aligned}$$

which guaranties the unbiasedness of the bootstrap estimator \(\hat{\mu }^*_\mathrm{s}\).

Table 1 Standard deviation (top entry) and skewness (below that) of \(\hat{\mu }_\mathrm{s}\) under five underlying models described in the text
Table 2 Coverage and standardized average length (in parenthesis) of the \(90\%\) confidence intervals described in the text for \(M=100\)
Table 3 Coverage and standardized average length (in parenthesis) of the \(90\%\) confidence intervals described in the text for \(M=400\)
Table 4 Lower and upper miscoverage rates of the \(90\%\) confidence intervals described in the text for \(M=100\)
Table 5 Lower and upper miscoverage rates of the \(90\%\) confidence intervals described in the text for \(M=400\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohammadi, M. Bootstrap Confidence Intervals for the Population Mean Under Inverse Sampling Design. Iran J Sci Technol Trans Sci 43, 1003–1009 (2019). https://doi.org/10.1007/s40995-018-0482-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40995-018-0482-3

Keywords

Navigation