Skip to main content
Log in

Statistical inference with empty strata in judgment post stratified samples

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

This article develops estimators for certain population characteristics using a judgment post stratified (JPS) sample. The paper first constructs a conditional JPS sample with a reduced set size K by conditioning on the ranks of the measured observations of the original JPS sample of set size \(H \ge K\). The paper shows that the estimators of the population mean, median and distribution function based on this conditional JPS sample are consistent and have limiting normal distributions. It is shown that the proposed estimators, unlike the ratio and regression estimators, where they require a strong linearity assumption, only need a monotonic relationship between the response and auxiliary variable. For moderate sample sizes, the paper provides a bootstrap distribution to draw statistical inference. A small-scale simulation study shows that the proposed estimators based on a reduced set JPS sample perform better than the corresponding estimators based on a regular JPS sample.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Chen, M., Ahn, S., Wang, X., Lim, J. (2014). Generalized isotonized mean estimators for judgment post-stratification with multiple rankers. Journal of Agricultural, Biological, and Environmental Statistics, 19, 405–418.

  • Dastbaravarde, A., Arghami, N. R., Sarmad, M. (2016). Some theoretical results concerning non-prametric estimation by using a judsgment post-stratification sample with perfect ranking. Communications in Statistics, Theory and Methods, 45, 2182–2203.

  • Dell, T. R., Clutter, J. L. (1972). Ranked set sampling theory with order statistics backround. Biometrics, 28, 545–555.

  • Du, J., MacEachern, S. N. (2008). Judgement post-stratification for designed experiments. Biometrics, 64, 345–354.

  • Frey, J. (2012). Constrained nonparametric estimation of the mean and the CDF using ranked-set sampling with a covariate. Annals of the Institute of Statistical Mathematics, 64, 439–456.

    Article  MathSciNet  MATH  Google Scholar 

  • Frey, J., Feeman, T. G. (2012). An improved mean estimator for judgment post-stratification. Computational Statistics and Data Analysis, 56, 418–426.

  • Frey, J., Feeman, T. G. (2013). Variance estimation using judgment post-stratification. Annals of the Institute of Statistical Mathematics, 65, 551–569.

  • Frey, J., Ozturk, O. (2011). Constrained estimation using judgment post-stratification. Annals of the Institute of Statistical Mathematics, 63, 769–789.

  • Hollander, M., Wolfe, D. A., Chicken, E. (2014). Nonparametric Statistical Methods (3rd ed.). New York: Wiley.

  • MacEachern, S. N., Stasny, E. A., Wolfe, D. A. (2004). Judgment post-stratification with imprecise rankings. Biometrics, 60, 207–215.

  • Ozturk, O. (2012). Quantile inference based on partially rank-ordered set samples. Journal of Statistical Planning and Inference, 142, 2116–2127.

    Article  MathSciNet  MATH  Google Scholar 

  • Ozturk, O. (2013). Combining multi-observer information in partially rank-ordered judgment post-stratified and ranked set samples. The Canadian Journal of Statistics, 41, 304–324.

    Article  MathSciNet  MATH  Google Scholar 

  • Ozturk, O. (2014). Statistical inference for population quantiles and variance in judgment post-stratified samples. Computational Statistics and Data Analysis, 77, 188–205.

    Article  MathSciNet  Google Scholar 

  • Ozturk, O. (2015). Distribution free two-sample methods for judgment post-stratified data. Statistica Sinica, 25, 1691–1712.

    MathSciNet  MATH  Google Scholar 

  • Stokes, S. L., Wang, X., Chen, M. (2007). Judgment post-stratification with multiple rankers. Journal of Statistical Theory and Applications, 6, 344–359.

  • Wang, X., Stokes, S. L., Lim, J., Chen, M. (2006). Concomitants of multivariate order statistics with application to judgment post-stratification. Journal of the American Statistical Association, 101, 1693–1704.

  • Wang, X., Lim, J., Stokes, S. L. (2008). A nonparametric mean estimator for judgment post-stratified data. Biometrics, 64, 355–363.

  • Wang, X., Wang, K., Lim, J. (2012). Isotonized CDF estimation from judgment post-stratification data with empty strata. Biometrics, 68, 194–202.

  • Wolfe, D. A. (2012). Ranked set sampling: its relevance and impact on statistical inference. ISRN Probability and Statistics. doi:10.5402/2012/568385.

Download references

Acknowledgments

The author thanks Professor Jesse Frey who provided the sketch of the proof of the unbiasedness of the estimator in Theorem 1, and the Editor, Associate Editor and two anonymous referees for their helpful comments to improve the presentation of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omer Ozturk.

Appendix

Appendix

Proof of Theorem 1

For the proof of (i), from Eqs. (2) and (3) and from the law of large numbers, we write

$$\begin{aligned} \lim _{n\rightarrow \infty } \left\{ \frac{1}{n} \sum _{i=1}^na_{k:K|R_i} g(X_i-b)\right\} = \frac{1}{K} E\{g(X_{[k:K]}-b)\}+o_p(1) \end{aligned}$$

and

$$\begin{aligned} \lim _{n\rightarrow \infty } \left\{ \frac{1}{n} \sum _{i=1}^na_{k:K|R_i} \right\} = \frac{1}{K}+o_p(1). \end{aligned}$$

It is now easy to observe that

$$\begin{aligned} \lim _{n\rightarrow \infty } \hat{\theta }_{g,K}= & {} \sum _{k=1}^K \lim _{n\rightarrow \infty } \left( \frac{I_k}{d_n}\right) E\{g(X_{[k:K]}-b)\} +o_p(1) \nonumber \\= & {} \sum _{k=1}^K\frac{ E\{g(X_{[k:K]}-b)\}}{K}+o_p(1)= \theta _{g}+o_p(1). \end{aligned}$$

For the proof of (ii), without loss of generality assume that H is an even integer and \(b=0\). We consider

$$\begin{aligned} E(\hat{\theta }_{g,K})= & {} E\left\{ E(\hat{\theta }_{g,K}|\varvec{R})\right\} =\sum _{h=1}^H E\left\{ \sum _{k=1}^K \frac{I_ka_{k:K|h}N_h}{d_n\sum _{h=1}^H a_{k:K|h}N_h}\right\} E(g(X_{[h:H]})) \\= & {} \sum _{h=1}^H E(w_h)E(g(X_{[h:H]}))=\sum _{h=1}^{H/2} \{E(w_h)E(g(X_{[h:H]}))\\&+\,E(w_{H+1-h})E(g(X_{[H+1-h:H]}))\}. \end{aligned}$$

In a JPS sample, since \(E(N_{h})= E(N_{H+1-h})\), the expected weights also preserve the equality \(E(w_h)=E(w_{H+1-h})\). By using the assumption of the theorem and the symmetry of the expected weights, we write

$$\begin{aligned} E(\hat{\theta }_{g,K})= & {} \sum _{h=1}^{H/2}E(w_h)\left\{ E(g(X_{[h:H]}-b))+E(g(X_{[H+1-h:H]}-b))\right\} \\= & {} \sum _{h=1}^{H/2}E(w_h)E(g(X-b))/2 = E(g(X-b))\sum _{h=1}^H E(w_h)=E(g(X-b)). \end{aligned}$$

This completes the proof. \(\square \)

Proof of Lemma (1)

\(\bar{\varvec{T}}_{g}(b)\) is the mean of independent random vectors. Hence, central limit theorem gives the asymptotic normality. The expression \(E(\varvec{\bar{T}}_g)\) follows from Eqs. (2) and (3). We now compute the variance and covariances. For \(\sigma _{r,s}\), \(1 \le r,s \le K\), we use the conditional covariance given \(R_i\)

$$\begin{aligned} \sigma _{r,s}= & {} \frac{1}{n}\sum _{h=1}^HE\left\{ \sum _{i=1}^nI(R_i=h)\mathrm{cov}(W_{r,R_i,g}(b),W_{s,R_i,g}(b))|R_i\right\} \\&+ \frac{1}{n}\mathrm{cov}\left\{ \sum _{i=1}^n I(R_i=h) E(W_{r,R_i,g}(b)|R_i),\sum _{i=1}^n I(R_i=h) E(W_{s,R_i,g}(b)|R_i)\right\} \\= & {} A+B. \end{aligned}$$

In the above equation the first term reduces to

$$\begin{aligned} A= & {} \sum _{h=1}^H E\left( \frac{N_h}{n}\right) a_{r:K|h}a_{s:K|h} \mathrm{var}(g(X_{[h:H]}-b)) \\= & {} \frac{1}{H} \sum _{h=1} a_{r:K|h}a_{s:K|h} \mathrm{var}(g(X_{[h:H]}-b)). \end{aligned}$$

In a similar approach, we have

$$\begin{aligned} B= & {} \frac{1}{n} \mathrm{cov}\left\{ \sum _{h=1}^H N_h a_{r:K|h} E(g(X_{[h:H]}-b)),\sum _{h=1}^H N_ha_{s:K|h} E(g(X_{[h:H]}-b))\right\} \\= & {} \sum _{h=1}^H\sum _{h'=1}^H \frac{\mathrm{cov}(N_h,N_{h'})}{n} a_{r:K|h} E(g(X_{[h:H]}-b))a_{r:K|h'} E(g(X_{[h':H]}-b)) \\= & {} \sum _{h=1}^H\frac{\mathrm{Var}(N_h)}{n} a_{r:K|h}a_{s:K|h} (E(g(X_{[h:H]}-b)))^2 \\&+ \sum _{h=1}^H\sum _{h'\ne h}^H\frac{\mathrm{cov}(N_h,N_{h'})}{n} E(g(X_{[h:H]}-b))a_{r:K|h'} E(g(X_{[h':H]}-b)) \\= & {} \sum _{h=1}^H\frac{H-1}{H^2}a_{r:K|h}a_{s:K|h} (E(g(X_{[h:H]}-b)))^2 \\&+ \sum _{h=1}^H\sum _{h'\ne h}^H\frac{-1}{H^2} E(g(X_{[h:H]}-b))a_{r:K|h'} E(g(X_{[h':H]}-b)) \\= & {} \frac{1}{H} \sum _{h=1}^H a_{r:K|h}a_{s:K|h} (E(g(X_{[h:H]}-b)))^2 - \frac{1}{K^2} E(g(X_{[r:K]}-b)) E(g(X_{[s:K]}-b)). \end{aligned}$$

The second term in the right side of the above expression follows from Eq. (2). Combining expressions A and B, we obtain

$$\begin{aligned} \sigma _{r,s}=\frac{1}{H}\sum _{h=1}^H a_{r:K|h}a_{s:K|h} Eg^2(X_{[h:H]}-b) - \frac{1}{K^2} E(g(X_{[r:K]}-b)) E(g(X_{[s:K]}-b)). \end{aligned}$$

For \(\tau _{r,s}\), \(1 \le r \le k\) and \(1\le s \le K-1\) we compute

$$\begin{aligned} \tau _{r,s}= & {} \frac{1}{n}\mathrm{cov}\left\{ \sum _{i=1}^nW_{r,R_i,g}(b),\sum _{i=1}^n a_{s:K|R_i}\right\} \\= & {} \frac{1}{n}E\left\{ \mathrm{cov} \left( \sum _{i=1}^nW_{r,R_i,g}(b), \sum _{i=1}^na_{s:K|R_i}\right) |\varvec{R}\right\} \nonumber \\&+\frac{1}{n}\mathrm{cov} \left\{ E\left( \sum _{i=1}^nW_{r,R_i,g}(b)|\varvec{R}\right) , E(\sum _{i=1}^na_{s:K|R_i})|\varvec{R}\right\} \\= & {} 0 +\frac{1}{n}\mathrm{cov} \left\{ \sum _{h=1}^H N_hW_{r,h,g}(b),\sum _{h=1}^H N_h a_{s:K|h}) \right\} \\= & {} \frac{1}{n} \sum _{h=1}^H \sum _{h'=1}^H \mathrm{cov}(N_h,N_{h'}) a_{r:K|h}a_{s:K|h'} Eg(X_{[h:H]}-b) \\= & {} \frac{1}{H}\sum _{h=1}^H a_{r:K|h}a_{s:K|h} Eg(X_{[h:H]}-b) -\frac{1}{K^2} Eg(X_{[r:K]}-b). \end{aligned}$$

Finally, the \( \gamma _{r,s}\), \(1 \le r,s \le K-1\), follows from

$$\begin{aligned} \gamma _{r,s}= & {} \frac{1}{n}\mathrm{cov}\left\{ \sum _{i=1}^na_{r:K|R_i},\sum _{i=1}^na_{s:K|R_i} \right\} \\= & {} \frac{1}{n}E\left\{ \mathrm{cov} \left( \sum _{i=1}^n a_{r:K|R_i}, \sum _{i=1}^na_{s:K|R_i}\right) |\varvec{R}{}\right\} \nonumber \\&+\frac{1}{n}\mathrm{cov} \left\{ E\left( \sum _{i=1}^na_{r:K|R_i}\right) |\varvec{R}, E\left( \sum _{i=1}^na_{s:K|R_i}\right) |\varvec{R}\right\} \\= & {} 0+\frac{1}{H}\sum _{h=1}^H a_{r:K|h}a_{s:K|h} -\frac{1}{K^2}. \end{aligned}$$

The proof is completed. \(\square \)

Proof of Theorem 2

Let Q be a transformation from \(\varvec{\bar{T}}_g(b)\) to \(\hat{\theta }_g(b)\)

$$\begin{aligned} \hat{\theta }_g(b)=\frac{1}{d_n} \sum _{k=1}^K \frac{I_k\bar{T}_{g,k}(b)}{\bar{T}_{g,K+k}(b)} \mathop {=}\limits ^{D}\frac{1}{K} \sum _{k=1}^K \frac{\bar{T}_{g,k}(b)}{\bar{T}_{g,K+k}(b)}=Q(\varvec{\bar{T}}_g(b)), \end{aligned}$$

where \(\bar{T}_{g,k}(b)\) is the k th component of vector \(\varvec{\bar{T}}_g(b)\) and \(\bar{T}_{g,2K}(b)=1-\sum _{k=1}^{K-1} \bar{T}_{g,K+k}(b)\). The second equality in the above equation follows from the fact that \(I_k/d_n\) converges in probability to 1 / K for \(k=1,\ldots ,K\). It is clear that \(Q(E(\varvec{\bar{T}}_g(b)))=E(g(X-b))\). For notational convenience, let \(\varvec{\mu }_T=E(\varvec{\bar{T}}_g(b))\). By using a Taylor expansion of \(Q(\varvec{\bar{T}}_g(b))\) around \(\varvec{\mu }_T\) we write

$$\begin{aligned} \sqrt{n}\left( Q(\varvec{\bar{T}}_g(b))-E(g(X-b))\right) =\frac{1}{K}\varvec{L}_T^\top \sqrt{n}\left( \varvec{\bar{T}}_g(b) -\varvec{\mu }_T\right) +o_p(1), \end{aligned}$$

where \(\varvec{L}_T\) is a \(2K-1\) dimensional partial derivative vector of \(Q(\varvec{\mu }_T)\),

$$\begin{aligned} L_r= & {} \frac{\mathrm{d}}{\mathrm{d}\mu _T(r)} Q(\varvec{\mu }_T)\\= & {} \left\{ \begin{array}{ll} K &{} \quad r=1,\ldots , K \\ K\left( Eg(X_{[K:K]}-b)-Eg(X_{[r-K:K]}-b)\right) &{}\quad r=K+1,\ldots , 2K-1, \end{array} \right. \end{aligned}$$

and \(\mu _T(r)\) is the r th component of vector \(\mu _T\). Let \(\varvec{L}^\top =(\varvec{L}_1^\top ,\varvec{L}_2^\top )\) with \(\varvec{L}_1^\top =(K,\ldots ,K)\) and \(\varvec{L}_2=K(Eg(X_{[K:K]}-b)-Eg(X_{[1:K]}-b),\ldots , Eg(X_{[K:K]}-b)-Eg(X_{[K-1:K]}-b))\). It is then easy to see that \(\sqrt{n}\left( Q(\varvec{\bar{T}}_g(b))-E(g(X-b))\right) \) converges to a normal distribution with mean zero and variance

$$\begin{aligned} \sigma ^2_{\hat{\theta }_g}= \frac{1}{K^2}\varvec{L}_1^\top \varvec{\Sigma }_{1,1}\varvec{L}_1+\frac{1}{K^2} 2\varvec{L}_1^\top \varvec{\Sigma }_{1,2}\varvec{L}_2+\frac{1}{K^2}\varvec{L}_2^\top \varvec{\Sigma }_{2,2}\varvec{L}_2. \end{aligned}$$

By using Lemma 1, we write

$$\begin{aligned} \frac{1}{K^2}\varvec{L}_1^\top \varvec{\Sigma }_{1,1}\varvec{L}_1= & {} \sum _{r=1}^K\sum _{s=1}^K\left\{ \frac{1}{H} \sum _{h=1}^H a_{r:K|h}a_{s:K|h}E\left\{ g^2(X_{[h:H]}-b)\right\} \right. \\&- \left. \frac{E\left\{ g(X_{[r:K]}-b)\right\} E\left\{ g(X_{[s:K]}-b)\right\} }{K^2}\right\} \\= & {} \frac{1}{H} \sum _{h=1}^H E\left\{ g^2(X_{[h:H]}-b)\right\} -\left\{ Eg(X-b)\right\} ^2=\sigma ^2_g. \end{aligned}$$

Let \(d_s=Eg(X_{[K:K]}-b)-Eg(X_{[s:K]}-b)\), \(s=1,\ldots , K-1\).

$$\begin{aligned} \frac{1}{K^2}\varvec{L}_1^\top \varvec{\Sigma }_{1,2}\varvec{L}_2= & {} \sum _{r=1}^{K}\sum _{s=1}^{K-1} d_s\tau _{r,s} \\= & {} \sum _{r=1}^{K}\sum _{s=1}^{K-1} d_s \left[ \left\{ \frac{1}{H}\sum _{h=1}^H \alpha _{r:K|h}\alpha _{s:K|h}Eg(X_{[h:H]}-b)\right\} -\frac{Eg(X_{[r:K]}-b)}{K^2}\right] \\= & {} \sum _{s=1}^{K-1}d_s\left\{ \frac{Eg(X_{[s:K]}-b)}{K}-\frac{ Eg(X-b)}{K}\right\} \\= & {} \sum _{s=1}^{K-1} \left\{ Eg(X_{[K:K]}-b)- Eg(X_{[s:K]}-b)\right\} \left\{ \frac{Eg(X_{[s:K]}-b)}{K}-\frac{ Eg(X-b)}{K}\right\} \\= & {} -\sum _{s=1}^{K}\left\{ Eg(X_{[s:K]}-b)-Eg(X-b)\right\} ^2/K. \end{aligned}$$

For the last expression, we consider

$$\begin{aligned} \frac{1}{K^2}\varvec{L}_2^\top \varvec{\Sigma }_{2,2}\varvec{L}_2= & {} \sum _{r=1}^{K-1}\sum _{s=1}^{K-1} d_rd_s \gamma _{r,s} \\= & {} \sum _{r=1}^{K-1}\sum _{s=1}^{K-1} \left\{ Eg(X_{[K:K]}-b)- Eg(X_{[r:K]}-b)\right\} \\&\times \left\{ Eg(X_{[K:K]}-b)- Eg(X_{[s:K]}-b)\right\} \gamma _{r,s} \\= & {} \sum _{r=1}^{K}\sum _{s=1}^{K} \left\{ Eg(X_{[K:K]}-b)- Eg(X_{[r:K]}-b)\right\} \\&\times \left\{ Eg(X_{[K:K]}-b)- Eg(X_{[s:K]}-b)\right\} \gamma _{r,s} \\= & {} \sum _{r=1}^{K}\sum _{s=1}^{K}\frac{1}{H}\sum _{h=1}^H \alpha _{r:K|h}\alpha _{s:K|h}Eg(X_{[r:K]}-b) Eg(X_{[s:K]}-b)\\&-\left\{ Eg(x-b)\right\} ^2. \end{aligned}$$

This completes the proof. \(\square \)

Proof of Corollary 1

We first rewrite \(\sigma ^2_{\theta :K}\) as

$$\begin{aligned} \sigma ^2_{\hat{\theta }_{g:K}}= & {} \sigma ^2_g-\sum _{r=1}^K (Eg(X_{[r:K]}-b)-Eg(X-b))^2/K -\left\{ \sum _{r=1}^K (Eg(X_{[r:K]}-b))^2/K\right. \\&\left. -\sum _{r=1}^K \sum _{s=1}^K\sum _{h=1}^H\frac{ a_{r:K|h}a_{s:K|h}}{H}Eg(X_{[r:K]}-b)Eg(X_{[s:K]}-b)\right\} \\= & {} \sigma ^2_{\hat{\theta }_{g:RSS(K)}}- \left\{ \sum _{r=1}^K \frac{(Eg(X_{[r:K]}-b))^2}{K}\right. \\&\left. -\sum _{r=1}^K \sum _{s=1}^K\sum _{h=1}^H\frac{ a_{r:K|h}a_{s:K|h}}{H}Eg(X_{[r:K]}-b)Eg(X_{[s:K]}-b) \right\} \\= & {} \sigma ^2_{\hat{\theta }_{g:RSS(K)}}-A_d. \end{aligned}$$

We need to show \(A_d\) is non-negative. We first consider

$$\begin{aligned} B_d= \sum _{r=1}^K \sum _{s=1}^K \frac{1}{H} \sum _{h=1}^H a_{r:K|h}a_{s:K|h}\left\{ E\left( g(X_{[r:K]}-b)\right) -E\left( g(X_{[r:K]}-b)\right) \right\} ^2 \ge 0. \end{aligned}$$

We re-write expression \(B_d\) as

$$\begin{aligned} B_d= & {} 2\sum _{r=1}^K\sum _{s=1}^K \frac{1}{H}\sum _{h=1}^Ha_{r:K|h}a_{s:K|h}\left\{ E\left( g(X_{[r:K]}-b)\right) \right\} ^2 \\&-2\sum _{r=1}^K\sum _{s=1}^K \frac{1}{H}\sum _{h=1}^Ha_{r:K|h}a_{s:K|h}\left\{ E\left( g(X_{[r:K]}-b)\right) \right\} \left\{ E\left( g(X_{[r:K]}-b)\right) \right\} . \end{aligned}$$

Use of equalities \(\sum _{s=1}^K a_{s:K|h}=1\) and \(\sum _{h=1}^H a_{s:K|h}=H/K\) in the first term of the above equation reduces \(B_d\) to \(2A_d\)

$$\begin{aligned} B_d= & {} \frac{2}{K}\sum _{r=1}^K\left\{ E\left( g(X_{[r:K]}-b)\right) \right\} ^2 \\&-2\sum _{r=1}^K\sum _{s=1}^K \frac{1}{H}\sum _{h=1}^Ha_{r:K|h}a_{s:K|h}\left\{ E\left( g(X_{[r:K]}-b)\right) \right\} \left\{ E\left( g(X_{[r:K]}-b)\right) \right\} \\= & {} 2A_d \ge 0. \end{aligned}$$

This completes the proof. \(\square \)

Proof of Theorem 3

Let \(U_n(b/\sqrt{n})=\left\{ S_n(\eta _p+b/\sqrt{n})-S_n(\eta _p))/(b/\sqrt{n}\right\} \). By using conditional expectation given rank vector \(\varvec{R}\), we obtain

$$\begin{aligned} E(U_n(b/\sqrt{n})= \sum _{h=1}^H E \left\{ \sum _{k=1}^K\frac{I_kN_h a_{k:K|h}}{d_n \sum _{t=1}^H N_t a_{k:K|t}}\right\} \frac{F_{[h]}(\eta _p)-F_{[h]}(\eta _p+b/\sqrt{n})}{b/\sqrt{n}}. \end{aligned}$$

For large n, we can write

$$\begin{aligned} \lim _{n \rightarrow \infty }E\left\{ \frac{I_kN_h a_{k:K|h}}{d_n \sum _{t=1}^H N_t a_{k:K|t}}\right\}= & {} E\left\{ \lim _{n\rightarrow \infty }\frac{I_k}{d_n} \lim _{n\rightarrow \infty }\frac{N_h a_{k:K|h}}{d_n \sum _{t=1}^H N_t a_{k:K|t}}\right\} \\= & {} \frac{1}{K} \frac{a_{k:K|h}/H}{\sum _{t=1}^Ha_{k:K|t}/H} = a_{k:K|h}/H. \end{aligned}$$

It is now easy to observe that \(E(U_n(b/\sqrt{n})\) has a limit at \(-bf(\eta _p)\)

$$\begin{aligned} \lim _{n\rightarrow \infty } E(U_n(b/\sqrt{n})= & {} \sum _{h=1}^H \sum _{k=1}^K \frac{a_{k:K|h}}{H} \lim _{n \rightarrow \infty } \frac{F_{[h]}(\eta _p)-F_{[h]}(\eta _p+b/\sqrt{n})}{b/\sqrt{n}}\\= & {} -b \sum _{k=1}^K \sum _{h=1}^H \frac{a_{k:K|h}}{H}f_{[h]}(\eta _p)=-b \sum _{k=1}^K \frac{H}{KH}f(\eta _p)=- bf(\eta _p). \end{aligned}$$

We now show that the variance of \(U_n(b/\sqrt{n})\) converges to zero as n goes to infinity. Without loss of generality, assume that \(b>0\) and \(\eta _p=0\). In this case, \(U_n(b/\sqrt{n})\) can be written as

$$\begin{aligned} U_n(b/\sqrt{n})= \sum _{i=1}^n \sum _{k=1}^K \frac{I_k a_{k:K|R_i}}{ d_n\sum _{j=1}^n a_{k:K|R_j}} I( 0 \le X_i \le b/\sqrt{n}). \end{aligned}$$

The variance of \(U_n(b/\sqrt{n})\) can be written as

$$\begin{aligned} \mathrm{Var}(U_n(b/\sqrt{n})= \mathrm{Var}\left( E\left\{ U_n(b/\sqrt{n})|\varvec{R}\right\} \right) +E\left( \mathrm{Var}\left\{ U_n(b/\sqrt{n})|\varvec{R}\right\} \right) =D_n+G_n. \end{aligned}$$

The expression \(D_n\) is given by

$$\begin{aligned} D_n= & {} \frac{1}{b^2}\sum _{h=1}^H \sum _{h'=1}^H \left( F_{[h]}(b/\sqrt{n})-F_{[h]}(0)\right) \left( F_{[h']}(b/\sqrt{n})-F_{[h']}(0)\right) \\&\times \sum _{k=1}^K \sum _{k'=1}^K \mathrm{Cov}\left( \frac{\sqrt{n}I_kN_h a_{k:K|h}}{d_n \sum _{t=1}^H N_t a_{k:K|t}}, \frac{\sqrt{n}I_{k'} N_{h'} a_{k':K|h'}}{d_n \sum _{t=1}^H N_t a_{k':K|t}}\right) . \end{aligned}$$

In the above expression, the covariances are all finite and the double sum in the first term converges to zero as n gets large. Hence, \(D_n\) has a limit at 0. In a similar fashion, \(G_n\) can be written as

$$\begin{aligned} G_n= & {} \sum _{k=1}^K \sum _{h=1}^H E \left\{ \frac{I_k^2N_h a^2_{k:K|h}}{d^2_n (\sum _{t=1}^H N_t a_{k:K|t})^2}\right\} \left( F_{[h']}(b/\sqrt{n})-F_{[h']}(0)\right) \\&\times \left\{ 1- \left( F_{[h']}(b/\sqrt{n})+F_{[h']}(0)\right) \right\} . \end{aligned}$$

Since the expectation in the above expression has a finite limit, \(G_n\) converges to zero as n goes to infinity. We then conclude that variance \(U_n(b/\sqrt{n})\) goes to zero as n approaches to infinity. This establishes the point-wise convergence

$$\begin{aligned} \sqrt{n}S_n(\eta _p+b/\sqrt{n})= \sqrt{n}S_n(\eta _p)-b f(\eta _p)+o_p(1). \end{aligned}$$

The uniform convergence follows from the monotonicity of \(S_n(b)\). \(\square \)

Proof of Theorem 4

We first observe that sample size vector \(\varvec{N}^{\top }= (N_1,\ldots , N_H)\) has a multinomial distribution with parameters n and \((1/H,\ldots , 1/H)\). The probability that \(W_n(q)\) equals K is

$$\begin{aligned} P(W_n(q)=K)= \left( \begin{array}{l} {H} \\ {K}\end{array} \right) P( N_1> q, \ldots , N_K> q, N_{K+1} \le q, \ldots , N_{H} \le q). \end{aligned}$$

Let

$$\begin{aligned} A_{K,q}=\{ N_1> q, \ldots , N_K> q\} \text{ and } B_{H-K,q}=\{ N_{K+1} \le q, \ldots , N_{H} \le q\}. \end{aligned}$$

For a fixed j, we also define \(B_{j,H-K,q}\) to be the event that each one of the j judgment classes in set \(B_{H-K,q}\) has exactly q observations. Note that by the definition of the set \(B_{H-K,q}\), \(B_{j,H-K,q}= B_{H-K,H-K,q}\) when \(q=0\). With this new notation we write

$$\begin{aligned} P((W_n(q)=K)=\left( \begin{array}{l} {H} \\ {K} \end{array}\right) \sum _{j=0}^{q(H-K)} \left( \begin{array}{c} {q(H-K)} \\ {j}\end{array}\right) P( A_{K,q}| B_{j,H-K,q}) P(B_{j,H-K,q}). \end{aligned}$$

It is clear that

$$\begin{aligned} P(B_{j,H-K,q})= & {} \frac{n!}{(n-qj)!}\left( \frac{K}{H}\right) ^{n-qj}\left( \frac{1}{H}\right) ^{qj} \end{aligned}$$

and

$$\begin{aligned} P( A_{K,q}| B_{j,H-K,q})=\sum _{n_1>q, \ldots , n_K>q} \left( \begin{array}{c} {n-qj}\\ {n_1,\ldots , n_K} \end{array}\right) (1/K)^{n-qj}. \end{aligned}$$

Let

$$\begin{aligned} A_{K,q}=\{A_{1:K,q} \cap \cdots \cap A_{K:K,q}\}, \end{aligned}$$

where \(A_{h:K,q}\) is the event that \(N_{h} > q\) for \(h \le K\). By using DeMorgan’s law, we write

$$\begin{aligned} P( A_{K,q}| B_{j,H-K,q})= & {} 1- P\left( A^C_{K,q}| B_{j,H-K,q}\right) = 1- P\left( \cup _{h=1}^K A^C_{h:K,q} | B_{j,H-K,q}\right) \nonumber \\= & {} 1- \sum _{i=1}^K \left( \begin{array}{l} K \\ i \end{array}\right) (-1)^{i-1}P\left( A^C_{1:K,q}\cap \cdots \cap A^C_{i:K,q}|B_{j,H-K,q}\right) .\nonumber \\ \end{aligned}$$
(11)

We now evaluate the following conditional probability

$$\begin{aligned} P\left( A^C_{1:K,q}\cap \cdots \cap A^C_{i:K,q}|B_{j,H-K,q}\right)= & {} \sum _{r_1=0}^q \cdots \sum _{r_i=0}^q \left( \begin{array}{c} {n-qj}\\ {r_1,\ldots , r_i, n-qj-T_i} \end{array}\right) \nonumber \\&\times (1/K)^{T_i}\{1-i/K\}^{(n-qj-T_i)}, \end{aligned}$$

where \(T_i= \sum _{y=1}^i r_y\). With some algebra, the above equation simplifies to

$$\begin{aligned} P\left( A^C_{1:K,q}\cap \cdots \cap A^C_{i:K,q}|B_{j,H-K,q}\right) =\sum _{s=0}^{iq} \left( \begin{array}{l} {qi} \\ s \end{array}\right) \left( \begin{array}{l} {n-qj} \\ {s} \end{array}\right) \frac{s! (K-i)^{n-qj-s}}{K^{n-qj}}. \end{aligned}$$

We complete the proof by putting this expression in (11). \(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ozturk, O. Statistical inference with empty strata in judgment post stratified samples. Ann Inst Stat Math 69, 1029–1057 (2017). https://doi.org/10.1007/s10463-016-0572-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-016-0572-y

Keywords

Navigation