Skip to main content
Log in

Inference on a distribution function from ranked set samples

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Consider independent observations \((X_i,R_i)\) with random or fixed ranks \(R_i\), while conditional on \(R_i\), the random variable \(X_i\) has the same distribution as the \(R_i\)-th order statistic within a random sample of size k from an unknown distribution function F. Such observation schemes are well known from ranked set sampling and judgment post-stratification. Within a general, not necessarily balanced setting we derive and compare the asymptotic distributions of three different estimators of the distribution function F: a stratified estimator, a nonparametric maximum-likelihood estimator and a moment-based estimator. Our functional central limit theorems generalize and refine previous asymptotic analyses. In addition, we discuss briefly pointwise and simultaneous confidence intervals for the distribution function with guaranteed coverage probability for finite sample sizes. The methods are illustrated with a real data example, and the potential impact of imperfect rankings is investigated in a small simulation experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Balakrishnan, N., Li, T. (2006). Confidence intervals for quantiles and tolerance intervals based on ordered ranked set samples. Annals of the Institute of Statistical Mathematics, 58, 757–777.

    Article  MathSciNet  Google Scholar 

  • Bhoj, D. S. (2001). Ranked set sampling with unequal samples. Biometrics, 57(3), 957–962.

    Article  MathSciNet  Google Scholar 

  • Chen, Z. (2001). Non-parametric inferences based on general unbalanced ranked-set samples. Journal of Nonparametric Statistics, 13(2), 291–310.

    Article  MathSciNet  Google Scholar 

  • Chen, Z., Bai, Z., Sinha, B. K. (2004). Ranked set sampling. Theory and applications. New York: Springer.

    Book  Google Scholar 

  • Clopper, C. J., Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26(4), 404–413.

    Article  Google Scholar 

  • Dastbaravarde, A., Arghami, N. R., Sarmad, M. (2016). Some theoretical results concerning non parametric estimation by using a judgment poststratification sample. Communications in Statistics, Theory and Methods, 45(8), 2181–2203.

    Article  MathSciNet  Google Scholar 

  • David, H. A., Nagaraja, H. N. (2003). Order statistics (3rd ed.). Hoboken, NJ: Wiley-Interscience.

  • Dell, T. R., Clutter, J. L. (1972). Ranked set sampling theory with order statistics background. Biometrics, 28(2), 545–555.

    Article  Google Scholar 

  • Frey, J., Ozturk, O. (2011). Constrained estimation using judgement post-stratification. Annals of the Institute of Statistical Mathematics, 63, 769–789.

    Article  MathSciNet  Google Scholar 

  • Ghosh, K., Tiwari, R. C. (2008). Estimating the distribution function using \(k\)-tuple ranked set samples. Journal of Statistical Planning and Inference, 138(4), 929–949.

    Article  MathSciNet  Google Scholar 

  • Huang, J. (1997). Properties of the Npmle of a distribution function based on ranked set samples. Annals of Statistics, 25(3), 1036–1049.

    Article  MathSciNet  Google Scholar 

  • Kvam, P. H., Samaniego, F. J. (1994). Nonparametric maximum likelihood estimation based on ranked set samples. Journal of the American Statistical Association, 89(426), 526–537.

    Article  MathSciNet  Google Scholar 

  • MacEachern, S. N., Stasny, E. A., Wolfe, D. A. (2004). Judgement post-stratification with imprecise rankings. Biometrics, 60, 207–215.

    Article  MathSciNet  Google Scholar 

  • McIntyre, G. A. (1952). A method of unbiased selective sampling, using ranked sets. Australian Journal of Agricultural Research, 3, 385–390.

    Article  Google Scholar 

  • Presnell, B., Bohn, L. L. (1999). U-Statistics and imperfect ranking in ranked set sampling. Journal of Nonparamatric Statistics, 10(2), 111–126.

    Article  MathSciNet  Google Scholar 

  • R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Accessed July 2018.

  • Shorack, G. R., Wellner, J. A. (1986). Empirical processes with applications to statistics. New York: Wiley.

  • Stokes, S. L., Sager, T. W. (1988). Characterization of a ranked-set sample with application to estimating distribution functions. Journal of the American Statistical Association, 83(402), 374–381.

    Article  MathSciNet  Google Scholar 

  • Terpstra, J. T., Miller, Z. A. (2006). Exact inference for a population proportion based on a ranked set sample. Communications in Statistics, Simulation and Computation, 35(1), 19–26.

    Article  MathSciNet  Google Scholar 

  • Wang, X., Wang, K., Lim, J. (2012). Isotonized CDF estimation from judgement poststratification data with empty strata. Biometrics, 68(1), 194–202.

    Article  MathSciNet  Google Scholar 

  • Wolfe, D. A. (2004). Ranked set sampling: An approach to more efficient data collection. Statistical Science, 19(4), 636–643.

    Article  MathSciNet  Google Scholar 

  • Wolfe, D. A. (2012). Ranked set sampling: Its relevance and impact on statistical inference. ISRN Probability and Statistics, 2012, 568385. https://doi.org/10.5402/2012/568385.

    Article  MATH  Google Scholar 

Download references

Acknowledgements

Constructive comments by an associate editor and two referees are gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ehsan Zamanzade.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1016 KB)

Appendix

Appendix

We first recall two well-known facts about uniform empirical processes, see Shorack and Wellner (1986).

Proposition 6

Let \(U_1, U_2, U_3, \ldots \) be independent random variables with uniform distribution on [0, 1]. For \(N \in \mathbb {N}\) and \(u \in [0,1]\) define

$$\begin{aligned} \mathbb {V}^{(N)}(u) \ := \ N^{-1/2} \sum _{i=1}^N \bigl ( 1 \{U_i \le u\} - u) . \end{aligned}$$

Then, as \(N \rightarrow \infty \), \(\mathbb {V}^{(N)}\) converges in distribution in \(\ell _\infty ([0,1])\) to a standard Brownian bridge \(\mathbb {V}\) on [0, 1]. Moreover, for any fixed \(\delta \in [0,1/2)\) and \(\epsilon > 0\),

$$\begin{aligned} \sup _{N \ge 1} \mathop {\mathrm {I\!P}}\nolimits \left( \sup _{u \in (0,1)} \frac{|\mathbb {V}^{(N)}(u)|}{u^\delta (1 - u)^\delta } \ge C \right)&\rightarrow \ 0 \quad \text {as} \ C \uparrow \infty , \\ \sup _{N \ge 1} \mathop {\mathrm {I\!P}}\nolimits \left( \sup _{u \in (0,c] \cup [1-c,1)} \frac{|\mathbb {V}^{(N)}(u)|}{u^\delta (1 - u)^\delta } \ge \epsilon \right)&\rightarrow \ 0 \quad \text {as} \ c \downarrow 0. \end{aligned}$$

For the estimators \(\widehat{F}_n^\mathrm{M}\), \(\widehat{F}_n^\mathrm{L}\) we need some basic facts and inequalities for the auxiliary functions \(w_k\) and \(B_k\) which are proved in the supplement:

Lemma 7

(a):

For \(r = 1,2,\ldots ,k\), the function \(w_r\) on (0, 1) may be written as \(w_r(t) = \widetilde{w}_r(t) / (t(1-t))\) with \(\widetilde{w}_r : [0,1] \rightarrow (0,\infty )\) continuously differentiable. Moreover, for \(r = 1,2,\ldots ,k\) and \(t \in (0,1)\),

$$\begin{aligned} 1 \ \le \ \widetilde{w}_r(t) \ \le \ \max (r,k+1-r). \end{aligned}$$
(b):

For any constant \(c \in (0,1)\) there exists a number \(c' = c'(k,c) > 0\) with the following property: If \(t,p \in (0,1)\) such that

$$\begin{aligned} \frac{|p - t|}{t(1-t)} \ \le \ c , \end{aligned}$$

then for \(r = 1,2,\ldots ,k\),

$$\begin{aligned} \max \left\{ \left| \frac{w_r(p)}{w_r(t)} - 1 \right| , \left| \frac{B_r(p) - B_r(t)}{\beta _r(t) (p - t)} - 1 \right| \right\} \ \le \ c' \frac{|p - t|}{t(1-t)}. \end{aligned}$$

Proof of Theorem 2

We start with the weight functions \(\gamma _{nr}^\mathrm{Z}\): Note that by Lemma 7,

$$\begin{aligned} \gamma _{nr}^\mathrm{S}(t)&= \ \frac{1}{k \sqrt{\pi _{nr}}} , \\ \gamma _{nr}^\mathrm{M}(t)&= \ \sqrt{\pi _{nr}} \bigr / \sum _{s=1}^k \pi _{ns} \beta _s(t) , \\ \gamma _{nr}^\mathrm{L}(t)&= \sqrt{\pi _{nr}}\, \widetilde{w}_r(t) \Big / \sum _{s=1}^k \pi _{ns} \widetilde{w}_s(t) \beta _s(t) \end{aligned}$$

with the probability weights \(\pi _{nr} := N_{nr}/n\) and continuous functions \(\widetilde{w}_r : [0,1] \rightarrow [1,k]\). Since the beta densities \(\beta _r\) are also continuous with \(\beta _1(0) = \beta _k(1) = k\), this shows that \(\gamma _{nr}^\mathrm{Z}\) is well-defined and continuous, provided that its denominator is strictly positive, i.e.,

$$\begin{aligned} {\left\{ \begin{array}{ll} \pi _{n1}, \ldots , \pi _{nk}> 0 &{} \text {if} \ \mathrm{Z} = \mathrm{S} , \\ \pi _{n1}, \pi _{nk} > 0 &{} \text {if} \ \mathrm{Z} = \mathrm{M}, \mathrm{L}. \end{array}\right. } \end{aligned}$$

For sufficiently large n this is the case, because \(\lim _{n\rightarrow \infty } \pi _{nr} = \pi _r\) for all r. The functions \(\gamma _r^\mathrm{Z}\) in Corollary 4 are continuous, too, and elementary considerations reveal that

$$\begin{aligned} \max _{t \in [0,1], \, 1 \le r \le k} \bigl | \gamma _{nr}^\mathrm{Z}(t) - \gamma _r^\mathrm{Z}(t) \bigr | \ \rightarrow \ 0 \end{aligned}$$
(5)

as \(n \rightarrow \infty \). In particular, \(\max _{t \in [0,1], 1 \le r \le k} \gamma _{nr}^\mathrm{Z}(t) = O(1)\).

Note that for \(n \ge 1\) and \(1 \le r \le k\), the empirical process \(\mathbb {V}_{nr}\) is distributed as \(\mathbb {V}^{(N_{nr})}\) in Proposition 6. Note also that the distribution functions \(B_r\) satisfy \(B_1 \ge B_2 \ge \cdots \ge B_k\), because for \(1 \le r < k\) the density ratio \(\beta _{r+1}/\beta _r\) is a positive multiple of \(t/(1 - t)\) and thus strictly increasing. Consequently, for \(1 \le r \le k\),

$$\begin{aligned} B_r(t) \ \le \ B_1(t) \ \le \ kt \quad \text {and}\quad 1 - B_r(t) \ \le \ 1 - B_k(t) \ \le \ k(1-t) , \end{aligned}$$

so

$$\begin{aligned} \frac{B_r(t)(1 - B_r(t))}{t(1-t)} \ \le \ k. \end{aligned}$$

Consequently,

$$\begin{aligned} \sup _{t \in (0,1)} \frac{|\mathbb {V}_{nr}(B_r(t))|}{t^\delta (1 - t)^\delta }&\le \ k^\delta \sup _{u \in (0,1)} \frac{|\mathbb {V}_{nr}(u)|}{u^\delta (1 - u)^\delta } \ = \ O_p(1) \quad \text {and} \\ \sup _{u \in (0,c] \cup [1-c,1)} \frac{|\mathbb {V}_{nr}(B_r(t))|}{t^\delta (1 - t)^\delta }&\le \ k^\delta \sup _{u \in (0,kc] \cup [1 - kc,1)} \frac{|\mathbb {V}_{nr}(u)|}{u^\delta (1 - u)^\delta } \ \rightarrow _p \ 0 \end{aligned}$$

as \(n \rightarrow \infty \) and \(c \downarrow 0\). All in all, we may conclude that

$$\begin{aligned} \sup _{t \in (0,1)} \, \frac{|\mathbb {V}_n^\mathrm{Z}(t)|}{t^\delta (1 - t)^\delta }&= \ O_p(1) , \end{aligned}$$
(6)
$$\begin{aligned} \sup _{t \in (0,c] \cup [1-c,1)} \, \frac{|\mathbb {V}_n^\mathrm{Z}(t)|}{t^\delta (1 - t)^\delta }&\rightarrow _p \ 0 \quad \text {as} \ n \rightarrow \infty \ \text {and}\ c \downarrow 0. \end{aligned}$$
(7)

It remains to be shown that the process \(\sqrt{n} (\widehat{B}_n^\mathrm{Z} - B)\) may be approximated by \(\mathbb {V}_n^\mathrm{Z}\). In case of \(\mathrm{Z} = \mathrm{S}\) it follows from \(\sum _{r=1}^k \beta _r \equiv k\) that \(\sum _{r=1}^k B_r = k B\), and this implies that

$$\begin{aligned} \sqrt{n} (\widehat{B}_n^\mathrm{S}- B) \ = \ \sum _{r=1}^k \frac{\sqrt{n} (\widehat{B}_{nr} - B_r)}{k} \ = \ \sum _{r=1}^k \gamma _{nr}^\mathrm{S} \, \mathbb {V}_{nr} \circ B_r \ = \ \mathbb {V}_n^\mathrm{S}. \end{aligned}$$

For \(\mathrm{Z} = \mathrm{M}, \mathrm{L}\) it suffices to show that for any fixed number \(b \ne 0\) and

$$\begin{aligned} p_n^\mathrm{Z}(t) \ := \ t + \frac{\mathbb {V}_n^\mathrm{Z}(t) + b t^\delta (1-t)^\delta }{\sqrt{n}} \end{aligned}$$

the following statements are true: If \(b < 0\), then with asymptotic probability one,

$$\begin{aligned} \left. \begin{array}{c} \displaystyle \inf _{t \in (0,1)} \left( n \widehat{B}_n(t) - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \right) \\ \displaystyle \inf _{t \in (0,1)} \, L_n'(t,p_n^\mathrm{L}(t)) \end{array}\right\} \ \ge \ 0. \end{aligned}$$
(8)

If \(b > 0\), then with asymptotic probability one,

$$\begin{aligned} \left. \begin{array}{c} \displaystyle \sup _{t \in (0,1)} \left( n \widehat{B}_n(t) - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \right) \\ \displaystyle \sup _{t \in (0,1)} \, L_n'(t,p_n^\mathrm{L}(t)) \end{array}\right\} \ \le \ 0. \end{aligned}$$
(9)

Here we use the conventions that \(L_n'(t,\cdot ) := \infty \) and \(B_r := 0\) on \((-\infty ,0]\) while \(L_n'(t,\cdot ) := -\infty \) and \(B_r := 1\) on \([1,\infty )\).

To verify these claims, we split the interval (0, 1) into \((0,c_n]\), \([c_n,1-c_n]\) and \([1-c_n,1)\) with numbers \(c_n \in (0,1/2)\) to be specified later, where \(c_n \downarrow 0\).

On \([c_n,1-c_n]\) we utilize Lemma 7: For \(t \in [c_n, 1 - t_n]\) and \(p \in (0,1)\) such that \(|p - t| \le t(1-t)/2\) we may write

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr}(B_r(p) - B_r(t)) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^m N_{nr} \beta _r(t) (p - t) + \rho _n^\mathrm{M}(t,p) \\&\quad = \ \sum _{r=1}^k N_{nr} \beta _r(t) \left( \frac{\mathbb {V}_n^\mathrm{M}(t)}{\sqrt{n}} - (p - t) \right) + \rho _n^\mathrm{M}(t,p) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p)&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(p) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(p) (B_r(p) - B_r(t)) \\&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(t) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) (p - t) + \rho _n^\mathrm{L}(t,p) \\&= \ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) \left( \frac{\mathbb {V}_n^\mathrm{L}(t)}{\sqrt{n}} - (p - t) \right) + \rho _n^\mathrm{L}(t,p) , \end{aligned}$$

where

$$\begin{aligned} |\rho _n^\mathrm{M}(t,p)|&\le \ \frac{O(n) |p - t|^2}{t(1-t)} , \\ |\rho _n^\mathrm{L}(t,p)|&\le \ \frac{O_p(\sqrt{n}) t^{\delta } (1 - t)^{\delta } |p - t|}{t(1-t)} + \frac{O(n) |p - t|^2}{t^2(1-t)^2}. \end{aligned}$$

Note that for \(t \in [c_n,1-c_n]\),

$$\begin{aligned} \frac{\bigl | p_n^\mathrm{Z}(t) - t \bigr |}{t(1-t)} \ \le \ \frac{O_p(1) t^\delta (1 - t)^{\delta }}{\sqrt{n} \, t(1-t)} \ \le \ \frac{O_p(1)}{\sqrt{n} \, c_n^{1-\delta }}. \end{aligned}$$

Hence we choose \(c_n\) such that \(c_n \downarrow 0\) but \(n c_n^{2(1-\delta )} \rightarrow \infty \). With this choice, we may conclude that uniformly in \(t \in [c_n,1-c_n]\),

$$\begin{aligned} \bigl | \rho _n^\mathrm{M}(t,p_n^\mathrm{M}(t)) \bigr |&\le \ O_p(c_n^{\delta -1}) t^\delta (1-t)^\delta , \\ \bigl | \rho _n^\mathrm{L}(t,p_n^\mathrm{L}(t)) \bigr |&\le \ O_p(c_n^{\delta -1}) t^{\delta -1}(1-t)^{\delta -1}. \end{aligned}$$

On the other hand, since \(\beta _1(t) + \beta _k(t) \ge \beta _1(1/2) + \beta _k(1/2) = k 2^{2-k}\),

$$\begin{aligned} \sum _{r=1}^k N_{nr} \beta _r(t)&\ge \ k 2^{2 - k} \min \{N_{n1},N_{nk}\} , \\ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t)&\ge \ \frac{k 2^{2-k} c_w}{t(1-t)} \, \min \{N_{n1},N_{nk}\}. \end{aligned}$$

Consequently,

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \\&\quad = \ \sum _{r=1}^k N_{nr} \beta _r(t) \frac{- b t^\delta (1-t)^\delta }{\sqrt{n}} + \rho _n^\mathrm{M}(t,p_n^\mathrm{M}(t)) \\&\quad = \ \sum _{r=1}^m N_{nr} \beta _r(t) \frac{t^\delta (1-t)^\delta }{\sqrt{n}} \left( - b + O_p(c_n^{\delta -1} n^{-1/2}) \kappa _n^\mathrm{M}(t) \right) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p_n^\mathrm{L}(t))&= \ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) \frac{-b t^\delta (1-t)^\delta }{\sqrt{n}} + \rho _n^\mathrm{L}(t,p_n^\mathrm{L}(t)) \\&= \ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) \frac{t^\delta (1-t)^\delta }{\sqrt{n}} \Bigg ( - b + O_p\left( c_n^{\delta -1} n^{-1/2}\right) \kappa _n^\mathrm{L}(t) \Bigg ) \end{aligned}$$

for some random functions \(\kappa _n^\mathrm{M}, \kappa _n^\mathrm{L} : [c_n,1-c_n] \rightarrow [-1,1]\). These considerations show that (8) and (9) are satisfied with \([c_n,1-c_n]\) in place of (0, 1).

It remains to verify (8) and (9) with \((0,c_n]\) in place of (0, 1); the interval \([1-c_n,1)\) may be treated analogously. Note first that for \(2 \le r \le k\),

$$\begin{aligned} B_r(t) \ \le \ B_2(t) \ \le \ k(k-1) t^2/2 \quad \text {and}\quad \beta _r(t) \ \le \ k 2^{k-1} t , \end{aligned}$$

so

$$\begin{aligned} \bigl | B_r(p) - B_r(t) \bigr | \ = \ \Bigl | \int _t^p \beta _r(u) \, \mathrm{d}u \Bigr | \ \le \ O(\max (p,t)) (p - t). \end{aligned}$$

Furthermore, since \(B_1(t) = 1 - (1 - t)^k\),

$$\begin{aligned} B_1(p) - B_1(t) \ = \ k (p - t) + O(\max (t,p)) (p - t). \end{aligned}$$

Hence for \(t \in (0,c_n]\) and \(p \in (0,2c_n]\),

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr}(B_r(p) - B_r(t)) \\&\quad = \ - N_{n1} k (p - t) + \rho _n^\mathrm{M}(t,p) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p)&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(p) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(p) (B_r(p) - B_r(t)) \\&= \ - N_{n1} w_1(p) k (p - t) + \rho _n^\mathrm{L}(t,p) , \end{aligned}$$

where

$$\begin{aligned} |\rho _n^\mathrm{M}(t,p)|&\le \ o_p(\sqrt{n}) t^\delta + O(n c_n) (p - t) , \\ |\rho _n^\mathrm{L}(t,p)|&\le \ o_p(\sqrt{n}) p^{-1} t^\delta + O(n c_n) p^{-1} (p - t). \end{aligned}$$

Note also that

$$\begin{aligned} \sup _{t \in (0,c_n]} \Bigl | \frac{\sqrt{n}(p_n^\mathrm{Z}(t) - t)}{t^\delta (1-t)^\delta } - b \Bigr | \ \rightarrow _p \ 0. \end{aligned}$$

In particular, \(\sup _{t \in (0,c_n]} p_n^\mathrm{Z}(t) = c_n + o_p(n^{-1/2} c_n^\delta ) = c_n (1 + o_p(1))\), and in case of \(b > 0\), \(\mathop {\mathrm {I\!P}}\nolimits \bigl ( p_n^\mathrm{Z}(t) > 0 \ \text {for} \ 0 < t \le c_n \bigr ) \rightarrow 1\).

In case of \(b > 0\), these considerations show that for \(0 < t \le c_n\),

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \\&\quad = \ - N_{n1} k (p_n^\mathrm{M}(t) - t) + \rho _n^\mathrm{M}(t,p_n^\mathrm{M}(t)) \\&\quad \le \ \frac{N_{n1} k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( -b + o_p(1) \bigr ) + o_p(\sqrt{n}) t^\delta + O(\sqrt{n} c_n) t^\delta \\&\quad \le \ \frac{N_{n1} k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p_n^\mathrm{L}(t))&= \ - N_{n1} w_1(p) k (p_n^\mathrm{L}(t) - t) + \rho _n^\mathrm{L}(t,p_n^\mathrm{Z}(t)) \\&\le \ \frac{N_{n1} w_1(p) k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) + o_p(\sqrt{n}) p^{-1} t^\delta \\&\quad + O(\sqrt{n} c_n) p^{-1} t^\delta \\&\le \ \frac{N_{n1} w_1(p) k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ). \end{aligned}$$

Analogously, in case of \(b < 0\), for any \(t \in (0,c_n]\) we obtain the inequalities

$$\begin{aligned} n \widehat{B}_n(t) \!-\! \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t))&\ge {\left\{ \begin{array}{ll} \displaystyle \frac{N_{n1} k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) &{} \text {if} \ p_n^\mathrm{M}(t)> 0 , \\ 0 &{} \text {if} \ p_n^\mathrm{M}(t) \le 0 , \end{array}\right. } \\ L_n'(t,p_n^\mathrm{L}(t))&\ge {\left\{ \begin{array}{ll} \displaystyle \frac{N_{n1} w_1(p) k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) &{} \text {if} \ p_n^\mathrm{L}(t) > 0 , \\ \infty &{} \text {if} \ p_n^\mathrm{L}(t) \le 0. \end{array}\right. } \end{aligned}$$

Hence, (8) and (9) are satisfied with \((0,c_n]\) in place of (0, 1). \(\square \)

Proof of Theorem 3

For symmetry reasons it suffices to prove the first part about the left tails. Let \((c_n)_n\) be a sequence of numbers in (0, 1 / 2] converging to zero. Then for \(t \in (0,c_n]\) and \(\delta := \kappa /2 \in (0,1/2)\),

$$\begin{aligned} \bigl | \sqrt{n} \bigl ( \widehat{B}_n^\mathrm{S}(t) - t \bigr ) - \mathbb {V}_n^{(\ell )}(t) \bigr | \ = \ \left| \sum _{r=2}^k \frac{\mathbb {V}_{nr}(B_r(t))}{k \sqrt{N_{nr}/n}} \right| \ \le \ t^{2 \delta } o_p(1) \ = \ t^\kappa o_p(1). \end{aligned}$$

Concerning \(\widehat{B}_n^\mathrm{M}\) and \(\widehat{B}_n^\mathrm{L}\), for any \(t \in (0,c_n]\) and \(p \in (0,1)\),

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr}(B_r(p) - B_r(t)) \\&\quad = \ \sqrt{N_{n1}} \mathbb {V}_{n1}(B_1(t)) - N_{n1} k(p - t) + \rho _n^\mathrm{M}(t,p) \\&\quad = \ N_{n1} k \Bigl ( \frac{\mathbb {V}_{n1}(B_1(t))}{k \sqrt{N_{n1}}} - (p - t) \Bigr ) + \rho _n^\mathrm{M}(t,p) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p)&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(p) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(p) (B_r(p) - B_r(t)) \\&= \ \sqrt{N_{n1}} w_1(p) \mathbb {V}_{n1}(B_1(t)) - N_{n1} w_1(p) k (p - t) + \rho _n^\mathrm{L}(t,p) \\&= \ N_{n1} k w_1(p) \left( \frac{\mathbb {V}_{n1}(B_1(t))}{k \sqrt{N_{n1}}} - (p - t) \right) + \rho _n^\mathrm{L}(t,p) , \end{aligned}$$

where

$$\begin{aligned} |\rho _n^\mathrm{M}(t,p)|&\le \ o_p(\sqrt{n}) t^{2\delta } + O(n) \max (t,p) (p - t) , \\ |\rho _n^\mathrm{L}(t,p)|&\le \ o_p(\sqrt{n}) p^{-1} t^{2\delta } + O(n) p^{-1} \max (t,p) (p - t). \end{aligned}$$

Now we proceed similarly as in the proof of Theorem 2, defining

$$\begin{aligned} p_n(t) \ := \ t + \frac{\mathbb {V}_n^{(\ell )}(t) + b t^\kappa }{\sqrt{n}} \end{aligned}$$

for some fixed \(b \ne 0\). Note that for \(t \in (0,c_n]\),

$$\begin{aligned} |p_n(t) - t| \ \le \ o_p(n^{-1/2}) t^\delta + O(n^{-1/2}) t^\kappa \ = \ o_p(n^{-1/2}) t^\delta , \end{aligned}$$

because \(\kappa > \delta \). Note also that

$$\begin{aligned} t + \frac{\mathbb {V}_n^{(\ell )}(t)}{\sqrt{n}} \ = \ t + \frac{\mathbb {V}_{n1}(B_1(t))}{k \sqrt{N_{n1}}} \ = \ t - \frac{1 - (1-t)^k}{k} + \frac{\widehat{B}_{n1}(t)}{k} \ > \ 0 \quad \text {on} \ (0,1) , \end{aligned}$$

because \(\widehat{B}_{n1} \ge 0\) and \(t \mapsto t - (1 - (1-t)^k)/k\) is strictly convex on [0, 1] with derivative 0 at 0. Thus \(p_n(t) > 0\) for all \(t \in (0,c_n]\) in case of \(b > 0\).

In case of \(b > 0\), we may conclude that

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p_n(t)) \\&\quad = \ N_{n1} k \frac{-b t^\kappa }{\sqrt{n}} + \rho _n^\mathrm{M}(t,p_n(t)) \\&\quad \le \ \frac{N_{n1} k}{\sqrt{n}} \bigl ( -b t^\kappa + o_p(1) t^{2\delta } + O(1) (t + o_p(n^{-1/2}) t^\delta ) t^\delta \bigr ) \\&\quad \le \ \frac{N_{n1} k t^\kappa }{\sqrt{n}} \bigl ( -b + o_p(1) \bigr ) , \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p_n(t))&\le \ \frac{N_{n1} k w_1(p) t^\kappa }{\sqrt{n}} \bigl ( -b + o_p(1) \bigr ). \end{aligned}$$

Hence for any fixed \(b > 0\),

$$\begin{aligned} \mathop {\mathrm {I\!P}}\nolimits \left( \sqrt{n} (\widehat{B}_n^\mathrm{Z}(t) - t) \le \mathbb {V}_n^{(\ell )}(t) + b t^\kappa \ \text {for} \ t \in (0,c_n] \right) \ \rightarrow \ 0. \end{aligned}$$

Similarly we can show that for any fixed \(b < 0\), with asymptotic probability one, \(\sqrt{n} (\widehat{B}_n^\mathrm{Z}(t) - t) \le \mathbb {V}_n^{(\ell )}(t) + b t^\kappa \) for all \(t \in (0,c_n]\). \(\square \)

Proof of Corollary 4

It follows from Proposition 6 that

$$\begin{aligned} \sup _{1 \le r \le k, \, u \in [0,1]} |\mathbb {V}_{nr}(u)| \ = \ O_p(1). \end{aligned}$$

Together with (5) this entails that \(\sup _{t \in [0,1]} \bigl | \mathbb {V}_n^\mathrm{Z}(t) - \widetilde{\mathbb {V}}_n^\mathrm{Z}(t) \bigr | \rightarrow _p 0\), where \(\widetilde{\mathbb {V}}_n^\mathrm{Z} := \sum _{r=1}^k \gamma _r^\mathrm{Z} \, \mathbb {V}_{nr} \circ B_r\). But \(\gamma _r^\mathrm{Z} \equiv 0\) whenever \(\pi _r = 0\). In case of \(\pi _r > 0\) it follows from Proposition 6 that \(\mathbb {V}_{nr}\) converges in distribution to \(\mathbb {V}_r\). Consequently, \(\widetilde{\mathbb {V}}_n^\mathrm{Z}\) converges in distribution to the Gaussian process \(\mathbb {V}^\mathrm{Z} = \sum _{r=1}^k \gamma _r^\mathrm{Z} \, \mathbb {V}_r \circ B_r\). \(\square \)

Proof of Theorem 5

The asserted inequalities follow from Jensen’s inequality. On the one hand, it follows from \(w_r = \beta _r / (B_r(1 - B_r))\) and \(\sum _{r=1}^k \beta _r \equiv k\) that

$$\begin{aligned} K^\mathrm{S}(t)&= \ \frac{1}{k} \sum _{r=1}^k \frac{\beta _r(t)}{k} \cdot (\pi _r w_r(t))^{-1} \\&\ge \ \frac{1}{k} \left( \sum _{r=1}^k \frac{\beta _r(t)}{k} \cdot \pi _r w_r(t) \right) ^{-1} \\&= \ \left( \sum _{r=1}^k \pi _r \beta _r(t) w_r(t) \right) ^{-1} \ = \ K^\mathrm{L}(t). \end{aligned}$$

Equality holds if, and only if,

$$\begin{aligned} \pi _1 w_1(t) = \pi _2 w_2(t) = \cdots = \pi _k w_k(t). \end{aligned}$$

But

$$\begin{aligned} w_1(t) \ = \ \frac{k}{(1-t)( 1 - (1 - t)^k)} \quad \text {and}\quad w_k(t) \ = \ \frac{k}{t(1 - t^k)} , \end{aligned}$$

so

$$\begin{aligned} \frac{w_k(t)}{w_1(t)} \ = \ \frac{(1-t)(1 - (1-t)^k)}{t(1-t^k)} \ = \ \frac{\sum _{j=0}^{k-1} (1-t)^j}{\sum _{j=0}^{k-1} t^j} \end{aligned}$$

is strictly decreasing in t. Hence there is at most one solution of the equation \(\pi _1 w_1(t) = \pi _k w_k(t)\).

Similarly, with \(a_r(t) := \pi _r \beta _r(t) \big / \sum _{s=1}^k \pi _s \beta _s(t)\),

$$\begin{aligned} K^\mathrm{M}(t)&= \ \sum _{r=1}^k \pi _r \beta _r(t) \cdot w_r(t)^{-1} \Big / \left( \sum _{s=1}^k \pi _s \beta _s(t) \right) ^2 \\&= \ \sum _{r=1}^k a_r(t) \cdot w_r(t)^{-1} \Big / \sum _{s=1}^k \pi _s \beta _s(t) \\&\ge \ \left( \sum _{r=1}^k a_r(t) w_r(t) \right) ^{-1} \Big / \sum _{s=1}^k \pi _s \beta _s(t) \\&= \ \left( \sum _{r=1}^k \pi _r \beta _r(t) w_r(t) \right) ^{-1} \ = \ K^\mathrm{L}(t). \end{aligned}$$

Here the inequality is strict unless

$$\begin{aligned} w_1(t) = w_2(t) = \cdots = w_k(t). \end{aligned}$$

But \(w_1(t) = w_k(t)\) implies that \(t = 1/2\). Moreover, \(w_1(1/2) = 2k/(1 - 2^{-k})\) and

$$\begin{aligned} w_{k-1}(1/2) \ = \ \frac{2k(k-1)}{(k+1)(1 - (k+1)2^{-k})} \end{aligned}$$

are identical if, and only if, \(k^2 + k + 2 = 2^{k+1}\). But \(2^{k+1} = 2 \sum _{j=0}^k \left( {\begin{array}{c}k\\ j\end{array}}\right) \) is strictly larger than \(2(1 + k + k(k-1)/2) = k^2 + k + 2\) if \(k \ge 3\).

As to the ratios \(E^\mathrm{Z}(t) := K^\mathrm{Z}(t)/K^\mathrm{L}(t)\), note first that

$$\begin{aligned} E^\mathrm{S}(t)&= \ \sum _{r=1}^k \frac{B_r(t)(1 - B_r(t))}{k^2 \pi _r} \sum _{s=1}^k \pi _s \beta _s(t) w_s(t) \\&\ge \ \min _{r,s=1,\ldots ,k} \frac{B_r(t)(1 - B_r(t)) \beta _s(t) w_s(t)}{k^2} \Big / \min _{r=1,\ldots ,k} \pi _r \\&\rightarrow \ \infty \quad \text {as} \ \min _{r=1,\ldots ,k} \pi _r \downarrow 0. \end{aligned}$$

On the other hand, with \(a_r(t)\) as above,

$$\begin{aligned} E^\mathrm{M}(t) \ = \ \sum _{r=1}^k a_r(t) w_r(t)^{-1} \sum _{s=1}^k a_s(t) w_s(t) \ = \ \mathop {\mathrm {I\!E}}\nolimits (W) \mathop {\mathrm {I\!E}}\nolimits (W^{-1}) \end{aligned}$$

with a random variable W with distribution \(\sum _{r=1}^k a_r(t) \delta _{w_r(t)}\). But with \(\ell (t) := \min _r w_r(t)\) and \(u(t) := \max _r w_r(t)\), convexity of \(w \mapsto w^{-1}\) on \([\ell (t),u(t)]\) implies that

$$\begin{aligned} W^{-1} \ \le \ \frac{W - \ell (t)}{u(t) - \ell (t)} u(t)^{-1} + \frac{u(t) - W}{u(t) - \ell (t)} \ell (t)^{-1} , \end{aligned}$$

so

$$\begin{aligned} \mathop {\mathrm {I\!E}}\nolimits (W) \mathop {\mathrm {I\!E}}\nolimits (W^{-1})&\le \ \mathop {\mathrm {I\!E}}\nolimits (W) \left( \frac{\mathop {\mathrm {I\!E}}\nolimits (W) - \ell (t)}{u(t) - \ell (t)} u(t)^{-1} + \frac{u(t) - \mathop {\mathrm {I\!E}}\nolimits (W)}{u(t) - \ell (t)} \ell (t)^{-1} \right) \\&= \ \frac{\mathop {\mathrm {I\!E}}\nolimits (W) (\ell (t) + u(t) - \mathop {\mathrm {I\!E}}\nolimits (W))}{\ell (t) u(t)} \\&\le \ \frac{(\ell (t) + u(t))^2}{4 \ell (t)u(t)} \ = \ \frac{\rho (t) + \rho (t)^{-1} + 2}{4}. \end{aligned}$$

This upper bound for \(E^\mathrm{M}(t)\) is attained approximately, if the distribution of W approaches the uniform distribution on \(\{\ell (t),u(t)\}\). Hence we should choose \((\pi _r)_{r=1}^k\) as follows: Let r(1), r(2) be two different numbers in \(\{1,\ldots ,k\}\) such that \(w_{r(1)}(t) = \ell (t)\) and \(w_{r(2)}(t) = u(t)\). Then let

$$\begin{aligned} \pi _r \ \approx \ {\left\{ \begin{array}{ll} \beta _{r}(t)^{-1}/(\beta _{r(1)}^{-1} + \beta _{r(2)}^{-1}) &{}\text {for} \ r \in \{r(1),r(2)\} , \\ 0 &{}\text {for} \ r \not \in \{r(1),r(2)\}. \end{array}\right. } \end{aligned}$$

The inequality \(\rho (t) \le k\) follows from Lemma 7 and the fact that \(\rho (t)\) remains unchanged if we replace \(w_r(t)\) with \(\widetilde{w}_r(t) = t(1-t) w_t(t) \in [1,k]\). \(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dümbgen, L., Zamanzade, E. Inference on a distribution function from ranked set samples. Ann Inst Stat Math 72, 157–185 (2020). https://doi.org/10.1007/s10463-018-0680-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-018-0680-y

Keywords

Navigation