Inference on a distribution function from ranked set samples

Dümbgen, Lutz; Zamanzade, Ehsan

doi:10.1007/s10463-018-0680-y

Inference on a distribution function from ranked set samples

Published: 20 July 2018

Volume 72, pages 157–185, (2020)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

338 Accesses
19 Citations
Explore all metrics

Abstract

Consider independent observations $(X_i,R_i)$ with random or fixed ranks $R_i$, while conditional on $R_i$, the random variable $X_i$ has the same distribution as the $R_i$-th order statistic within a random sample of size k from an unknown distribution function F. Such observation schemes are well known from ranked set sampling and judgment post-stratification. Within a general, not necessarily balanced setting we derive and compare the asymptotic distributions of three different estimators of the distribution function F: a stratified estimator, a nonparametric maximum-likelihood estimator and a moment-based estimator. Our functional central limit theorems generalize and refine previous asymptotic analyses. In addition, we discuss briefly pointwise and simultaneous confidence intervals for the distribution function with guaranteed coverage probability for finite sample sizes. The methods are illustrated with a real data example, and the potential impact of imperfect rankings is investigated in a small simulation experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Article Open access 07 July 2017

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

References

Balakrishnan, N., Li, T. (2006). Confidence intervals for quantiles and tolerance intervals based on ordered ranked set samples. Annals of the Institute of Statistical Mathematics, 58, 757–777.
Article MathSciNet Google Scholar
Bhoj, D. S. (2001). Ranked set sampling with unequal samples. Biometrics, 57(3), 957–962.
Article MathSciNet Google Scholar
Chen, Z. (2001). Non-parametric inferences based on general unbalanced ranked-set samples. Journal of Nonparametric Statistics, 13(2), 291–310.
Article MathSciNet Google Scholar
Chen, Z., Bai, Z., Sinha, B. K. (2004). Ranked set sampling. Theory and applications. New York: Springer.
Book Google Scholar
Clopper, C. J., Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26(4), 404–413.
Article Google Scholar
Dastbaravarde, A., Arghami, N. R., Sarmad, M. (2016). Some theoretical results concerning non parametric estimation by using a judgment poststratification sample. Communications in Statistics, Theory and Methods, 45(8), 2181–2203.
Article MathSciNet Google Scholar
David, H. A., Nagaraja, H. N. (2003). Order statistics (3rd ed.). Hoboken, NJ: Wiley-Interscience.
Dell, T. R., Clutter, J. L. (1972). Ranked set sampling theory with order statistics background. Biometrics, 28(2), 545–555.
Article Google Scholar
Frey, J., Ozturk, O. (2011). Constrained estimation using judgement post-stratification. Annals of the Institute of Statistical Mathematics, 63, 769–789.
Article MathSciNet Google Scholar
Ghosh, K., Tiwari, R. C. (2008). Estimating the distribution function using $k$-tuple ranked set samples. Journal of Statistical Planning and Inference, 138(4), 929–949.
Article MathSciNet Google Scholar
Huang, J. (1997). Properties of the Npmle of a distribution function based on ranked set samples. Annals of Statistics, 25(3), 1036–1049.
Article MathSciNet Google Scholar
Kvam, P. H., Samaniego, F. J. (1994). Nonparametric maximum likelihood estimation based on ranked set samples. Journal of the American Statistical Association, 89(426), 526–537.
Article MathSciNet Google Scholar
MacEachern, S. N., Stasny, E. A., Wolfe, D. A. (2004). Judgement post-stratification with imprecise rankings. Biometrics, 60, 207–215.
Article MathSciNet Google Scholar
McIntyre, G. A. (1952). A method of unbiased selective sampling, using ranked sets. Australian Journal of Agricultural Research, 3, 385–390.
Article Google Scholar
Presnell, B., Bohn, L. L. (1999). U-Statistics and imperfect ranking in ranked set sampling. Journal of Nonparamatric Statistics, 10(2), 111–126.
Article MathSciNet Google Scholar
R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Accessed July 2018.
Shorack, G. R., Wellner, J. A. (1986). Empirical processes with applications to statistics. New York: Wiley.
Stokes, S. L., Sager, T. W. (1988). Characterization of a ranked-set sample with application to estimating distribution functions. Journal of the American Statistical Association, 83(402), 374–381.
Article MathSciNet Google Scholar
Terpstra, J. T., Miller, Z. A. (2006). Exact inference for a population proportion based on a ranked set sample. Communications in Statistics, Simulation and Computation, 35(1), 19–26.
Article MathSciNet Google Scholar
Wang, X., Wang, K., Lim, J. (2012). Isotonized CDF estimation from judgement poststratification data with empty strata. Biometrics, 68(1), 194–202.
Article MathSciNet Google Scholar
Wolfe, D. A. (2004). Ranked set sampling: An approach to more efficient data collection. Statistical Science, 19(4), 636–643.
Article MathSciNet Google Scholar
Wolfe, D. A. (2012). Ranked set sampling: Its relevance and impact on statistical inference. ISRN Probability and Statistics, 2012, 568385. https://doi.org/10.5402/2012/568385.
Article MATH Google Scholar

Download references

Acknowledgements

Constructive comments by an associate editor and two referees are gratefully acknowledged.

Author information

Authors and Affiliations

Institute of Mathematical Statistics and Actuarial Science, University of Bern, Alpeneggstrasse 22, 3012, Bern, Switzerland
Lutz Dümbgen
Department of Statistics, University of Isfahan, Isfahan, 81746-73441, Iran
Ehsan Zamanzade

Authors

Lutz Dümbgen
View author publications
You can also search for this author in PubMed Google Scholar
Ehsan Zamanzade
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ehsan Zamanzade.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1016 KB)

Appendix

We first recall two well-known facts about uniform empirical processes, see Shorack and Wellner (1986).

Proposition 6

Let $U_1, U_2, U_3, \ldots $ be independent random variables with uniform distribution on [0, 1]. For $N \in \mathbb {N}$ and $u \in [0,1]$ define

$$\begin{aligned} \mathbb {V}^{(N)}(u) \ := \ N^{-1/2} \sum _{i=1}^N \bigl ( 1 \{U_i \le u\} - u) . \end{aligned}$$

Then, as $N \rightarrow \infty $, $\mathbb {V}^{(N)}$ converges in distribution in $\ell _\infty ([0,1])$ to a standard Brownian bridge $\mathbb {V}$ on [0, 1]. Moreover, for any fixed $\delta \in [0,1/2)$ and $\epsilon > 0$,

$$\begin{aligned} \sup _{N \ge 1} \mathop {\mathrm {I\!P}}\nolimits \left( \sup _{u \in (0,1)} \frac{|\mathbb {V}^{(N)}(u)|}{u^\delta (1 - u)^\delta } \ge C \right)&\rightarrow \ 0 \quad \text {as} \ C \uparrow \infty , \\ \sup _{N \ge 1} \mathop {\mathrm {I\!P}}\nolimits \left( \sup _{u \in (0,c] \cup [1-c,1)} \frac{|\mathbb {V}^{(N)}(u)|}{u^\delta (1 - u)^\delta } \ge \epsilon \right)&\rightarrow \ 0 \quad \text {as} \ c \downarrow 0. \end{aligned}$$

For the estimators $\widehat{F}_n^\mathrm{M}$, $\widehat{F}_n^\mathrm{L}$ we need some basic facts and inequalities for the auxiliary functions $w_k$ and $B_k$ which are proved in the supplement:

Lemma 7

(a):

For $r = 1,2,\ldots ,k$, the function $w_r$ on (0, 1) may be written as $w_r(t) = \widetilde{w}_r(t) / (t(1-t))$ with $\widetilde{w}_r : [0,1] \rightarrow (0,\infty )$ continuously differentiable. Moreover, for $r = 1,2,\ldots ,k$ and $t \in (0,1)$,

$$\begin{aligned} 1 \ \le \ \widetilde{w}_r(t) \ \le \ \max (r,k+1-r). \end{aligned}$$

(b):

For any constant $c \in (0,1)$ there exists a number $c' = c'(k,c) > 0$ with the following property: If $t,p \in (0,1)$ such that

$$\begin{aligned} \frac{|p - t|}{t(1-t)} \ \le \ c , \end{aligned}$$

then for $r = 1,2,\ldots ,k$,

$$\begin{aligned} \max \left\{ \left| \frac{w_r(p)}{w_r(t)} - 1 \right| , \left| \frac{B_r(p) - B_r(t)}{\beta _r(t) (p - t)} - 1 \right| \right\} \ \le \ c' \frac{|p - t|}{t(1-t)}. \end{aligned}$$

Proof of Theorem 2

We start with the weight functions $\gamma _{nr}^\mathrm{Z}$: Note that by Lemma 7,

$$\begin{aligned} \gamma _{nr}^\mathrm{S}(t)&= \ \frac{1}{k \sqrt{\pi _{nr}}} , \\ \gamma _{nr}^\mathrm{M}(t)&= \ \sqrt{\pi _{nr}} \bigr / \sum _{s=1}^k \pi _{ns} \beta _s(t) , \\ \gamma _{nr}^\mathrm{L}(t)&= \sqrt{\pi _{nr}}\, \widetilde{w}_r(t) \Big / \sum _{s=1}^k \pi _{ns} \widetilde{w}_s(t) \beta _s(t) \end{aligned}$$

with the probability weights $\pi _{nr} := N_{nr}/n$ and continuous functions $\widetilde{w}_r : [0,1] \rightarrow [1,k]$. Since the beta densities $\beta _r$ are also continuous with $\beta _1(0) = \beta _k(1) = k$, this shows that $\gamma _{nr}^\mathrm{Z}$ is well-defined and continuous, provided that its denominator is strictly positive, i.e.,

$$\begin{aligned} {\left\{ \begin{array}{ll} \pi _{n1}, \ldots , \pi _{nk}> 0 &{} \text {if} \ \mathrm{Z} = \mathrm{S} , \\ \pi _{n1}, \pi _{nk} > 0 &{} \text {if} \ \mathrm{Z} = \mathrm{M}, \mathrm{L}. \end{array}\right. } \end{aligned}$$

For sufficiently large n this is the case, because $\lim _{n\rightarrow \infty } \pi _{nr} = \pi _r$ for all r. The functions $\gamma _r^\mathrm{Z}$ in Corollary 4 are continuous, too, and elementary considerations reveal that

$$\begin{aligned} \max _{t \in [0,1], \, 1 \le r \le k} \bigl | \gamma _{nr}^\mathrm{Z}(t) - \gamma _r^\mathrm{Z}(t) \bigr | \ \rightarrow \ 0 \end{aligned}$$

(5)

as $n \rightarrow \infty $. In particular, $\max _{t \in [0,1], 1 \le r \le k} \gamma _{nr}^\mathrm{Z}(t) = O(1)$.

Note that for $n \ge 1$ and $1 \le r \le k$, the empirical process $\mathbb {V}_{nr}$ is distributed as $\mathbb {V}^{(N_{nr})}$ in Proposition 6. Note also that the distribution functions $B_r$ satisfy $B_1 \ge B_2 \ge \cdots \ge B_k$, because for $1 \le r < k$ the density ratio $\beta _{r+1}/\beta _r$ is a positive multiple of $t/(1 - t)$ and thus strictly increasing. Consequently, for $1 \le r \le k$,

$$\begin{aligned} B_r(t) \ \le \ B_1(t) \ \le \ kt \quad \text {and}\quad 1 - B_r(t) \ \le \ 1 - B_k(t) \ \le \ k(1-t) , \end{aligned}$$

so

$$\begin{aligned} \frac{B_r(t)(1 - B_r(t))}{t(1-t)} \ \le \ k. \end{aligned}$$

Consequently,

$$\begin{aligned} \sup _{t \in (0,1)} \frac{|\mathbb {V}_{nr}(B_r(t))|}{t^\delta (1 - t)^\delta }&\le \ k^\delta \sup _{u \in (0,1)} \frac{|\mathbb {V}_{nr}(u)|}{u^\delta (1 - u)^\delta } \ = \ O_p(1) \quad \text {and} \\ \sup _{u \in (0,c] \cup [1-c,1)} \frac{|\mathbb {V}_{nr}(B_r(t))|}{t^\delta (1 - t)^\delta }&\le \ k^\delta \sup _{u \in (0,kc] \cup [1 - kc,1)} \frac{|\mathbb {V}_{nr}(u)|}{u^\delta (1 - u)^\delta } \ \rightarrow _p \ 0 \end{aligned}$$

as $n \rightarrow \infty $ and $c \downarrow 0$. All in all, we may conclude that

$$\begin{aligned} \sup _{t \in (0,1)} \, \frac{|\mathbb {V}_n^\mathrm{Z}(t)|}{t^\delta (1 - t)^\delta }&= \ O_p(1) , \end{aligned}$$

(6)

$$\begin{aligned} \sup _{t \in (0,c] \cup [1-c,1)} \, \frac{|\mathbb {V}_n^\mathrm{Z}(t)|}{t^\delta (1 - t)^\delta }&\rightarrow _p \ 0 \quad \text {as} \ n \rightarrow \infty \ \text {and}\ c \downarrow 0. \end{aligned}$$

(7)

It remains to be shown that the process $\sqrt{n} (\widehat{B}_n^\mathrm{Z} - B)$ may be approximated by $\mathbb {V}_n^\mathrm{Z}$. In case of $\mathrm{Z} = \mathrm{S}$ it follows from $\sum _{r=1}^k \beta _r \equiv k$ that $\sum _{r=1}^k B_r = k B$, and this implies that

$$\begin{aligned} \sqrt{n} (\widehat{B}_n^\mathrm{S}- B) \ = \ \sum _{r=1}^k \frac{\sqrt{n} (\widehat{B}_{nr} - B_r)}{k} \ = \ \sum _{r=1}^k \gamma _{nr}^\mathrm{S} \, \mathbb {V}_{nr} \circ B_r \ = \ \mathbb {V}_n^\mathrm{S}. \end{aligned}$$

For $\mathrm{Z} = \mathrm{M}, \mathrm{L}$ it suffices to show that for any fixed number $b \ne 0$ and

$$\begin{aligned} p_n^\mathrm{Z}(t) \ := \ t + \frac{\mathbb {V}_n^\mathrm{Z}(t) + b t^\delta (1-t)^\delta }{\sqrt{n}} \end{aligned}$$

the following statements are true: If $b < 0$, then with asymptotic probability one,

$$\begin{aligned} \left. \begin{array}{c} \displaystyle \inf _{t \in (0,1)} \left( n \widehat{B}_n(t) - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \right) \\ \displaystyle \inf _{t \in (0,1)} \, L_n'(t,p_n^\mathrm{L}(t)) \end{array}\right\} \ \ge \ 0. \end{aligned}$$

(8)

If $b > 0$, then with asymptotic probability one,

$$\begin{aligned} \left. \begin{array}{c} \displaystyle \sup _{t \in (0,1)} \left( n \widehat{B}_n(t) - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \right) \\ \displaystyle \sup _{t \in (0,1)} \, L_n'(t,p_n^\mathrm{L}(t)) \end{array}\right\} \ \le \ 0. \end{aligned}$$

(9)

Here we use the conventions that $L_n'(t,\cdot ) := \infty $ and $B_r := 0$ on $(-\infty ,0]$ while $L_n'(t,\cdot ) := -\infty $ and $B_r := 1$ on $[1,\infty )$.

To verify these claims, we split the interval (0, 1) into $(0,c_n]$, $[c_n,1-c_n]$ and $[1-c_n,1)$ with numbers $c_n \in (0,1/2)$ to be specified later, where $c_n \downarrow 0$.

On $[c_n,1-c_n]$ we utilize Lemma 7: For $t \in [c_n, 1 - t_n]$ and $p \in (0,1)$ such that $|p - t| \le t(1-t)/2$ we may write

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr}(B_r(p) - B_r(t)) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^m N_{nr} \beta _r(t) (p - t) + \rho _n^\mathrm{M}(t,p) \\&\quad = \ \sum _{r=1}^k N_{nr} \beta _r(t) \left( \frac{\mathbb {V}_n^\mathrm{M}(t)}{\sqrt{n}} - (p - t) \right) + \rho _n^\mathrm{M}(t,p) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p)&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(p) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(p) (B_r(p) - B_r(t)) \\&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(t) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) (p - t) + \rho _n^\mathrm{L}(t,p) \\&= \ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) \left( \frac{\mathbb {V}_n^\mathrm{L}(t)}{\sqrt{n}} - (p - t) \right) + \rho _n^\mathrm{L}(t,p) , \end{aligned}$$

where

$$\begin{aligned} |\rho _n^\mathrm{M}(t,p)|&\le \ \frac{O(n) |p - t|^2}{t(1-t)} , \\ |\rho _n^\mathrm{L}(t,p)|&\le \ \frac{O_p(\sqrt{n}) t^{\delta } (1 - t)^{\delta } |p - t|}{t(1-t)} + \frac{O(n) |p - t|^2}{t^2(1-t)^2}. \end{aligned}$$

Note that for $t \in [c_n,1-c_n]$,

$$\begin{aligned} \frac{\bigl | p_n^\mathrm{Z}(t) - t \bigr |}{t(1-t)} \ \le \ \frac{O_p(1) t^\delta (1 - t)^{\delta }}{\sqrt{n} \, t(1-t)} \ \le \ \frac{O_p(1)}{\sqrt{n} \, c_n^{1-\delta }}. \end{aligned}$$

Hence we choose $c_n$ such that $c_n \downarrow 0$ but $n c_n^{2(1-\delta )} \rightarrow \infty $. With this choice, we may conclude that uniformly in $t \in [c_n,1-c_n]$,

$$\begin{aligned} \bigl | \rho _n^\mathrm{M}(t,p_n^\mathrm{M}(t)) \bigr |&\le \ O_p(c_n^{\delta -1}) t^\delta (1-t)^\delta , \\ \bigl | \rho _n^\mathrm{L}(t,p_n^\mathrm{L}(t)) \bigr |&\le \ O_p(c_n^{\delta -1}) t^{\delta -1}(1-t)^{\delta -1}. \end{aligned}$$

On the other hand, since $\beta _1(t) + \beta _k(t) \ge \beta _1(1/2) + \beta _k(1/2) = k 2^{2-k}$,

$$\begin{aligned} \sum _{r=1}^k N_{nr} \beta _r(t)&\ge \ k 2^{2 - k} \min \{N_{n1},N_{nk}\} , \\ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t)&\ge \ \frac{k 2^{2-k} c_w}{t(1-t)} \, \min \{N_{n1},N_{nk}\}. \end{aligned}$$

Consequently,

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \\&\quad = \ \sum _{r=1}^k N_{nr} \beta _r(t) \frac{- b t^\delta (1-t)^\delta }{\sqrt{n}} + \rho _n^\mathrm{M}(t,p_n^\mathrm{M}(t)) \\&\quad = \ \sum _{r=1}^m N_{nr} \beta _r(t) \frac{t^\delta (1-t)^\delta }{\sqrt{n}} \left( - b + O_p(c_n^{\delta -1} n^{-1/2}) \kappa _n^\mathrm{M}(t) \right) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p_n^\mathrm{L}(t))&= \ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) \frac{-b t^\delta (1-t)^\delta }{\sqrt{n}} + \rho _n^\mathrm{L}(t,p_n^\mathrm{L}(t)) \\&= \ \sum _{r=1}^k N_{nr} w_r(t) \beta _r(t) \frac{t^\delta (1-t)^\delta }{\sqrt{n}} \Bigg ( - b + O_p\left( c_n^{\delta -1} n^{-1/2}\right) \kappa _n^\mathrm{L}(t) \Bigg ) \end{aligned}$$

for some random functions $\kappa _n^\mathrm{M}, \kappa _n^\mathrm{L} : [c_n,1-c_n] \rightarrow [-1,1]$. These considerations show that (8) and (9) are satisfied with $[c_n,1-c_n]$ in place of (0, 1).

It remains to verify (8) and (9) with $(0,c_n]$ in place of (0, 1); the interval $[1-c_n,1)$ may be treated analogously. Note first that for $2 \le r \le k$,

$$\begin{aligned} B_r(t) \ \le \ B_2(t) \ \le \ k(k-1) t^2/2 \quad \text {and}\quad \beta _r(t) \ \le \ k 2^{k-1} t , \end{aligned}$$

so

$$\begin{aligned} \bigl | B_r(p) - B_r(t) \bigr | \ = \ \Bigl | \int _t^p \beta _r(u) \, \mathrm{d}u \Bigr | \ \le \ O(\max (p,t)) (p - t). \end{aligned}$$

Furthermore, since $B_1(t) = 1 - (1 - t)^k$,

$$\begin{aligned} B_1(p) - B_1(t) \ = \ k (p - t) + O(\max (t,p)) (p - t). \end{aligned}$$

Hence for $t \in (0,c_n]$ and $p \in (0,2c_n]$,

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr}(B_r(p) - B_r(t)) \\&\quad = \ - N_{n1} k (p - t) + \rho _n^\mathrm{M}(t,p) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p)&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(p) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(p) (B_r(p) - B_r(t)) \\&= \ - N_{n1} w_1(p) k (p - t) + \rho _n^\mathrm{L}(t,p) , \end{aligned}$$

where

$$\begin{aligned} |\rho _n^\mathrm{M}(t,p)|&\le \ o_p(\sqrt{n}) t^\delta + O(n c_n) (p - t) , \\ |\rho _n^\mathrm{L}(t,p)|&\le \ o_p(\sqrt{n}) p^{-1} t^\delta + O(n c_n) p^{-1} (p - t). \end{aligned}$$

Note also that

$$\begin{aligned} \sup _{t \in (0,c_n]} \Bigl | \frac{\sqrt{n}(p_n^\mathrm{Z}(t) - t)}{t^\delta (1-t)^\delta } - b \Bigr | \ \rightarrow _p \ 0. \end{aligned}$$

In particular, $\sup _{t \in (0,c_n]} p_n^\mathrm{Z}(t) = c_n + o_p(n^{-1/2} c_n^\delta ) = c_n (1 + o_p(1))$, and in case of $b > 0$, $\mathop {\mathrm {I\!P}}\nolimits \bigl ( p_n^\mathrm{Z}(t) > 0 \ \text {for} \ 0 < t \le c_n \bigr ) \rightarrow 1$.

In case of $b > 0$, these considerations show that for $0 < t \le c_n$,

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t)) \\&\quad = \ - N_{n1} k (p_n^\mathrm{M}(t) - t) + \rho _n^\mathrm{M}(t,p_n^\mathrm{M}(t)) \\&\quad \le \ \frac{N_{n1} k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( -b + o_p(1) \bigr ) + o_p(\sqrt{n}) t^\delta + O(\sqrt{n} c_n) t^\delta \\&\quad \le \ \frac{N_{n1} k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p_n^\mathrm{L}(t))&= \ - N_{n1} w_1(p) k (p_n^\mathrm{L}(t) - t) + \rho _n^\mathrm{L}(t,p_n^\mathrm{Z}(t)) \\&\le \ \frac{N_{n1} w_1(p) k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) + o_p(\sqrt{n}) p^{-1} t^\delta \\&\quad + O(\sqrt{n} c_n) p^{-1} t^\delta \\&\le \ \frac{N_{n1} w_1(p) k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ). \end{aligned}$$

Analogously, in case of $b < 0$, for any $t \in (0,c_n]$ we obtain the inequalities

$$\begin{aligned} n \widehat{B}_n(t) \!-\! \sum _{r=1}^k N_{nr} B_r(p_n^\mathrm{M}(t))&\ge {\left\{ \begin{array}{ll} \displaystyle \frac{N_{n1} k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) &{} \text {if} \ p_n^\mathrm{M}(t)> 0 , \\ 0 &{} \text {if} \ p_n^\mathrm{M}(t) \le 0 , \end{array}\right. } \\ L_n'(t,p_n^\mathrm{L}(t))&\ge {\left\{ \begin{array}{ll} \displaystyle \frac{N_{n1} w_1(p) k t^\delta (1-t)^\delta }{\sqrt{n}} \bigl ( - b + o_p(1) \bigr ) &{} \text {if} \ p_n^\mathrm{L}(t) > 0 , \\ \infty &{} \text {if} \ p_n^\mathrm{L}(t) \le 0. \end{array}\right. } \end{aligned}$$

Hence, (8) and (9) are satisfied with $(0,c_n]$ in place of (0, 1). $\square $

Proof of Theorem 3

For symmetry reasons it suffices to prove the first part about the left tails. Let $(c_n)_n$ be a sequence of numbers in (0, 1 / 2] converging to zero. Then for $t \in (0,c_n]$ and $\delta := \kappa /2 \in (0,1/2)$,

$$\begin{aligned} \bigl | \sqrt{n} \bigl ( \widehat{B}_n^\mathrm{S}(t) - t \bigr ) - \mathbb {V}_n^{(\ell )}(t) \bigr | \ = \ \left| \sum _{r=2}^k \frac{\mathbb {V}_{nr}(B_r(t))}{k \sqrt{N_{nr}/n}} \right| \ \le \ t^{2 \delta } o_p(1) \ = \ t^\kappa o_p(1). \end{aligned}$$

Concerning $\widehat{B}_n^\mathrm{M}$ and $\widehat{B}_n^\mathrm{L}$, for any $t \in (0,c_n]$ and $p \in (0,1)$,

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p) \\&\quad = \ \sum _{r=1}^k \sqrt{N_{nr}} \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr}(B_r(p) - B_r(t)) \\&\quad = \ \sqrt{N_{n1}} \mathbb {V}_{n1}(B_1(t)) - N_{n1} k(p - t) + \rho _n^\mathrm{M}(t,p) \\&\quad = \ N_{n1} k \Bigl ( \frac{\mathbb {V}_{n1}(B_1(t))}{k \sqrt{N_{n1}}} - (p - t) \Bigr ) + \rho _n^\mathrm{M}(t,p) \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p)&= \ \sum _{r=1}^k \sqrt{N_{nr}} w_r(p) \mathbb {V}_{nr}(B_r(t)) - \sum _{r=1}^k N_{nr} w_r(p) (B_r(p) - B_r(t)) \\&= \ \sqrt{N_{n1}} w_1(p) \mathbb {V}_{n1}(B_1(t)) - N_{n1} w_1(p) k (p - t) + \rho _n^\mathrm{L}(t,p) \\&= \ N_{n1} k w_1(p) \left( \frac{\mathbb {V}_{n1}(B_1(t))}{k \sqrt{N_{n1}}} - (p - t) \right) + \rho _n^\mathrm{L}(t,p) , \end{aligned}$$

where

$$\begin{aligned} |\rho _n^\mathrm{M}(t,p)|&\le \ o_p(\sqrt{n}) t^{2\delta } + O(n) \max (t,p) (p - t) , \\ |\rho _n^\mathrm{L}(t,p)|&\le \ o_p(\sqrt{n}) p^{-1} t^{2\delta } + O(n) p^{-1} \max (t,p) (p - t). \end{aligned}$$

Now we proceed similarly as in the proof of Theorem 2, defining

$$\begin{aligned} p_n(t) \ := \ t + \frac{\mathbb {V}_n^{(\ell )}(t) + b t^\kappa }{\sqrt{n}} \end{aligned}$$

for some fixed $b \ne 0$. Note that for $t \in (0,c_n]$,

$$\begin{aligned} |p_n(t) - t| \ \le \ o_p(n^{-1/2}) t^\delta + O(n^{-1/2}) t^\kappa \ = \ o_p(n^{-1/2}) t^\delta , \end{aligned}$$

because $\kappa > \delta $. Note also that

$$\begin{aligned} t + \frac{\mathbb {V}_n^{(\ell )}(t)}{\sqrt{n}} \ = \ t + \frac{\mathbb {V}_{n1}(B_1(t))}{k \sqrt{N_{n1}}} \ = \ t - \frac{1 - (1-t)^k}{k} + \frac{\widehat{B}_{n1}(t)}{k} \ > \ 0 \quad \text {on} \ (0,1) , \end{aligned}$$

because $\widehat{B}_{n1} \ge 0$ and $t \mapsto t - (1 - (1-t)^k)/k$ is strictly convex on [0, 1] with derivative 0 at 0. Thus $p_n(t) > 0$ for all $t \in (0,c_n]$ in case of $b > 0$.

In case of $b > 0$, we may conclude that

$$\begin{aligned}&n \widehat{B}_n(t) \ - \sum _{r=1}^k N_{nr} B_r(p_n(t)) \\&\quad = \ N_{n1} k \frac{-b t^\kappa }{\sqrt{n}} + \rho _n^\mathrm{M}(t,p_n(t)) \\&\quad \le \ \frac{N_{n1} k}{\sqrt{n}} \bigl ( -b t^\kappa + o_p(1) t^{2\delta } + O(1) (t + o_p(n^{-1/2}) t^\delta ) t^\delta \bigr ) \\&\quad \le \ \frac{N_{n1} k t^\kappa }{\sqrt{n}} \bigl ( -b + o_p(1) \bigr ) , \end{aligned}$$

and

$$\begin{aligned} L_n'(t,p_n(t))&\le \ \frac{N_{n1} k w_1(p) t^\kappa }{\sqrt{n}} \bigl ( -b + o_p(1) \bigr ). \end{aligned}$$

Hence for any fixed $b > 0$,

$$\begin{aligned} \mathop {\mathrm {I\!P}}\nolimits \left( \sqrt{n} (\widehat{B}_n^\mathrm{Z}(t) - t) \le \mathbb {V}_n^{(\ell )}(t) + b t^\kappa \ \text {for} \ t \in (0,c_n] \right) \ \rightarrow \ 0. \end{aligned}$$

Similarly we can show that for any fixed $b < 0$, with asymptotic probability one, $\sqrt{n} (\widehat{B}_n^\mathrm{Z}(t) - t) \le \mathbb {V}_n^{(\ell )}(t) + b t^\kappa $ for all $t \in (0,c_n]$. $\square $

Proof of Corollary 4

It follows from Proposition 6 that

$$\begin{aligned} \sup _{1 \le r \le k, \, u \in [0,1]} |\mathbb {V}_{nr}(u)| \ = \ O_p(1). \end{aligned}$$

Together with (5) this entails that $\sup _{t \in [0,1]} \bigl | \mathbb {V}_n^\mathrm{Z}(t) - \widetilde{\mathbb {V}}_n^\mathrm{Z}(t) \bigr | \rightarrow _p 0$, where $\widetilde{\mathbb {V}}_n^\mathrm{Z} := \sum _{r=1}^k \gamma _r^\mathrm{Z} \, \mathbb {V}_{nr} \circ B_r$. But $\gamma _r^\mathrm{Z} \equiv 0$ whenever $\pi _r = 0$. In case of $\pi _r > 0$ it follows from Proposition 6 that $\mathbb {V}_{nr}$ converges in distribution to $\mathbb {V}_r$. Consequently, $\widetilde{\mathbb {V}}_n^\mathrm{Z}$ converges in distribution to the Gaussian process $\mathbb {V}^\mathrm{Z} = \sum _{r=1}^k \gamma _r^\mathrm{Z} \, \mathbb {V}_r \circ B_r$. $\square $

Proof of Theorem 5

The asserted inequalities follow from Jensen’s inequality. On the one hand, it follows from $w_r = \beta _r / (B_r(1 - B_r))$ and $\sum _{r=1}^k \beta _r \equiv k$ that

$$\begin{aligned} K^\mathrm{S}(t)&= \ \frac{1}{k} \sum _{r=1}^k \frac{\beta _r(t)}{k} \cdot (\pi _r w_r(t))^{-1} \\&\ge \ \frac{1}{k} \left( \sum _{r=1}^k \frac{\beta _r(t)}{k} \cdot \pi _r w_r(t) \right) ^{-1} \\&= \ \left( \sum _{r=1}^k \pi _r \beta _r(t) w_r(t) \right) ^{-1} \ = \ K^\mathrm{L}(t). \end{aligned}$$

Equality holds if, and only if,

$$\begin{aligned} \pi _1 w_1(t) = \pi _2 w_2(t) = \cdots = \pi _k w_k(t). \end{aligned}$$

But

$$\begin{aligned} w_1(t) \ = \ \frac{k}{(1-t)( 1 - (1 - t)^k)} \quad \text {and}\quad w_k(t) \ = \ \frac{k}{t(1 - t^k)} , \end{aligned}$$

so

$$\begin{aligned} \frac{w_k(t)}{w_1(t)} \ = \ \frac{(1-t)(1 - (1-t)^k)}{t(1-t^k)} \ = \ \frac{\sum _{j=0}^{k-1} (1-t)^j}{\sum _{j=0}^{k-1} t^j} \end{aligned}$$

is strictly decreasing in t. Hence there is at most one solution of the equation $\pi _1 w_1(t) = \pi _k w_k(t)$.

Similarly, with $a_r(t) := \pi _r \beta _r(t) \big / \sum _{s=1}^k \pi _s \beta _s(t)$,

$$\begin{aligned} K^\mathrm{M}(t)&= \ \sum _{r=1}^k \pi _r \beta _r(t) \cdot w_r(t)^{-1} \Big / \left( \sum _{s=1}^k \pi _s \beta _s(t) \right) ^2 \\&= \ \sum _{r=1}^k a_r(t) \cdot w_r(t)^{-1} \Big / \sum _{s=1}^k \pi _s \beta _s(t) \\&\ge \ \left( \sum _{r=1}^k a_r(t) w_r(t) \right) ^{-1} \Big / \sum _{s=1}^k \pi _s \beta _s(t) \\&= \ \left( \sum _{r=1}^k \pi _r \beta _r(t) w_r(t) \right) ^{-1} \ = \ K^\mathrm{L}(t). \end{aligned}$$

Here the inequality is strict unless

$$\begin{aligned} w_1(t) = w_2(t) = \cdots = w_k(t). \end{aligned}$$

But $w_1(t) = w_k(t)$ implies that $t = 1/2$. Moreover, $w_1(1/2) = 2k/(1 - 2^{-k})$ and

$$\begin{aligned} w_{k-1}(1/2) \ = \ \frac{2k(k-1)}{(k+1)(1 - (k+1)2^{-k})} \end{aligned}$$

are identical if, and only if, $k^2 + k + 2 = 2^{k+1}$. But $2^{k+1} = 2 \sum _{j=0}^k \left( {\begin{array}{c}k\\ j\end{array}}\right) $ is strictly larger than $2(1 + k + k(k-1)/2) = k^2 + k + 2$ if $k \ge 3$.

As to the ratios $E^\mathrm{Z}(t) := K^\mathrm{Z}(t)/K^\mathrm{L}(t)$, note first that

$$\begin{aligned} E^\mathrm{S}(t)&= \ \sum _{r=1}^k \frac{B_r(t)(1 - B_r(t))}{k^2 \pi _r} \sum _{s=1}^k \pi _s \beta _s(t) w_s(t) \\&\ge \ \min _{r,s=1,\ldots ,k} \frac{B_r(t)(1 - B_r(t)) \beta _s(t) w_s(t)}{k^2} \Big / \min _{r=1,\ldots ,k} \pi _r \\&\rightarrow \ \infty \quad \text {as} \ \min _{r=1,\ldots ,k} \pi _r \downarrow 0. \end{aligned}$$

On the other hand, with $a_r(t)$ as above,

$$\begin{aligned} E^\mathrm{M}(t) \ = \ \sum _{r=1}^k a_r(t) w_r(t)^{-1} \sum _{s=1}^k a_s(t) w_s(t) \ = \ \mathop {\mathrm {I\!E}}\nolimits (W) \mathop {\mathrm {I\!E}}\nolimits (W^{-1}) \end{aligned}$$

with a random variable W with distribution $\sum _{r=1}^k a_r(t) \delta _{w_r(t)}$. But with $\ell (t) := \min _r w_r(t)$ and $u(t) := \max _r w_r(t)$, convexity of $w \mapsto w^{-1}$ on $[\ell (t),u(t)]$ implies that

$$\begin{aligned} W^{-1} \ \le \ \frac{W - \ell (t)}{u(t) - \ell (t)} u(t)^{-1} + \frac{u(t) - W}{u(t) - \ell (t)} \ell (t)^{-1} , \end{aligned}$$

so

$$\begin{aligned} \mathop {\mathrm {I\!E}}\nolimits (W) \mathop {\mathrm {I\!E}}\nolimits (W^{-1})&\le \ \mathop {\mathrm {I\!E}}\nolimits (W) \left( \frac{\mathop {\mathrm {I\!E}}\nolimits (W) - \ell (t)}{u(t) - \ell (t)} u(t)^{-1} + \frac{u(t) - \mathop {\mathrm {I\!E}}\nolimits (W)}{u(t) - \ell (t)} \ell (t)^{-1} \right) \\&= \ \frac{\mathop {\mathrm {I\!E}}\nolimits (W) (\ell (t) + u(t) - \mathop {\mathrm {I\!E}}\nolimits (W))}{\ell (t) u(t)} \\&\le \ \frac{(\ell (t) + u(t))^2}{4 \ell (t)u(t)} \ = \ \frac{\rho (t) + \rho (t)^{-1} + 2}{4}. \end{aligned}$$

This upper bound for $E^\mathrm{M}(t)$ is attained approximately, if the distribution of W approaches the uniform distribution on $\{\ell (t),u(t)\}$. Hence we should choose $(\pi _r)_{r=1}^k$ as follows: Let r(1), r(2) be two different numbers in $\{1,\ldots ,k\}$ such that $w_{r(1)}(t) = \ell (t)$ and $w_{r(2)}(t) = u(t)$. Then let

$$\begin{aligned} \pi _r \ \approx \ {\left\{ \begin{array}{ll} \beta _{r}(t)^{-1}/(\beta _{r(1)}^{-1} + \beta _{r(2)}^{-1}) &{}\text {for} \ r \in \{r(1),r(2)\} , \\ 0 &{}\text {for} \ r \not \in \{r(1),r(2)\}. \end{array}\right. } \end{aligned}$$

The inequality $\rho (t) \le k$ follows from Lemma 7 and the fact that $\rho (t)$ remains unchanged if we replace $w_r(t)$ with $\widetilde{w}_r(t) = t(1-t) w_t(t) \in [1,k]$. $\square $

About this article

Cite this article

Dümbgen, L., Zamanzade, E. Inference on a distribution function from ranked set samples. Ann Inst Stat Math 72, 157–185 (2020). https://doi.org/10.1007/s10463-018-0680-y

Download citation

Received: 22 October 2013
Revised: 02 July 2018
Published: 20 July 2018
Issue Date: February 2020
DOI: https://doi.org/10.1007/s10463-018-0680-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inference on a distribution function from ranked set samples

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 1016 KB)

Appendix

Proposition 6

Lemma 7

Proof of Theorem 2

Proof of Theorem 3

Proof of Corollary 4

Proof of Theorem 5

About this article

Cite this article

Keywords

Navigation

Inference on a distribution function from ranked set samples

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 1016 KB)

Appendix

Appendix

Proposition 6

Lemma 7

Proof of Theorem 2

Proof of Theorem 3

Proof of Corollary 4

Proof of Theorem 5

About this article

Cite this article

Share this article

Keywords

Search

Navigation