Information content of partially rank-ordered set samples

Hatefi, Armin; Jozani, Mohammad Jafari

doi:10.1007/s10182-016-0277-9

Information content of partially rank-ordered set samples

Original Paper
Published: 01 August 2016

Volume 101, pages 117–149, (2017)
Cite this article

AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Armin Hatefi^1,2 &
Mohammad Jafari Jozani³

296 Accesses
1 Citation
Explore all metrics

Abstract

Partially rank-ordered set (PROS) sampling is a generalization of ranked set sampling in which rankers are not required to fully rank the sampling units in each set, hence having more flexibility to perform the necessary judgemental ranking process. The PROS sampling has a wide range of applications in different fields ranging from environmental and ecological studies to medical research and it has been shown to be superior over ranked set sampling and simple random sampling for estimating the population mean. We study Fisher information content and uncertainty structure of the PROS samples and compare them with those of simple random sample (SRS) and ranked set sample (RSS) counterparts of the same size from the underlying population. We study uncertainty structure in terms of the Shannon entropy, Rényi entropy and Kullback–Leibler (KL) discrimination measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficiency comparisons for partially rank-ordered set sampling

Article 16 January 2016

Cumulative Tsallis entropy under maximum (minimum) ranked set sampling with unequal samples using the quantile function

Article 06 October 2022

A new systematic ranked set-sampling scheme for symmetric distributions

Article 10 June 2019

References

Arslan, G., Ozturk, O.: Parametric inference based on partially rank ordered set samples. J. Indian Stat. Assoc. 51(1), 1–24 (2013)
MathSciNet Google Scholar
Barabesi, L., El-Sharaawi, A.: The efficiency of ranked set sampling for parameter estimation. Stat. Probab. Lett. 53(2), 189–199 (2001)
Article MathSciNet MATH Google Scholar
Barreto, M.C.M., Barnett, V.: Best linear unbiased estimators for the simple linear regression model using ranked set sampling. Environ. Ecol. Stat. 6(2), 119–133 (1999)
Article Google Scholar
Chen, Z.: The efficiency of ranked-set sampling relative to simple random sampling under multi-parameter families. Stat. Sin. 10(1), 247–264 (2000)
MathSciNet MATH Google Scholar
Chen, Z., Wang, Y.G.: Efficient regression analysis with ranked-set sampling. Biometrics 60(4), 997–1004 (2004)
Article MathSciNet MATH Google Scholar
Chen, Z., Bai, Z., Sinha, B.K.: Ranked Set Sampling: Theory and Applications, vol. 176. Springer, New York (2004)
MATH Google Scholar
Dell, T.R., Clutter, J,L.: Ranked set sampling theory with order statistics background. Biometrics 28(2), 545–555 (1972)
Frey, J.: Nonparametric mean estimation using partially ordered sets. Environ. Ecol. Stat. 19(3), 309–326 (2012). (ISSN 1352-8505)
Article MathSciNet Google Scholar
Hatefi, A., Jafari Jozani, M., Oztuk, O.: Mixture model analysis of partially rank ordered set samples. Scand. J. Stat. 42, 848–871 (2015)
Article MathSciNet MATH Google Scholar
Hatefi, A., Jafari Jozani, M.: Fisher information in different types of perfect and imperfect ranked set samples from finite mixture models. J. Multivar. Anal. 119, 16–31 (2013)
Article MathSciNet MATH Google Scholar
Hatefi, A., Jafari Jozani, M., Ziou, D.: Estimation and classification for finite mixture models under ranked set sampling. Stat. Sin. 24, 675–698 (2014)
Hill, B.M.: Information for estimating the proportions in mixtures of exponential and normal distributions. J. Am. Stat. Assoc. 58(304), 918–932 (1963)
Article MathSciNet Google Scholar
Jafari Jozani, M., Ahmadi, J.: On uncertainty and information properties of ranked set samples. Inf. Sci. 260:1–16, 01 (2014)
Johnson, O.: Information Theory and the Central Limit Theorem. Imperial College Press, London (2004)
Book MATH Google Scholar
Lehmann, E.L., Casella, G.: Theory of Point Estimation, vol. 31. Springer, New York (1998)
MATH Google Scholar
McIntyre, G.A.: A method for unbiased selective sampling, using ranked sets. Crop Pasture Sci. 3(4), 385–390 (1952)
Article Google Scholar
McIntyre, G.A.: A method for unbiased selective sampling, using ranked sets. Am. Stat. 59(3), 230 (2005)
Article MathSciNet Google Scholar
Mode, N.A., Conquest, L.L., Marker, D.A.: Ranked set sampling for ecological research: accounting for the total costs of sampling. Environmetrics 10(2), 179–194 (1999)
Article Google Scholar
Muttlak, H.A., McDonald, L.L.: Ranked set sampling and the line intercept method: A more efficient procedure. Biom. J. 34(3), 329–346 (1992)
Article MATH Google Scholar
Ozturk, O.: Sampling from partially rank-ordered sets. Environ. Ecol. Stat. 18(4), 757–779 (2011)
Article MathSciNet Google Scholar
Ozturk, O.: Combining multi-observer information in partially rank-ordered judgment post-stratified and ranked set samples. Can. J. Stati. 41(2), 304–324 (2013)
Article MathSciNet MATH Google Scholar
Ozturk, O., Bilgin, Omer C., Wolfe, Douglas A.: Estimation of population mean and variance in flock management: a ranked set sampling approach in a finite population setting. J. Stat. Comput. Simul. 75(11), 905–919 (2005)
Article MathSciNet MATH Google Scholar
Stokes, S.L.: Ranked set sampling with concomitant variables. Commun. Stat. Theory Methods 6(12), 1207–1211 (1977)
Article Google Scholar
Stokes, S.L.: Estimation of variance using judgment ordered ranked set samples. Biometrics 36(1), 35–42 (1980)
Wang, Y.G., Ye, Y., Milton, D.A.: Efficient designs for sampling and subsampling in fisheries research based on ranked sets. ICES J. Marine Sci. J. du Conseil 66(5), 928–934 (2009)
Article Google Scholar
Wolfe, D.A.: Ranked set sampling: its relevance and impact on statistical inference. ISRN Probab. Stat. 1–32, 2012 (2012)
MATH Google Scholar

Download references

Acknowledgments

We would like to thank two anonymous reviewers and an associate editor for their constructive comments and suggestions. Mohammad Jafari Jozani gratefully acknowledges the research support of the NSERC Canada. Armin Hatefi acknowledges partial supports through the University of Manitoba Graduate Fellowship, Manitoba Graduate Scholarship (during his PhD program) and Fields Ontario Postdoctoral Fellowship.

Author information

Authors and Affiliations

The Fields Institute for Research in Mathematical Sciences, Toronto, ON, Canada
Armin Hatefi
Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
Armin Hatefi
Department of Statistics, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
Mohammad Jafari Jozani

Authors

Armin Hatefi
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Jafari Jozani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Jafari Jozani.

Appendix: FI of unbalanced PROS and the effect of misplacement errors

In this section, we study the FI matrix of the unbalanced PROS sampling design in a general setting when the subsets are allowed to be of different sizes. To obtain an unbalanced PROS, we first need to determine the sample of size K and set size S. Judgment sub-setting process is then applied to create K sets. We group these K sets into N cycles $G_i=\{S_{1,i},\ldots ,S_{n_i,i}\};\, i=1,\ldots ,N$, where $\sum _{i=1}^{N}n_i=K$. Let $D_{r,i}=\{{d_{r[1]i}},\ldots ,d_{r[n_i]i}\}$ be the design parameter associated with set $S_{r,i}$, where ${d_{r[l]i}}; l=1,\ldots ,n_i$ is the l-th judgment subset in the set $S_{r,i}$. In each cycle $G_i; i=1,\ldots ,N$, we randomly select a unit from one of the sets (particularly from the judgment subset $d_{r[r]i}; r=1,\ldots , n_i$) for full measurement, say $X_{[d_r]i}$ and the number of unranked units in subset $d_{r[r]i}$ is denoted by $m_{ri}; r=1,\ldots ,n_i; i=1,\ldots ,N$. To this end, the collection of measured observations $\{X_{[d_r]i};r=1,\ldots ,n_i;i=1,\ldots ,N\}$ is an unbalanced PROS sample of size $K=\sum _{i=1}^{N}n_i$. Table 9 illustrates the construction of an unbalanced PROS sample of size of $K=5$ with set size $S=6$ and cycle size $N=2$ so that in the first cycle we declare three subsets $n_1=3$ and two subsets $n_2=2$ of different sizes in the first and second cycles, respectively. In each set, $m_{ri}$ represents the number of unranked units in the selected subset. For more details about this kind of designs, see Ozturk (2011).

Table 9 An example of unbalanced PROS design when $S=6, K=5, N=2, n_1=3, n_2=2$ and $m_{ri}$ represents size of the selected subset in each set

Full size table

In the light of Lemma 8 (pointed out in Ozturk 2011) and proving it through latent variables, we show the difference between the complete PROS (Sect. 2 for the case of perfect) and incomplete PROS (Sect. 3.2).

Lemma 8

Let $Y_{ri}=X_{[d_r]i}$ be an observation from unbalanced PROS sampling design from a continuous distribution with pdf $f(\cdot ;{\varvec{\theta }})$. With knowledge of the design parameter $D_{r,i}$, the pdf of $Y_{ri}$ is given by

$$\begin{aligned} f_{[r;m_{ri}]}(y;{\varvec{\theta }})= \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} f^{[v:S]}(y;{\varvec{\theta }}), \end{aligned}$$

where $f^{[v:S]}(y;{\varvec{\theta }})$ is the pdf of the v-th judgment order statistics between S data.

Proof

For each $Y_{ri}$ define the latent vector ${\varvec{\Delta }}^{[d_r]i}= (\Delta ^{[d_r]i}(v), v\in d_{r[r]i})$, where

$$\begin{aligned} \Delta ^{[d_r]i}(v) = \left\{ \begin{array}{ll} 1 &{} \quad \text{ if } Y_{ri} \text{ is } \text{ selected } \text{ from } \text{ the } v\text {-}\text {th position within the subset }d_{r[r]i}; \\ 0 &{} \quad \text{ otherwise }, \end{array} \right. \end{aligned}$$

with $\sum _{v \in d_{r[r]i}} \Delta ^{[d_r]i}(v)=1$. The joint pdf of $(Y_{ri},{\varvec{\Delta }}^{[d_r]i})$ is given by

$$\begin{aligned} f(y,{\varvec{\delta }}^{[d_r]i};{\varvec{\theta }})= \prod _{r=1}^{n_i}\prod _{v \in d_{r[r]i}} \left\{ \frac{1}{m_{ri}} f^{[v:S]}(y;{\varvec{\theta }})\right\} ^{{\varvec{\delta }}^{[d_r]i}(v)}. \end{aligned}$$

Furthermore, by summing the joint distribution of $(Y_{ri},{\varvec{\Delta }}^{[d_r]i})$ over ${\varvec{\Delta }}^{[d_r]i}={\varvec{\delta }}^{[d_r]i}$, the marginal distribution of $Y_{ri}$ is obtained as follows:

$$\begin{aligned} f_{[r;m_{ri}]}(y;{\varvec{\theta }})= \sum _{{\varvec{\delta }}^{[d_r]i}} f(y,{\varvec{\delta }}^{[d_r]i};{\varvec{\theta }}) = \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} f^{[v:S]}(y;{\varvec{\theta }}). \end{aligned}$$

$\square $

Using Lemma 8, the likelihood function under an unbalanced PROS design is now given by

$$\begin{aligned} L(\Omega )= & {} \prod _{i=1}^{N}\prod _{r=1}^{n_i} f_{[r;m_{ri}]}(y_{ri};{\varvec{\theta }}) = \prod _{i=1}^{N}\prod _{r=1}^{n_i} \left\{ \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} f^{[v:S]}(y_{ri};{\varvec{\theta }}) \right\} \nonumber \\= & {} \prod _{i=1}^{N}\prod _{r=1}^{n_i} \left\{ \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} \sum _{h=1}^{n_i} \sum _{u\in d_{h[h]i}} \frac{\alpha _{[d_r,d_h]i}}{m_{hi}} f^{(u:S)}(y_{ri};{\varvec{\theta }}) \right\} , \end{aligned}$$

(13)

where $\Omega = ({\varvec{\theta }}, \varvec{\alpha })$, $f^{(u:S)}(\cdot ;{\varvec{\theta }})$ is the pdf of the u-th order statistics and in a similar vein to Sect. 3.2, $\alpha _{[d_r,d_h]i}$ is considered as the misplacement probability of a unit from subset $d_{h[h]i}$ into subset $d_{r[r]i}$ so that $\sum _{h=1}^{n_i} \alpha _{[d_r,d_h]i}=\sum _{r=1}^{n_i} \alpha _{[d_r,d_h]i} =1; i=1,\ldots ,N$.

Similarly, one can re-write the likelihood function of unbalanced PROS data (13) as:

$$\begin{aligned} L(\Omega ) = \prod _{i=1}^{N}\prod _{r=1}^{n_i} f_{[r;m_{ri}]}(y_{ri};{\varvec{\theta }}) = \prod _{i=1}^{N}\prod _{r=1}^{n_i} f(y_{ri};{\varvec{\theta }})~ g_{ri}(y_{ri};{\varvec{\theta }}), \end{aligned}$$

where

(14)

Similar to Sect. 3.2, to obtain the FI matrix of an unbalanced PROS sample and compare it with its SRS and RSS counterparts we need the following results whose proofs are easy and left to the reader.

Table 10 Values of $\mathrm{RE}_1$ and $\mathrm{RE}_2$ to compare the FI content of unbalanced PROS data with its SRS and RSS counterparts of the same size $n\in \{2,3\}$ for normal distribution when the set size in PROS design is $S=6$

Full size table

Lemma 9

Let $Y_{r,i}=X_{[d_r]i}$, $r=1,\ldots , n_i; i=1,\ldots ,N$, be observed from a continuous distribution with pdf $f(\cdot ; {\varvec{\theta }})$ using an unbalanced PROS sampling design. Suppose $f_{[r;m_{ri}]}(\cdot ;{\varvec{\theta }})$ and $g_{ri}(\cdot ;{\varvec{\theta }})$ are defined as in Lemma 8 and (14), respectively. Under the regularity conditions of Chen et al. (2004b), we have

(i)
$\sum _{i=1}^{N}\sum _{r=1}^{n_i}E \left\{ \frac{D^2_{{\varvec{\theta }}}g_{ri}(X_{[d_r]i};{\varvec{\theta }})}{g_{ri}(X_{[d_r]i};{\varvec{\theta }})} \right\} = \sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ {D^2_{{\varvec{\theta }}}g_{ri}(X;{\varvec{\theta }})} \right\} ,$
(ii)
$\sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ \frac{[D_{{\varvec{\theta }}} g_{ri}(X_{[d_r]i};{\varvec{\theta }})][D_{{\varvec{\theta }}} g_{ri}(X_{[d_r]i};{\varvec{\theta }})]^{\top }}{g_{ri}^2(X_{[d_r]i};{\varvec{\theta }})} \right\} = { \sum _{i=1}^{N}\sum _{r=1}^{n_i}} E \left\{ \frac{[D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})][D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})]^{\top }}{g_{ri}(X;{\varvec{\theta }})} \right\} .$

Theorem 4

Under the conditions of Lemma 9, the FI matrix of an unbalanced PROS sample about unknown parameters $\Omega =(\varvec{\alpha },{\varvec{\theta }})$ is given by

$$\begin{aligned}&\mathbb {I}_{upros}(\Omega ) =\mathbb {I}_{srs}({\varvec{\theta }}) - \sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ {D^2_{{\varvec{\theta }}}g_{ri}(X;{\varvec{\theta }})} \right\} \\&\quad \qquad \qquad \qquad +\sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ \frac{[D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})][D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})]^{\top }}{g_{ri}(X;{\varvec{\theta }})} \right\} . \end{aligned}$$

Through numerical studies, Table 10 compares compare the FI content of unbalanced PROS with that of the counterparts RSS (with set size $n\in \{2,3\}$ and cycle $N=1$) and SRS data in the case of normal distribution. The parameters of PROS design are cycle size $N=1$, set size $S=6$ and number of subsets $n\in \{2,3\}$ with subset sizes $m\in \{2, 3\}$. Similar to the previous simulation studies, the misplacement ranking error models are obtained following Dell and Clutter (1972) for $\rho \in \{0.25,0.5,0.75,0.9,1\}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hatefi, A., Jozani, M.J. Information content of partially rank-ordered set samples. AStA Adv Stat Anal 101, 117–149 (2017). https://doi.org/10.1007/s10182-016-0277-9

Download citation

Received: 14 January 2016
Accepted: 20 July 2016
Published: 01 August 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10182-016-0277-9

Keywords

AMS Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Information content of partially rank-ordered set samples

Abstract

Access this article

Similar content being viewed by others

Efficiency comparisons for partially rank-ordered set sampling

Cumulative Tsallis entropy under maximum (minimum) ranked set sampling with unequal samples using the quantile function

A new systematic ranked set-sampling scheme for symmetric distributions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: FI of unbalanced PROS and the effect of misplacement errors

Lemma 8

Proof

Lemma 9

Theorem 4

Rights and permissions

About this article

Cite this article

Keywords

AMS Subject Classification

Navigation

Information content of partially rank-ordered set samples

Abstract

Access this article

Similar content being viewed by others

Efficiency comparisons for partially rank-ordered set sampling

Cumulative Tsallis entropy under maximum (minimum) ranked set sampling with unequal samples using the quantile function

A new systematic ranked set-sampling scheme for symmetric distributions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: FI of unbalanced PROS and the effect of misplacement errors

Appendix: FI of unbalanced PROS and the effect of misplacement errors

Lemma 8

Proof

Lemma 9

Theorem 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

AMS Subject Classification

Search

Navigation