Skip to main content
Log in

Information content of partially rank-ordered set samples

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

Partially rank-ordered set (PROS) sampling is a generalization of ranked set sampling in which rankers are not required to fully rank the sampling units in each set, hence having more flexibility to perform the necessary judgemental ranking process. The PROS sampling has a wide range of applications in different fields ranging from environmental and ecological studies to medical research and it has been shown to be superior over ranked set sampling and simple random sampling for estimating the population mean. We study Fisher information content and uncertainty structure of the PROS samples and compare them with those of simple random sample (SRS) and ranked set sample (RSS) counterparts of the same size from the underlying population. We study uncertainty structure in terms of the Shannon entropy, Rényi entropy and Kullback–Leibler (KL) discrimination measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arslan, G., Ozturk, O.: Parametric inference based on partially rank ordered set samples. J. Indian Stat. Assoc. 51(1), 1–24 (2013)

    MathSciNet  Google Scholar 

  • Barabesi, L., El-Sharaawi, A.: The efficiency of ranked set sampling for parameter estimation. Stat. Probab. Lett. 53(2), 189–199 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Barreto, M.C.M., Barnett, V.: Best linear unbiased estimators for the simple linear regression model using ranked set sampling. Environ. Ecol. Stat. 6(2), 119–133 (1999)

    Article  Google Scholar 

  • Chen, Z.: The efficiency of ranked-set sampling relative to simple random sampling under multi-parameter families. Stat. Sin. 10(1), 247–264 (2000)

    MathSciNet  MATH  Google Scholar 

  • Chen, Z., Wang, Y.G.: Efficient regression analysis with ranked-set sampling. Biometrics 60(4), 997–1004 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, Z., Bai, Z., Sinha, B.K.: Ranked Set Sampling: Theory and Applications, vol. 176. Springer, New York (2004)

    MATH  Google Scholar 

  • Dell, T.R., Clutter, J,L.: Ranked set sampling theory with order statistics background. Biometrics 28(2), 545–555 (1972)

  • Frey, J.: Nonparametric mean estimation using partially ordered sets. Environ. Ecol. Stat. 19(3), 309–326 (2012). (ISSN 1352-8505)

    Article  MathSciNet  Google Scholar 

  • Hatefi, A., Jafari Jozani, M., Oztuk, O.: Mixture model analysis of partially rank ordered set samples. Scand. J. Stat. 42, 848–871 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Hatefi, A., Jafari Jozani, M.: Fisher information in different types of perfect and imperfect ranked set samples from finite mixture models. J. Multivar. Anal. 119, 16–31 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Hatefi, A., Jafari Jozani, M., Ziou, D.: Estimation and classification for finite mixture models under ranked set sampling. Stat. Sin. 24, 675–698 (2014)

  • Hill, B.M.: Information for estimating the proportions in mixtures of exponential and normal distributions. J. Am. Stat. Assoc. 58(304), 918–932 (1963)

    Article  MathSciNet  Google Scholar 

  • Jafari Jozani, M., Ahmadi, J.: On uncertainty and information properties of ranked set samples. Inf. Sci. 260:1–16, 01 (2014)

  • Johnson, O.: Information Theory and the Central Limit Theorem. Imperial College Press, London (2004)

    Book  MATH  Google Scholar 

  • Lehmann, E.L., Casella, G.: Theory of Point Estimation, vol. 31. Springer, New York (1998)

    MATH  Google Scholar 

  • McIntyre, G.A.: A method for unbiased selective sampling, using ranked sets. Crop Pasture Sci. 3(4), 385–390 (1952)

    Article  Google Scholar 

  • McIntyre, G.A.: A method for unbiased selective sampling, using ranked sets. Am. Stat. 59(3), 230 (2005)

    Article  MathSciNet  Google Scholar 

  • Mode, N.A., Conquest, L.L., Marker, D.A.: Ranked set sampling for ecological research: accounting for the total costs of sampling. Environmetrics 10(2), 179–194 (1999)

    Article  Google Scholar 

  • Muttlak, H.A., McDonald, L.L.: Ranked set sampling and the line intercept method: A more efficient procedure. Biom. J. 34(3), 329–346 (1992)

    Article  MATH  Google Scholar 

  • Ozturk, O.: Sampling from partially rank-ordered sets. Environ. Ecol. Stat. 18(4), 757–779 (2011)

    Article  MathSciNet  Google Scholar 

  • Ozturk, O.: Combining multi-observer information in partially rank-ordered judgment post-stratified and ranked set samples. Can. J. Stati. 41(2), 304–324 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Ozturk, O., Bilgin, Omer C., Wolfe, Douglas A.: Estimation of population mean and variance in flock management: a ranked set sampling approach in a finite population setting. J. Stat. Comput. Simul. 75(11), 905–919 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Stokes, S.L.: Ranked set sampling with concomitant variables. Commun. Stat. Theory Methods 6(12), 1207–1211 (1977)

    Article  Google Scholar 

  • Stokes, S.L.: Estimation of variance using judgment ordered ranked set samples. Biometrics 36(1), 35–42 (1980)

  • Wang, Y.G., Ye, Y., Milton, D.A.: Efficient designs for sampling and subsampling in fisheries research based on ranked sets. ICES J. Marine Sci. J. du Conseil 66(5), 928–934 (2009)

    Article  Google Scholar 

  • Wolfe, D.A.: Ranked set sampling: its relevance and impact on statistical inference. ISRN Probab. Stat. 1–32, 2012 (2012)

    MATH  Google Scholar 

Download references

Acknowledgments

We would like to thank two anonymous reviewers and an associate editor for their constructive comments and suggestions. Mohammad Jafari Jozani gratefully acknowledges the research support of the NSERC Canada. Armin Hatefi acknowledges partial supports through the University of Manitoba Graduate Fellowship, Manitoba Graduate Scholarship (during his PhD program) and Fields Ontario Postdoctoral Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Jafari Jozani.

Appendix: FI of unbalanced PROS and the effect of misplacement errors

Appendix: FI of unbalanced PROS and the effect of misplacement errors

In this section, we study the FI matrix of the unbalanced PROS sampling design in a general setting when the subsets are allowed to be of different sizes. To obtain an unbalanced PROS, we first need to determine the sample of size K and set size S. Judgment sub-setting process is then applied to create K sets. We group these K sets into N cycles \(G_i=\{S_{1,i},\ldots ,S_{n_i,i}\};\, i=1,\ldots ,N\), where \(\sum _{i=1}^{N}n_i=K\). Let \(D_{r,i}=\{{d_{r[1]i}},\ldots ,d_{r[n_i]i}\}\) be the design parameter associated with set \(S_{r,i}\), where \({d_{r[l]i}}; l=1,\ldots ,n_i\) is the l-th judgment subset in the set \(S_{r,i}\). In each cycle \(G_i; i=1,\ldots ,N\), we randomly select a unit from one of the sets (particularly from the judgment subset \(d_{r[r]i}; r=1,\ldots , n_i\)) for full measurement, say \(X_{[d_r]i}\) and the number of unranked units in subset \(d_{r[r]i}\) is denoted by \(m_{ri}; r=1,\ldots ,n_i; i=1,\ldots ,N\). To this end, the collection of measured observations \(\{X_{[d_r]i};r=1,\ldots ,n_i;i=1,\ldots ,N\}\) is an unbalanced PROS sample of size \(K=\sum _{i=1}^{N}n_i\). Table 9 illustrates the construction of an unbalanced PROS sample of size of \(K=5\) with set size \(S=6\) and cycle size \(N=2\) so that in the first cycle we declare three subsets \(n_1=3\) and two subsets \(n_2=2\) of different sizes in the first and second cycles, respectively. In each set, \(m_{ri}\) represents the number of unranked units in the selected subset. For more details about this kind of designs, see Ozturk (2011).

Table 9 An example of unbalanced PROS design when \(S=6, K=5, N=2, n_1=3, n_2=2\) and \(m_{ri}\) represents size of the selected subset in each set

In the light of Lemma 8 (pointed out in Ozturk 2011) and proving it through latent variables, we show the difference between the complete PROS (Sect. 2 for the case of perfect) and incomplete PROS (Sect. 3.2).

Lemma 8

Let \(Y_{ri}=X_{[d_r]i}\) be an observation from unbalanced PROS sampling design from a continuous distribution with pdf \(f(\cdot ;{\varvec{\theta }})\). With knowledge of the design parameter \(D_{r,i}\), the pdf of \(Y_{ri}\) is given by

$$\begin{aligned} f_{[r;m_{ri}]}(y;{\varvec{\theta }})= \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} f^{[v:S]}(y;{\varvec{\theta }}), \end{aligned}$$

where \(f^{[v:S]}(y;{\varvec{\theta }})\) is the pdf of the v-th judgment order statistics between S data.

Proof

For each \(Y_{ri}\) define the latent vector \({\varvec{\Delta }}^{[d_r]i}= (\Delta ^{[d_r]i}(v), v\in d_{r[r]i})\), where

$$\begin{aligned} \Delta ^{[d_r]i}(v) = \left\{ \begin{array}{ll} 1 &{} \quad \text{ if } Y_{ri} \text{ is } \text{ selected } \text{ from } \text{ the } v\text {-}\text {th position within the subset }d_{r[r]i}; \\ 0 &{} \quad \text{ otherwise }, \end{array} \right. \end{aligned}$$

with \(\sum _{v \in d_{r[r]i}} \Delta ^{[d_r]i}(v)=1\). The joint pdf of \((Y_{ri},{\varvec{\Delta }}^{[d_r]i})\) is given by

$$\begin{aligned} f(y,{\varvec{\delta }}^{[d_r]i};{\varvec{\theta }})= \prod _{r=1}^{n_i}\prod _{v \in d_{r[r]i}} \left\{ \frac{1}{m_{ri}} f^{[v:S]}(y;{\varvec{\theta }})\right\} ^{{\varvec{\delta }}^{[d_r]i}(v)}. \end{aligned}$$

Furthermore, by summing the joint distribution of \((Y_{ri},{\varvec{\Delta }}^{[d_r]i})\) over \({\varvec{\Delta }}^{[d_r]i}={\varvec{\delta }}^{[d_r]i}\), the marginal distribution of \(Y_{ri}\) is obtained as follows:

$$\begin{aligned} f_{[r;m_{ri}]}(y;{\varvec{\theta }})= \sum _{{\varvec{\delta }}^{[d_r]i}} f(y,{\varvec{\delta }}^{[d_r]i};{\varvec{\theta }}) = \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} f^{[v:S]}(y;{\varvec{\theta }}). \end{aligned}$$

\(\square \)

Using Lemma 8, the likelihood function under an unbalanced PROS design is now given by

$$\begin{aligned} L(\Omega )= & {} \prod _{i=1}^{N}\prod _{r=1}^{n_i} f_{[r;m_{ri}]}(y_{ri};{\varvec{\theta }}) = \prod _{i=1}^{N}\prod _{r=1}^{n_i} \left\{ \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} f^{[v:S]}(y_{ri};{\varvec{\theta }}) \right\} \nonumber \\= & {} \prod _{i=1}^{N}\prod _{r=1}^{n_i} \left\{ \frac{1}{m_{ri}} \sum _{v\in d_{r[r]i}} \sum _{h=1}^{n_i} \sum _{u\in d_{h[h]i}} \frac{\alpha _{[d_r,d_h]i}}{m_{hi}} f^{(u:S)}(y_{ri};{\varvec{\theta }}) \right\} , \end{aligned}$$
(13)

where \(\Omega = ({\varvec{\theta }}, \varvec{\alpha })\), \(f^{(u:S)}(\cdot ;{\varvec{\theta }})\) is the pdf of the u-th order statistics and in a similar vein to Sect. 3.2, \(\alpha _{[d_r,d_h]i}\) is considered as the misplacement probability of a unit from subset \(d_{h[h]i}\) into subset \(d_{r[r]i}\) so that \(\sum _{h=1}^{n_i} \alpha _{[d_r,d_h]i}=\sum _{r=1}^{n_i} \alpha _{[d_r,d_h]i} =1; i=1,\ldots ,N\).

Similarly, one can re-write the likelihood function of unbalanced PROS data (13) as:

$$\begin{aligned} L(\Omega ) = \prod _{i=1}^{N}\prod _{r=1}^{n_i} f_{[r;m_{ri}]}(y_{ri};{\varvec{\theta }}) = \prod _{i=1}^{N}\prod _{r=1}^{n_i} f(y_{ri};{\varvec{\theta }})~ g_{ri}(y_{ri};{\varvec{\theta }}), \end{aligned}$$

where

(14)

Similar to Sect. 3.2, to obtain the FI matrix of an unbalanced PROS sample and compare it with its SRS and RSS counterparts we need the following results whose proofs are easy and left to the reader.

Table 10 Values of \(\mathrm{RE}_1\) and \(\mathrm{RE}_2\) to compare the FI content of unbalanced PROS data with its SRS and RSS counterparts of the same size \(n\in \{2,3\}\) for normal distribution when the set size in PROS design is \(S=6\)

Lemma 9

Let \(Y_{r,i}=X_{[d_r]i}\), \(r=1,\ldots , n_i; i=1,\ldots ,N\), be observed from a continuous distribution with pdf \(f(\cdot ; {\varvec{\theta }})\) using an unbalanced PROS sampling design. Suppose \(f_{[r;m_{ri}]}(\cdot ;{\varvec{\theta }})\) and \(g_{ri}(\cdot ;{\varvec{\theta }})\) are defined as in Lemma 8 and (14), respectively. Under the regularity conditions of Chen et al. (2004b), we have

  1. (i)

    \(\sum _{i=1}^{N}\sum _{r=1}^{n_i}E \left\{ \frac{D^2_{{\varvec{\theta }}}g_{ri}(X_{[d_r]i};{\varvec{\theta }})}{g_{ri}(X_{[d_r]i};{\varvec{\theta }})} \right\} = \sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ {D^2_{{\varvec{\theta }}}g_{ri}(X;{\varvec{\theta }})} \right\} ,\)

  2. (ii)

    \(\sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ \frac{[D_{{\varvec{\theta }}} g_{ri}(X_{[d_r]i};{\varvec{\theta }})][D_{{\varvec{\theta }}} g_{ri}(X_{[d_r]i};{\varvec{\theta }})]^{\top }}{g_{ri}^2(X_{[d_r]i};{\varvec{\theta }})} \right\} = { \sum _{i=1}^{N}\sum _{r=1}^{n_i}} E \left\{ \frac{[D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})][D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})]^{\top }}{g_{ri}(X;{\varvec{\theta }})} \right\} .\)

Theorem 4

Under the conditions of Lemma 9, the FI matrix of an unbalanced PROS sample about unknown parameters \(\Omega =(\varvec{\alpha },{\varvec{\theta }})\) is given by

$$\begin{aligned}&\mathbb {I}_{upros}(\Omega ) =\mathbb {I}_{srs}({\varvec{\theta }}) - \sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ {D^2_{{\varvec{\theta }}}g_{ri}(X;{\varvec{\theta }})} \right\} \\&\quad \qquad \qquad \qquad +\sum _{i=1}^{N}\sum _{r=1}^{n_i} E \left\{ \frac{[D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})][D_{{\varvec{\theta }}} g_{ri}(X;{\varvec{\theta }})]^{\top }}{g_{ri}(X;{\varvec{\theta }})} \right\} . \end{aligned}$$

Through numerical studies, Table 10 compares compare the FI content of unbalanced PROS with that of the counterparts RSS (with set size \(n\in \{2,3\}\) and cycle \(N=1\)) and SRS data in the case of normal distribution. The parameters of PROS design are cycle size \(N=1\), set size \(S=6\) and number of subsets \(n\in \{2,3\}\) with subset sizes \(m\in \{2, 3\}\). Similar to the previous simulation studies, the misplacement ranking error models are obtained following Dell and Clutter (1972) for \(\rho \in \{0.25,0.5,0.75,0.9,1\}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hatefi, A., Jozani, M.J. Information content of partially rank-ordered set samples. AStA Adv Stat Anal 101, 117–149 (2017). https://doi.org/10.1007/s10182-016-0277-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-016-0277-9

Keywords

AMS Subject Classification

Navigation