Skip to main content
Log in

Group sequential comparison of two binomial proportions under ranked set sampling design

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Ranked set sampling (RSS) is a technique for incorporating auxiliary (concomitant) information into estimation and testing procedures right at the design stage. In this paper, we propose group sequential testing procedures for comparing two treatments with binary outcomes under an RSS scheme with perfect ranking. We compare the power, the average sample sizes and type I errors of the proposed tests to those of the group sequential tests based on simple random sampling schemes. We illustrate the usefulness of the methodology by using data from a clinical trial on leukemia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abu-Dayyeh WA, Muttlak HA (1996) Using ranked set sampling for testing hypotheses on the scale parameter for exponential and uniform distributions. Pak J Stat 12(2):131–138

    MathSciNet  MATH  Google Scholar 

  • Arnold BC, Balakrishnan N, Nagaraja HN (1992) A first course in order statistics. Wiley, New York

    MATH  Google Scholar 

  • Chen H, Stasny EA, Wolfe DA (2005) Ranked set sampling for efficient estimation of a population proportion. Stat Med 24:3319–3329

    Article  MathSciNet  Google Scholar 

  • Chen H, Stasny EA, Wolfe DA (2007) Improved procedures for estimation of disease prevalence using ranked set sampling. Biomet J 49(4):530–538

    Article  MathSciNet  Google Scholar 

  • Chen Z, Liu J, Shen L, Wang Y-G (2008) General ranked set sampling for efficient treatment comparison. Stat Sinica 18:91–94

    MathSciNet  MATH  Google Scholar 

  • Cui Y, Fu YJ, Hussein A (2009) Group sequential testing of homogeneity in genetic linkage analysis. Comput Stat Data Anal 53(10):3630–3639

    Article  MathSciNet  MATH  Google Scholar 

  • Dell TR, Clutter JL (1972) Ranked set sampling theory with order statistics background. Biometrics 28:545–553

    Article  MATH  Google Scholar 

  • Emerson SS, Banks PL (1994) Interpretation of a leukemia trial stopped early. In: Lange N, Ryan L, Billard L, Brillinger D, Conquest L, Greenhouse J (eds) Case studies in biometry. Wiley-Interscience, New York

    Google Scholar 

  • Fleming TR (1982) One-sample multiple testing procedure for Phase II clinical trials. Biometrics 38:143–151

    Article  MATH  Google Scholar 

  • Hussein A (2005) Sequential comparison of two treatments using weighted Wald-type statistics. Commun Stat Theory Methods 34(7):1631–1641

    Article  MathSciNet  MATH  Google Scholar 

  • Hussein A, Carriere KC (2005) On group sequential procedures under variance heterogeneity. Stati Methods Med Res 14(2):121–128

    Article  MathSciNet  MATH  Google Scholar 

  • Jennison C, Turnbull BW (2000) Group sequential methods with application to clinical trials. Chapman& Hall/CRC, Florida

    Google Scholar 

  • Lam K, Sinha BK, Wu Z (1994) Estimation of parameters in two-parameter exponential distribution using ranked set sampling. Ann Inst Stat Math 46(4):723–736

    Article  MathSciNet  MATH  Google Scholar 

  • Lan KK, DeMets DL (1983) Discrete sequential boundaries for clinical trials. Biometrika 70(3):659–663

    Article  MathSciNet  MATH  Google Scholar 

  • Lehmann E (1999) Elements of large sample theory. Springer, New York

    Book  MATH  Google Scholar 

  • McIntyre GA (1952) A method for unbiased selective sampling, using, ranked sets. Aust J Agric Res 3:385–390

    Article  Google Scholar 

  • Muttlak HA (1997) Median ranked set sampling. J Appl Stat Sci 6(4):245–255

    MathSciNet  MATH  Google Scholar 

  • Muttlak HA, Abu-Dayyeh WA (1997) Testing hypotheses for the normal distribution using ranked set sampling. J Inform Optim Sci 19(1):1–11

    Google Scholar 

  • O’Brien PC, Fleming TR (1979) A multiple testing procedure for clinical trials. Biometrics 35:549–56

    Article  Google Scholar 

  • Pocock SJ (1977) Group sequential methods in the design and analysis of clinical trials. Biometrika 64:191–199

    Article  Google Scholar 

  • Rosenberger WF, Lachin John M (2002) Randomization in clinical trials: theory and practice. Wiley, New York

    Book  Google Scholar 

  • Roman P (2009) Groupseq: performing computations related to group sequential designs. R package version 1.3.1

  • Schultz JR, Nichol FR, Elfring GL, Weed SD (1973) Multiple-stage procedures for drug screening. Biometrics 29:293–300

    Article  Google Scholar 

  • Shaibu A-B, Muttlak HA (2004) Estimating the parameters of the normal, exponential and gamma distributions using median and extreme ranked set sampling. Statistica LXIV(1):76–98

    MathSciNet  Google Scholar 

  • Shen WH, Yuan W (1995) A test for a normal mean based a modified partial ranked set sample. Pak J Stat 11(3):228–223

    MathSciNet  Google Scholar 

  • Sinha Bimal K, Sinha Bikas K (1996) On some aspects of ranked set sampling for estimation of normal and exponential parameters. Stat Decis 14:223–240

    MATH  Google Scholar 

  • Stokes SL (1980) Inferences on the correlation coefficient in bivariate normal populations from ranked set samples. J Am Stat Assoc 75:989–995

    Article  MathSciNet  MATH  Google Scholar 

  • Stokes SL (1977) Ranked set sampling with concomitant variables. Commun Stat Theory Methods A6:1207–1211

    Google Scholar 

  • Takahasi K, Wakimoto K (1968) On the unbiased estimates of the population mean based on the sample stratified by means of ordering. Ann Inst Stat Math 20:1–31

    Article  MathSciNet  MATH  Google Scholar 

  • Terpstra JT (2004) On estimating a population proportion via ranked set sampling. Biometric J 46(2):264–272

    Article  MathSciNet  Google Scholar 

  • Terpstra JT, Liudahl LA (2004) Concomitant-based rank set sampling proportion estimates. Stat Med 23:2061–2070

    Article  Google Scholar 

  • Terpstra JT, Miller ZA (2006) Exact inference for a population proportion based on a ranked set sample. Commun Stat Theory Methods 35(1):19–26

    MathSciNet  MATH  Google Scholar 

  • Terpstra JT, Nelson EJ (2005) Optimal rank set sampling estimates for a population proportion. J Plan Inference 127:309–321

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia under the Fast Track project # IN100024. The first author was supported by the National Science and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H. A. Muttlak.

Appendix: Proof of the result that the test statistics are approximately Brownian motions computed at the interim analysis

Appendix: Proof of the result that the test statistics are approximately Brownian motions computed at the interim analysis

  1. 1.

    First we proof result (6) for \(Z_{l}^{1}\). Notice that the ranking mechanism used in this manuscript is consistent, i.e., that \(\frac{1}{k}\sum _{j=1}^{m_{il}} {p_{i[r]}} =p_i\), where \(p_i\) is the true proportion of the ith population and \(p_{i[r]} \)is the proportion of the ith population’s \(r\)th order statistics (see Chen et al. 2005). Therefore, since by the Weak Law of Large Numbers (WLLN)

    $$\begin{aligned} \frac{1}{m_{il} }\sum _{j=1}^{m_{il}} x_{i[r]j} {\mathop {\longrightarrow }\limits ^{p}}p_{i[r]}\quad \text{ as} m_{il} \rightarrow \infty , \end{aligned}$$

    we have by Slutsky’s theorem (Lehmann 1999) that

    $$\begin{aligned} \hat{p}_{il}^{rss} =\frac{1}{k}\sum _{r=1}^{k} \left(\frac{1}{m_{il}}\sum _{j=1}^{m_{il}}x_{i[r]j}\right) {\mathop {\longrightarrow }\limits ^{p}}\frac{1}{k} \sum _{r=1}^k {p_{i[r]}}=p_i, \end{aligned}$$

    hence, \(\hat{p}_{il}^{rss}{\mathop {\longrightarrow }\limits ^{p}}p_i\). Using similar arguments,

    $$\begin{aligned} \hat{\tau }_{0l}^{2}=\frac{1}{k^{2}}\sum _{r=1}^{k}\hat{p}_{0[r]l} (1-\hat{p}_{0[r]l}){\mathop {\longrightarrow }\limits ^{p}}\tau _0^2, \end{aligned}$$

    as

    $$\begin{aligned} m_{iL} ,m_{iL} \rightarrow \infty \quad \text{ and} \quad {m_{1L}}/{m_{2L}}\rightarrow \lambda _L \in (0,1),\quad {m_{il}}/{m_{iL}} \rightarrow \lambda _{1l} \in (0,1). \end{aligned}$$

    Now we can write:

    $$\begin{aligned} Z_{l}^{1}=\frac{\hat{p}_{1l}^{rss}-\hat{p}_{2l}^{rss}}{\sqrt{\hat{\tau }}_{0l}^{2}\left(\frac{1}{m_{1l}} +\frac{1}{m_{2l}}\right)}=\frac{\sum _{j=1}^{m_{1l}}{y_{1j}} -\sum _{j=1}^{m_{2l}} {y_{2j}}}{{\sqrt{\tau _{0}^{2}\left( \frac{1}{m_{1l}}+\frac{1}{m_{2l}}\right)}}}+o_p (1), \end{aligned}$$

    where \(y_{ij} =\frac{1}{k}\sum _{r=1}^k \left(x_{i[r]j} -p_{0[r]}\right)\!, p_{0[r]}\) is the proportion of success for the rth order statistic under the null hypothesis. Therefore it is enough to show the result for the vector \(S^{L}=\left(S_{1}^{L}, S_2^L ,\ldots , S_L^L\right)\) with

    $$\begin{aligned} S_{l}^{L} =\frac{\sum _{j=1}^{m_{1l}}{y_{1j}}-\sum _{j=1}^{m_{2l}} {y_{2j}}}{\sqrt{\tau _0^2 \left( {\frac{1}{m_{1l} }+\frac{1}{m_{2l} }} \right)}}. \end{aligned}$$

    We can represent such vector as

    $$\begin{aligned} {\mathbf S}^{L}={\mathbf C}_1^L {\mathbf V}_1^L -{\mathbf C}_2^L {\mathbf V}_2^L \end{aligned}$$

    where \(V_i^L \) for \(i=1,2\) is a vector of \(L\) mutually independent random variables of the form

    $$\begin{aligned} V_{il}^L =\left( {\frac{1}{\left( {m_{il}-m_{i\left( {l-1} \right)}}\right)\tau _0^2 }}\right)^{1/2}\sum _{j=m_{i\left( {l-1} \right)} }^{m_{il}} {y_{ij}}. \end{aligned}$$

    \({\mathbf C}_{i}^{L}\) are \(L\times L\) matrices with elements

    $$\begin{aligned} C_{i(l,l^{\prime })}^L =\frac{\left( {m_{il^{\prime }} -m_{i(l^{\prime }-1)} } \right)^{1/2}}{m_{il} \left( {1/m_{1l} +1/m_{2l} } \right)^{1/2}} \end{aligned}$$

    for \(l^{\prime }\le l\) and zero otherwise, and \(y_{ij}\) is as defined above. Now thanks to the information balance assumption and hence, the assumption that \(\frac{m_{il} }{m_{iL} }\rightarrow t_l ,\) we have

    $$\begin{aligned} C_{i(l,l^{\prime })}^L =\frac{(m_{il^{\prime }} -m_{i(l^{\prime }-1)})^{1/2}}{m_{il} \left( {1/m_{il} +1/m_{2l} } \right)^{1/2} }\rightarrow \frac{1}{\sqrt{2}}\left( {\frac{t_{l^{\prime }} -t_{(l^{\prime }-1)}}{t_l }} \right)^{1/2} =\frac{1}{\sqrt{2}}C^{L}_{(l,l^{\prime })}. \end{aligned}$$

    The vectors \(V_{il}^L\) converge in distribution to independent multivariate standard normal variables, \({\mathbf Z}_1\) and \({\mathbf Z}_2\), and hence, by the multivariate Slutsky theorem (Lehmann 1999), we have

    $$\begin{aligned} {\mathbf S}^{L}={\mathbf C}_1^L {\mathbf V}_1^L -{\mathbf C}_2^L {\mathbf V}_2^L {\mathop {\longrightarrow }\limits ^{D}}\frac{1}{\sqrt{2}}{\mathbf C}^{L}{\mathbf Z}_1 -\frac{1}{\sqrt{2}}{\mathbf C}^{L}{\mathbf Z}_2. \end{aligned}$$

    Therefore, the asymptotic distribution of \({\mathbf S}^{L}\) is that of standardized Brownian motion computed at the interim analysis as desired.

  2. 2.

    To prove result (6) for \(Z_l^2\) we notice that by large sample theory (Lehmann 1999, p. 470), asymptotically,

    $$\begin{aligned} \sqrt{m_{il} }\left(\hat{p}_{il}^{rssml}-p_i\right)\cong \frac{\sum _{j=1}^{m_{il}} {l^{{\prime }}_{ij} (p_i )} }{m_{il} I_i (p_i )} \end{aligned}$$

    in the sense of having the same asymptotic distribution, where \(l^{\prime }_{ij} (p_i)\) is the first-order derivative of \(l_{ij} (p_i )\) in (3) with respect to \(p_i \). Also, under the null hypothesis as well as for both treatment populations all the estimators

    $$\begin{aligned} \hat{\gamma }_{il} (\hat{p}_{il}^{rssml})=I_i (\hat{p}_{il}^{rssml} ) \end{aligned}$$

    and

    $$\begin{aligned} \hat{\gamma }_{0l} (\hat{p}_{0l}^{rssml})=I_i (\hat{p}_{0l}^{rssml}) \end{aligned}$$

    are asymptotically consistent for their theoretical quantities. As a consequence, the proof of the result (6) for this case will be identical to the proof above for \(Z_{l}^{1}\) by simply setting

    $$\begin{aligned} y_{ij} =l^{{\prime }}_{ij} (p_i )/I_i (p_i ). \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hussein, A., Muttlak, H.A. & Saleh, M. Group sequential comparison of two binomial proportions under ranked set sampling design. Comput Stat 28, 1169–1194 (2013). https://doi.org/10.1007/s00180-012-0347-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-012-0347-8

Keywords

Navigation