Abstract
Ranked set sampling (RSS) is a technique for incorporating auxiliary (concomitant) information into estimation and testing procedures right at the design stage. In this paper, we propose group sequential testing procedures for comparing two treatments with binary outcomes under an RSS scheme with perfect ranking. We compare the power, the average sample sizes and type I errors of the proposed tests to those of the group sequential tests based on simple random sampling schemes. We illustrate the usefulness of the methodology by using data from a clinical trial on leukemia.
Similar content being viewed by others
References
Abu-Dayyeh WA, Muttlak HA (1996) Using ranked set sampling for testing hypotheses on the scale parameter for exponential and uniform distributions. Pak J Stat 12(2):131–138
Arnold BC, Balakrishnan N, Nagaraja HN (1992) A first course in order statistics. Wiley, New York
Chen H, Stasny EA, Wolfe DA (2005) Ranked set sampling for efficient estimation of a population proportion. Stat Med 24:3319–3329
Chen H, Stasny EA, Wolfe DA (2007) Improved procedures for estimation of disease prevalence using ranked set sampling. Biomet J 49(4):530–538
Chen Z, Liu J, Shen L, Wang Y-G (2008) General ranked set sampling for efficient treatment comparison. Stat Sinica 18:91–94
Cui Y, Fu YJ, Hussein A (2009) Group sequential testing of homogeneity in genetic linkage analysis. Comput Stat Data Anal 53(10):3630–3639
Dell TR, Clutter JL (1972) Ranked set sampling theory with order statistics background. Biometrics 28:545–553
Emerson SS, Banks PL (1994) Interpretation of a leukemia trial stopped early. In: Lange N, Ryan L, Billard L, Brillinger D, Conquest L, Greenhouse J (eds) Case studies in biometry. Wiley-Interscience, New York
Fleming TR (1982) One-sample multiple testing procedure for Phase II clinical trials. Biometrics 38:143–151
Hussein A (2005) Sequential comparison of two treatments using weighted Wald-type statistics. Commun Stat Theory Methods 34(7):1631–1641
Hussein A, Carriere KC (2005) On group sequential procedures under variance heterogeneity. Stati Methods Med Res 14(2):121–128
Jennison C, Turnbull BW (2000) Group sequential methods with application to clinical trials. Chapman& Hall/CRC, Florida
Lam K, Sinha BK, Wu Z (1994) Estimation of parameters in two-parameter exponential distribution using ranked set sampling. Ann Inst Stat Math 46(4):723–736
Lan KK, DeMets DL (1983) Discrete sequential boundaries for clinical trials. Biometrika 70(3):659–663
Lehmann E (1999) Elements of large sample theory. Springer, New York
McIntyre GA (1952) A method for unbiased selective sampling, using, ranked sets. Aust J Agric Res 3:385–390
Muttlak HA (1997) Median ranked set sampling. J Appl Stat Sci 6(4):245–255
Muttlak HA, Abu-Dayyeh WA (1997) Testing hypotheses for the normal distribution using ranked set sampling. J Inform Optim Sci 19(1):1–11
O’Brien PC, Fleming TR (1979) A multiple testing procedure for clinical trials. Biometrics 35:549–56
Pocock SJ (1977) Group sequential methods in the design and analysis of clinical trials. Biometrika 64:191–199
Rosenberger WF, Lachin John M (2002) Randomization in clinical trials: theory and practice. Wiley, New York
Roman P (2009) Groupseq: performing computations related to group sequential designs. R package version 1.3.1
Schultz JR, Nichol FR, Elfring GL, Weed SD (1973) Multiple-stage procedures for drug screening. Biometrics 29:293–300
Shaibu A-B, Muttlak HA (2004) Estimating the parameters of the normal, exponential and gamma distributions using median and extreme ranked set sampling. Statistica LXIV(1):76–98
Shen WH, Yuan W (1995) A test for a normal mean based a modified partial ranked set sample. Pak J Stat 11(3):228–223
Sinha Bimal K, Sinha Bikas K (1996) On some aspects of ranked set sampling for estimation of normal and exponential parameters. Stat Decis 14:223–240
Stokes SL (1980) Inferences on the correlation coefficient in bivariate normal populations from ranked set samples. J Am Stat Assoc 75:989–995
Stokes SL (1977) Ranked set sampling with concomitant variables. Commun Stat Theory Methods A6:1207–1211
Takahasi K, Wakimoto K (1968) On the unbiased estimates of the population mean based on the sample stratified by means of ordering. Ann Inst Stat Math 20:1–31
Terpstra JT (2004) On estimating a population proportion via ranked set sampling. Biometric J 46(2):264–272
Terpstra JT, Liudahl LA (2004) Concomitant-based rank set sampling proportion estimates. Stat Med 23:2061–2070
Terpstra JT, Miller ZA (2006) Exact inference for a population proportion based on a ranked set sample. Commun Stat Theory Methods 35(1):19–26
Terpstra JT, Nelson EJ (2005) Optimal rank set sampling estimates for a population proportion. J Plan Inference 127:309–321
Acknowledgments
This work was supported by King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia under the Fast Track project # IN100024. The first author was supported by the National Science and Engineering Research Council of Canada (NSERC).
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of the result that the test statistics are approximately Brownian motions computed at the interim analysis
Appendix: Proof of the result that the test statistics are approximately Brownian motions computed at the interim analysis
-
1.
First we proof result (6) for \(Z_{l}^{1}\). Notice that the ranking mechanism used in this manuscript is consistent, i.e., that \(\frac{1}{k}\sum _{j=1}^{m_{il}} {p_{i[r]}} =p_i\), where \(p_i\) is the true proportion of the ith population and \(p_{i[r]} \)is the proportion of the ith population’s \(r\)th order statistics (see Chen et al. 2005). Therefore, since by the Weak Law of Large Numbers (WLLN)
$$\begin{aligned} \frac{1}{m_{il} }\sum _{j=1}^{m_{il}} x_{i[r]j} {\mathop {\longrightarrow }\limits ^{p}}p_{i[r]}\quad \text{ as} m_{il} \rightarrow \infty , \end{aligned}$$we have by Slutsky’s theorem (Lehmann 1999) that
$$\begin{aligned} \hat{p}_{il}^{rss} =\frac{1}{k}\sum _{r=1}^{k} \left(\frac{1}{m_{il}}\sum _{j=1}^{m_{il}}x_{i[r]j}\right) {\mathop {\longrightarrow }\limits ^{p}}\frac{1}{k} \sum _{r=1}^k {p_{i[r]}}=p_i, \end{aligned}$$hence, \(\hat{p}_{il}^{rss}{\mathop {\longrightarrow }\limits ^{p}}p_i\). Using similar arguments,
$$\begin{aligned} \hat{\tau }_{0l}^{2}=\frac{1}{k^{2}}\sum _{r=1}^{k}\hat{p}_{0[r]l} (1-\hat{p}_{0[r]l}){\mathop {\longrightarrow }\limits ^{p}}\tau _0^2, \end{aligned}$$as
$$\begin{aligned} m_{iL} ,m_{iL} \rightarrow \infty \quad \text{ and} \quad {m_{1L}}/{m_{2L}}\rightarrow \lambda _L \in (0,1),\quad {m_{il}}/{m_{iL}} \rightarrow \lambda _{1l} \in (0,1). \end{aligned}$$Now we can write:
$$\begin{aligned} Z_{l}^{1}=\frac{\hat{p}_{1l}^{rss}-\hat{p}_{2l}^{rss}}{\sqrt{\hat{\tau }}_{0l}^{2}\left(\frac{1}{m_{1l}} +\frac{1}{m_{2l}}\right)}=\frac{\sum _{j=1}^{m_{1l}}{y_{1j}} -\sum _{j=1}^{m_{2l}} {y_{2j}}}{{\sqrt{\tau _{0}^{2}\left( \frac{1}{m_{1l}}+\frac{1}{m_{2l}}\right)}}}+o_p (1), \end{aligned}$$where \(y_{ij} =\frac{1}{k}\sum _{r=1}^k \left(x_{i[r]j} -p_{0[r]}\right)\!, p_{0[r]}\) is the proportion of success for the rth order statistic under the null hypothesis. Therefore it is enough to show the result for the vector \(S^{L}=\left(S_{1}^{L}, S_2^L ,\ldots , S_L^L\right)\) with
$$\begin{aligned} S_{l}^{L} =\frac{\sum _{j=1}^{m_{1l}}{y_{1j}}-\sum _{j=1}^{m_{2l}} {y_{2j}}}{\sqrt{\tau _0^2 \left( {\frac{1}{m_{1l} }+\frac{1}{m_{2l} }} \right)}}. \end{aligned}$$We can represent such vector as
$$\begin{aligned} {\mathbf S}^{L}={\mathbf C}_1^L {\mathbf V}_1^L -{\mathbf C}_2^L {\mathbf V}_2^L \end{aligned}$$where \(V_i^L \) for \(i=1,2\) is a vector of \(L\) mutually independent random variables of the form
$$\begin{aligned} V_{il}^L =\left( {\frac{1}{\left( {m_{il}-m_{i\left( {l-1} \right)}}\right)\tau _0^2 }}\right)^{1/2}\sum _{j=m_{i\left( {l-1} \right)} }^{m_{il}} {y_{ij}}. \end{aligned}$$\({\mathbf C}_{i}^{L}\) are \(L\times L\) matrices with elements
$$\begin{aligned} C_{i(l,l^{\prime })}^L =\frac{\left( {m_{il^{\prime }} -m_{i(l^{\prime }-1)} } \right)^{1/2}}{m_{il} \left( {1/m_{1l} +1/m_{2l} } \right)^{1/2}} \end{aligned}$$for \(l^{\prime }\le l\) and zero otherwise, and \(y_{ij}\) is as defined above. Now thanks to the information balance assumption and hence, the assumption that \(\frac{m_{il} }{m_{iL} }\rightarrow t_l ,\) we have
$$\begin{aligned} C_{i(l,l^{\prime })}^L =\frac{(m_{il^{\prime }} -m_{i(l^{\prime }-1)})^{1/2}}{m_{il} \left( {1/m_{il} +1/m_{2l} } \right)^{1/2} }\rightarrow \frac{1}{\sqrt{2}}\left( {\frac{t_{l^{\prime }} -t_{(l^{\prime }-1)}}{t_l }} \right)^{1/2} =\frac{1}{\sqrt{2}}C^{L}_{(l,l^{\prime })}. \end{aligned}$$The vectors \(V_{il}^L\) converge in distribution to independent multivariate standard normal variables, \({\mathbf Z}_1\) and \({\mathbf Z}_2\), and hence, by the multivariate Slutsky theorem (Lehmann 1999), we have
$$\begin{aligned} {\mathbf S}^{L}={\mathbf C}_1^L {\mathbf V}_1^L -{\mathbf C}_2^L {\mathbf V}_2^L {\mathop {\longrightarrow }\limits ^{D}}\frac{1}{\sqrt{2}}{\mathbf C}^{L}{\mathbf Z}_1 -\frac{1}{\sqrt{2}}{\mathbf C}^{L}{\mathbf Z}_2. \end{aligned}$$Therefore, the asymptotic distribution of \({\mathbf S}^{L}\) is that of standardized Brownian motion computed at the interim analysis as desired.
-
2.
To prove result (6) for \(Z_l^2\) we notice that by large sample theory (Lehmann 1999, p. 470), asymptotically,
$$\begin{aligned} \sqrt{m_{il} }\left(\hat{p}_{il}^{rssml}-p_i\right)\cong \frac{\sum _{j=1}^{m_{il}} {l^{{\prime }}_{ij} (p_i )} }{m_{il} I_i (p_i )} \end{aligned}$$in the sense of having the same asymptotic distribution, where \(l^{\prime }_{ij} (p_i)\) is the first-order derivative of \(l_{ij} (p_i )\) in (3) with respect to \(p_i \). Also, under the null hypothesis as well as for both treatment populations all the estimators
$$\begin{aligned} \hat{\gamma }_{il} (\hat{p}_{il}^{rssml})=I_i (\hat{p}_{il}^{rssml} ) \end{aligned}$$and
$$\begin{aligned} \hat{\gamma }_{0l} (\hat{p}_{0l}^{rssml})=I_i (\hat{p}_{0l}^{rssml}) \end{aligned}$$are asymptotically consistent for their theoretical quantities. As a consequence, the proof of the result (6) for this case will be identical to the proof above for \(Z_{l}^{1}\) by simply setting
$$\begin{aligned} y_{ij} =l^{{\prime }}_{ij} (p_i )/I_i (p_i ). \end{aligned}$$
Rights and permissions
About this article
Cite this article
Hussein, A., Muttlak, H.A. & Saleh, M. Group sequential comparison of two binomial proportions under ranked set sampling design. Comput Stat 28, 1169–1194 (2013). https://doi.org/10.1007/s00180-012-0347-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-012-0347-8