Skip to main content

Sample size calculation for clustered survival data under subunit randomization


Each cluster consists of multiple subunits from which outcome data are collected. In a subunit randomization trial, subunits are randomized into different intervention arms. Observations from subunits within each cluster tend to be positively correlated due to the shared common frailties, so that the outcome data from a subunit randomization trial have dependency between arms as well as within each arm. For subunit randomization trials with a survival endpoint, few methods have been proposed for sample size calculation showing the clear relationship between the joint survival distribution between subunits and the sample size, especially when the number of subunits from each cluster is variable. In this paper, we propose a closed form sample size formula for weighted rank test to compare the marginal survival distributions between intervention arms under subunit randomization, possibly with variable number of subunits among clusters. We conduct extensive simulations to evaluate the performance of our formula under various design settings, and demonstrate our sample size calculation method with some real clinical trials.

This is a preview of subscription content, access via your institution.

Fig. 1


  • Aalen O (1978) Nonparametric inference for a family of counting processes. Ann Stat 6:701–726

    MathSciNet  MATH  Google Scholar 

  • Batchelor J, Hackett M (1970) Hl-a matching in treatment of burned patients with skin allografts. The Lancet 296(7673):581–583

    Article  Google Scholar 

  • Fleming TR, Harrington DP (2011) Counting processes and survival analysis, vol 169. Wiley, London

    MATH  Google Scholar 

  • Freedman LS (1982) Tables of the number of patients required in clinical trials using the logrank test. Stat Med 1(2):121–129

    Article  Google Scholar 

  • Gangnon RE, Kosorok MR (2004) Sample-size formula for clustered survival data using weighted log-rank statistics. Biometrika 91(2):263–275

    MathSciNet  Article  Google Scholar 

  • Gehan EA (1965) A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika 52(1–2):203–224

    MathSciNet  Article  Google Scholar 

  • George SL, Desu M (1974) Planning the size and duration of a clinical trial studying the time to some critical event. J Chronic Dis 27(1–2):15–24

    Article  Google Scholar 

  • Gumbel EJ (1960) Bivariate exponential distributions. J Am Stat Assoc 55(292):698–707

    MathSciNet  Article  Google Scholar 

  • Harper CC, Rocca CH, Thompson KM, Morfesis J, Goodman S, Darney PD, Westhoff CL, Speidel JJ (2015) Reductions in pregnancy rates in the USA with long-acting reversible contraception: a cluster randomised trial. The Lancet 386(9993):562–568

    Article  Google Scholar 

  • Harrington DP, Fleming TR (1982) A class of rank test procedures for censored survival data. Biometrika 69(3):553–566

    MathSciNet  Article  Google Scholar 

  • Jeong JH, Jung SH (2006) Rank tests for clustered survival data when dependent subunits are randomized. Stat Med 25(3):361–373

    MathSciNet  Article  Google Scholar 

  • Jung SH (2008) Sample size calculation for the weighted rank statistics with paired survival data. Stat Med 27(17):3350–3365

    MathSciNet  Article  Google Scholar 

  • Kalbfleisch JD, Prentice RL (2011) The statistical analysis of failure time data, vol 360. Wiley, London

    MATH  Google Scholar 

  • Lakatos E (1988) Sample sizes based on the log-rank statistic in complex clinical trials. Biometrics 44:229–241

    MathSciNet  Article  Google Scholar 

  • Lakatos E, Lan KG (1992) A comparison of sample size methods for the logrank statistic. Stat Med 11(2):179–191

    Article  Google Scholar 

  • Lee EW, Wei L, Amato DA, Leurgans S (1992) Cox-type regression analysis for large numbers of small groups of correlated failure time observations. In: Klein JP, Goel PK (eds) Survival analysis: state of the art. Springer, pp 237–247

  • Li J, Jung SH (2020) Sample size calculation for cluster randomization trials with a time-to-event endpoint. Stat Med 39(25):3608–3623

    MathSciNet  Article  Google Scholar 

  • Lin D, Ying Z (1993) A simple nonparametric estimator of the bivariate survival function under univariate censoring. Biometrika 80(3):573–581

    MathSciNet  Article  Google Scholar 

  • Martens MJ, Logan BR (2021) A unified approach to sample size and power determination for testing parameters in generalized linear and time-to-event regression models. Stat Med 40(5):1121–1132

    Article  Google Scholar 

  • McNeil AJ (2008) Sampling nested archimedean copulas. J Stat Comput Simul 78(6):567–581

    MathSciNet  Article  Google Scholar 

  • Nelson W (1969) Hazard plotting for incomplete failure data. J Qual Technol 1(1):27–52

    Article  Google Scholar 

  • Nolan J (2003) Stable distributions: models for heavy-tailed data. Birkhauser, New York

    Book  Google Scholar 

  • Pals SL, Murray DM, Alfano CM, Shadish WR, Hannan PJ, Baker WL (2008) Individually randomized group treatment trials: a critical appraisal of frequently used design and analytic approaches. Am J Public Health 98(8):1418–1424

    Article  Google Scholar 

  • Schoenfeld DA (1983) Sample-size formula for the proportional-hazards regression model. Biometrics 39:499–503

    Article  Google Scholar 

  • Tarone RE, Ware J (1977) On distribution-free tests for equality of survival distributions. Biometrika 64(1):156–160

    MathSciNet  Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sin-Ho Jung.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix A: Limiting distribution of the clustered rank statistic under \(H_1\)

For subunit j in cluster i that is randomized to arm k, let \(M_{ikj}(t)=N_{ikj}(t)-\int _0^t Y_{ikj}(s)d\Lambda _k(t)\) and \(M_{ik}(t)=\sum _{j=1}^{m_{ik}}M_{ikj}(t)\). By the definition of W,

$$\begin{aligned} W=&\sqrt{n}\{\sum _{i=1}^{n_1} \int _0^\infty \frac{H(t)}{Y_1(t)}dM_{1i}(t) -\sum _{i=1}^{n_2}\int _0^\infty \frac{H(t)}{Y_2(t)}dM_{2i}(t)\} \\&\quad + \sqrt{n}\int _0^\infty H(t)\{d\Lambda _1(t)-d\Lambda _2(t)\} \end{aligned}$$

Let \(\tau =\max \{t:S_1(t)S_2(t)G(t)>0\}\). Usually the upper limit of the support of survival distributions is longer than the study period which is the upper limit of the support of censoring distribution, so that \(\tau \) denotes the study period. For the log-rank statistic, as \(n\rightarrow \infty \), \(n^{-1}Y_k(t)\) and H(t) uniformly converge to \(y_k(t)={\bar{m}} p_kS_k(t)G(t)\) and

$$\begin{aligned} h(t)=\frac{{\bar{m}} p_1p_2S_1(t)S_2(t)G(t)}{p_1S_1(t)+p_2S_2(t)} \end{aligned}$$

in \([0,\tau ]\), respectively, so that we have

$$\begin{aligned} W=\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\epsilon _{i} +\sqrt{n}\int _0^\infty h(t)\{d\Lambda _1(t)-d\Lambda _2(t)\}+o_p(1), \end{aligned}$$

where \(\epsilon _{i}=\epsilon _{i1} - \epsilon _{i2}\), \(\epsilon _{ik} = \sum _{j=1}^{m_{ik}}\epsilon _{ikj}\), and \(\epsilon _{ikj}=\int _0^\infty y_k(t)^{-1}h(t)dM_{ikj}(t)\).

Since, \(\{\epsilon _{i}, i=1,...,n\}\) are independent random variables with mean 0, by the central limit theorem, W is approximately normal with mean \(\sqrt{n}{\bar{\omega }}\), where \(\omega =\int _0^\infty h(t) \{d\Lambda _1(t)-d\Lambda _2(t)\}\) and variance \(\sigma ^2=\sigma _1+\sigma _2-2\sigma _{12}\) where

$$\begin{aligned} \sigma _k={\bar{m}}p_k(\sigma _{k1}^2-c_{k}) + \bar{\bar{m}}p_k^2c_{k} \\ \sigma _{k1}^2=\text{ var }(\epsilon _{ikj})=\int _0^\infty \frac{h^2(t)}{y_k(t)}d\Lambda _k(t) \end{aligned}$$


$$\begin{aligned} c_k=\text{ cov }(\epsilon _{ikj},\epsilon _{ikj'}) =\int _0^\infty \int _0^\infty \frac{h(t_1)h(t_2)}{y_k(t_1)y_k(t_2)}E\{dM_{ikj}(t_1)dM_{ikj'}(t_2)\} \end{aligned}$$

We can derive \(c_k\) in a rather direct way. For \(j\ne j'\), By definition,

$$\begin{aligned}&dM_{ikj}(t_1)dM_{ikj'}(t_2) \\&\quad = dN_{ikj}(t_1)dN_{ikj'}(t_2) - Y_{ikj}(t_1)\lambda _k(t_1)dt_1dN_{ikj'}(t_2) \\&\qquad - Y_{ikj'}(t_2)\lambda _k(t_2)dt_2dN_{ikj}(t_1) + Y_{ikj}(t_1)\lambda _k(t_1)Y_{ikj'}(t_2)\lambda _k(t_2)dt_1dt_2 \end{aligned}$$

By similar arguments to those in the lemma of Jung (2008), we have

$$\begin{aligned}&E\{dN_{ikj}(t_1)dN_{ikj'}(t_2)\} \\&\quad = P(t_1\le T_{ikj}<t_1+dt_1, t_2\le T_{ikj'}<t_2+dt_2, \delta _{ikj}=1, \delta _{ikj'}=1) \\&\quad = y_k(t_1,t_2)\times \frac{P(t_1\le T_{ikj}<t_1+dt_1, t_2\le T_{ikj'}<t_2+dt_2, \delta _{ikj}=1, \delta _{ikj'}=1)}{y_k(t_1,t_2)}\\&\quad = y_k(t_1,t_2)\lambda _k(t_1,t_2)dt_1dt_2 \end{aligned}$$

where \(y_k(t_1,t_2) = E(Y_{ikj}Y_{ikj'})=G(t_1,t_2)S_k(t_1,t_2)\). We can also derive

$$\begin{aligned}&E\{Y_{ikj}(t_1)\lambda _k(t_1)dt_1dN_{ikj'}(t_2)\}= y_k(t_1,t_2)\lambda _{k(2|1)}(t_1,t_2)\lambda _k(t_1)dt_1dt_2 \\&E\{Y_{ikj'}(t_2)\lambda _k(t_2)dt_2dN_{ikj}(t_1)\}= y_k(t_1,t_2)\lambda _{k(1|2)}(t_1,t_2)\lambda _k(t_2)dt_1dt_2 \end{aligned}$$


$$\begin{aligned}&E\{Y_{ikj}(t_1)\lambda _k(t_1)Y_{ikj'}(t_2)\lambda _k(t_2)dt_1dt_2\} = y_k(t_1,t_2)\lambda _k(t_1)\lambda _k(t_2)dt_1dt_2 \end{aligned}$$


$$\begin{aligned} c_k = \int _0^\infty \int _0^\infty \frac{h(t_1)h(t_2)}{y_k(t_1)y_k(t_2)}y(t_1,t_2)dA_k(t_1,t_2) \end{aligned}$$

Similarly we have

$$\begin{aligned} \sigma _{12}&=\text{ cov }(\epsilon _{i1j},\epsilon _{i2j'})\\&=\int _0^\infty \int _0^\infty \frac{h(t_1)h(t_2)}{y_1(t_1)y_2(t_2)}E\{dM_{i1j}(t_1)dM_{i2j'}(t_2)\}\\&=\int _0^\infty \int _0^\infty \frac{h(t_1)h(t_2)}{y_1(t_1)y_2(t_2)}y(t_1,t_2)dA_{12}(t_1,t_2) \end{aligned}$$

where \(y(t_1,t_2) = E(Y_{i1j}Y_{i2j'})=G(t_1,t_2)S_{12}(t_1,t_2)\).

On the other hand, by definition,

$$\begin{aligned} \displaystyle {\hat{\sigma }}^2&=\frac{1}{n}\sum _{i=1}^n\left[ \int _0^\infty \frac{H(t)}{Y_1(t)} dM_{i1}(t) - \int _0^\infty \frac{H(t)}{Y_2(t)} dM_{i2}(t) \right. \\&\left. \quad +\int _0^\infty \frac{H(t)}{Y_1(t)}Y_{i1}(t)\{d\Lambda _1(t)-d{\hat{\Lambda }}(t)\} - \int _0^\infty \frac{H(t)}{Y_2(t)}Y_{i2}(t)\{d\Lambda _2(t)-d{\hat{\Lambda }}(t)\}\right] ^2 \end{aligned}$$

By the uniform convergence of \(n^{-1}Y_k(t)\) and \(Y_k(t)^{-1}dN_k(t)\) to \(y_k(t)\) and \(d\Lambda _k(t)\), respectively, \(d{\hat{\Lambda }}(t)\) uniformly converges to \(\{y_1(t)d\Lambda _1(t)+y_2(t)d\Lambda _2(t)\}/\{y_1(t)+y_2(t)\}\) in \([0,\tau ]\). Hence, we have

$$\begin{aligned} {\hat{\sigma }}^2=\frac{1}{n}\sum _{i=1}^n(\epsilon _{i}+\xi _{i})^2 +o_p(1) \end{aligned}$$


$$\begin{aligned}&\xi _{i}=\int _0^{\infty } \frac{h(t)}{\{y_1(t)+y_2(t)\}}\{Y_{i1}(t)\frac{y_2(t)}{y_1(t)} \\&\qquad + Y_{i2}(t)\frac{y_1(t)}{y_2(t)}\} \{d\Lambda _1(t)-d\Lambda _2(t)\} \end{aligned}$$

are negligible under a nearby alternative hypothesis. Therefore, \({\hat{\sigma }}^2=\frac{1}{n}\sum _{i=1}^n\epsilon _{i}^2 +o_p(1)\) converges to \(\sigma ^2\).

Appendix B: A simplified sample size formula under the nearby alternative hypothesis

We consider a proportional hazards model, \(\Delta =\lambda _1(t)/\lambda _2(t)\), and simplify the sample size formula under the nearby alternative hypothesis. Suppose \(S_1(t_1,t_2)\) and \(S_2(t_1,t_2)\) are commonly approximated by \(S(t_1,t_2)\). Under this assumption, we have \(\log \Delta =\approx \Delta -1\) by the Taylor expansion and

$$\begin{aligned} \omega = (\Delta -1) \int _0^\infty S(t)G(t)d\Lambda (t) \approx (\log \Delta ) d \end{aligned}$$

where \(d=-\int _0^\infty G(t)dS(t)=P(T_{ij}<C_{ij})\) denotes the probability that a subunit experiences an event. Furthermore,

$$\begin{aligned}&\sigma _{k}^2 = p_{3-k}^2\int _0^\infty S(t)G(t)d\Lambda (t)=p_{3-k}^2d \\&c_k = p^2_{3-k} \int _0^\infty \int _0^\infty S(t_1,t_2)G(t_1, t_2)dA(t_1,t_2) \end{aligned}$$


$$\begin{aligned}&dA(t_1,t_2) = \{\lambda (t_1,t_2) - \lambda _{(1|2)}(t_1,t_2)\lambda (t_2) - \lambda _{(2|1)}(t_2,t_1)\lambda (t_1) + \lambda (t_1)\lambda (t_2)\}dt_1dt_2 \end{aligned}$$

Let \(c_w= \int _0^\infty \int _0^\infty S(t_1,t_2)G(t_1, t_2)dA(t_1,t_2)\) and \(c_b= \int _0^\infty \int _0^\infty S_{12}(t_1,t_2)G(t_1, t_2)dA_{12}(t_1,t_2)\). Then, we have

$$\begin{aligned} \sigma ^2=p_1p_2{\bar{m}} d\{1+(2p_1p_2\bar{\bar{m}}/{\bar{m}} -1)\rho _w - 2p_1p_2\rho _b\bar{\bar{m}}/{\bar{m}}\} \end{aligned}$$

where \(\rho _w=c_w/d\) and \(\rho _b=c_b/d\). Hence, under the nearby alternative hypothesis, (4) is expressed as

$$\begin{aligned} n = \frac{(z_{1-\alpha /2} + z_{1-\beta })^2}{{\bar{m}}dp_1p_2 (\log \Delta )^2}\text{ DE } \end{aligned}$$

where \(\text{ DE }=1+(2p_1p_2\bar{\bar{m}}/{\bar{m}} -1)\rho _w - 2p_1p_2\rho _b\bar{\bar{m}}/{\bar{m}}\).

Appendix C: Calculation of parameters under practical settings given in Sect. 3.3

Under the assumption of common censoring within each cluster, we have \(G(t_1,t_2)=G(t_1\vee t_2)\). Further, with uniform accrual during accrual period a and with additional follow-up period b, we have

$$\begin{aligned} G(t) = {I}(t<a+b) - \frac{t-b}{a}{I}(b\le t<a+b) \end{aligned}$$

We assume Gumbel’s copula and the exponential marginal distribution with hazard rate \(\lambda _k\). Using the same notation as in Sect. 3.3, the within-treatment group joint distribution becomes,

$$\begin{aligned} S_k(t_1,t_2)&= \exp \left[ -\left\{ (\lambda _kt_1)^{1/\theta _w}+(\lambda _kt_2)^{1/\theta _w}\right\} ^{\theta _w}\right] \\ f_k(t_1,t_2)&= \lambda _k^2S_k(t_1,t_2)(\lambda _kt_1)^{1/\theta _w-1}(\lambda _kt_2)^{1/\theta _w-1}\left\{ (\lambda _kt_1)^{1/\theta _w}+(\lambda _kt_2)^{1/\theta _w}\right\} ^{2\theta _w-2} \\&\quad \times \left[ 1+(\frac{1}{\theta }_w-1) \left\{ (\lambda _kt_1)^{1/\theta _w}+(\lambda _kt_2)^{1/\theta _w}\right\} ^{-\theta _w}\right] \\ \frac{\partial S_k(t_1,t_2)}{\partial t_1}&= -\lambda _k S_k(t_1,t_2)(\lambda _kt_1)^{1/\theta _w-1}\left\{ (\lambda _kt_1)^{1/\theta _w}+(\lambda _kt_2)^{1/\theta _w}\right\} ^{\theta _w-1} \end{aligned}$$

Hence, we have

$$\begin{aligned} \lambda _k(t_1,t_2)&= \lambda _k^2(\lambda _kt_1)^{1/\theta _w-1}(\lambda _kt_2)^{1/\theta _w-1}\left\{ (\lambda _kt_1)^{1/\theta _w}+(\lambda _kt_2)^{1/\theta _w}\right\} ^{2\theta _w-2} \\&\quad \times \left[ 1+(\frac{1}{\theta }_w-1) \left\{ (\lambda _kt_1)^{1/\theta _w}+(\lambda _kt_2)^{1/\theta _w}\right\} ^{-\theta _w}\right] \\ \lambda _{k(1|2)}(t_1,t_2)&= \lambda _k(\lambda _kt_1)^{1/\theta _w-1}\left\{ (\lambda _kt_1)^{1/\theta _w}+(\lambda _kt_2)^{1/\theta _w}\right\} ^{\theta _w-1} \end{aligned}$$

Similarly for the inter-arm distributions, we have

$$\begin{aligned} S_{12}(t_1,t_2)&= \exp \left[ -\left\{ (\lambda _1t_1)^{1/\theta _b}+(\lambda _2t_2)^{1/\theta _b}\right\} ^{\theta _b}\right] \\ \lambda _{12}(t_1,t_2)&= \lambda _1\lambda _2(\lambda _1t_1)^{1/\theta _b-1}(\lambda _2t_2)^{1/\theta _b-1}\left\{ (\lambda _1t_1)^{1/\theta _b}+(\lambda _2t_2)^{1/\theta _b}\right\} ^{2\theta _b-2}\\&\quad \times \left[ 1+(\frac{1}{\theta }_b-1) \left\{ (\lambda _1t_1)^{1/\theta _b}+(\lambda _2t_2)^{1/\theta _b}\right\} ^{-\theta _b}\right] \\ \lambda _{12(1|2)}(t_1,t_2)&= \lambda _1(\lambda _1t_1)^{1/\theta _b-1}\left\{ (\lambda _1t_1)^{1/\theta _b}+(\lambda _2t_2)^{1/\theta _b}\right\} ^{\theta _b-1} \end{aligned}$$

In addition, using the formulas given in Sect. 3.1, we have

$$\begin{aligned} \omega= & {} (\lambda _1-\lambda _2)\\&\times \left\{ \int _0^{a+b} \frac{e^{-(\lambda _1-\lambda _2)t}}{(p_1e^{-\lambda _1t} + p_2e^{-\lambda _2t})^2}dt - \frac{1}{a}\int _0^{a+b} \frac{(t-b)e^{-(\lambda _1-\lambda _2)t}}{(p_1e^{-\lambda _1t} + p_2e^{-\lambda _2t})^2}dt \right\} \\ \sigma ^2_{k}= & {} p_{3-k}^2\lambda _k\\&\times \left\{ \int _0^{a+b} \frac{e^{-(\lambda _k+2\lambda _{3-k})t}}{(p_1e^{-\lambda _1t} + p_2e^{-\lambda _2t})^2}dt - \frac{1}{a}\int _0^{a+b} \frac{(t-b)e^{-(\lambda _k+2\lambda _{3-k})t}}{(p_1e^{-\lambda _1t} + p_2e^{-\lambda _2t})^2}dt \right\} \end{aligned}$$

Appendix D: Relationship between sample sizes of cluster randomization study and subunit randomization study

For CRTs with time-to-event endpoint, Li and Jung (2020) proposed that the required total number of clusters \(n_c\) can be calculated with

$$\begin{aligned} n_c(\rho _w,{\bar{m}}, \bar{\bar{m}},p_1^c) = \frac{(z_{1-\alpha /2} + z_{1-\beta })^2}{{\bar{m}}dp^c_1p^c_2(\log \Delta )^2}\text{ IF } \end{aligned}$$

Subunit randomization and cluster randomization are equivalent in some special cases. First, for a equally allocated SRT with sample size \(n_s\) and mean cluster size \({\bar{m}}\), if the inter-treatment ICC \(\rho _b = 0\), it is equivalent to a equally allocated CRT with a total of \(2n_s\) clusters and mean cluster size \({\bar{m}}/2\). Since \(E\{(m_i/2)^2\} = E(m_i^2)/4 = \bar{\bar{m}}/4\), this indicates that

$$\begin{aligned} 2n_s(\rho _w, 0,{\bar{m}}, \bar{\bar{m}},1/2) = n_c(\rho _w,{\bar{m}}/2, \bar{\bar{m}}/4,1/2) \end{aligned}$$

In addition, for equally allocated CRTs, we have

$$\begin{aligned} n_c(\rho _w,{\bar{m}}, \bar{\bar{m}},1/2)&= \frac{4(z_{1-\alpha /2} + z_{1-\beta })^2}{{\bar{m}}d(\log \Delta )^2}\{1+(\frac{\bar{\bar{m}}}{{\bar{m}}}-1)\rho _w\} \\&= \frac{1}{2} \times \frac{4(z_{1-\alpha /2} + z_{1-\beta })^2}{\frac{{\bar{m}}}{2}d(\log \Delta )^2}\{1+(\frac{\bar{\bar{m}}/4}{{\bar{m}}/2}-1)\rho _w + \frac{\bar{\bar{m}}}{2{\bar{m}}}\rho _w\}\\&\ge \frac{1}{2} n_c(\rho _w,{\bar{m}}/2, \bar{\bar{m}}/4,1/2)\\&\ge n_s(\rho _w, \rho _b,{\bar{m}}, \bar{\bar{m}},1/2) \end{aligned}$$

The last inequality is based on the previous equation and the fact that \(n_s(\rho _w, \rho _b,{\bar{m}}, \bar{\bar{m}},p_1)\le n_s(\rho _w, 0,{\bar{m}}, \bar{\bar{m}},p_1)\) always holds.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, J., Jung, SH. Sample size calculation for clustered survival data under subunit randomization. Lifetime Data Anal 28, 40–67 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Censoring
  • Design effect
  • Intracluster correlation coefficient
  • Variable cluster size
  • Weighted rank test