Skip to main content

Advertisement

Log in

Inference on win ratio for cluster-randomized semi-competing risk data

  • Original Paper
  • Recent Statistical Methods for Survival Analysis
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

The cluster randomization has been increasingly popular for pragmatic clinical trials by many public health researchers. The main advantages of using the cluster randomization include minimizing experimental contaminations, increasing the administrative efficiency, and having higher probability of compliances. Semi-competing risks data arise when a terminal event censors a nonterminal event, but not vice versa, and abundant literature exists on model-based methods to analyze such data. The win ratio is a purely nonparametric summary measure of a group effect in semi-competing risks data where the components of the composite endpoint get prioritized. In this paper, we propose inference on the win ratio for cluster-randomized semi-competing risks data, which can be formulated as the ratio of two clustered U-statistics. First the asymptotic joint distribution of the two clustered U-statistics is derived using the Cramer–Wold device, their variance and covariance estimators are evaluated, and then a test statistic for the win ratio for cluster-randomized semi-competing risks data is constructed. Simulation results are presented to assess type I error probabilities and powers of the test statistic. The proposed method is illustrated with a real data set from a multi-center phase III clinical trial on breast cancer, treating the clinics as randomized clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Bebu, I., & Lachin, J. M. (2015). Large sample inference for a win ratio analysis of a composite outcome based on prioritized components. Biostatistics, 17(1), 178–187.

    Article  MathSciNet  Google Scholar 

  • Chew, D. P., Astley, C. M., Luker, H., Alprandi-Costa, B., Hillis, G., Chow, C. K., et al. (2015). A cluster randomized trial of objective risk assessment versus standard care for acute coronary syndromes: Rationale and design of the australian grace risk score intervention study (agris). American Heart Journal, 170(5), 995–1004.

    Article  Google Scholar 

  • Do Ha, I., Xiang, L., Peng, M., Jeong, J. H., & Lee, Y. (2020). Frailty modelling approaches for semi-competing risks data. Lifetime data analysis, 26(1), 109–133.

    Article  MathSciNet  Google Scholar 

  • Donner, A., & Klar, N. (2004). Pitfalls of and controversies in cluster randomization trials. American Journal of Public Health, 94(3), 416–422.

    Article  Google Scholar 

  • Ferreira-González, I., Alonso-Coello, P., Solà, I., Pacheco-Huergo, V., Domingo-Salvany, A., Alonso, J., et al. (2008). Composite endpoints in clinical trials. Revista Espanola de Cardiologia (English Edition), 61(3), 283–290.

    Google Scholar 

  • Fine, J. P., Jiang, H., & Chappell, R. (2001). On semi-competing risks data. Biometrika, 88(4), 907–919.

    Article  MathSciNet  Google Scholar 

  • Fisher, B., Costantino, J., Redmond, C., Poisson, R., Bowman, D., Couture, J., et al. (1989). A randomized clinical trial evaluating tamoxifen in the treatment of patients with node-negative breast cancer who have estrogen-receptor-positive tumors. New England Journal of Medicine, 320(8), 479–484.

    Article  Google Scholar 

  • Griffin, S. J., Borch-Johnsen, K., Davies, M. J., Khunti, K., Rutten, G. E., Sandbæk, A., et al. (2011). Effect of early intensive multifactorial therapy on 5-year cardiovascular outcomes in individuals with type 2 diabetes detected by screening (addition-europe): a cluster-randomised trial. The Lancet, 378(9786), 156–167.

    Article  Google Scholar 

  • Hahn, S., Puffer, S., Torgerson, D., & Watson, J. (2005). Methodological bias in cluster randomised trials. BMC Medical Research Methodology, 5, 10. https://doi.org/10.1186/1471-2288-5-10.

    Article  Google Scholar 

  • Huque, M. F., Alosh, M., & Bhore, R. (2011). Addressing multiplicity issues of a composite endpoint and its components in clinical trials. Journal of Biopharmaceutical Statistics, 21(4), 610–634.

    Article  MathSciNet  Google Scholar 

  • Jahn-Eimermacher, A., Ingel, K., & Schneider, A. (2013). Sample size in cluster-randomized trials with time to event as the primary endpoint. Statistics in Medicine, 32(5), 739–751.

    Article  MathSciNet  Google Scholar 

  • Jeong, J. H., & Jung, S. H. (2006). Rank tests for clustered survival data when dependent subunits are randomized. Statistics in Medicine, 25(3), 361–373.

    Article  MathSciNet  Google Scholar 

  • Jordhoy, M., Frayers, P., Ahlner-Elmqvist, M., & Kaasa, S. (2002). Lack of concealment may lead to selection bias in cluster randomized trials of palliative care. Palliative Medicine, 16, 43–49.

    Article  Google Scholar 

  • Kalia, S., Klar, N., & Donner, A. (2016). On the estimation of intracluster correlation for time-to-event outcomes in cluster randomized trials. Statistics in Medicine, 35(30), 5551–5560.

    Article  MathSciNet  Google Scholar 

  • Kleist, P. (2007). Composite endpoints for clinical trials. International Journal of Pharmaceutical Medicine, 21(3), 187–198.

    Article  Google Scholar 

  • Lee, M. L. T., & Dehling, H. G. (2005). Generalized two-sample u-statistics for clustered data. Statistica Neerlandica, 59(3), 313–323.

    Article  MathSciNet  Google Scholar 

  • Li, P., & Redden, D. T. (2015). Small sample performance of bias-corrected sandwich estimators for cluster-randomized trials with binary outcomes. Statistics in Medicine, 34(2), 281–296.

    Article  MathSciNet  Google Scholar 

  • Liu, L., Wolfe, R. A., & Huang, X. (2004). Shared frailty models for recurrent events and a terminal event. Biometrics, 60(3), 747–756.

    Article  MathSciNet  Google Scholar 

  • Luo, X., Tian, H., Mohanty, S., & Tsai, W. Y. (2015). An alternative approach to confidence interval estimation for the win ratio statistic. Biometrics, 71(1), 139–145.

    Article  MathSciNet  Google Scholar 

  • Mao, L. (2017). On causal estimation using-statistics. Biometrika, 105(1), 215–220.

    Article  MathSciNet  Google Scholar 

  • Marshall, A. W., & Olkin, I. (1988). Families of multivariate distributions. Journal of the American Statistical Association, 83(403), 834–841.

    Article  MathSciNet  Google Scholar 

  • Oakes, D. (2016). On the win-ratio statistic in clinical trials with multiple types of event. Biometrika, 103(3), 742–745.

    Article  MathSciNet  Google Scholar 

  • Obuchowski, N. A. (1997). Nonparametric analysis of clustered roc curve data. Biometrics, 53, 567–578.

    Article  Google Scholar 

  • Peng, L., & Fine, J. P. (2007). Regression modeling of semicompeting risks data. Biometrics, 63(1), 96–108.

    Article  MathSciNet  Google Scholar 

  • Peng, M., Xiang, L., & Wang, S. (2018). Semiparametric regression analysis of clustered survival data with semi-competing risks. Computational Statistics & Data Analysis, 124, 53–70.

    Article  MathSciNet  Google Scholar 

  • Pocock, S. J., Ariti, C. A., Collier, T. J., & Wang, D. (2011). The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal, 33(2), 176–182.

    Article  Google Scholar 

  • Stolker, J. M., Spertus, J. A., Cohen, D. J., Jones, P. G., Jain, K. K., Bamberger, E., et al. (2014). Re-thinking composite endpoints in clinical trials: Insights from patients and trialists. Circulation, 130(15), 1254–1261.

    Article  Google Scholar 

  • Tong, B. C., Huber, J. C., Ascheim, D. D., Puskas, J. D., Ferguson, T. B., Blackstone, E. H., & Smith, P. K. (2012). Weighting composite endpoints in clinical trials: Essential evidence for the heart team. The Annals of Thoracic Surgery, 94(6), 1908–1913.

    Article  Google Scholar 

  • Wang, M., Kong, L., Li, Z., & Zhang, L. (2016). Covariance estimators for generalized estimating equations (gee) in longitudinal analysis with small samples. Statistics in Medicine, 35(10), 1706–1721.

    Article  MathSciNet  Google Scholar 

  • Wu, B. H., Michimae, H., & Emura, T. (2020). Meta-analysis of individual patient data with semi-competing risks under the weibull joint frailty-copula model. Computational Statistics, 35(4), 1525–1552.

    Article  MathSciNet  Google Scholar 

  • Xu, J., Kalbfleisch, J. D., & Tai, B. (2010). Statistical analysis of illness-death processes and semicompeting risks data. Biometrics, 66(3), 716–725.

    Article  MathSciNet  Google Scholar 

  • Zeng, D., & Lin, D. (2009). Semiparametric transformation models with random effects for joint analysis of recurrent and terminal events. Biometrics, 65(3), 746–752.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jong-Hyeon Jeong.

Ethics declarations

Conflict of interest statement

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Disclaimer

Dr. Di Zhang contributed to this article in her personal capacity. This manuscript reflects the views of the author and should not be construed to represent FDA views or policies.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

This Appendix contains examples of between-subject comparisons using win ratio approach (Sect. A.1), proof of the Theorem (Sect. A.2), detailed derivation of the covariance between the clustered \(U_1\) and \(U_2\) (Sect. A.3), presentation of the evaluable formulas of the variance and covariance matrix of the joint distribution of the bivariate clustered U-statistics (Sect. A.4), the user manual of the R package cWR with examples (Sect. A.5), and additional simulation results of Type I Error analysis under large intra-cluster correlation (Sect. A.6).

1.1 A.1 Examples of between-subject comparisons using the win ratio approach

We present some examples of between-subject comparisons in the win ratio approach to help readers better understand the comparison procedure. For formal definitions of all possible comparison scenarios, we defer to Luo et al. (2015). Define \(T_{H_z}\), \(T_{D_z}\) and \(T_{C_z}\) as time to non-fatal event, time to fatal event and time to censoring, respectively, for treated (\(z=1\)) and control (\(z=0\)) patients. The horizontal lines in the plot represent follow-up timeline starting from time=0 for all patients. We assume that the fatal event is more important compared to the non-fatal event, and we want to prioritize the fatal event. The fatal event can censor the non-fatal event, but not vice versa.

figure a

Example 1

Both non-fatal and fatal events of the treated patient were observed, while only the non-fatal event was observed for the control patient, due to censoring. We first compare the times to fatal event between the treated and control patients. Because the control patient was censored before treated patient developed the fatal event (\(\{T_{C_1}>T_{D_1}\} \text { and } \{T_{D_0}>T_{C_0}>T_{H_0}\} \text { and } \{T_{C_0}<T_{D_1}\)}), we cannot determine who wins based on the fatal event and we move down to the rank of importance to compare the time to non-fatal event. Because the treated patient developed the non-fatal event before the control patient (\(T_{H_1}<T_{H_0}\)), we claim that the control patient wins in this comparison. To further illustrate the construction of the indicator functions of the two kernels in Eqs. (3) and (4) of the main manuscript, we have \(\mathbf{1} \{(T_{C_1}>T_{D_1}) \text { and } (T_{D_0}>T_{C_0}>T_{H_0}) \text { and } (T_{C_0}<T_{D_1}) \text { and } (T_{H_1}<T_{H_0})\} =\phi _2(X_j^{(i)},Y_l^{(k)})=1-\phi _1(X_j^{(i)},Y_l^{(k)})=1. \)

Example 2

The treated patient was censored before developed any events, while the control patient developed the fatal event. We first compare the times to fatal event between the treated and control patients. Because the control patient developed the fatal event before the treated patient was censored (\(\{\min (T_{H_1},T_{D_1})>T_{C_1}\}\text { and }\{\min (T_{H_0},T_{C_0})>T_{D_0}\}\text { and }\{T_{C_1}>T_{D_0}\}\)), we claim that the treated patient wins in this comparison without the need to consider times to non-fatal event. To construct the indicator functions of the two kernels, we have \(\mathbf{1} \{(\min (T_{H_1},T_{D_1})>T_{C_1})\text { and }(\min (T_{H_0},T_{C_0})>T_{D_0})\text { and }(T_{C_1}>T_{D_0})\}=\phi _1(X_j^{(i)},Y_l^{(k)})=1-\phi _2(X_j^{(i)},Y_l^{(k)})=1\).

Example 3

Similar to Example 2 that the treated patient was censored before developed any events, while the control patient developed the fatal event, except that the treated patient censored before the control patient developed the fatal event (\(T_{C_1}<T_{D_0}\)). We cannot determine who wins based on the fatal event. Because we do not observe the non-fatal event for neither patient, this comparison is inconclusive, which leads to a tie. To construct the indicator functions of the two kernels, we have \(\mathbf{1} \{(\min (T_{H_1},T_{D_1})>T_{C_1})\text { and }(\min (T_{H_0},T_{C_0})>T_{D_0})\text { and }(T_{C_1}<T_{D_0})\}=\phi _1(X_j^{(i)},Y_l^{(k)})=\phi _2(X_j^{(i)},Y_l^{(k)})=0\).

1.2 A.2 Proof of Theorem 1

Theorem 1

Define \(A_1^{(i)}=\sum _{j=1}^{J_i} \phi _1(X_j^{(i)})\), \(B_1^{(k)}=\sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})\), \(A_2^{(i)}=\sum _{j=1}^{J_i} \phi _2(X_j^{(i)})\), \(B_2^{(k)}=\sum _{l=1}^{L_k} \phi _2(Y_l^{(k)})\). Also define \({\bar{J}}_m=\frac{1}{m}\sum _{i=1}^{m}J_i\) and \({\bar{L}}_n=\frac{1}{n}\sum _{k=1}^{n}L_k\), where m and n are the numbers of clusters in two comparison groups. Then under the assumptions of independence between comparison groups and among clusters and exchangeability, as \(\min (m,n)\rightarrow \infty \), the U-statistics \(U_1-{\bar{J}}_m {\bar{L}}_n \theta _1\) and \(U_2-{\bar{J}}_m {\bar{L}}_n \theta _2\) asymptotically follow a mean 0 bivariate normal distribution with the variance–covariance matrix of

$$\begin{aligned} \Sigma =\frac{{\bar{L}}_n^2}{m}\Sigma _1+\frac{{\bar{J}}_m^2}{n}\Sigma _2, \end{aligned}$$

where

$$\begin{aligned} \Sigma _1= \begin{bmatrix} Var(A_1^{(i)}) &{} Cov(A_1^{(i)},A_2^{(i)}) \\ Cov(A_1^{(i)},A_2^{(i)}) &{} Var(A_2^{(i)}), \end{bmatrix} \end{aligned}$$

and

$$\begin{aligned} \Sigma _2= \begin{bmatrix} Var(B_1^{(k)}) &{} Cov(B_1^{(k)},B_2^{(k)}) \\ Cov(B_1^{(k)},B_2^{(k)}) &{} Var(B_2^{(k)}) \end{bmatrix}. \end{aligned}$$

Proof

Lee and Dehling (2005) has asymptotically shown that as \(\min (m,n)\rightarrow \infty \),

$$\begin{aligned} U_1-{\bar{J}}_m {\bar{L}}_n \theta _1 \approx {\bar{L}}_n \frac{1}{m} \sum _{i=1}^{m} \sum _{j=1}^{J_i} \phi _1(X_j^{(i)})+ {\bar{J}}_m \frac{1}{n} \sum _{k=1}^{n} \sum _{l=1}^{L_k} \phi _1(Y_l^{(k)}), \end{aligned}$$
(8)

and that the first term follows an asymptotic normal distribution with mean 0 and variance

$$\begin{aligned} \frac{{\bar{L}}_n^2}{m}Var\left( \sum _{j=1}^{J_i} \phi _1(X_j^{(i)})\right) , \end{aligned}$$

and the second term does with mean 0 and variance

$$\begin{aligned} \frac{{\bar{J}}_m^2}{n}Var\left( \sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})\right) . \end{aligned}$$

Based on the assumptions of independent clusters and identical distributions among observations in the same group, \(\sum _{j=1}^{J_i} \phi _1(X_j^{(i)})\), \(i=1, 2, ..., m\), are iid random variables, as well as \(\sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})\), \(k=1, 2, ..., n\). The same applies to the statistic \(U_2-{\bar{J}}_m {\bar{L}}_n \theta _2\), except that the latter uses the second kernel.

Noting that \(E(\phi _1(X_j^{(i)}))=E(\phi _1(Y_l^{(k)}))=0\), suppose (\(A_1^{(i)}\), \(A_2^{(i)}\)), \(i=1,2,...,m\), follows a bivariate distribution with mean \((0,0)^T\) and variance

$$\begin{aligned} \begin{bmatrix} Var(A_1^{(i)}) &{} Cov(A_1^{(i)}, A_2^{(i)})\\ Cov(A_1^{(i)}, A_2^{(i)}) &{} Var(A_2^{(i)}) \end{bmatrix}. \end{aligned}$$

Then we have

$$\begin{aligned} \sqrt{m}{\bar{L}}_n({\bar{A}}_1+ {\bar{A}}_2-0) \Longrightarrow AN\left( 0,{\bar{L}}_n^2(Var(A_1^{(i)}) + Var(A_2^{(i)}) + 2 Cov(A_1^{(i)}, A_2^{(i)}))\right) , \end{aligned}$$

where \({\bar{A}}_1=\sum _{i=1}^{m}A_1^{(i)}/m\) and \({\bar{A}}_2=\sum _{i=1}^{m}A_2^{(i)}/m\). Therefore, by the Cramer–Wold device

$$\begin{aligned} \sqrt{m} {\bar{L}}_n({\bar{A}}_1-0, {\bar{A}}_2-0) \Longrightarrow AN \left( \begin{bmatrix} 0\\ 0 \end{bmatrix}, {\bar{L}}_n^2 \begin{bmatrix} Var(A_1^{(i)}) &{} Cov(A_1^{(i)}, A_2^{(i)})\\ Cov(A_1^{(i)}, A_2^{(i)}) &{} Var(A_2^{(i)}) \end{bmatrix} \right) . \end{aligned}$$

Similarly, by defining \({\bar{B}}_1=\sum _{k=1}^{n}B_1^{(k)}/n\) and \({\bar{B}}_2=\sum _{k=1}^{n}B_2^{(k)}/n\),

$$\begin{aligned} \sqrt{n} {\bar{J}}_m({\bar{B}}_1-0, {\bar{B}}_2-0) \Longrightarrow AN \left( \begin{bmatrix} 0\\ 0 \end{bmatrix}, {\bar{J}}_m^2 \begin{bmatrix} Var(B_1^{(k)}) &{} Cov(B_1^{(k)}, B_2^{(k)})\\ Cov(B_1^{(k)}, B_2^{(k)}) &{} Var(B_2^{(k)}) \end{bmatrix} \right) . \end{aligned}$$

Finally, since \(Cov({\bar{A}}_r,{\bar{B}}_r)=0\) (\(r=1,2\)) due to the independence assumption among clusters, \((U_1-{\bar{J}}_m {\bar{L}}_n \theta _1, U_2-{\bar{J}}_m {\bar{L}}_n \theta _2)^T\) follows an asymptotic normal distribution with mean \((0,0)^T\) and variance \(\Sigma =({\bar{L}}_n^2/m)\Sigma _1+({\bar{J}}_m^2/n)\Sigma _2.\) \(\square \)

1.3 A.3 Derivation of the covariance between \(U_1\) and \(U_2\)

Using the definition of the asymptotic formula of the U statistic given in (4),

We can easily see that \(Cov(\textcircled {a},\textcircled {d})=Cov(\textcircled {b},\textcircled {c})=0\) due to independence between clusters, and

$$\begin{aligned} \begin{aligned}&Cov(\textcircled {a},\textcircled {c})=\frac{{\bar{L}}_n^2}{m^2}Cov\left( \sum _{i=1}^{m}\sum _{j=1}^{J_i}\phi _1(X_j^{(i)}),\sum _{i=1}^{m}\sum _{j=1}^{J_i}\phi _2(X_j^{(i)})\right) \\&\quad =\frac{{\bar{L}}_n^2}{m^2}\sum _{i=1}^{m}Cov\left( \sum _{j=1}^{J_i}\phi _1(X_j^{(i)}),\sum _{j=1}^{J_i}\phi _2(X_j^{(i)})\right) \text {, since clusters are independent.}\\&\quad =\frac{{\bar{L}}_n^2}{m^2}\sum _{i=1}^{m}\{J_iCov(\phi _1(X_j^{(i)}),\phi _2(X_j^{(i)}))+J_i(J_i-1)Cov(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\}\\&\quad =\frac{{\bar{L}}_n^2}{m}\{{\bar{J}}_mCov(\phi _1(X_j^{(i)}),\phi _2(X_j^{(i)}))+{\bar{J}}_m^{(2)}Cov(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\}. \end{aligned} \end{aligned}$$

Similarly,

$$\begin{aligned}&Cov(\textcircled {b},\textcircled {d})=\frac{{\bar{J}}_m^2}{n^2}Cov\left( \sum _{k=1}^{n}\sum _{l=1}^{L_k}\phi _1(Y_l^{(k)}),\sum _{k=1}^{n}\sum _{l=1}^{L_k}\phi _2(Y_l^{(k)})\right) \\&\quad =\frac{{\bar{J}}_m^2}{n^2}\sum _{k=1}^{n}Cov\left( \sum _{l=1}^{L_k}\phi _1(Y_l^{(k)}),\sum _{l=1}^{L_k}\phi _2(Y_l^{(k)})\right) \text {, since clusters are independent.}\\&\quad =\frac{{\bar{J}}_m^2}{n^2}\sum _{k=1}^{n}\{L_kCov(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))+L_k(L_k-1)Cov(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\}\\&\quad =\frac{{\bar{J}}_m^2}{n}\{{\bar{L}}_nCov(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))+{\bar{L}}_n^{(2)}Cov(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\}. \end{aligned}$$

Therefore, when \(m,n\rightarrow \infty \) the asymptotic covariance between \(U_1\) and \(U_2\) is

$$\begin{aligned} Cov(U_1,U_2)= & {} \frac{{\bar{L}}_n^2}{m}\{{\bar{J}}_mCov(\phi _1(X_j^{(i)}),\phi _2(X_j^{(i)}))+{\bar{J}}_m^{(2)}Cov(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\}\\&+\frac{{\bar{J}}_m^2}{n}\{{\bar{L}}_nCov(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))+{\bar{L}}_n^{(2)}Cov(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\}. \end{aligned}$$

1.4 A.4 Evaluable formulas of variance–covariance matrix

This section presents evaluable forms of each component of the variances and covariances derived in Sects. 3 and 4. Let us define

$$\begin{aligned} {\hat{\theta }}_1=\frac{1}{(\sum _{i=1}^{m} J_i)(\sum _{k=1}^{n} L_k)} \sum _{i=1}^{m}\sum _{k=1}^{n}\sum _{j=1}^{J_i}\sum _{l=1}^{L_k} \mathbf{1 }\{X_j^{(i)}>Y_l^{(k)}\}. \end{aligned}$$

The four components of \(Var(U_1)\) are estimated as follows:

$$\begin{aligned} \begin{aligned} {\widehat{Var}}&(\phi _1(X_j^{(i)}))\\&={\widehat{Var}}(P_Y(X_j^{(i)}>Y_l^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_j^{(i)}>Y_{l'}^{(k')})\\&=P(X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i)(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)^2\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')}\}}{m{\bar{J}}_m(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)^2, \end{aligned} \end{aligned}$$

where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).

$$\begin{aligned} \begin{aligned} {\widehat{Var}}&(\phi _1(Y_l^{(k)}))\\&={\widehat{Var}}(P_X(X_j^{(i)}>Y_l^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}>Y_{l}^{(k)})\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)}\}}{(\sum _{k=1}^{n}L_k)(\sum _{1\le i\ne i'\le m}J_iJ_i')}-({\hat{\theta }}_1)^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)}\}}{n{\bar{L}}_n(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}-({\hat{\theta }}_1)^2, \end{aligned} \end{aligned}$$

where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l \le L_k\).

$$\begin{aligned}&{\widehat{Cov}}(\phi _1(X_j^{(i)}),\phi _1(X_{j'}^{(i)}))\\&\qquad ={\widehat{Cov}}(P_Y(X_j^{(i)}>Y_l^{(k)}),P_Y(X_{j'}^{(i)}>Y_{l'}^{(k')}))\\&\qquad =P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i)}>Y_{l'}^{(k')})\\&\qquad =P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&\qquad =\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i(J_i-1))(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)^2\\&\qquad =\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')}\}}{m{\bar{J}}_m^{(2)}(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)^2, \end{aligned}$$

where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\ne j'\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).

$$\begin{aligned} \begin{aligned} {\widehat{Cov}}&(\phi _1(Y_l^{(k)}),\phi _1(Y_{l'}^{(k)}))\\&={\widehat{Cov}}(P_X(X_j^{(i)}>Y_l^{(k)}),P_X(X_{j'}^{(i')}>Y_{l'}^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}>Y_{l'}^{(k)})\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)}\}}{(\sum _{k=1}^{n}L_k(L_k-1))(\sum _{1\le i\ne i'\le m}J_iJ_i')}-({\hat{\theta }}_1)^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)}\}}{n{\bar{L}}_n^{(2)}(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}-({\hat{\theta }}_1)^2, \end{aligned} \end{aligned}$$

where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l\ne l' \le L_k\).

The same formulation applies to \(Var(U_2)\).

The four components of \(Cov(U_1,U_2)\) can be estimated as follows:

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(X_j^{(i)}),\phi _2(X_{j}^{(i)}))\\&={\widehat{Cov}}(P_Y(X_j^{(i)}>Y_l^{(k)}),P_Y(X_j^{(i)}<Y_{l'}^{(k')}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}<Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_j^{(i)}<Y_{l'}^{(k')})\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}<Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i)(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}<Y_{l'}^{(k')}\}}{m{\bar{J}}_m(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))\\&={\widehat{Cov}}(P_X(X_j^{(i)}>Y_l^{(k)}),P_X(X_{j'}^{(i')}<Y_l^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}<Y_{l}^{(k)})\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l}^{(k)}\}}{(\sum _{k=1}^{n}L_k)(\sum _{1\le i\ne i'\le m}J_iJ_i')}-({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l}^{(k)}\}}{n{\bar{L}}_n(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}-({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l \le L_k\).

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\\&={\widehat{Cov}}(P_Y(X_j^{(i)}>Y_l^{(k)}),P_Y(X_{j'}^{(i)}<Y_{l'}^{(k')}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}<Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i)}<Y_{l'}^{(k')})\\&=\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}<Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i(J_i-1))(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}<Y_{l'}^{(k')}\}}{m{\bar{J}}_m^{(2)}(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\ne j'\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\\&={\widehat{Cov}}(P_X(X_j^{(i)}>Y_l^{(k)}),P_X(X_{j'}^{(i')}<Y_{l'}^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l'}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}<Y_{l'}^{(k)})\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l'}^{(k)}\}}{(\sum _{k=1}^{n}L_k(L_k-1))(\sum _{1\le i\ne i'\le m}J_iJ_i')}\\&\quad -({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l'}^{(k)}\}}{n{\bar{L}}_n^{(2)}(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}\\&\quad -({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l\ne l' \le L_k\).

1.5 A.5 User’s manual of R package: cWR

Description

This package uses win ratio (WR) as a summary statistic to compare the composite endpoints of time-to-event data between two groups. Options of clustered or independent time-to-event data can be specified.

Usage

cWR(treatment, cluster, y1, y2, delta1, delta2, null.WR=1, alpha.sig=0.05)

Arguments

Treatment

An integer vector with code 0 as control group and 1 as treatment group for each subject.

Cluster

An integer vector with unique cluster ID for each cluster. When subjects are independent, the cluster ID is unique for each subject.

y1

Let \(T_H\), \(T_D\) and \(T_C\) be time to non-fatal event, time to fatal event and censoring time, respectively. y1 is a numeric vector with \(\min \)(\(T_H\), \(T_D\), \(T_C\)) for each subject.

y2

A numeric vector with \(\min \)(\(T_D\), \(T_C\)) for each subject.

delta1

An integer vector with code 1 indicating that \(T_H\) is observed, 0 otherwise.

delta2

An integer vector with code 1 indicating that \(T_D\) is observed, 0 otherwise.

null.WR

The null hypothesis of the WR statistic. The default is H0: WR=1 or log(WR)=0.

alpha.sig

The significance level, with default value 0.05.

Output values

name

The test name

U1

First estimated clustered U-statistic

U2

Second estimated clustered U-statistic

logWR

Estimated WR on log scale

se

Estimated standard error of the WR on log scale

z

Test statistic

ci

100(1-alpha.sig)% confidence interval

p

P-value of the significance testing

var_cov

Variance and covariance matrix of the first and second clustered U-statistics

Details The function “cWR” performs significance testing of comparing two composite time-to-event outcomes between groups. The Win Ratio summary statistic is built on the ”unmatched” approach described by Pocock et al. (2011). We assume that the composite endpoints can be formulated as semi-competing risk data. Each individual in the study is measured on time to non-fatal (non-terminal) event (e.g. hospitalization) and time to fatal (terminal) event (e.g. death). Specifically, the fatal event is considered clinically more important compared to the non-fatal event. Censoring is allowed, but time to censor needs to be observed.

This function can handle independent data, as well as clustered data. The inference of clustered data is based on the generalized bivariate clustered U-statistics proposed in this paper. This clustered U-statistic accounts for the potential correlations among subjects within a cluster. When the cluster size is 1, it’s the independent setting and the inference is the same as the method proposed by Bebu and Lachin (2015).

Note: The option “treatment”, “cluster”, “y1”, “y2”, “delta1”, “delta2” are required and no defaults are provided. These options have to be vectors with the same length. No missing values are allowed.

Example

Data generation for the clustered semi-competing risk data and the clustered win ratio analysis.

figure b

1.6 A.6 Simulation results of Type I error analysis for large intra-cluster correlation (ICC=0.5)

Simulation results of Type I error analysis for ICC\(=0.5\) and CV\(=0.71\), under the underlying true log(WR)=0:

\((\eta _H,\eta _D)=(0,0) [\log (WR)=0]\)

Number of clusters

Cluster size

Estimated \(\log (WR)\)

Empirical SE

Estimated SE

Type I error

25-25

20

\(-0.007\)

0.207

0.196

0.069

50

0.004

0.195

0.185

0.062

100

\(-0.006\)

0.189

0.181

0.067

200

\(-0.001\)

0.183

0.178

0.058

50-50

20

\(-0.004\)

0.144

0.141

0.050

50

\(-0.001\)

0.133

0.133

0.055

100

\(-0.001\)

0.132

0.130

0.056

200

0.000

0.131

0.128

0.051

75-75

20

0.000

0.118

0.116

0.054

50

0.002

0.112

0.109

0.058

100

\(-0.002\)

0.109

0.107

0.047

200

0.000

0.103

0.106

0.053

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, D., Jeong, JH. Inference on win ratio for cluster-randomized semi-competing risk data. Jpn J Stat Data Sci 4, 1263–1292 (2021). https://doi.org/10.1007/s42081-021-00131-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-021-00131-1

Keywords

Navigation