Inference on win ratio for cluster-randomized semi-competing risk data

Zhang, Di; Jeong, Jong-Hyeon

doi:10.1007/s42081-021-00131-1

Inference on win ratio for cluster-randomized semi-competing risk data

Original Paper
Recent Statistical Methods for Survival Analysis
Published: 16 June 2021

Volume 4, pages 1263–1292, (2021)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

280 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

The cluster randomization has been increasingly popular for pragmatic clinical trials by many public health researchers. The main advantages of using the cluster randomization include minimizing experimental contaminations, increasing the administrative efficiency, and having higher probability of compliances. Semi-competing risks data arise when a terminal event censors a nonterminal event, but not vice versa, and abundant literature exists on model-based methods to analyze such data. The win ratio is a purely nonparametric summary measure of a group effect in semi-competing risks data where the components of the composite endpoint get prioritized. In this paper, we propose inference on the win ratio for cluster-randomized semi-competing risks data, which can be formulated as the ratio of two clustered U-statistics. First the asymptotic joint distribution of the two clustered U-statistics is derived using the Cramer–Wold device, their variance and covariance estimators are evaluated, and then a test statistic for the win ratio for cluster-randomized semi-competing risks data is constructed. Simulation results are presented to assess type I error probabilities and powers of the test statistic. The proposed method is illustrated with a real data set from a multi-center phase III clinical trial on breast cancer, treating the clinics as randomized clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Systematic comparison of approaches to analyze clustered competing risks data

Article Open access 10 April 2023

Estimating relative risks in multicenter studies with a small number of centers — which methods to use? A simulation study

Article Open access 02 November 2017

Combining non-randomized and randomized data in clinical trials using commensurate priors

Article 06 August 2016

References

Bebu, I., & Lachin, J. M. (2015). Large sample inference for a win ratio analysis of a composite outcome based on prioritized components. Biostatistics, 17(1), 178–187.
Article MathSciNet Google Scholar
Chew, D. P., Astley, C. M., Luker, H., Alprandi-Costa, B., Hillis, G., Chow, C. K., et al. (2015). A cluster randomized trial of objective risk assessment versus standard care for acute coronary syndromes: Rationale and design of the australian grace risk score intervention study (agris). American Heart Journal, 170(5), 995–1004.
Article Google Scholar
Do Ha, I., Xiang, L., Peng, M., Jeong, J. H., & Lee, Y. (2020). Frailty modelling approaches for semi-competing risks data. Lifetime data analysis, 26(1), 109–133.
Article MathSciNet Google Scholar
Donner, A., & Klar, N. (2004). Pitfalls of and controversies in cluster randomization trials. American Journal of Public Health, 94(3), 416–422.
Article Google Scholar
Ferreira-González, I., Alonso-Coello, P., Solà, I., Pacheco-Huergo, V., Domingo-Salvany, A., Alonso, J., et al. (2008). Composite endpoints in clinical trials. Revista Espanola de Cardiologia (English Edition), 61(3), 283–290.
Google Scholar
Fine, J. P., Jiang, H., & Chappell, R. (2001). On semi-competing risks data. Biometrika, 88(4), 907–919.
Article MathSciNet Google Scholar
Fisher, B., Costantino, J., Redmond, C., Poisson, R., Bowman, D., Couture, J., et al. (1989). A randomized clinical trial evaluating tamoxifen in the treatment of patients with node-negative breast cancer who have estrogen-receptor-positive tumors. New England Journal of Medicine, 320(8), 479–484.
Article Google Scholar
Griffin, S. J., Borch-Johnsen, K., Davies, M. J., Khunti, K., Rutten, G. E., Sandbæk, A., et al. (2011). Effect of early intensive multifactorial therapy on 5-year cardiovascular outcomes in individuals with type 2 diabetes detected by screening (addition-europe): a cluster-randomised trial. The Lancet, 378(9786), 156–167.
Article Google Scholar
Hahn, S., Puffer, S., Torgerson, D., & Watson, J. (2005). Methodological bias in cluster randomised trials. BMC Medical Research Methodology, 5, 10. https://doi.org/10.1186/1471-2288-5-10.
Article Google Scholar
Huque, M. F., Alosh, M., & Bhore, R. (2011). Addressing multiplicity issues of a composite endpoint and its components in clinical trials. Journal of Biopharmaceutical Statistics, 21(4), 610–634.
Article MathSciNet Google Scholar
Jahn-Eimermacher, A., Ingel, K., & Schneider, A. (2013). Sample size in cluster-randomized trials with time to event as the primary endpoint. Statistics in Medicine, 32(5), 739–751.
Article MathSciNet Google Scholar
Jeong, J. H., & Jung, S. H. (2006). Rank tests for clustered survival data when dependent subunits are randomized. Statistics in Medicine, 25(3), 361–373.
Article MathSciNet Google Scholar
Jordhoy, M., Frayers, P., Ahlner-Elmqvist, M., & Kaasa, S. (2002). Lack of concealment may lead to selection bias in cluster randomized trials of palliative care. Palliative Medicine, 16, 43–49.
Article Google Scholar
Kalia, S., Klar, N., & Donner, A. (2016). On the estimation of intracluster correlation for time-to-event outcomes in cluster randomized trials. Statistics in Medicine, 35(30), 5551–5560.
Article MathSciNet Google Scholar
Kleist, P. (2007). Composite endpoints for clinical trials. International Journal of Pharmaceutical Medicine, 21(3), 187–198.
Article Google Scholar
Lee, M. L. T., & Dehling, H. G. (2005). Generalized two-sample u-statistics for clustered data. Statistica Neerlandica, 59(3), 313–323.
Article MathSciNet Google Scholar
Li, P., & Redden, D. T. (2015). Small sample performance of bias-corrected sandwich estimators for cluster-randomized trials with binary outcomes. Statistics in Medicine, 34(2), 281–296.
Article MathSciNet Google Scholar
Liu, L., Wolfe, R. A., & Huang, X. (2004). Shared frailty models for recurrent events and a terminal event. Biometrics, 60(3), 747–756.
Article MathSciNet Google Scholar
Luo, X., Tian, H., Mohanty, S., & Tsai, W. Y. (2015). An alternative approach to confidence interval estimation for the win ratio statistic. Biometrics, 71(1), 139–145.
Article MathSciNet Google Scholar
Mao, L. (2017). On causal estimation using-statistics. Biometrika, 105(1), 215–220.
Article MathSciNet Google Scholar
Marshall, A. W., & Olkin, I. (1988). Families of multivariate distributions. Journal of the American Statistical Association, 83(403), 834–841.
Article MathSciNet Google Scholar
Oakes, D. (2016). On the win-ratio statistic in clinical trials with multiple types of event. Biometrika, 103(3), 742–745.
Article MathSciNet Google Scholar
Obuchowski, N. A. (1997). Nonparametric analysis of clustered roc curve data. Biometrics, 53, 567–578.
Article Google Scholar
Peng, L., & Fine, J. P. (2007). Regression modeling of semicompeting risks data. Biometrics, 63(1), 96–108.
Article MathSciNet Google Scholar
Peng, M., Xiang, L., & Wang, S. (2018). Semiparametric regression analysis of clustered survival data with semi-competing risks. Computational Statistics & Data Analysis, 124, 53–70.
Article MathSciNet Google Scholar
Pocock, S. J., Ariti, C. A., Collier, T. J., & Wang, D. (2011). The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal, 33(2), 176–182.
Article Google Scholar
Stolker, J. M., Spertus, J. A., Cohen, D. J., Jones, P. G., Jain, K. K., Bamberger, E., et al. (2014). Re-thinking composite endpoints in clinical trials: Insights from patients and trialists. Circulation, 130(15), 1254–1261.
Article Google Scholar
Tong, B. C., Huber, J. C., Ascheim, D. D., Puskas, J. D., Ferguson, T. B., Blackstone, E. H., & Smith, P. K. (2012). Weighting composite endpoints in clinical trials: Essential evidence for the heart team. The Annals of Thoracic Surgery, 94(6), 1908–1913.
Article Google Scholar
Wang, M., Kong, L., Li, Z., & Zhang, L. (2016). Covariance estimators for generalized estimating equations (gee) in longitudinal analysis with small samples. Statistics in Medicine, 35(10), 1706–1721.
Article MathSciNet Google Scholar
Wu, B. H., Michimae, H., & Emura, T. (2020). Meta-analysis of individual patient data with semi-competing risks under the weibull joint frailty-copula model. Computational Statistics, 35(4), 1525–1552.
Article MathSciNet Google Scholar
Xu, J., Kalbfleisch, J. D., & Tai, B. (2010). Statistical analysis of illness-death processes and semicompeting risks data. Biometrics, 66(3), 716–725.
Article MathSciNet Google Scholar
Zeng, D., & Lin, D. (2009). Semiparametric transformation models with random effects for joint analysis of recurrent and terminal events. Biometrics, 65(3), 746–752.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Division of Biometrics VII, Office of Biostatistics, Office of Translational Science, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
Di Zhang
Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, 15261, USA
Jong-Hyeon Jeong

Authors

Di Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Hyeon Jeong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jong-Hyeon Jeong.

Ethics declarations

Conflict of interest statement

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Disclaimer

Dr. Di Zhang contributed to this article in her personal capacity. This manuscript reflects the views of the author and should not be construed to represent FDA views or policies.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

This Appendix contains examples of between-subject comparisons using win ratio approach (Sect. A.1), proof of the Theorem (Sect. A.2), detailed derivation of the covariance between the clustered $U_1$ and $U_2$ (Sect. A.3), presentation of the evaluable formulas of the variance and covariance matrix of the joint distribution of the bivariate clustered U-statistics (Sect. A.4), the user manual of the R package cWR with examples (Sect. A.5), and additional simulation results of Type I Error analysis under large intra-cluster correlation (Sect. A.6).

1.1 A.1 Examples of between-subject comparisons using the win ratio approach

We present some examples of between-subject comparisons in the win ratio approach to help readers better understand the comparison procedure. For formal definitions of all possible comparison scenarios, we defer to Luo et al. (2015). Define $T_{H_z}$, $T_{D_z}$ and $T_{C_z}$ as time to non-fatal event, time to fatal event and time to censoring, respectively, for treated ($z=1$) and control ($z=0$) patients. The horizontal lines in the plot represent follow-up timeline starting from time=0 for all patients. We assume that the fatal event is more important compared to the non-fatal event, and we want to prioritize the fatal event. The fatal event can censor the non-fatal event, but not vice versa.

Example 1

Both non-fatal and fatal events of the treated patient were observed, while only the non-fatal event was observed for the control patient, due to censoring. We first compare the times to fatal event between the treated and control patients. Because the control patient was censored before treated patient developed the fatal event ($\{T_{C_1}>T_{D_1}\} \text { and } \{T_{D_0}>T_{C_0}>T_{H_0}\} \text { and } \{T_{C_0}<T_{D_1}$}), we cannot determine who wins based on the fatal event and we move down to the rank of importance to compare the time to non-fatal event. Because the treated patient developed the non-fatal event before the control patient ($T_{H_1}<T_{H_0}$), we claim that the control patient wins in this comparison. To further illustrate the construction of the indicator functions of the two kernels in Eqs. (3) and (4) of the main manuscript, we have $\mathbf{1} \{(T_{C_1}>T_{D_1}) \text { and } (T_{D_0}>T_{C_0}>T_{H_0}) \text { and } (T_{C_0}<T_{D_1}) \text { and } (T_{H_1}<T_{H_0})\} =\phi _2(X_j^{(i)},Y_l^{(k)})=1-\phi _1(X_j^{(i)},Y_l^{(k)})=1. $

Example 2

The treated patient was censored before developed any events, while the control patient developed the fatal event. We first compare the times to fatal event between the treated and control patients. Because the control patient developed the fatal event before the treated patient was censored ($\{\min (T_{H_1},T_{D_1})>T_{C_1}\}\text { and }\{\min (T_{H_0},T_{C_0})>T_{D_0}\}\text { and }\{T_{C_1}>T_{D_0}\}$), we claim that the treated patient wins in this comparison without the need to consider times to non-fatal event. To construct the indicator functions of the two kernels, we have $\mathbf{1} \{(\min (T_{H_1},T_{D_1})>T_{C_1})\text { and }(\min (T_{H_0},T_{C_0})>T_{D_0})\text { and }(T_{C_1}>T_{D_0})\}=\phi _1(X_j^{(i)},Y_l^{(k)})=1-\phi _2(X_j^{(i)},Y_l^{(k)})=1$.

Example 3

Similar to Example 2 that the treated patient was censored before developed any events, while the control patient developed the fatal event, except that the treated patient censored before the control patient developed the fatal event ($T_{C_1}<T_{D_0}$). We cannot determine who wins based on the fatal event. Because we do not observe the non-fatal event for neither patient, this comparison is inconclusive, which leads to a tie. To construct the indicator functions of the two kernels, we have $\mathbf{1} \{(\min (T_{H_1},T_{D_1})>T_{C_1})\text { and }(\min (T_{H_0},T_{C_0})>T_{D_0})\text { and }(T_{C_1}<T_{D_0})\}=\phi _1(X_j^{(i)},Y_l^{(k)})=\phi _2(X_j^{(i)},Y_l^{(k)})=0$.

1.2 A.2 Proof of Theorem 1

Theorem 1

Define $A_1^{(i)}=\sum _{j=1}^{J_i} \phi _1(X_j^{(i)})$, $B_1^{(k)}=\sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})$, $A_2^{(i)}=\sum _{j=1}^{J_i} \phi _2(X_j^{(i)})$, $B_2^{(k)}=\sum _{l=1}^{L_k} \phi _2(Y_l^{(k)})$. Also define ${\bar{J}}_m=\frac{1}{m}\sum _{i=1}^{m}J_i$ and ${\bar{L}}_n=\frac{1}{n}\sum _{k=1}^{n}L_k$, where m and n are the numbers of clusters in two comparison groups. Then under the assumptions of independence between comparison groups and among clusters and exchangeability, as $\min (m,n)\rightarrow \infty $, the U-statistics $U_1-{\bar{J}}_m {\bar{L}}_n \theta _1$ and $U_2-{\bar{J}}_m {\bar{L}}_n \theta _2$ asymptotically follow a mean 0 bivariate normal distribution with the variance–covariance matrix of

$$\begin{aligned} \Sigma =\frac{{\bar{L}}_n^2}{m}\Sigma _1+\frac{{\bar{J}}_m^2}{n}\Sigma _2, \end{aligned}$$

where

$$\begin{aligned} \Sigma _1= \begin{bmatrix} Var(A_1^{(i)}) &{} Cov(A_1^{(i)},A_2^{(i)}) \\ Cov(A_1^{(i)},A_2^{(i)}) &{} Var(A_2^{(i)}), \end{bmatrix} \end{aligned}$$

and

$$\begin{aligned} \Sigma _2= \begin{bmatrix} Var(B_1^{(k)}) &{} Cov(B_1^{(k)},B_2^{(k)}) \\ Cov(B_1^{(k)},B_2^{(k)}) &{} Var(B_2^{(k)}) \end{bmatrix}. \end{aligned}$$

Proof

Lee and Dehling (2005) has asymptotically shown that as $\min (m,n)\rightarrow \infty $,

$$\begin{aligned} U_1-{\bar{J}}_m {\bar{L}}_n \theta _1 \approx {\bar{L}}_n \frac{1}{m} \sum _{i=1}^{m} \sum _{j=1}^{J_i} \phi _1(X_j^{(i)})+ {\bar{J}}_m \frac{1}{n} \sum _{k=1}^{n} \sum _{l=1}^{L_k} \phi _1(Y_l^{(k)}), \end{aligned}$$

(8)

and that the first term follows an asymptotic normal distribution with mean 0 and variance

$$\begin{aligned} \frac{{\bar{L}}_n^2}{m}Var\left( \sum _{j=1}^{J_i} \phi _1(X_j^{(i)})\right) , \end{aligned}$$

and the second term does with mean 0 and variance

$$\begin{aligned} \frac{{\bar{J}}_m^2}{n}Var\left( \sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})\right) . \end{aligned}$$

Based on the assumptions of independent clusters and identical distributions among observations in the same group, $\sum _{j=1}^{J_i} \phi _1(X_j^{(i)})$, $i=1, 2, ..., m$, are iid random variables, as well as $\sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})$, $k=1, 2, ..., n$. The same applies to the statistic $U_2-{\bar{J}}_m {\bar{L}}_n \theta _2$, except that the latter uses the second kernel.

Noting that $E(\phi _1(X_j^{(i)}))=E(\phi _1(Y_l^{(k)}))=0$, suppose ($A_1^{(i)}$, $A_2^{(i)}$), $i=1,2,...,m$, follows a bivariate distribution with mean $(0,0)^T$ and variance

$$\begin{aligned} \begin{bmatrix} Var(A_1^{(i)}) &{} Cov(A_1^{(i)}, A_2^{(i)})\\ Cov(A_1^{(i)}, A_2^{(i)}) &{} Var(A_2^{(i)}) \end{bmatrix}. \end{aligned}$$

Then we have

$$\begin{aligned} \sqrt{m}{\bar{L}}_n({\bar{A}}_1+ {\bar{A}}_2-0) \Longrightarrow AN\left( 0,{\bar{L}}_n^2(Var(A_1^{(i)}) + Var(A_2^{(i)}) + 2 Cov(A_1^{(i)}, A_2^{(i)}))\right) , \end{aligned}$$

where ${\bar{A}}_1=\sum _{i=1}^{m}A_1^{(i)}/m$ and ${\bar{A}}_2=\sum _{i=1}^{m}A_2^{(i)}/m$. Therefore, by the Cramer–Wold device

$$\begin{aligned} \sqrt{m} {\bar{L}}_n({\bar{A}}_1-0, {\bar{A}}_2-0) \Longrightarrow AN \left( \begin{bmatrix} 0\\ 0 \end{bmatrix}, {\bar{L}}_n^2 \begin{bmatrix} Var(A_1^{(i)}) &{} Cov(A_1^{(i)}, A_2^{(i)})\\ Cov(A_1^{(i)}, A_2^{(i)}) &{} Var(A_2^{(i)}) \end{bmatrix} \right) . \end{aligned}$$

Similarly, by defining ${\bar{B}}_1=\sum _{k=1}^{n}B_1^{(k)}/n$ and ${\bar{B}}_2=\sum _{k=1}^{n}B_2^{(k)}/n$,

$$\begin{aligned} \sqrt{n} {\bar{J}}_m({\bar{B}}_1-0, {\bar{B}}_2-0) \Longrightarrow AN \left( \begin{bmatrix} 0\\ 0 \end{bmatrix}, {\bar{J}}_m^2 \begin{bmatrix} Var(B_1^{(k)}) &{} Cov(B_1^{(k)}, B_2^{(k)})\\ Cov(B_1^{(k)}, B_2^{(k)}) &{} Var(B_2^{(k)}) \end{bmatrix} \right) . \end{aligned}$$

Finally, since $Cov({\bar{A}}_r,{\bar{B}}_r)=0$ ($r=1,2$) due to the independence assumption among clusters, $(U_1-{\bar{J}}_m {\bar{L}}_n \theta _1, U_2-{\bar{J}}_m {\bar{L}}_n \theta _2)^T$ follows an asymptotic normal distribution with mean $(0,0)^T$ and variance $\Sigma =({\bar{L}}_n^2/m)\Sigma _1+({\bar{J}}_m^2/n)\Sigma _2.$ $\square $

1.3 A.3 Derivation of the covariance between $U_1$ and $U_2$

Using the definition of the asymptotic formula of the U statistic given in (4),

We can easily see that $Cov(\textcircled {a},\textcircled {d})=Cov(\textcircled {b},\textcircled {c})=0$ due to independence between clusters, and

$$\begin{aligned} \begin{aligned}&Cov(\textcircled {a},\textcircled {c})=\frac{{\bar{L}}_n^2}{m^2}Cov\left( \sum _{i=1}^{m}\sum _{j=1}^{J_i}\phi _1(X_j^{(i)}),\sum _{i=1}^{m}\sum _{j=1}^{J_i}\phi _2(X_j^{(i)})\right) \\&\quad =\frac{{\bar{L}}_n^2}{m^2}\sum _{i=1}^{m}Cov\left( \sum _{j=1}^{J_i}\phi _1(X_j^{(i)}),\sum _{j=1}^{J_i}\phi _2(X_j^{(i)})\right) \text {, since clusters are independent.}\\&\quad =\frac{{\bar{L}}_n^2}{m^2}\sum _{i=1}^{m}\{J_iCov(\phi _1(X_j^{(i)}),\phi _2(X_j^{(i)}))+J_i(J_i-1)Cov(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\}\\&\quad =\frac{{\bar{L}}_n^2}{m}\{{\bar{J}}_mCov(\phi _1(X_j^{(i)}),\phi _2(X_j^{(i)}))+{\bar{J}}_m^{(2)}Cov(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\}. \end{aligned} \end{aligned}$$

Similarly,

$$\begin{aligned}&Cov(\textcircled {b},\textcircled {d})=\frac{{\bar{J}}_m^2}{n^2}Cov\left( \sum _{k=1}^{n}\sum _{l=1}^{L_k}\phi _1(Y_l^{(k)}),\sum _{k=1}^{n}\sum _{l=1}^{L_k}\phi _2(Y_l^{(k)})\right) \\&\quad =\frac{{\bar{J}}_m^2}{n^2}\sum _{k=1}^{n}Cov\left( \sum _{l=1}^{L_k}\phi _1(Y_l^{(k)}),\sum _{l=1}^{L_k}\phi _2(Y_l^{(k)})\right) \text {, since clusters are independent.}\\&\quad =\frac{{\bar{J}}_m^2}{n^2}\sum _{k=1}^{n}\{L_kCov(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))+L_k(L_k-1)Cov(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\}\\&\quad =\frac{{\bar{J}}_m^2}{n}\{{\bar{L}}_nCov(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))+{\bar{L}}_n^{(2)}Cov(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\}. \end{aligned}$$

Therefore, when $m,n\rightarrow \infty $ the asymptotic covariance between $U_1$ and $U_2$ is

$$\begin{aligned} Cov(U_1,U_2)= & {} \frac{{\bar{L}}_n^2}{m}\{{\bar{J}}_mCov(\phi _1(X_j^{(i)}),\phi _2(X_j^{(i)}))+{\bar{J}}_m^{(2)}Cov(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\}\\&+\frac{{\bar{J}}_m^2}{n}\{{\bar{L}}_nCov(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))+{\bar{L}}_n^{(2)}Cov(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\}. \end{aligned}$$

1.4 A.4 Evaluable formulas of variance–covariance matrix

This section presents evaluable forms of each component of the variances and covariances derived in Sects. 3 and 4. Let us define

$$\begin{aligned} {\hat{\theta }}_1=\frac{1}{(\sum _{i=1}^{m} J_i)(\sum _{k=1}^{n} L_k)} \sum _{i=1}^{m}\sum _{k=1}^{n}\sum _{j=1}^{J_i}\sum _{l=1}^{L_k} \mathbf{1 }\{X_j^{(i)}>Y_l^{(k)}\}. \end{aligned}$$

The four components of $Var(U_1)$ are estimated as follows:

$$\begin{aligned} \begin{aligned} {\widehat{Var}}&(\phi _1(X_j^{(i)}))\\&={\widehat{Var}}(P_Y(X_j^{(i)}>Y_l^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_j^{(i)}>Y_{l'}^{(k')})\\&=P(X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i)(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)^2\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}>Y_{l'}^{(k')}\}}{m{\bar{J}}_m(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)^2, \end{aligned} \end{aligned}$$

where $1\le i\le m; 1\le k\ne k'\le n; 1\le j\le J_i; 1\le l \le L_k; 1\le l' \le L_k'$.

$$\begin{aligned} \begin{aligned} {\widehat{Var}}&(\phi _1(Y_l^{(k)}))\\&={\widehat{Var}}(P_X(X_j^{(i)}>Y_l^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}>Y_{l}^{(k)})\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)}\}}{(\sum _{k=1}^{n}L_k)(\sum _{1\le i\ne i'\le m}J_iJ_i')}-({\hat{\theta }}_1)^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l}^{(k)}\}}{n{\bar{L}}_n(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}-({\hat{\theta }}_1)^2, \end{aligned} \end{aligned}$$

where $1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l \le L_k$.

$$\begin{aligned}&{\widehat{Cov}}(\phi _1(X_j^{(i)}),\phi _1(X_{j'}^{(i)}))\\&\qquad ={\widehat{Cov}}(P_Y(X_j^{(i)}>Y_l^{(k)}),P_Y(X_{j'}^{(i)}>Y_{l'}^{(k')}))\\&\qquad =P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i)}>Y_{l'}^{(k')})\\&\qquad =P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&\qquad =\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i(J_i-1))(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)^2\\&\qquad =\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}>Y_{l'}^{(k')}\}}{m{\bar{J}}_m^{(2)}(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)^2, \end{aligned}$$

where $1\le i\le m; 1\le k\ne k'\le n; 1\le j\ne j'\le J_i; 1\le l \le L_k; 1\le l' \le L_k'$.

$$\begin{aligned} \begin{aligned} {\widehat{Cov}}&(\phi _1(Y_l^{(k)}),\phi _1(Y_{l'}^{(k)}))\\&={\widehat{Cov}}(P_X(X_j^{(i)}>Y_l^{(k)}),P_X(X_{j'}^{(i')}>Y_{l'}^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}>Y_{l'}^{(k)})\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)})-[P(X_j^{(i)}>Y_l^{(k)})]^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)}\}}{(\sum _{k=1}^{n}L_k(L_k-1))(\sum _{1\le i\ne i'\le m}J_iJ_i')}-({\hat{\theta }}_1)^2\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}>Y_{l'}^{(k)}\}}{n{\bar{L}}_n^{(2)}(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}-({\hat{\theta }}_1)^2, \end{aligned} \end{aligned}$$

where $1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l\ne l' \le L_k$.

The same formulation applies to $Var(U_2)$.

The four components of $Cov(U_1,U_2)$ can be estimated as follows:

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(X_j^{(i)}),\phi _2(X_{j}^{(i)}))\\&={\widehat{Cov}}(P_Y(X_j^{(i)}>Y_l^{(k)}),P_Y(X_j^{(i)}<Y_{l'}^{(k')}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}<Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_j^{(i)}<Y_{l'}^{(k')})\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}<Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i)(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{i=1}^{m} \sum _{j=1}^{J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_j^{(i)}<Y_{l'}^{(k')}\}}{m{\bar{J}}_m(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where $1\le i\le m; 1\le k\ne k'\le n; 1\le j\le J_i; 1\le l \le L_k; 1\le l' \le L_k'$.

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(Y_l^{(k)}),\phi _2(Y_l^{(k)}))\\&={\widehat{Cov}}(P_X(X_j^{(i)}>Y_l^{(k)}),P_X(X_{j'}^{(i')}<Y_l^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}<Y_{l}^{(k)})\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l}^{(k)}\}}{(\sum _{k=1}^{n}L_k)(\sum _{1\le i\ne i'\le m}J_iJ_i')}-({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{l=1}^{L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l}^{(k)}\}}{n{\bar{L}}_n(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}-({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where $1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l \le L_k$.

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(X_j^{(i)}),\phi _2(X_{j'}^{(i)}))\\&={\widehat{Cov}}(P_Y(X_j^{(i)}>Y_l^{(k)}),P_Y(X_{j'}^{(i)}<Y_{l'}^{(k')}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}<Y_{l'}^{(k')})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i)}<Y_{l'}^{(k')})\\&=\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}<Y_{l'}^{(k')}\}}{(\sum _{i=1}^{m}J_i(J_i-1))(\sum _{1\le k\ne k'\le n}L_kL_k')}-({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{i=1}^{m} \sum _{1\le j\ne j'\le J_i} \sum _{1\le k\ne k'\le n} \sum _{l=1}^{L_k} \sum _{l'=1}^{L_k'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i)}<Y_{l'}^{(k')}\}}{m{\bar{J}}_m^{(2)}(n^2({\bar{L}}_n)^{2}-n{\bar{L}}_n^{(2)}-n{\bar{L}}_n)}-({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where $1\le i\le m; 1\le k\ne k'\le n; 1\le j\ne j'\le J_i; 1\le l \le L_k; 1\le l' \le L_k'$.

$$\begin{aligned} {\widehat{Cov}}&(\phi _1(Y_l^{(k)}),\phi _2(Y_{l'}^{(k)}))\\&={\widehat{Cov}}(P_X(X_j^{(i)}>Y_l^{(k)}),P_X(X_{j'}^{(i')}<Y_{l'}^{(k)}))\\&=P(X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l'}^{(k)})-P(X_j^{(i)}>Y_l^{(k)})P(X_{j'}^{(i')}<Y_{l'}^{(k)})\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l'}^{(k)}\}}{(\sum _{k=1}^{n}L_k(L_k-1))(\sum _{1\le i\ne i'\le m}J_iJ_i')}\\&\quad -({\hat{\theta }}_1)({\hat{\theta }}_2)\\&=\frac{\sum _{k=1}^{n} \sum _{1\le i\ne i'\le m} \sum _{1\le l\ne l'\le L_k} \sum _{j=1}^{J_i} \sum _{j'=1}^{J_i'} \mathbf{1} \{X_j^{(i)}>Y_l^{(k)}, X_{j'}^{(i')}<Y_{l'}^{(k)}\}}{n{\bar{L}}_n^{(2)}(m^2({\bar{J}}_m)^{2}-m{\bar{J}}_m^{(2)}-m{\bar{J}}_m)}\\&\quad -({\hat{\theta }}_1)({\hat{\theta }}_2), \end{aligned}$$

where $1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l\ne l' \le L_k$.

1.5 A.5 User’s manual of R package: cWR

Description

This package uses win ratio (WR) as a summary statistic to compare the composite endpoints of time-to-event data between two groups. Options of clustered or independent time-to-event data can be specified.

Usage

cWR(treatment, cluster, y1, y2, delta1, delta2, null.WR=1, alpha.sig=0.05)

Arguments

Treatment	An integer vector with code 0 as control group and 1 as treatment group for each subject.
Cluster	An integer vector with unique cluster ID for each cluster. When subjects are independent, the cluster ID is unique for each subject.
y1	Let $T_H$, $T_D$ and $T_C$ be time to non-fatal event, time to fatal event and censoring time, respectively. y1 is a numeric vector with $\min $($T_H$, $T_D$, $T_C$) for each subject.
y2	A numeric vector with $\min $($T_D$, $T_C$) for each subject.
delta1	An integer vector with code 1 indicating that $T_H$ is observed, 0 otherwise.
delta2	An integer vector with code 1 indicating that $T_D$ is observed, 0 otherwise.
null.WR	The null hypothesis of the WR statistic. The default is H0: WR=1 or log(WR)=0.
alpha.sig	The significance level, with default value 0.05.

Output values

name	The test name
U1	First estimated clustered U-statistic
U2	Second estimated clustered U-statistic
logWR	Estimated WR on log scale
se	Estimated standard error of the WR on log scale
z	Test statistic
ci	100(1-alpha.sig)% confidence interval
p	P-value of the significance testing
var_cov	Variance and covariance matrix of the first and second clustered U-statistics

Details The function “cWR” performs significance testing of comparing two composite time-to-event outcomes between groups. The Win Ratio summary statistic is built on the ”unmatched” approach described by Pocock et al. (2011). We assume that the composite endpoints can be formulated as semi-competing risk data. Each individual in the study is measured on time to non-fatal (non-terminal) event (e.g. hospitalization) and time to fatal (terminal) event (e.g. death). Specifically, the fatal event is considered clinically more important compared to the non-fatal event. Censoring is allowed, but time to censor needs to be observed.

This function can handle independent data, as well as clustered data. The inference of clustered data is based on the generalized bivariate clustered U-statistics proposed in this paper. This clustered U-statistic accounts for the potential correlations among subjects within a cluster. When the cluster size is 1, it’s the independent setting and the inference is the same as the method proposed by Bebu and Lachin (2015).

Note: The option “treatment”, “cluster”, “y1”, “y2”, “delta1”, “delta2” are required and no defaults are provided. These options have to be vectors with the same length. No missing values are allowed.

Example

Data generation for the clustered semi-competing risk data and the clustered win ratio analysis.

1.6 A.6 Simulation results of Type I error analysis for large intra-cluster correlation (ICC=0.5)

Simulation results of Type I error analysis for ICC$=0.5$ and CV$=0.71$, under the underlying true log(WR)=0:

$(\eta _H,\eta _D)=(0,0) [\log (WR)=0]$
Number of clusters	Cluster size	Estimated $\log (WR)$	Empirical SE	Estimated SE	Type I error
25-25	20	$-0.007$	0.207	0.196	0.069
	50	0.004	0.195	0.185	0.062
	100	$-0.006$	0.189	0.181	0.067
	200	$-0.001$	0.183	0.178	0.058
50-50	20	$-0.004$	0.144	0.141	0.050
	50	$-0.001$	0.133	0.133	0.055
	100	$-0.001$	0.132	0.130	0.056
	200	0.000	0.131	0.128	0.051
75-75	20	0.000	0.118	0.116	0.054
	50	0.002	0.112	0.109	0.058
	100	$-0.002$	0.109	0.107	0.047
	200	0.000	0.103	0.106	0.053

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, D., Jeong, JH. Inference on win ratio for cluster-randomized semi-competing risk data. Jpn J Stat Data Sci 4, 1263–1292 (2021). https://doi.org/10.1007/s42081-021-00131-1

Download citation

Received: 15 February 2021
Revised: 01 June 2021
Accepted: 05 June 2021
Published: 16 June 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s42081-021-00131-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inference on win ratio for cluster-randomized semi-competing risk data

Abstract

Access this article

Similar content being viewed by others

Systematic comparison of approaches to analyze clustered competing risks data

Estimating relative risks in multicenter studies with a small number of centers — which methods to use? A simulation study

Combining non-randomized and randomized data in clinical trials using commensurate priors

References