Abstract
The cluster randomization has been increasingly popular for pragmatic clinical trials by many public health researchers. The main advantages of using the cluster randomization include minimizing experimental contaminations, increasing the administrative efficiency, and having higher probability of compliances. Semi-competing risks data arise when a terminal event censors a nonterminal event, but not vice versa, and abundant literature exists on model-based methods to analyze such data. The win ratio is a purely nonparametric summary measure of a group effect in semi-competing risks data where the components of the composite endpoint get prioritized. In this paper, we propose inference on the win ratio for cluster-randomized semi-competing risks data, which can be formulated as the ratio of two clustered U-statistics. First the asymptotic joint distribution of the two clustered U-statistics is derived using the Cramer–Wold device, their variance and covariance estimators are evaluated, and then a test statistic for the win ratio for cluster-randomized semi-competing risks data is constructed. Simulation results are presented to assess type I error probabilities and powers of the test statistic. The proposed method is illustrated with a real data set from a multi-center phase III clinical trial on breast cancer, treating the clinics as randomized clusters.
Similar content being viewed by others
References
Bebu, I., & Lachin, J. M. (2015). Large sample inference for a win ratio analysis of a composite outcome based on prioritized components. Biostatistics, 17(1), 178–187.
Chew, D. P., Astley, C. M., Luker, H., Alprandi-Costa, B., Hillis, G., Chow, C. K., et al. (2015). A cluster randomized trial of objective risk assessment versus standard care for acute coronary syndromes: Rationale and design of the australian grace risk score intervention study (agris). American Heart Journal, 170(5), 995–1004.
Do Ha, I., Xiang, L., Peng, M., Jeong, J. H., & Lee, Y. (2020). Frailty modelling approaches for semi-competing risks data. Lifetime data analysis, 26(1), 109–133.
Donner, A., & Klar, N. (2004). Pitfalls of and controversies in cluster randomization trials. American Journal of Public Health, 94(3), 416–422.
Ferreira-González, I., Alonso-Coello, P., Solà, I., Pacheco-Huergo, V., Domingo-Salvany, A., Alonso, J., et al. (2008). Composite endpoints in clinical trials. Revista Espanola de Cardiologia (English Edition), 61(3), 283–290.
Fine, J. P., Jiang, H., & Chappell, R. (2001). On semi-competing risks data. Biometrika, 88(4), 907–919.
Fisher, B., Costantino, J., Redmond, C., Poisson, R., Bowman, D., Couture, J., et al. (1989). A randomized clinical trial evaluating tamoxifen in the treatment of patients with node-negative breast cancer who have estrogen-receptor-positive tumors. New England Journal of Medicine, 320(8), 479–484.
Griffin, S. J., Borch-Johnsen, K., Davies, M. J., Khunti, K., Rutten, G. E., Sandbæk, A., et al. (2011). Effect of early intensive multifactorial therapy on 5-year cardiovascular outcomes in individuals with type 2 diabetes detected by screening (addition-europe): a cluster-randomised trial. The Lancet, 378(9786), 156–167.
Hahn, S., Puffer, S., Torgerson, D., & Watson, J. (2005). Methodological bias in cluster randomised trials. BMC Medical Research Methodology, 5, 10. https://doi.org/10.1186/1471-2288-5-10.
Huque, M. F., Alosh, M., & Bhore, R. (2011). Addressing multiplicity issues of a composite endpoint and its components in clinical trials. Journal of Biopharmaceutical Statistics, 21(4), 610–634.
Jahn-Eimermacher, A., Ingel, K., & Schneider, A. (2013). Sample size in cluster-randomized trials with time to event as the primary endpoint. Statistics in Medicine, 32(5), 739–751.
Jeong, J. H., & Jung, S. H. (2006). Rank tests for clustered survival data when dependent subunits are randomized. Statistics in Medicine, 25(3), 361–373.
Jordhoy, M., Frayers, P., Ahlner-Elmqvist, M., & Kaasa, S. (2002). Lack of concealment may lead to selection bias in cluster randomized trials of palliative care. Palliative Medicine, 16, 43–49.
Kalia, S., Klar, N., & Donner, A. (2016). On the estimation of intracluster correlation for time-to-event outcomes in cluster randomized trials. Statistics in Medicine, 35(30), 5551–5560.
Kleist, P. (2007). Composite endpoints for clinical trials. International Journal of Pharmaceutical Medicine, 21(3), 187–198.
Lee, M. L. T., & Dehling, H. G. (2005). Generalized two-sample u-statistics for clustered data. Statistica Neerlandica, 59(3), 313–323.
Li, P., & Redden, D. T. (2015). Small sample performance of bias-corrected sandwich estimators for cluster-randomized trials with binary outcomes. Statistics in Medicine, 34(2), 281–296.
Liu, L., Wolfe, R. A., & Huang, X. (2004). Shared frailty models for recurrent events and a terminal event. Biometrics, 60(3), 747–756.
Luo, X., Tian, H., Mohanty, S., & Tsai, W. Y. (2015). An alternative approach to confidence interval estimation for the win ratio statistic. Biometrics, 71(1), 139–145.
Mao, L. (2017). On causal estimation using-statistics. Biometrika, 105(1), 215–220.
Marshall, A. W., & Olkin, I. (1988). Families of multivariate distributions. Journal of the American Statistical Association, 83(403), 834–841.
Oakes, D. (2016). On the win-ratio statistic in clinical trials with multiple types of event. Biometrika, 103(3), 742–745.
Obuchowski, N. A. (1997). Nonparametric analysis of clustered roc curve data. Biometrics, 53, 567–578.
Peng, L., & Fine, J. P. (2007). Regression modeling of semicompeting risks data. Biometrics, 63(1), 96–108.
Peng, M., Xiang, L., & Wang, S. (2018). Semiparametric regression analysis of clustered survival data with semi-competing risks. Computational Statistics & Data Analysis, 124, 53–70.
Pocock, S. J., Ariti, C. A., Collier, T. J., & Wang, D. (2011). The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal, 33(2), 176–182.
Stolker, J. M., Spertus, J. A., Cohen, D. J., Jones, P. G., Jain, K. K., Bamberger, E., et al. (2014). Re-thinking composite endpoints in clinical trials: Insights from patients and trialists. Circulation, 130(15), 1254–1261.
Tong, B. C., Huber, J. C., Ascheim, D. D., Puskas, J. D., Ferguson, T. B., Blackstone, E. H., & Smith, P. K. (2012). Weighting composite endpoints in clinical trials: Essential evidence for the heart team. The Annals of Thoracic Surgery, 94(6), 1908–1913.
Wang, M., Kong, L., Li, Z., & Zhang, L. (2016). Covariance estimators for generalized estimating equations (gee) in longitudinal analysis with small samples. Statistics in Medicine, 35(10), 1706–1721.
Wu, B. H., Michimae, H., & Emura, T. (2020). Meta-analysis of individual patient data with semi-competing risks under the weibull joint frailty-copula model. Computational Statistics, 35(4), 1525–1552.
Xu, J., Kalbfleisch, J. D., & Tai, B. (2010). Statistical analysis of illness-death processes and semicompeting risks data. Biometrics, 66(3), 716–725.
Zeng, D., & Lin, D. (2009). Semiparametric transformation models with random effects for joint analysis of recurrent and terminal events. Biometrics, 65(3), 746–752.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest statement
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Disclaimer
Dr. Di Zhang contributed to this article in her personal capacity. This manuscript reflects the views of the author and should not be construed to represent FDA views or policies.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
This Appendix contains examples of between-subject comparisons using win ratio approach (Sect. A.1), proof of the Theorem (Sect. A.2), detailed derivation of the covariance between the clustered \(U_1\) and \(U_2\) (Sect. A.3), presentation of the evaluable formulas of the variance and covariance matrix of the joint distribution of the bivariate clustered U-statistics (Sect. A.4), the user manual of the R package cWR with examples (Sect. A.5), and additional simulation results of Type I Error analysis under large intra-cluster correlation (Sect. A.6).
1.1 A.1 Examples of between-subject comparisons using the win ratio approach
We present some examples of between-subject comparisons in the win ratio approach to help readers better understand the comparison procedure. For formal definitions of all possible comparison scenarios, we defer to Luo et al. (2015). Define \(T_{H_z}\), \(T_{D_z}\) and \(T_{C_z}\) as time to non-fatal event, time to fatal event and time to censoring, respectively, for treated (\(z=1\)) and control (\(z=0\)) patients. The horizontal lines in the plot represent follow-up timeline starting from time=0 for all patients. We assume that the fatal event is more important compared to the non-fatal event, and we want to prioritize the fatal event. The fatal event can censor the non-fatal event, but not vice versa.
Example 1
Both non-fatal and fatal events of the treated patient were observed, while only the non-fatal event was observed for the control patient, due to censoring. We first compare the times to fatal event between the treated and control patients. Because the control patient was censored before treated patient developed the fatal event (\(\{T_{C_1}>T_{D_1}\} \text { and } \{T_{D_0}>T_{C_0}>T_{H_0}\} \text { and } \{T_{C_0}<T_{D_1}\)}), we cannot determine who wins based on the fatal event and we move down to the rank of importance to compare the time to non-fatal event. Because the treated patient developed the non-fatal event before the control patient (\(T_{H_1}<T_{H_0}\)), we claim that the control patient wins in this comparison. To further illustrate the construction of the indicator functions of the two kernels in Eqs. (3) and (4) of the main manuscript, we have \(\mathbf{1} \{(T_{C_1}>T_{D_1}) \text { and } (T_{D_0}>T_{C_0}>T_{H_0}) \text { and } (T_{C_0}<T_{D_1}) \text { and } (T_{H_1}<T_{H_0})\} =\phi _2(X_j^{(i)},Y_l^{(k)})=1-\phi _1(X_j^{(i)},Y_l^{(k)})=1. \)
Example 2
The treated patient was censored before developed any events, while the control patient developed the fatal event. We first compare the times to fatal event between the treated and control patients. Because the control patient developed the fatal event before the treated patient was censored (\(\{\min (T_{H_1},T_{D_1})>T_{C_1}\}\text { and }\{\min (T_{H_0},T_{C_0})>T_{D_0}\}\text { and }\{T_{C_1}>T_{D_0}\}\)), we claim that the treated patient wins in this comparison without the need to consider times to non-fatal event. To construct the indicator functions of the two kernels, we have \(\mathbf{1} \{(\min (T_{H_1},T_{D_1})>T_{C_1})\text { and }(\min (T_{H_0},T_{C_0})>T_{D_0})\text { and }(T_{C_1}>T_{D_0})\}=\phi _1(X_j^{(i)},Y_l^{(k)})=1-\phi _2(X_j^{(i)},Y_l^{(k)})=1\).
Example 3
Similar to Example 2 that the treated patient was censored before developed any events, while the control patient developed the fatal event, except that the treated patient censored before the control patient developed the fatal event (\(T_{C_1}<T_{D_0}\)). We cannot determine who wins based on the fatal event. Because we do not observe the non-fatal event for neither patient, this comparison is inconclusive, which leads to a tie. To construct the indicator functions of the two kernels, we have \(\mathbf{1} \{(\min (T_{H_1},T_{D_1})>T_{C_1})\text { and }(\min (T_{H_0},T_{C_0})>T_{D_0})\text { and }(T_{C_1}<T_{D_0})\}=\phi _1(X_j^{(i)},Y_l^{(k)})=\phi _2(X_j^{(i)},Y_l^{(k)})=0\).
1.2 A.2 Proof of Theorem 1
Theorem 1
Define \(A_1^{(i)}=\sum _{j=1}^{J_i} \phi _1(X_j^{(i)})\), \(B_1^{(k)}=\sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})\), \(A_2^{(i)}=\sum _{j=1}^{J_i} \phi _2(X_j^{(i)})\), \(B_2^{(k)}=\sum _{l=1}^{L_k} \phi _2(Y_l^{(k)})\). Also define \({\bar{J}}_m=\frac{1}{m}\sum _{i=1}^{m}J_i\) and \({\bar{L}}_n=\frac{1}{n}\sum _{k=1}^{n}L_k\), where m and n are the numbers of clusters in two comparison groups. Then under the assumptions of independence between comparison groups and among clusters and exchangeability, as \(\min (m,n)\rightarrow \infty \), the U-statistics \(U_1-{\bar{J}}_m {\bar{L}}_n \theta _1\) and \(U_2-{\bar{J}}_m {\bar{L}}_n \theta _2\) asymptotically follow a mean 0 bivariate normal distribution with the variance–covariance matrix of
where
and
Proof
Lee and Dehling (2005) has asymptotically shown that as \(\min (m,n)\rightarrow \infty \),
and that the first term follows an asymptotic normal distribution with mean 0 and variance
and the second term does with mean 0 and variance
Based on the assumptions of independent clusters and identical distributions among observations in the same group, \(\sum _{j=1}^{J_i} \phi _1(X_j^{(i)})\), \(i=1, 2, ..., m\), are iid random variables, as well as \(\sum _{l=1}^{L_k} \phi _1(Y_l^{(k)})\), \(k=1, 2, ..., n\). The same applies to the statistic \(U_2-{\bar{J}}_m {\bar{L}}_n \theta _2\), except that the latter uses the second kernel.
Noting that \(E(\phi _1(X_j^{(i)}))=E(\phi _1(Y_l^{(k)}))=0\), suppose (\(A_1^{(i)}\), \(A_2^{(i)}\)), \(i=1,2,...,m\), follows a bivariate distribution with mean \((0,0)^T\) and variance
Then we have
where \({\bar{A}}_1=\sum _{i=1}^{m}A_1^{(i)}/m\) and \({\bar{A}}_2=\sum _{i=1}^{m}A_2^{(i)}/m\). Therefore, by the Cramer–Wold device
Similarly, by defining \({\bar{B}}_1=\sum _{k=1}^{n}B_1^{(k)}/n\) and \({\bar{B}}_2=\sum _{k=1}^{n}B_2^{(k)}/n\),
Finally, since \(Cov({\bar{A}}_r,{\bar{B}}_r)=0\) (\(r=1,2\)) due to the independence assumption among clusters, \((U_1-{\bar{J}}_m {\bar{L}}_n \theta _1, U_2-{\bar{J}}_m {\bar{L}}_n \theta _2)^T\) follows an asymptotic normal distribution with mean \((0,0)^T\) and variance \(\Sigma =({\bar{L}}_n^2/m)\Sigma _1+({\bar{J}}_m^2/n)\Sigma _2.\) \(\square \)
1.3 A.3 Derivation of the covariance between \(U_1\) and \(U_2\)
Using the definition of the asymptotic formula of the U statistic given in (4),
We can easily see that \(Cov(\textcircled {a},\textcircled {d})=Cov(\textcircled {b},\textcircled {c})=0\) due to independence between clusters, and
Similarly,
Therefore, when \(m,n\rightarrow \infty \) the asymptotic covariance between \(U_1\) and \(U_2\) is
1.4 A.4 Evaluable formulas of variance–covariance matrix
This section presents evaluable forms of each component of the variances and covariances derived in Sects. 3 and 4. Let us define
The four components of \(Var(U_1)\) are estimated as follows:
where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).
where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l \le L_k\).
where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\ne j'\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).
where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l\ne l' \le L_k\).
The same formulation applies to \(Var(U_2)\).
The four components of \(Cov(U_1,U_2)\) can be estimated as follows:
where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).
where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l \le L_k\).
where \(1\le i\le m; 1\le k\ne k'\le n; 1\le j\ne j'\le J_i; 1\le l \le L_k; 1\le l' \le L_k'\).
where \(1\le i\ne i'\le m; 1\le k\le n; 1\le j\le J_i; 1\le j' \le J_i'; 1\le l\ne l' \le L_k\).
1.5 A.5 User’s manual of R package: cWR
Description
This package uses win ratio (WR) as a summary statistic to compare the composite endpoints of time-to-event data between two groups. Options of clustered or independent time-to-event data can be specified.
Usage
cWR(treatment, cluster, y1, y2, delta1, delta2, null.WR=1, alpha.sig=0.05)
Arguments
Treatment | An integer vector with code 0 as control group and 1 as treatment group for each subject. |
Cluster | An integer vector with unique cluster ID for each cluster. When subjects are independent, the cluster ID is unique for each subject. |
y1 | Let \(T_H\), \(T_D\) and \(T_C\) be time to non-fatal event, time to fatal event and censoring time, respectively. y1 is a numeric vector with \(\min \)(\(T_H\), \(T_D\), \(T_C\)) for each subject. |
y2 | A numeric vector with \(\min \)(\(T_D\), \(T_C\)) for each subject. |
delta1 | An integer vector with code 1 indicating that \(T_H\) is observed, 0 otherwise. |
delta2 | An integer vector with code 1 indicating that \(T_D\) is observed, 0 otherwise. |
null.WR | The null hypothesis of the WR statistic. The default is H0: WR=1 or log(WR)=0. |
alpha.sig | The significance level, with default value 0.05. |
Output values
name | The test name |
U1 | First estimated clustered U-statistic |
U2 | Second estimated clustered U-statistic |
logWR | Estimated WR on log scale |
se | Estimated standard error of the WR on log scale |
z | Test statistic |
ci | 100(1-alpha.sig)% confidence interval |
p | P-value of the significance testing |
var_cov | Variance and covariance matrix of the first and second clustered U-statistics |
Details The function “cWR” performs significance testing of comparing two composite time-to-event outcomes between groups. The Win Ratio summary statistic is built on the ”unmatched” approach described by Pocock et al. (2011). We assume that the composite endpoints can be formulated as semi-competing risk data. Each individual in the study is measured on time to non-fatal (non-terminal) event (e.g. hospitalization) and time to fatal (terminal) event (e.g. death). Specifically, the fatal event is considered clinically more important compared to the non-fatal event. Censoring is allowed, but time to censor needs to be observed.
This function can handle independent data, as well as clustered data. The inference of clustered data is based on the generalized bivariate clustered U-statistics proposed in this paper. This clustered U-statistic accounts for the potential correlations among subjects within a cluster. When the cluster size is 1, it’s the independent setting and the inference is the same as the method proposed by Bebu and Lachin (2015).
Note: The option “treatment”, “cluster”, “y1”, “y2”, “delta1”, “delta2” are required and no defaults are provided. These options have to be vectors with the same length. No missing values are allowed.
Example
Data generation for the clustered semi-competing risk data and the clustered win ratio analysis.
1.6 A.6 Simulation results of Type I error analysis for large intra-cluster correlation (ICC=0.5)
Simulation results of Type I error analysis for ICC\(=0.5\) and CV\(=0.71\), under the underlying true log(WR)=0:
\((\eta _H,\eta _D)=(0,0) [\log (WR)=0]\) | |||||
---|---|---|---|---|---|
Number of clusters | Cluster size | Estimated \(\log (WR)\) | Empirical SE | Estimated SE | Type I error |
25-25 | 20 | \(-0.007\) | 0.207 | 0.196 | 0.069 |
50 | 0.004 | 0.195 | 0.185 | 0.062 | |
100 | \(-0.006\) | 0.189 | 0.181 | 0.067 | |
200 | \(-0.001\) | 0.183 | 0.178 | 0.058 | |
50-50 | 20 | \(-0.004\) | 0.144 | 0.141 | 0.050 |
50 | \(-0.001\) | 0.133 | 0.133 | 0.055 | |
100 | \(-0.001\) | 0.132 | 0.130 | 0.056 | |
200 | 0.000 | 0.131 | 0.128 | 0.051 | |
75-75 | 20 | 0.000 | 0.118 | 0.116 | 0.054 |
50 | 0.002 | 0.112 | 0.109 | 0.058 | |
100 | \(-0.002\) | 0.109 | 0.107 | 0.047 | |
200 | 0.000 | 0.103 | 0.106 | 0.053 |
Rights and permissions
About this article
Cite this article
Zhang, D., Jeong, JH. Inference on win ratio for cluster-randomized semi-competing risk data. Jpn J Stat Data Sci 4, 1263–1292 (2021). https://doi.org/10.1007/s42081-021-00131-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-021-00131-1