Skip to main content

Optimal Weighted Wilcoxon–Mann–Whitney Test for Prioritized Outcomes

  • Chapter
  • First Online:
New Frontiers of Biostatistics and Bioinformatics

Abstract

We consider a two-group randomized clinical trial of prioritized endpoints, where mortality affects the assessment of a follow-up continuous outcome. With the continuous outcome as the principal outcome, we combine it with mortality via the worst-rank paradigm into a single composite endpoint. Then, we develop a weighted Wilcoxon–Mann–Whitney test statistic to analyze the data. We determine the optimal weights for the Wilcoxon–Mann–Whitney test statistic that maximize its power. We provide the rationale for the weights and their implications in the application of the method. In addition, we derive a formula for its power and demonstrate its accuracy in simulations. Finally, we apply the method to data from an acute ischemic stroke clinical trial of normobaric oxygen therapy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Adams H., Jr., Davis, P., Leira, E., Chang, K., Bendixen, B., Clarke, W., et al. (1999). Baseline NIH stroke scale score strongly predicts outcome after stroke: A report of the Trial of Org 10172 in Acute Stroke Treatment (TOAST). Neurology, 53(1), 126.

    Article  Google Scholar 

  • Ahmad, Y., Nijjer, S., Cook, C. M., El-Harasis, M., Graby, J., Petraco, R., et al. (2015). A new method of applying randomised control study data to the individual patient: A novel quantitative patient-centred approach to interpreting composite end points. International Journal of Cardiology, 195, 216–224.

    Article  Google Scholar 

  • Allen, L. A., Hernandez, A. F., O’Connor, C. M., & Felker, G. M. (2009). End points for clinical trials in acute heart failure syndromes. Journal of the American College of Cardiology, 53(24), 2248–2258.

    Article  Google Scholar 

  • Anker, S. D., & Mcmurray, J. J. (2012). Time to move on from’time-to-first’: Should all events be included in the analysis of clinical trials? European Heart Journal, 33(22), 2764–2765.

    Article  Google Scholar 

  • Anker, S. D., Schroeder, S., Atar, D., Bax, J. J., Ceconi, C., Cowie, M. R., et al. (2016). Traditional and new composite endpoints in heart failure clinical trials: Facilitating comprehensive efficacy assessments and improving trial efficiency. European Journal of Heart Failure, 18(5):482–489.

    Article  Google Scholar 

  • Anstrom, K. J., & Eisenstein, E. L. From batting average to wins above replacement to composite end points-refining clinical research using baseball statistical methods. American Heart Journal, 161(5), 805–806.

    Article  Google Scholar 

  • Armstrong, P. W., & Westerhout, C. M. (2013). The power of more than one. Circulation 127, 665–667.

    Article  Google Scholar 

  • Armstrong, P. W., & Westerhout, C. M. (2017). Composite end points in clinical research. Circulation, 135(23), 2299–2307.

    Article  Google Scholar 

  • Armstrong, P. W., Westerhout, C. M., Van de Werf, F., Califf, R. M., Welsh, R. C., Wilcox, R. G., et al. (2011). Refining clinical trial composite outcomes: An application to the assessment of the safety and efficacy of a new thrombolytic–3 (assent-3) trial. American Heart Journal, 161(5), 848–854.

    Article  Google Scholar 

  • Bakal, J. A., Roe, M. T., Ohman, E. M., Goodman, S. G., Fox, K. A., Zheng, Y., et al. (2015). Applying novel methods to assess clinical outcomes: Insights from the trilogy ACS trial. European Heart Journal, 36(6), 385–392.

    Article  Google Scholar 

  • Bakal, J. A., Westerhout, C. M., & Armstrong, P. W. (2012). Impact of weighted composite compared to traditional composite endpoints for the design of randomized controlled trials. Statistical Methods in Medical Research, 24(6), 980–988. https://doi.org/10.1177/0962280211436004

    Article  MathSciNet  Google Scholar 

  • Bakal, J. A., Westerhout, C. M., Cantor, W. J., Fernández-Avilés, F., Welsh, R. C., Fitchett, D., et al. (2012). Evaluation of early percutaneous coronary intervention vs. standard therapy after fibrinolysis for st-segment elevation myocardial infarction: Contribution of weighting the composite endpoint. European Heart Journal, 34(12), 903–908.

    Article  Google Scholar 

  • Bebu, I., & Lachin, J. M. (2015). Large sample inference for a win ratio analysis of a composite outcome based on prioritized components. Biostatistics, 17(1), 178–187.

    MathSciNet  Google Scholar 

  • Berry, J. D., Miller, R., Moore, D. H., Cudkowicz, M. E., Van Den Berg, L. H., Kerr, D. A., et al. (2013). The combined assessment of function and survival (CAFS): A new endpoint for ALS clinical trials. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration, 14(3), 162–168.

    Article  Google Scholar 

  • Bonate, P. L. (2000). Analysis of pretest-posttest designs. Boca Raton: CRC Press.

    Book  Google Scholar 

  • Braunwald, E., Antman, E. M., Beasley, J. W., Califf, R. M., Cheitlin, M. D., Hochman, J. S., et al. (2002). ACC/AHA 2002 guideline update for the management of patients with unstable angina and non–st-segment elevation myocardial infarction–summary article: A report of the American college of cardiology/American heart association task force on practice guidelines (committee on the management of patients with unstable angina). Journal of the American College of Cardiology, 40(7), 1366–1374.

    Google Scholar 

  • Brittain, E., Palensky, J., Blood, J., & Wittes, J. (1997). Blinded subjective rankings as a method of assessing treatment effect: A large sample example from the systolic hypertension in the elderly program (SHEP). Statistics in Medicine, 16(6), 681–693.

    Article  Google Scholar 

  • Brown, P. M., Anstrom, K. J., Felker, G. M., & Ezekowitz, J. A. (2016). Composite end points in acute heart failure research: Data simulations illustrate the limitations. Canadian Journal of Cardiology, 32(11), 1356.e21–1356.e28.

    Article  Google Scholar 

  • Brunner, E., & Munzel, U. (2000). The nonparametric Behrens-Fisher problem: Asymptotic theory and a small-sample approximation. Biometrical Journal, 42(1), 17–25.

    Article  MathSciNet  MATH  Google Scholar 

  • Bruno, A., Saha, C., & Williams, L.S. (2006). Using change in the national institutes of health stroke scale to measure treatment effect in acute stroke trials. Stroke, 37(3), 920–921.

    Article  Google Scholar 

  • Buyse, M. (2010). Generalized pairwise comparisons of prioritized outcomes in the two-sample problem. Statistics in Medicine, 29(30), 3245–3257

    Article  MathSciNet  Google Scholar 

  • Campbell, D. T., & Kenny, D. A. (1999). A primer on regression artifacts. New York: Guilford Publications.

    Google Scholar 

  • Chung, E., & Romano, J. P. (2016). Asymptotically valid and exact permutation tests based on two-sample U-statistics. Journal of Statistical Planning and Inference, 168, 97–105.

    Article  MathSciNet  MATH  Google Scholar 

  • Claggett, B., Wei, L.-J., & Pfeffer, M. A. (2013). Moving beyond our comfort zone. European Heart Journal, 34(12), 869–871.

    Article  Google Scholar 

  • Cordoba, G., Schwartz, L., Woloshin, S., Bae, H., & Gotzsche, P. (2010). Definition, reporting, and interpretation of composite outcomes in clinical trials: Systematic review. British Medical Journal, 341, c3920.

    Article  Google Scholar 

  • Davis, S. M., Koch, G. G., Davis, C., & LaVange, L. M. (2003). Statistical approaches to effectiveness measurement and outcome-driven re-randomizations in the clinical antipsychotic trials of intervention effectiveness (CATIE) studies. Schizophrenia Bulletin, 29(1), 73.

    Article  Google Scholar 

  • DeCoster, T., Willis, M., Marsh, J., Williams, T., Nepola, J., Dirschl, D., & Hurwitz, S. (1999). Rank order analysis of tibial plafond fractures: Does injury or reduction predict outcome? Foot & Ankle International, 20(1), 44–49.

    Article  Google Scholar 

  • Dmitrienko, A., D’Agostino, R. B., & Huque, M. F. (2013). Key multiplicity issues in clinical drug development. Statistics in Medicine, 32(7), 1079–1111.

    Article  MathSciNet  Google Scholar 

  • Fay, M. P., & Proschan, M. A. (2010). Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Statistics Surveys, 4, 1–39.

    Article  MathSciNet  MATH  Google Scholar 

  • Feldman, A., Baughman, K., Lee, W., Gottlieb, S., Weiss, J., Becker, L., & Strobeck, J. (1991). Usefulness of OPC-8212, a quinolinone derivative, for chronic congestive heart failure in patients with ischemic heart disease or idiopathic dilated cardiomyopathy. The American Journal of Cardiology, 68(11), 1203–1210.

    Article  Google Scholar 

  • Felker, G., Anstrom, K., & Rogers, J. (2008). A global ranking approach to end points in trials of mechanical circulatory support devices. Journal of Cardiac Failure, 14(5), 368–372.

    Article  Google Scholar 

  • Felker, G. M., & Maisel, A. S. (2010). A global rank end point for clinical trials in acute heart failure. Circulation: Heart Failure, 3(5), 643–646.

    Google Scholar 

  • Ferreira-Gonzalez, I., Permanyer-Miralda, G., Busse, J., Devereaux, P., Guyatt, G., Alonso-Coello, P., et al. (2009). Composite outcomes can distort the nature and magnitude of treatment benefits in clinical trials. Annals of Internal Medicine, 150(8), 566.

    Article  Google Scholar 

  • Ferreira-González, I., Permanyer-Miralda, G., Busse, J. W., Bryant, D. M., Montori, V. M., Alonso-Coello, P., et al. (2007a). Methodologic discussions for using and interpreting composite endpoints are limited, but still identify major concerns. Journal of Clinical Epidemiology, 60(7), 651–657.

    Article  Google Scholar 

  • Ferreira-González, I., Permanyer-Miralda, G., Domingo-Salvany, A., Busse, J., Heels-Ansdell, D., Montori, V., et al. (2007b). Problems with use of composite end points in cardiovascular trials: Systematic review of randomised controlled trials. The BMJ, 334(7597), 786.

    Article  Google Scholar 

  • Finkelstein, D., & Schoenfeld, D. (1999). Combining mortality and longitudinal measures in clinical trials. Statistics in Medicine, 18(11), 1341–1354.

    Article  Google Scholar 

  • Fisher, L. D. (1998). Self-designing clinical trials. Statistics in Medicine, 17(14), 1551–1562.

    Article  Google Scholar 

  • Fitzmaurice, G. (2001). A conundrum in the analysis of change. Nutrition, 17(4), 360–361.

    Article  Google Scholar 

  • Follmann, D., Duerr, A., Tabet, S., Gilbert, P., Moodie, Z., Fast, P., et al. (2007). Endpoints and regulatory issues in HIV vaccine clinical trials: Lessons from a workshop. Journal of Acquired Immune Deficiency Syndromes (1999), 44(1), 49.

    Article  Google Scholar 

  • Follmann, D., Wittes, J., & Cutler, J. A. (1992). The use of subjective rankings in clinical trials with an application to cardiovascular disease. Statistics in Medicine, 11(4), 427–437.

    Article  Google Scholar 

  • Freemantle, N., Calvert, M., Wood, J., Eastaugh, J., & Griffin, C. (2003). Composite outcomes in randomized trials: Greater precision but with greater uncertainty? JAMA, 289(19), 2554.

    Article  Google Scholar 

  • Gail, M. H., Mark, S. D., Carroll, R. J., Green, S. B., & Pee, D. (1996). On design considerations and randomization-based inference for community intervention trials. Statistics in Medicine, 15(11), 1069–1092.

    Article  Google Scholar 

  • Gehan, E. A. (1965). A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika, 52(1–2), 203–223.

    Article  MathSciNet  MATH  Google Scholar 

  • Gómez, G., & Lagakos, S. W. (2013). Statistical considerations when using a composite endpoint for comparing treatment groups. Statistics in Medicine, 32(5), 719–738.

    Article  MathSciNet  Google Scholar 

  • Gould, A. (1980). A new approach to the analysis of clinical drug trials with withdrawals. Biometrics, 36(4), 721–727.

    Article  Google Scholar 

  • Grech, E., & Ramsdale, D. (2003). Acute coronary syndrome: Unstable angina and non-st segment elevation myocardial infarction. The BMJ, 326(7401), 1259.

    Article  Google Scholar 

  • Hallstrom, A., Litwin, P., & Douglas Weaver, W. (1992). A method of assigning scores to the components of a composite outcome: An example from the MITI trial. Controlled Clinical Trials, 13(2), 148–155.

    Article  Google Scholar 

  • Hanley, J. A., & McNeil, B. J. (1992). The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology, 143(1), 29–36.

    Article  Google Scholar 

  • Hariharan, S., McBride, M. A., & Cohen, E. P. (2003). Evolution of endpoints for renal transplant outcome. American Journal of Transplantation, 3(8), 933–941.

    Article  Google Scholar 

  • Heddle, N. M., & Cook, R. J. (2011). Composite outcomes in clinical trials: What are they and when should they be used? Transfusion, 51(1), 11–13.

    Article  Google Scholar 

  • Huang, P., Woolson, R. F., & O’Brien, P. C. (2008). A rank-based sample size method for multiple outcomes in clinical trials. Statistics in Medicine, 27(16), 3084–3104.

    Article  MathSciNet  Google Scholar 

  • Huque, M. F., Alosh, M., & Bhore, R. (2011). Addressing multiplicity issues of a composite endpoint and its components in clinical trials. Journal of Biopharmaceutical Statistics, 21(4), 610–634.

    Article  MathSciNet  Google Scholar 

  • Kaufman, K. D., Olsen, E. A., Whiting, D., Savin, R., DeVillez, R., Bergfeld, W., et al. (1998). Finasteride in the treatment of men with androgenetic alopecia. Journal of the American Academy of Dermatology, 39(4), 578–589.

    Article  Google Scholar 

  • Kawaguchi, A., Koch, G. G., & Wang, X. (2011). Stratified multivariate Mann–Whitney estimators for the comparison of two treatments with randomization based covariance adjustment. Statistics in Biopharmaceutical Research, 3(2), 217–231.

    Article  Google Scholar 

  • Lachin, J. (1999). Worst-rank score analysis with informatively missing observations in clinical trials. Controlled Clinical Trials, 20(5), 408–422.

    Article  Google Scholar 

  • Lachin, J. M., & Bebu, I. (2015). Application of the Wei–Lachin multivariate one-directional test to multiple event-time outcomes. Clinical Trials, 12(6), 627–633. https://doi.org/10.1177/1740774515601027.

    Article  Google Scholar 

  • Li, D., Zhao, G., Paty, D., University of British Columbia MS/MRI Analysis Research Group, The SPECTRIMS Study Group. (2001). Randomized controlled trial of interferon-beta-1a in secondary progressive MS MRI results. Neurology, 56(11), 1505–1513.

    Article  Google Scholar 

  • Lisa, A. B., & James, S. H. (1997). Rule-based ranking schemes for antiretroviral trials. Statistics in Medicine, 16, 1175–1191.

    Article  Google Scholar 

  • Logan, B., & Tamhane, A. (2008). Superiority inferences on individual endpoints following noninferiority testing in clinical trials. Biometrical Journal, 50(5), 693–703.

    Article  MathSciNet  Google Scholar 

  • Lubsen, J., Just, H., Hjalmarsson, A., La Framboise, D., Remme, W., Heinrich-Nols, J., et al. (1996). Effect of pimobendan on exercise capacity in patients with heart failure: Main results from the Pimobendan in Congestive Heart Failure (PICO) trial. Heart, 76(3), 223.

    Article  Google Scholar 

  • Lubsen, J., & Kirwan, B.-A. (2002). Combined endpoints: Can we use them? Statistics in Medicine, 21(19), 2959–2970.

    Article  Google Scholar 

  • Luo, X., Qiu, J., Bai, S., & Tian, H. (2017). Weighted win loss approach for analyzing prioritized outcomes. Statistics in Medicine, 36(15), 2452–2465.

    Article  MathSciNet  Google Scholar 

  • Manja, V., AlBashir, S., & Guyatt, G. (2017). Criteria for use of composite end points for competing risks–a systematic survey of the literature with recommendations. Journal of Clinical Epidemiology, 82, 4–11.

    Article  Google Scholar 

  • Mascha, E. J., & Turan, A. (2012). Joint hypothesis testing and gatekeeping procedures for studies with multiple endpoints. Anesthesia & Analgesia, 114(6), 1304–1317.

    Article  Google Scholar 

  • Matsouaka, R. A., & Betensky, R. A. (2015). Power and sample size calculations for the Wilcoxon–Mann–Whitney test in the presence of death-censored observations. Statistics in Medicine, 34(3), 406–431.

    Article  MathSciNet  Google Scholar 

  • Matsouaka, R. A., Singhal, A. B., & Betensky, R. A. (2016). An optimal Wilcoxon–Mann–Whitney test of mortality and a continuous outcome. Statistical Methods in Medical Research, 27(8), 2384–2400. https://doi.org/10.1177/0962280216680524

    Article  MathSciNet  Google Scholar 

  • Minas, G., Rigat, F., Nichols, T. E., Aston, J. A., & Stallard, N. (2012). A hybrid procedure for detecting global treatment effects in multivariate clinical trials: Theory and applications to fMRI studies. Statistics in Medicine, 31(3), 253–268.

    Article  MathSciNet  Google Scholar 

  • Moyé, L. (2013). Multiple analyses in clinical trials: Fundamentals for investigators. Berlin: Springer.

    MATH  Google Scholar 

  • Moyé, L., Davis, B., & Hawkins, C. (1992). Analysis of a clinical trial involving a combined mortality and adherence dependent interval censored endpoint. Statistics in Medicine, 11(13), 1705–1717.

    Article  Google Scholar 

  • National Asthma Education and Prevention Program (National Heart, Lung, and Blood Institute). (2007). Third expert panel on the management of asthma. Expert panel report 3: Guidelines for the diagnosis and management of asthma. US Department of Health and Human Services, National Institutes of Health, National Heart, Lung, and Blood Institute.

    Google Scholar 

  • Neaton, J., Gray, G., Zuckerman, B., & Konstam, M. (2005). Key issues in end point selection for heart failure trials: Composite end points. Journal of Cardiac Failure, 11(8), 567–575.

    Article  Google Scholar 

  • Neaton, J. D., Wentworth, D. N., Rhame, F., Hogan, C., Abrams, D. I., & Deyton, L. (1994). Considerations in choice of a clinical endpoint for aids clinical trials. Statistics in Medicine, 13(19–20), 2107–2125.

    Article  Google Scholar 

  • Newcombe, R. G. (2006). Confidence intervals for an effect size measure based on the Mann–Whitney statistic. part 2: Asymptotic methods and evaluation. Statistics in Medicine, 25(4), 559–573.

    Article  MathSciNet  Google Scholar 

  • Oakes, J. M., & Feldman, H. A. (2001). Statistical power for nonequivalent pretest-posttest designs the impact of change-score versus ANCOVA models. Evaluation Review, 25(1), 3–28.

    Article  Google Scholar 

  • O’Brien, P. C. (1984). Procedures for comparing samples with multiple endpoints. Biometrics, 40, 1079–1087.

    Article  MathSciNet  Google Scholar 

  • Parsons, M., Spratt, N., Bivard, A., Campbell, B., Chung, K., Miteff, F., et al. (2012). A randomized trial of tenecteplase versus alteplase for acute ischemic stroke. New England Journal of Medicine, 366(12), 1099–1107.

    Article  Google Scholar 

  • Pearl, J. (2014). Lord’s paradox revisited–(oh lord! kumbaya!). Tech. rep., Citeseer.

    Google Scholar 

  • Pocock, S. J., Ariti, C. A., Collier, T. J., & Wang, D. (2011). The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal, 33(2), 176–182.

    Article  Google Scholar 

  • Pratt, J. W. (1964). Robustness of some procedures for the two-sample location problem. Journal of the American Statistical Association, 59, 665–680.

    MathSciNet  Google Scholar 

  • Prieto-Merino, D., Smeeth, L., van Staa, T. P., & Roberts, I. (2013). Dangers of non-specific composite outcome measures in clinical trials. The BMJ, 347, f6782.

    Article  Google Scholar 

  • Ramchandani, R., Schoenfeld, D. A., & Finkelstein, D. M. (2016). Global rank tests for multiple, possibly censored, outcomes. Biometrics, 72, 926–935.

    Article  MathSciNet  MATH  Google Scholar 

  • Röhmel, J., Gerlinger, C., Benda, N., & Läuter, J. (2006). On testing simultaneously non-inferiority in two multiple primary endpoints and superiority in at least one of them. Biometrical Journal, 48(6), 916–933.

    Article  MathSciNet  Google Scholar 

  • Rosenbaum, P. R. (2006). Comment: The place of death in the quality of life. Statistical Science, 21(3), 313–316.

    Article  MathSciNet  MATH  Google Scholar 

  • Rosner, B. (2015). Fundamentals of biostatistics. Toronto: Nelson Education.

    Google Scholar 

  • Ross, S. (2007). Composite outcomes in randomized clinical trials: Arguments for and against. American Journal of Obstetrics and Gynecology, 196(2), 119–e1.

    Article  MathSciNet  Google Scholar 

  • Rowan, J. A., Hague, W. M., Gao, W., Battin, M. R., & Moore, M. P. (2008). Metformin versus insulin for the treatment of gestational diabetes. New England Journal of Medicine, 358(19), 2003–2015.

    Article  Google Scholar 

  • Rubin, D. B. (2006). Rejoinder: Causal inference through potential outcomes and principal stratification: Application to studies with” censoring” due to death. Statistical Science, 21(3), 319–321.

    Article  MathSciNet  MATH  Google Scholar 

  • Sampson, U. K., Metcalfe, C., Pfeffer, M. A., Solomon, S. D., & Zou, K. H. (2010). Composite outcomes: Weighting component events according to severity assisted interpretation but reduced statistical power. Journal of Clinical Epidemiology, 63(10), 1156–1158

    Article  Google Scholar 

  • Samson, K. (2013). News from the AAN annual meeting: Why a trial of normobaric oxygen in acute ischemic stroke was halted early. Neurology Today, 13(10), 34–35.

    Article  Google Scholar 

  • Sankoh, A. J., Li, H., & D’Agostino, R. B. (2014). Use of composite endpoints in clinical trials. Statistics in Medicine, 33(27), 4709–4714.

    Article  MathSciNet  Google Scholar 

  • Senn, S. (2006). Change from baseline and analysis of covariance revisited. Statistics in Medicine, 25(24), 4334–4344.

    Article  MathSciNet  Google Scholar 

  • Shahar, E., & Shahar, D. J. (2012). Causal diagrams and change variables. Journal of Evaluation in Clinical Practice, 18(1), 143–148.

    Article  Google Scholar 

  • Singhal, A., Benner, T., Roccatagliata, L., Koroshetz, W., Schaefer, P., Lo, E., et al. (2005). A pilot study of normobaric oxygen therapy in acute ischemic stroke. Stroke, 36(4), 797.

    Article  Google Scholar 

  • Singhal, A. B. (2006). Normobaric oxygen therapy in acute ischemic stroke trial. ClinicalTrials.gov Database. http://clinicaltrials.gov/ct2/show/NCT00414726

  • Singhal, A. B. (2007). A review of oxygen therapy in ischemic stroke. Neurological Research, 29(2), 173–183.

    Article  Google Scholar 

  • Spencer, S., Mayer, B., Bendall, K. L., & Bateman, E. D. (2007). Validation of a guideline-based composite outcome assessment tool for asthma control. Respiratory Research, 8(1), 26.

    Article  Google Scholar 

  • Subherwal, S., Anstrom, K. J., Jones, W. S., Felker, M. G., Misra, S., Conte, M. S., et al. (2012). Use of alternative methodologies for evaluation of composite end points in trials of therapies for critical limb ischemia. American Heart Journal, 164(3), 277.

    Article  Google Scholar 

  • Sun, H., Davison, B. A., Cotter, G., Pencina, M. J., & Koch, G. G. (2012). Evaluating treatment efficacy by multiple end points in phase ii acute heart failure clinical trials analyzing data using a global method. Circulation: Heart Failure, 5(6), 742–749.

    Google Scholar 

  • Tomlinson, G., & Detsky, A. S. (2010). Composite end points in randomized trials: There is no free lunch. JAMA, 303(3), 267–268.

    Article  Google Scholar 

  • Tyler, K. M., Normand, S.-L. T., & Horton, N. J. (2011). The use and abuse of multiple outcomes in randomized controlled depression trials. Contemporary Clinical Trials, 32(2), 299–304.

    Article  Google Scholar 

  • van Breukelen, G. J. (2013). ANCOVA versus CHANGE from baseline in nonrandomized studies: The difference. Multivariate Behavioral Research, 48(6), 895–922.

    Article  Google Scholar 

  • Van Elteren, P. (1960). On the combination of independent two-sample tests of Wilcoxon. Bulletin of the International Statistical Institute, 37, 351–361.

    MathSciNet  MATH  Google Scholar 

  • Wen, X., Hartzema, A., Delaney, J. A., Brumback, B., Liu, X., Egerman, R., et al. (2017). Combining adverse pregnancy and perinatal outcomes for women exposed to antiepileptic drugs during pregnancy, using a latent trait model. BMC Pregnancy and Childbirth, 17(1), 10.

    Article  Google Scholar 

  • Willett, J. B. (1988). Questions and answers in the measurement of change. Review of Research in Education, 15, 345–422.

    Google Scholar 

  • Wilson, R. F., & Berger, A. K. (2011). Are all end points created equal? The case for weighting. Journal of the American College of Cardiology, 57(5), 546–548.

    Article  Google Scholar 

  • Young, F. B., Weir, C. J., Lees, K. R., & GAIN International Trial Steering Committee and Investigators. (2005). Comparison of the national institutes of health stroke scale with disability outcome measures in acute stroke trials. Stroke, 36(10), 2187–2192.

    Article  Google Scholar 

  • Zhang, J., Quan, H., Ng, J., & Stepanavage, M. E. (1997). Some statistical methods for multiple endpoints in clinical trials. Controlled Clinical Trials, 18(3), 204–221.

    Article  Google Scholar 

  • Zhao, Y. (2006). Sample size estimation for the van Elteren test—a stratified Wilcoxon–Mann–Whitney test. Statistics in Medicine, 25(15), 2675–2687.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by grants P50-NS051343, R01-CA075971, T32 NS048005, 1RO1HL118336-01, and UL1TR001117 awarded by the National Institutes of Health. The content of this paper is solely the responsibility of the authors and does not necessarily represent the official view of the National Institutes of Health.

Conflict of Interest: None declared.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roland A. Matsouaka .

Editor information

Editors and Affiliations

Appendices

Appendix 1: Proof of Theorem 1.1

We consider \(\widetilde {X}_{ij}=\delta _{ij}(\eta +T_{ij})+(1-\delta _{ij})X_{ij},~ i=1, 2~\mbox{and} ~j=1,\ldots , N_i\)

$$\displaystyle \begin{aligned} \begin{array}{rcl} I(\widetilde{X}_{1k}<\widetilde{X}_{2l})&\displaystyle =&\displaystyle \delta_{1k}\delta_{2l}I({T}_{1k}<{T}_{2l}|T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max})+\delta_{1k}(1-\delta_{2l})\\ &\displaystyle &\displaystyle + (1-\delta_{1k})(1-\delta_{2l})I({X}_{1k}<{X}_{2l}),\\ I(\widetilde{X}_{1k}=\widetilde{X}_{2l})&\displaystyle =&\displaystyle \delta_{1k}\delta_{2l}I({T}_{1k}={T}_{2l}|T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max})\\ &\displaystyle &\displaystyle + (1-\delta_{1k})(1-\delta_{2l})I({X}_{1k}={X}_{2l}). \end{array} \end{aligned} $$

For q i = 1 − p i, we have

where

$$\displaystyle \begin{aligned} \begin{array}{rcl} \pi_{t1}&\displaystyle =&\displaystyle P({T}_{1k}<{T}_{2l}|T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max})\\ &\displaystyle &\displaystyle +\frac{1}{2}P({T}_{1k}={T}_{2l}|T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max})\\ \pi_{x1}&\displaystyle =&\displaystyle P({X}_{1k}<{X}_{2l})+\frac{1}{2}P({X}_{1k}={X}_{2l}). \end{array} \end{aligned} $$

Define \(U_{kl}=I(\widetilde {X}_{1k}<\widetilde {X}_{2l})+\frac {1}{2}I(\widetilde {X}_{1k}=\widetilde {X}_{2l}),\) for k = 1, …, N 1 and l = 1, …, N 2. The binary variable \(U_{kl}=I(\widetilde {X}_{1k}<\widetilde {X}_{2l})\) follows Bernoulli distribution with probability π U1. Its mean and variance, respectively, E(U kl) = π U1 and \(~~Var(U_{kl})=E(U_{kl})\left [1-E(U_{kl})\right ]=\pi _{U1}(1-\pi _{U1})\). Thus, we can use these results to derive the variance of U using the following formula:

$$\displaystyle \begin{aligned} Var(U)&=(N_1N_2)^{-2}\left[\sum_{k=1}^{N_1}\sum_{l=1}^{N_2}Var(U_{kl})+\sum_{k=1}^{N_1}\sum_{l=1}^{N_2}\sum_{k'=1}^{N_1}\sum_{l'=1}^{N_2} Cov(U_{kl}, U_{k'l'})\right]\\ &= (N_1N_2)^{-1}\left[Var(U_{kl})+(N_1-1)Cov(U_{kl}, U_{k'l})\right.\\ &\qquad \left.+(N_2-1)Cov(U_{kl}, U_{kl'})\right]. \end{aligned} $$

Note that when kk′ and ll′, the covariance

$$\displaystyle \begin{aligned}Cov(U_{kl}, U_{k'l'})=E(U_{kl} U_{k'l'})-E(U_{kl})E (U_{k'l'})=0.\end{aligned}$$

When kk′ or ll′, we have

$$\displaystyle \begin{aligned} Cov(U_{kl}, U_{k'l})=E(U_{kl} U_{k'l})-E(U_{kl})E (U_{k'l})= \pi_{U2}-\pi_{U1}^2;\\ Cov(U_{kl}, U_{kl'})=E(U_{kl} U_{kl'} )-E(U_{kl})E (U_{kl'}) =\pi_{U3}-\pi_{U1}^2. \end{aligned} $$

where \(\pi _{U2}=E(U_{kl} U_{k'l})\) and \(\pi _{U3}=E(U_{kl} U_{kl'}).\)

Therefore,

$$\displaystyle \begin{aligned} Var(U) &= (N_1N_2)^{-1}\left[\pi_{U1}\left(1-\pi_{U1}\right)+(N_1-1)(\pi_{U2}-\pi_{U1}^2)\right.\\ &\qquad \qquad \qquad \left.+(N_2-1)(\pi_{U3}-\pi_{U1}^2)\right] \end{aligned} $$
  1. 1.

    No ties:

    When there are no ties, \(I(\widetilde {X}_{1k}=\widetilde {X}_{2l})=0.\) In which case, \(U_{kl}=I(\widetilde {X}_{1k}<\widetilde {X}_{2l})= \delta _{1k}\delta _{2l}I({T}_{1k}<{T}_{2l}|T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max})+\delta _{1k}(1-\delta _{2l}) + (1-\delta _{1k})(1-\delta _{2l})I({X}_{1k}<{X}_{2l})\), for k = 1, …, N 1 and l = 1, …, N 2, . We have

    $$\displaystyle \begin{aligned} E(U_{kl}U_{k'l})&= P(T_{1k}<T_{2l}, T_{1k'}<T_{2l}|\delta_{1k}\delta_{1k'}\delta_{2l}=1)E(\delta_{1k}\delta_{1k'}\delta_{2l}=1)\\ &+ P(X_{1k'}<X_{2l})P(\delta_{1k}=1, ~\delta_{1k'}=\delta_{2l}=0)\\ &+P(X_{1k}<X_{2l})E(\delta_{1k}=\delta_{2l}=0)E(\delta_{1k'}=1)\\ &+ P(X_{1k}<X_{2l}, X_{1k'}<X_{2l})E(\delta_{1k}=\delta_{1k'}=\delta_{2l}=0)\\& +E(\delta_{1k}\delta_{1k'}=1)E(\delta_{2l}=0)\\ &= p_1^2p_2\pi_{t2}+2p_1q_1q_2\pi_{x1}+q_1^2q_2\pi_{x2}+p_1^2q_2\equiv \pi_{U2}\\ E(U_{kl}U_{kl'})&= P(T_{1k}<T_{2l},~ t_{1k}<t_{2l'}|\delta_{1k}\delta_{2l}\delta_{2l'}=1)E(\delta_{1k}\delta_{2l}\delta_{2l'}=1)\\ &+ P(T_{1k}<T_{2l}|\delta_{1k}\delta_{2l}=1, ~\delta_{2l'}=0)E(\delta_{1k}\delta_{2l}=1)E(\delta_{2l'}=0)\\ &+ P(t_{1k}<t_{2l'}|\delta_{1k}=1, ~\delta_{2l}=0, ~~\delta_{2l'}=1)E(\delta_{1k}\delta_{2l'}=1)E(\delta_{2l}=0)\\ &+ P(X_{1k}<X_{2l}, X_{1k}<X_{2l'})E(\delta_{1k}=\delta_{2l}=\delta_{2l'}=0)\\ &+ E(\delta_{1k}=1)E(\delta_{2l}=\delta_{2l'}=0)\\ &= p_1p_2^2\pi_{t3}+2p_1p_2q_2\pi_{t1}+q_1q_2^2\pi_{x3}+p_1q_2^2\equiv \pi_{U3}\\ \text{with} ~~~ \pi_{t2}&= P(T_{1k}<T_{2l}, T_{1k'}<T_{2l}|T_{1k}\leq T_{max},~ T_{1k'}\leq T_{max}, ~T_{2l}\leq T_{max}),\\ \pi_{x2}&= P(X_{1k}<X_{2l}, X_{1k'}<X_{2l}), \\~~ \pi_{t3} &= P(T_{1k}<T_{2l}, t_{1k}<t_{2l'}| T_{1k}\leq T_{max},~ T_{2l}\leq T_{max}, ~T_{2l'}\leq T_{max}), \\~~ \pi_{x3}&= P(X_{1k}<X_{2l}, X_{1k}<X_{2l'}) . \end{aligned} $$

    Under the null hypothesis of no difference between the two groups, with respect to survival and nonfatal outcome, we have F 1 = F 2 = F, G 1 = G 2 = G and p 1 = p 2 = p, q 1 = q 2 = q. This implies

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \pi_{t1}&\displaystyle =&\displaystyle P(T_{1k}<T_{2l}|T_{1k}\leq T_{max}, T_{2l}\leq T_{max})\\ &\displaystyle =&\displaystyle \frac{1}{2p^2}\left[F(T_{max})^2-F(0)^2\right]=\frac{1}{2}\\ \pi_{t2}&\displaystyle =&\displaystyle P(T_{1k}<T_{2l}, T_{1k'}<T_{2l}|T_{1k}\leq T_{max},T_{1k'}\leq T_{max},T_{2l}\leq T_{max})\\ &\displaystyle =&\displaystyle \frac{1}{p^3}\int_0^{T_{max}}F(t)^2dF(t)\\ &\displaystyle =&\displaystyle \frac{1}{3p^3}\left[F(T_{max})^3-F(0)^3\right]=\frac{1}{3}\\ \pi_{t3} &\displaystyle =&\displaystyle P(T_{1k}<T_{2l}, T_{1k}<T_{2l'}|T_{1k}\leq T, T_{2l}\leq T, T_{2l'}\leq T))\\ &\displaystyle =&\displaystyle \frac{1}{p^3}\int_0^{T_{max}}\left[1-F(t)\right]^2dF(t)\\ &\displaystyle =&\displaystyle \frac{1}{3p^3}\left\{[1-F(T_{max})]^3-[1-F(0)]^3\right\}=\frac{1}{3}\\ \pi_{x1}&\displaystyle =&\displaystyle P(X_{1k}<X_{2l})=\int_{-\infty}^{\infty}G(x)dG(x)=\frac{1}{2}\left[G(x)^2\right]_{-\infty}^{\infty}=\frac{1}{2}\\ ~~\pi_{x2}&\displaystyle =&\displaystyle P(X_{1k}<X_{2l}, X_{1k'}<X_{2l})=\int_{-\infty}^{\infty}G(t)^2dG(t)=\frac{1}{3}\left[G(x)^3\right]_{-\infty}^{\infty}=\frac{1}{3}\\ \pi_{x3}&\displaystyle =&\displaystyle P( X_{1k}<X_{2l}, X_{1k}<X_{2l'})\int_{-\infty}^{\infty}[1-G(t)]^2dG(t)\\ &\displaystyle =&\displaystyle -\frac{1}{3}\left\{[1-G(x)]^3\right\}_{-\infty}^{\infty}=\frac{1}{3}. \end{array} \end{aligned} $$

    Therefore,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \pi_{U1}&\displaystyle =&\displaystyle p_1p_2\pi_{t1}+p_1q_2+q_1q_2\pi_{x1}\\&\displaystyle =&\displaystyle \frac{1}{2}p^2+pq+\frac{1}{2}q^2=\frac{1}{2}(p+q)^2=\frac{1}{2}\\ \pi_{U2}&\displaystyle =&\displaystyle p_1^2q_2+p_1^2p_2\pi_{t2}+2p_1q_1q_2\pi_{x1}+q_1^2q_2\pi_{x2}\\ &\displaystyle =&\displaystyle p^2q+\frac{1}{3}p^3+pq^2+\frac{1}{3}q^3=\frac{1}{3}(p+q)^3=\frac{1}{3}\\ \pi_{U3}&\displaystyle =&\displaystyle p_1q_2^2+p_1p_2^2\pi_{t3}+2p_1p_2q_2\pi_{x1}+q_1q_2^2\pi_{x3}\\ &\displaystyle =&\displaystyle pq^2+\frac{1}{3}p^3+p^2q+\frac{1}{3}q^3=\frac{1}{3}(p+q)^3=\frac{1}{3}. \end{array} \end{aligned} $$

    The mean and variance become

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mu_0&\displaystyle =&\displaystyle E_0(U) =\pi_{U1}=\frac{1}{2};\\ \sigma^2_0&\displaystyle =&\displaystyle Var_0(U)\\ &\displaystyle =&\displaystyle (N_1N_2)^{-1}\left[\pi_{U1}\left(1-\pi_{U1}\right)+(N_1-1)\left(\pi_{U2}-\pi_{U1}^2\right)\right.\\ &\displaystyle &\displaystyle \qquad \qquad \left.+(N_2-1)\left(\pi_{U3}-\pi_{U1}^2\right)\right] \\ &\displaystyle =&\displaystyle (N_1N_2)^{-1}\left[\frac{1}{2}\left(1-\frac{1}{2}\right)+(N_1-1)\left(\frac{1}{3}-\left(\frac{1}{2}\right)^2\right)\right.\\ &\displaystyle &\displaystyle \qquad \qquad \left.+(N_2-1)\left(\frac{1}{3}-\left(\frac{1}{2}\right)^2\right)\right] \\ &\displaystyle =&\displaystyle (N_1N_2)^{-1}\left[\frac{1}{4}+\frac{1}{12}(N_1-1)+\frac{1}{12}(N_2-1)\right] =\frac{N_1+N_2+1}{12N_1N_2}. \end{array} \end{aligned} $$
  2. 2.

    Ties are present: More generally, we can approximate the probabilities \(\pi _{U2}=E(U_{kl} U_{k'l})\) and \(\pi _{U3}=E(U_{kl} U_{kl'})\) using their unbiased estimators.

    Following Hanley and McNeil (1982), we can show that the variance Var(U) can be estimated by:

    $$\displaystyle \begin{aligned}(N_1N_2)^{-1}\left[\widehat\pi_{U1}\left(1-\widehat\pi_{U1}\right)+(N_1-1)(\widehat\pi_{U2}-\widehat\pi_{U1}^2)+(N_2-1)(\widehat\pi_{U3}-\widehat\pi_{U1}^2)\right]\end{aligned}$$

    where \(\widehat \pi _{U1}=\displaystyle (N_1N_2)^{-1}\sum _{k=1}^{N_1} \sum _{l=1}^{N_2} U_{kl}, ~\widehat \pi _{U2}=\displaystyle (N_1N_2^2)^{-1}\sum _{k=1}^{N_1} U_{k\bullet }^2,~\) and \(~\widehat \pi _{U3}=\displaystyle (N_1^2N_2)^{-1}\sum _{l=1}^{N_2} U_{\bullet l}^2.\) In absence of ties, \(\widehat \pi _{U2}\) and \(\widehat \pi _{U3}\) are, respectively, estimates of π U3 and π U3.

    One can also consider other possible approximations of the variance of U using the exposition provided by Newcombe (2006).

    As we know,

    $$\displaystyle \begin{aligned}P(\widetilde{X}_{1k}<\widetilde{X}_{2l})+P(\widetilde{X}_{1k}>\widetilde{X}_{2l})+P(\widetilde{X}_{1k}=\widetilde{X}_{2l})=1.\end{aligned}$$

    Under the null hypothesis, i.e., \(\widetilde {X}_{1k}\) and \(\widetilde {X}_{2l}\) are identically distributed, we have \(P(\widetilde {X}_{1k}<\widetilde {X}_{2l})=P(\widetilde {X}_{1k}>\widetilde {X}_{2l})\) which implies \(P(\widetilde {X}_{1k}<\widetilde {X}_{2l})+\frac {1}{2}P(\widetilde {X}_{1k}=\widetilde {X}_{2l})=\frac {1}{2}.\) Therefore,

    $$\displaystyle \begin{aligned}E(U) =E(U_{kl})=P(\widetilde{X}_{1k}<\widetilde{X}_{2l})+\frac{1}{2}P(\widetilde{X}_{1k}=\widetilde{X}_{2l})=\frac{1}{2}.\end{aligned}$$

    The variance reduces to:

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma^2_0&\displaystyle =&\displaystyle Var_0(U)=\frac{1}{12N_1N_2}\left( N_1+N_2+1-\frac{\displaystyle\sum_{\nu=1}^{g}t_{\nu}(t_{\nu}^2-1)}{(N_1+N_2)(N_1+N_2-1)}\right) \end{array} \end{aligned} $$

    where t ν is the number of observations with the same value in the ν-th block of tied observations sharing the same value and g is the number of such blocks (see, for instance, Rosner 2015).

Appendix 2: Mean and Variance of the Weighted U-Statistic

Consider the weights w = (w 1, w 2), we define the vector \(\mathbf {c}'=(c_1, c_2, c_3)=\left (w_1^2, w_1w_2, w_2^2\right )\). Let \(\displaystyle \widetilde {X}_{1k}=w_1\delta _{1k}(\eta +t_{1k})+w_2(1-\delta _{1k}) X_{1k},\) for k = 1, …, N 1 and \(\widetilde {X}_{2l}=w_1\delta _{2l}(\eta +t_{2l})+w_2(1-\delta _{2l})X_{2l},\) for l = 1, …, N 2. We define the weighted WMW U-statistic by: c U = (U t, U tx, U x) where U  = (U t, U tx, U x) and

$$\displaystyle \begin{aligned} \begin{array}{rcl} U_t&\displaystyle =&\displaystyle (N_1N_2)^{-1}\sum_{k=1}^{N_1}\sum_{l=1}^{N_2}\delta_{1k}\delta_{2l}\left[ I({T}_{1k}<{T}_{2l})+\frac{1}{2}I({T}_{1k}={T}_{2l})\right], ~\text{with} \\ &\displaystyle &\displaystyle T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max} \\ U_{tx}&\displaystyle =&\displaystyle (N_1N_2)^{-1}\sum_{k=1}^{N_1}\sum_{l=1}^{N_2}\delta_{1k}(1-\delta_{2l}){}\\ U_x&\displaystyle =&\displaystyle (N_1N_2)^{-1}\sum_{k=1}^{N_1}\sum_{l=1}^{N_2} (1-\delta_{1k})(1-\delta_{2l})\left[ I({X}_{1k}<{X}_{2l})+I({X}_{1k}={X}_{2l})\right] \end{array} \end{aligned} $$
$$\displaystyle \begin{aligned} E(\mathbf{U})&=(E(U_t), E(U_{tx}), E(U_x))' \\ &=\Big( E(\delta_{1k}=1)E(\delta_{2l}=1)\bigg[ P({T}_{1k}<{T}_{2l}|T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max})\\ &\left.+\frac{1}{2}P({T}_{1k}={T}_{2l}|T_{1k}\leq T_{max}, ~T_{2l}\leq T_{max})\right] , ~E(\delta_{1k}=1)E(\delta_{2l}=0), \\ &\left. ~E(\delta_{1k}=0)E(\delta_{2l}=0)\left[ P(X_{1k}<X_{2l})+ \frac{1}{2}P(X_{1k}=X_{2l})\right] \right)'\\ &=\left(p_1p_2\pi_{t1}, p_1q_2, q_1q_2\pi_{x1}\right)' \end{aligned} $$

In absence of ties, the variance \(Var(\mathbf {U})={\varSigma } = (N_1N_2)^{-1}\left (\varSigma _{ij}\right )_{\substack {1\leq i, j\leq 3}}\) is a 3 × 3 matrix such that

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varSigma_{11}&\displaystyle =&\displaystyle E[(U_t-p_1p_2\pi_{t1})(U_t-p_1p_2\pi_{t1})]\\ &\displaystyle =&\displaystyle p_1p_2\left[\pi_{t1}(1-p_1p_2\pi_{t1})+p_1(N_1-1)(\pi_{t2} -p_2\pi_{t1}^2)+p_2(N_2-1)(\pi_{t3} -p_1\pi_{t1}^2)\right],\\ \varSigma_{12}&\displaystyle =&\displaystyle \varSigma_{21}=E[(U_t-p_1p_2\pi_{t1})(U_{tx}-p_1q_2)]=\pi_{t1}p_1p_2q_2\left[(N_2-1)q_1-N_1p_1\right] ,\\ \varSigma_{13}&\displaystyle =&\displaystyle \varSigma_{31}=E[(U_t-p_1p_2\pi_{t1})(U_{x}-q_1q_2\pi_{x1})]= -\pi_{t1}\pi_{x1}(N_1+N_2-1)p_1q_1p_2q_2,\\ \varSigma_{22}&\displaystyle =&\displaystyle E[(U_{tx}-p_1q_2)(U_{tx}-p_1q_2)]= p_1q_2\left[(1-p_1q_2)+(N_1-1)p_1p_2+(N_2-1)q_1q_2\right]\\ \varSigma_{23}&\displaystyle =&\displaystyle \varSigma_{32}=E[(U_{tx}-p_1q_2)(U_{x}-q_1q_2\pi_{x1})]=\pi_{x1}p_1q_1q_2\left[(N_1-1)p_2 -N_2q_2\right],\\ \varSigma_{33}&\displaystyle =&\displaystyle E[(U_{x}-q_1q_2)(U_{x}-q_1q_2\pi_{x1})]\\ &\displaystyle =&\displaystyle q_1q_2\left[ \pi_{x1}(1-q_1q_2\pi_{x1})+q_1(N_1-1)(\pi_{x2} -q_2\pi_{x1}^2)+q_2(N_2-1)(\pi_{x3} -q_1\pi_{x1}^2)\right]. \end{array} \end{aligned} $$

Therefore,

$$\displaystyle \begin{aligned}Var(\mathbf{c}'\mathbf{U})=\mathbf{c}'{\varSigma}\mathbf{c}.\end{aligned}$$

Under the null hypothesis of no difference between the two groups, with respect to both survival and nonfatal outcome, we have p 1 = p 2 = p, q 1 = q 2 = q = 1−p, π t1 = π x1 = 1∕2, and π t2 = π x2 = π t3 = π x3 = 1∕3.Thus,

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} E_0(\mathbf{U})= \frac{1}{2}\left(p^2, 2pq, q^2\right)'~~~\mbox{and}~~~Var_0(\mathbf{U})={\varSigma_0}, \end{array} \end{aligned} $$
(1.17)

where \({\varSigma _0}=(N_1N_2)^{-1}\left (\varSigma _{0ij}\right )_{\substack {1\leq i, j\leq 3}}\) is a symmetric matrix with

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varSigma_{011}&\displaystyle =&\displaystyle \frac{p^2}{12}A(p), ~\varSigma_{012}=\frac{p^2q}{2}\left[ (N_2{-}1)q{-}N_1p\right], ~~\varSigma_{013}={-}\frac{p^2q^2}{4}(N_2+N_1{-}1)\\ \varSigma_{022}&\displaystyle =&\displaystyle pq\left[ 1{-}pq+(N_2{-}1)q^2+(N_1{-}1)p^2\right], ~~ \varSigma_{023}= \frac{pq^2}{2}\left((N_1{-}1)p{-}N_2q\right), \\ \varSigma_{033}&\displaystyle =&\displaystyle \frac{q^2}{12}A(q),~~\text{where}~A(x)=6+4(N_2+N_1{-}2)x{-}3(N_2+N_1{-}1)x^{2}. \end{array} \end{aligned} $$

Moreover, since Var 0(c U) = c ′Σ 0 c ≥ 0 by definition, the matrix Σ 0 is positive semi-definite. In practice, p is estimated by the pooled sample proportion \(\hat p=(N_1\widehat p_1+N_2\widehat p_2)/(N_1+N_2)\), and both E 0(U) and Var 0(U) are calculated accordingly.

Finally, when ties are present, the foregoing formulas can be modified easily as we did in the non-weighted case to account for the ties in the variance estimations.

Appendix 3: Optimal Weights

From Eq. (1.15), we have

$$\displaystyle \begin{aligned}\displaystyle \mu_{1w}-\mu_{0w}=c_1\left(\pi_{t1}p_1p_2-\frac{1}{2}p^2\right)+c_2\left(p_1q_2-pq\right)+c_3\left(\pi_{x1}q_1q_2-\frac{1}{2}q^2\right)= \mathbf{c}'\boldsymbol{\mu}\end{aligned}$$

where \(\boldsymbol {\mu }'=\left (\pi _{t1}p_1p_2-\frac {1}{2}p^2, p_1q_2-pq , \pi _{x1}q_1q_2-\frac {1}{2}q^2\right ), \mathbf {c}'=(c_1, c_2, c_3)\) with c 1 + 2c 2 + c 3 = 1.

We assume that det(Σ 0) > 0, i.e., Σ 0 is positive definite. Maximizing \(\displaystyle \frac {|\mu _{1w}-\mu _{0w}|}{\sigma _{0w}},\) subject to c 1 + 2c 2 + c 3 = 1, with respect to c corresponds to maximizing the Lagrange function:

$$\displaystyle \begin{aligned}O( \mathbf{c}, \lambda)= \displaystyle \left| \mathbf{c}'\mu\right|\left( \mathbf{c}'\varSigma_0 \mathbf{c}\right)^{-\frac{1}{2}}-\lambda( \mathbf{c}'\mathbf{b}-1)\end{aligned}$$

with respect to the vector c and λ, where λ is the Lagrange multiplier and b  = (1, 2, 1). Let \(K(\mathbf {c})=sign(\mathbf {c}'\mu )[( \mathbf {c}'\varSigma _0 \mathbf {c})^{-\frac {3}{2}}]\), we have

$$\displaystyle \begin{aligned} \frac{\partial}{\partial \mathbf{c}} O( \mathbf{c}, \lambda)&= K(\mathbf{c})\left[ ( \mathbf{c}'\varSigma_0 \mathbf{c})\mu-(\varSigma_0 \mathbf{c})( \mathbf{c}'\mu)\right]-\lambda \mathbf{b}=0{} \end{aligned} $$
(1.18)
$$\displaystyle \begin{aligned} \frac{\partial}{\partial \lambda} O( \mathbf{c}, \lambda)&=\mathbf{c}'\mathbf{b}-1=0 {} \end{aligned} $$
(1.19)

From (1.18) and (1.19), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} 0&\displaystyle =&\displaystyle \mathbf{c}'\left\{K(\mathbf{c})\left[ ( \mathbf{c}'\varSigma_0 \mathbf{c})\mu-(\varSigma_0 \mathbf{c})( \mathbf{c}'\mu)\right]-\lambda \mathbf{b}\right\}\\ &\displaystyle =&\displaystyle K(\mathbf{c})\left[ ( \mathbf{c}'\varSigma_0 \mathbf{c}) \mathbf{c}'\mu-( \mathbf{c}'\varSigma_0 \mathbf{c})( \mathbf{c}'\mu)\right]-\lambda \mathbf{c}'\mathbf{b}=\lambda, \end{array} \end{aligned} $$

because both (c ′Σ 0 c) and (c ′μ) are scalars and c b = c 1 + 2c 2 + c 3 = 1.

Then, Eq. (1.18) implies (c ′Σ 0 c)μ = (Σ 0 c)(c ′μ), i.e., \(\displaystyle \mu =(\varSigma _0 \mathbf {c})\frac {( \mathbf {c}'\mu )}{( \mathbf {c}'\varSigma _0 \mathbf {c})}=\varSigma _0\frac {( \mathbf {c}'\mu )}{( \mathbf {c}'\varSigma _0 \mathbf {c})}\mathbf {c}.\) Since we assume that the matrix \(\varSigma _0^{-1}\) exists, this implies

$$\displaystyle \begin{aligned} \displaystyle\varSigma_0^{-1}\mu=\frac{( \mathbf{c}'\mu)}{( \mathbf{c}'\varSigma_0 \mathbf{c})}\mathbf{c} \end{aligned} $$
(1.20)

and thus, \(\displaystyle \mathbf {b}'\varSigma _0^{-1}\mu =\frac {( \mathbf {c}'\mu )}{( \mathbf {c}'\varSigma _0 \mathbf {c})}\mathbf {b}'\mathbf {c}=\frac {( \mathbf {c}'\mu )}{( \mathbf {c}'\varSigma _0 \mathbf {c})}\).

Replacing \(\displaystyle \frac {( \mathbf {c}'\mu )}{( \mathbf {c}'\varSigma _0 \mathbf {c})}\) by \(\displaystyle \mathbf {b}'\varSigma _0^{-1}\mu \) in Eq. (1.20) yields \(\displaystyle \varSigma _0^{-1}\mu =\displaystyle (\mathbf {b}'\varSigma _0^{-1}\mu )\mathbf {c}.\) Therefore, the optimal weight-vector is

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} {\mathbf{c}}_{opt}=\frac{\varSigma_0^{-1}\mu}{\mathbf{b}'\varSigma_0^{-1}\boldsymbol{\mu}}, \end{array} \end{aligned} $$
(1.21)

as long as \(\mathbf {b}'\varSigma _0^{-1}\boldsymbol {\mu }\neq 0\). In addition,

Since Σ 0 is positive definite, we can show that the border-preserving principal minors of order k > 2 have sign (−1)k. Therefore, \( \displaystyle {\mathbf {c}}_{opt}=\frac {\varSigma _0^{-1}\mu }{\mathbf {b}'\varSigma _0^{-1}\boldsymbol {\mu }}\) maximizes O(c).

Let us define two vectors d 1  = (1, 1, 0) and d 2  = b d 1  = (0, 1, 1). To calculate w 1 and w 2, we just need to consider the relationships \(\mathbf {c}=(w_1^2, w_1w_2, w_2^2)\) and w 1 + w 2 = 1. We have \(\mathbf {d_1}'\mathbf {c}=w_1^2+w_1(1-w_1)=w_1.\) Therefore, using the result given in Eq. (1.21), we can deduce \(\displaystyle w_1=\mathbf {d_1}'\mathbf {c}=\frac {\mathbf {d_1}'\varSigma _0^{-1}\mu }{\mathbf {b}'\varSigma _0^{-1}\boldsymbol {\mu }}\) and \(\displaystyle w_2=1-\mathbf {d_1}'\mathbf {c}=\frac {(\mathbf {b}'-\mathbf {d_1}')\varSigma _0^{-1}\mu }{\mathbf {b}'\varSigma _0^{-1}\boldsymbol {\mu }}=\frac {\mathbf {d_2}'\varSigma _0^{-1}\mu }{\mathbf {b}'\varSigma _0^{-1}\boldsymbol {\mu }}.\)

Appendix 4: Conditional Probabilities

1.1 Exponential Distribution

Suppose that the death times t 1, t 2 follow exponential distributions with hazards λ 1, λ 2, respectively, and denote \(\displaystyle \theta =\frac {\lambda _1}{\lambda _2},~~q_1=q_2^{\theta }\), and \(q_2=e^{-T\lambda _2}.\) Given that P(δ 1k = 1) = p 1, P(δ 2l = 1) = p 2, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} \pi_{t1}&\displaystyle =&\displaystyle P(T_{1k}<T_{2l}|\delta_{1k}=\delta_{2l}=1)= (p_1p_2)^{-1}\int_0^{T_{max}} (1-e^{-\lambda_1u})\lambda_2e^{-\lambda_2u}du\\ &\displaystyle =&\displaystyle \frac{1}{(1-q_2^{\theta})}\left[1- \frac{1-{q_2^{(1+\theta)}}}{(1+\theta)(1-q_2)}\right];\\ \pi_{t2}&\displaystyle =&\displaystyle P(T_{1k}<T_{2l}, T_{1k'}<T_{2l}|\delta_{1k}=\delta_{1k'}=\delta_{2l}=1)\\ &\displaystyle =&\displaystyle p_1^{-2}p_2^{-1}\int_0^{T_{max}} (1-e^{-\lambda_1u})^2\lambda_2e^{-\lambda_2u}du\\ &\displaystyle = &\displaystyle (1-q_2^{\theta})^{-2}\left\{1+\frac{1}{(1-q_2)}\left[\frac{1-q_2^{(1+2\theta)}}{1+2\theta}-\frac{2(1-q_2^{(1+\theta)})}{1+\theta}\right]\right\} \\ \pi_{t3}&\displaystyle =&\displaystyle P(T_{1k}<T_{2l}, t_{1k}<t_{2l'}|\delta_{1k}=\delta_{2l}=\delta_{2l'}=1)\\ &\displaystyle =&\displaystyle p_1^{-1}p_2^{-2}\int_0^{T} (e^{-\lambda_{2}T}-e^{-\lambda_2u})^2\lambda_1e^{-\lambda_1u}du\\ &\displaystyle =&\displaystyle \left(\frac{q_2}{1-q_2}\right)^2\left[1 +\frac{\theta(1-q_2^{(2+\theta)})}{(2+\theta)(1-q_2^{\theta})q_2^2}-\frac{2\theta(1-q_2^{(1+\theta)})}{(1+\theta)(1-q_2^{\theta})q_2}\right] \end{array} \end{aligned} $$

1.2 Normal Distribution

Suppose that the nonfatal outcomes X 1, X 2 follow normal distributions \(N(\mu _{x_1}, \sigma _{x_1})\) and \(N(\mu _{x_2}, \sigma _{x_2})\), respectively.

Consider \(\displaystyle \varDelta _{x}{=}\frac {\mu _{x_2}-\mu _{x_1} }{\sqrt {\sigma _{x_1}^2+\sigma _{x_2}^2}}~\), \(\displaystyle \rho _{x_j}{=}\frac {\sigma _{x_j}^2}{\sigma _{x_1}^2+\sigma _{x_2}^2} \), and \(\displaystyle Z_{kl}= \frac {X_{1k}-X_{2l}-(\mu _{x_1}-\mu _{x_2})}{\sqrt {\sigma _{x_1}^2+\sigma _{x_2}^2}}\).

We can show that

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \pi_{x1}=P(X_{1k}<X_{2l})=\varPhi(\varDelta_{x}),\\ &\displaystyle &\displaystyle \pi_{x2}=P(X_{1k}<X_{2l}, X_{1k'}<X_{2l})=P(Z_{kl}<\varDelta_{x}, ~ Z_{k'l}<\varDelta_{x}),\\ &\displaystyle &\displaystyle \pi_{x3}=P(X_{1k}<X_{2l}, X_{1k}<X_{2l'})=P(Z_{kl}<\varDelta_{x}, ~Z_{kl'}<\varDelta_{x}), \end{array} \end{aligned} $$

\( (Z_{kl},Z_{k'l})~\sim N\left ( \left (\begin {array}{l} 0\\ 0 \end {array}\right ) , \left (\begin {array}{lr} 1&\rho _{x_2}\\ \rho _{x_2}&1 \end {array}\right )\right ) \)and \(\allowdisplaybreaks (Z_{kl},Z_{kl'})~\sim N\left ( \left (\begin {array}{l} 0\\ 0 \end {array}\right ) , \left (\begin {array}{lr} 1&\rho _{x_1}\\ \rho _{x_1}&1 \end {array}\right )\right ). \)

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Matsouaka, R.A., Singhal, A.B., Betensky, R.A. (2018). Optimal Weighted Wilcoxon–Mann–Whitney Test for Prioritized Outcomes. In: Zhao, Y., Chen, DG. (eds) New Frontiers of Biostatistics and Bioinformatics. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-99389-8_1

Download citation

Publish with us

Policies and ethics