Optimal Weighted Wilcoxon–Mann–Whitney Test for Prioritized Outcomes

  • Roland A. Matsouaka
  • Aneesh B. Singhal
  • Rebecca A. Betensky
Part of the ICSA Book Series in Statistics book series (ICSABSS)


We consider a two-group randomized clinical trial of prioritized endpoints, where mortality affects the assessment of a follow-up continuous outcome. With the continuous outcome as the principal outcome, we combine it with mortality via the worst-rank paradigm into a single composite endpoint. Then, we develop a weighted Wilcoxon–Mann–Whitney test statistic to analyze the data. We determine the optimal weights for the Wilcoxon–Mann–Whitney test statistic that maximize its power. We provide the rationale for the weights and their implications in the application of the method. In addition, we derive a formula for its power and demonstrate its accuracy in simulations. Finally, we apply the method to data from an acute ischemic stroke clinical trial of normobaric oxygen therapy.



This work was supported by grants P50-NS051343, R01-CA075971, T32 NS048005, 1RO1HL118336-01, and UL1TR001117 awarded by the National Institutes of Health. The content of this paper is solely the responsibility of the authors and does not necessarily represent the official view of the National Institutes of Health.

Conflict of Interest: None declared.


  1. Adams H., Jr., Davis, P., Leira, E., Chang, K., Bendixen, B., Clarke, W., et al. (1999). Baseline NIH stroke scale score strongly predicts outcome after stroke: A report of the Trial of Org 10172 in Acute Stroke Treatment (TOAST). Neurology, 53(1), 126.CrossRefGoogle Scholar
  2. Ahmad, Y., Nijjer, S., Cook, C. M., El-Harasis, M., Graby, J., Petraco, R., et al. (2015). A new method of applying randomised control study data to the individual patient: A novel quantitative patient-centred approach to interpreting composite end points. International Journal of Cardiology, 195, 216–224.CrossRefGoogle Scholar
  3. Allen, L. A., Hernandez, A. F., O’Connor, C. M., & Felker, G. M. (2009). End points for clinical trials in acute heart failure syndromes. Journal of the American College of Cardiology, 53(24), 2248–2258.CrossRefGoogle Scholar
  4. Anker, S. D., & Mcmurray, J. J. (2012). Time to move on from’time-to-first’: Should all events be included in the analysis of clinical trials? European Heart Journal, 33(22), 2764–2765.CrossRefGoogle Scholar
  5. Anker, S. D., Schroeder, S., Atar, D., Bax, J. J., Ceconi, C., Cowie, M. R., et al. (2016). Traditional and new composite endpoints in heart failure clinical trials: Facilitating comprehensive efficacy assessments and improving trial efficiency. European Journal of Heart Failure, 18(5):482–489.CrossRefGoogle Scholar
  6. Anstrom, K. J., & Eisenstein, E. L. From batting average to wins above replacement to composite end points-refining clinical research using baseball statistical methods. American Heart Journal, 161(5), 805–806.CrossRefGoogle Scholar
  7. Armstrong, P. W., & Westerhout, C. M. (2013). The power of more than one. Circulation 127, 665–667.CrossRefGoogle Scholar
  8. Armstrong, P. W., & Westerhout, C. M. (2017). Composite end points in clinical research. Circulation, 135(23), 2299–2307.CrossRefGoogle Scholar
  9. Armstrong, P. W., Westerhout, C. M., Van de Werf, F., Califf, R. M., Welsh, R. C., Wilcox, R. G., et al. (2011). Refining clinical trial composite outcomes: An application to the assessment of the safety and efficacy of a new thrombolytic–3 (assent-3) trial. American Heart Journal, 161(5), 848–854.CrossRefGoogle Scholar
  10. Bakal, J. A., Roe, M. T., Ohman, E. M., Goodman, S. G., Fox, K. A., Zheng, Y., et al. (2015). Applying novel methods to assess clinical outcomes: Insights from the trilogy ACS trial. European Heart Journal, 36(6), 385–392.CrossRefGoogle Scholar
  11. Bakal, J. A., Westerhout, C. M., & Armstrong, P. W. (2012). Impact of weighted composite compared to traditional composite endpoints for the design of randomized controlled trials. Statistical Methods in Medical Research, 24(6), 980–988. MathSciNetCrossRefGoogle Scholar
  12. Bakal, J. A., Westerhout, C. M., Cantor, W. J., Fernández-Avilés, F., Welsh, R. C., Fitchett, D., et al. (2012). Evaluation of early percutaneous coronary intervention vs. standard therapy after fibrinolysis for st-segment elevation myocardial infarction: Contribution of weighting the composite endpoint. European Heart Journal, 34(12), 903–908.CrossRefGoogle Scholar
  13. Bebu, I., & Lachin, J. M. (2015). Large sample inference for a win ratio analysis of a composite outcome based on prioritized components. Biostatistics, 17(1), 178–187.MathSciNetGoogle Scholar
  14. Berry, J. D., Miller, R., Moore, D. H., Cudkowicz, M. E., Van Den Berg, L. H., Kerr, D. A., et al. (2013). The combined assessment of function and survival (CAFS): A new endpoint for ALS clinical trials. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration, 14(3), 162–168.CrossRefGoogle Scholar
  15. Bonate, P. L. (2000). Analysis of pretest-posttest designs. Boca Raton: CRC Press.CrossRefGoogle Scholar
  16. Braunwald, E., Antman, E. M., Beasley, J. W., Califf, R. M., Cheitlin, M. D., Hochman, J. S., et al. (2002). ACC/AHA 2002 guideline update for the management of patients with unstable angina and non–st-segment elevation myocardial infarction–summary article: A report of the American college of cardiology/American heart association task force on practice guidelines (committee on the management of patients with unstable angina). Journal of the American College of Cardiology, 40(7), 1366–1374.Google Scholar
  17. Brittain, E., Palensky, J., Blood, J., & Wittes, J. (1997). Blinded subjective rankings as a method of assessing treatment effect: A large sample example from the systolic hypertension in the elderly program (SHEP). Statistics in Medicine, 16(6), 681–693.CrossRefGoogle Scholar
  18. Brown, P. M., Anstrom, K. J., Felker, G. M., & Ezekowitz, J. A. (2016). Composite end points in acute heart failure research: Data simulations illustrate the limitations. Canadian Journal of Cardiology, 32(11), 1356.e21–1356.e28.CrossRefGoogle Scholar
  19. Brunner, E., & Munzel, U. (2000). The nonparametric Behrens-Fisher problem: Asymptotic theory and a small-sample approximation. Biometrical Journal, 42(1), 17–25.MathSciNetzbMATHCrossRefGoogle Scholar
  20. Bruno, A., Saha, C., & Williams, L.S. (2006). Using change in the national institutes of health stroke scale to measure treatment effect in acute stroke trials. Stroke, 37(3), 920–921.CrossRefGoogle Scholar
  21. Buyse, M. (2010). Generalized pairwise comparisons of prioritized outcomes in the two-sample problem. Statistics in Medicine, 29(30), 3245–3257MathSciNetCrossRefGoogle Scholar
  22. Campbell, D. T., & Kenny, D. A. (1999). A primer on regression artifacts. New York: Guilford Publications.Google Scholar
  23. Chung, E., & Romano, J. P. (2016). Asymptotically valid and exact permutation tests based on two-sample U-statistics. Journal of Statistical Planning and Inference, 168, 97–105.MathSciNetzbMATHCrossRefGoogle Scholar
  24. Claggett, B., Wei, L.-J., & Pfeffer, M. A. (2013). Moving beyond our comfort zone. European Heart Journal, 34(12), 869–871.CrossRefGoogle Scholar
  25. Cordoba, G., Schwartz, L., Woloshin, S., Bae, H., & Gotzsche, P. (2010). Definition, reporting, and interpretation of composite outcomes in clinical trials: Systematic review. British Medical Journal, 341, c3920.CrossRefGoogle Scholar
  26. Davis, S. M., Koch, G. G., Davis, C., & LaVange, L. M. (2003). Statistical approaches to effectiveness measurement and outcome-driven re-randomizations in the clinical antipsychotic trials of intervention effectiveness (CATIE) studies. Schizophrenia Bulletin, 29(1), 73.CrossRefGoogle Scholar
  27. DeCoster, T., Willis, M., Marsh, J., Williams, T., Nepola, J., Dirschl, D., & Hurwitz, S. (1999). Rank order analysis of tibial plafond fractures: Does injury or reduction predict outcome? Foot & Ankle International, 20(1), 44–49.CrossRefGoogle Scholar
  28. Dmitrienko, A., D’Agostino, R. B., & Huque, M. F. (2013). Key multiplicity issues in clinical drug development. Statistics in Medicine, 32(7), 1079–1111.MathSciNetCrossRefGoogle Scholar
  29. Fay, M. P., & Proschan, M. A. (2010). Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Statistics Surveys, 4, 1–39.MathSciNetzbMATHCrossRefGoogle Scholar
  30. Feldman, A., Baughman, K., Lee, W., Gottlieb, S., Weiss, J., Becker, L., & Strobeck, J. (1991). Usefulness of OPC-8212, a quinolinone derivative, for chronic congestive heart failure in patients with ischemic heart disease or idiopathic dilated cardiomyopathy. The American Journal of Cardiology, 68(11), 1203–1210.CrossRefGoogle Scholar
  31. Felker, G., Anstrom, K., & Rogers, J. (2008). A global ranking approach to end points in trials of mechanical circulatory support devices. Journal of Cardiac Failure, 14(5), 368–372.CrossRefGoogle Scholar
  32. Felker, G. M., & Maisel, A. S. (2010). A global rank end point for clinical trials in acute heart failure. Circulation: Heart Failure, 3(5), 643–646.Google Scholar
  33. Ferreira-Gonzalez, I., Permanyer-Miralda, G., Busse, J., Devereaux, P., Guyatt, G., Alonso-Coello, P., et al. (2009). Composite outcomes can distort the nature and magnitude of treatment benefits in clinical trials. Annals of Internal Medicine, 150(8), 566.CrossRefGoogle Scholar
  34. Ferreira-González, I., Permanyer-Miralda, G., Busse, J. W., Bryant, D. M., Montori, V. M., Alonso-Coello, P., et al. (2007a). Methodologic discussions for using and interpreting composite endpoints are limited, but still identify major concerns. Journal of Clinical Epidemiology, 60(7), 651–657.CrossRefGoogle Scholar
  35. Ferreira-González, I., Permanyer-Miralda, G., Domingo-Salvany, A., Busse, J., Heels-Ansdell, D., Montori, V., et al. (2007b). Problems with use of composite end points in cardiovascular trials: Systematic review of randomised controlled trials. The BMJ, 334(7597), 786.CrossRefGoogle Scholar
  36. Finkelstein, D., & Schoenfeld, D. (1999). Combining mortality and longitudinal measures in clinical trials. Statistics in Medicine, 18(11), 1341–1354.CrossRefGoogle Scholar
  37. Fisher, L. D. (1998). Self-designing clinical trials. Statistics in Medicine, 17(14), 1551–1562.CrossRefGoogle Scholar
  38. Fitzmaurice, G. (2001). A conundrum in the analysis of change. Nutrition, 17(4), 360–361.CrossRefGoogle Scholar
  39. Follmann, D., Duerr, A., Tabet, S., Gilbert, P., Moodie, Z., Fast, P., et al. (2007). Endpoints and regulatory issues in HIV vaccine clinical trials: Lessons from a workshop. Journal of Acquired Immune Deficiency Syndromes (1999), 44(1), 49.CrossRefGoogle Scholar
  40. Follmann, D., Wittes, J., & Cutler, J. A. (1992). The use of subjective rankings in clinical trials with an application to cardiovascular disease. Statistics in Medicine, 11(4), 427–437.CrossRefGoogle Scholar
  41. Freemantle, N., Calvert, M., Wood, J., Eastaugh, J., & Griffin, C. (2003). Composite outcomes in randomized trials: Greater precision but with greater uncertainty? JAMA, 289(19), 2554.CrossRefGoogle Scholar
  42. Gail, M. H., Mark, S. D., Carroll, R. J., Green, S. B., & Pee, D. (1996). On design considerations and randomization-based inference for community intervention trials. Statistics in Medicine, 15(11), 1069–1092.CrossRefGoogle Scholar
  43. Gehan, E. A. (1965). A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika, 52(1–2), 203–223.MathSciNetzbMATHCrossRefGoogle Scholar
  44. Gómez, G., & Lagakos, S. W. (2013). Statistical considerations when using a composite endpoint for comparing treatment groups. Statistics in Medicine, 32(5), 719–738.MathSciNetCrossRefGoogle Scholar
  45. Gould, A. (1980). A new approach to the analysis of clinical drug trials with withdrawals. Biometrics, 36(4), 721–727.CrossRefGoogle Scholar
  46. Grech, E., & Ramsdale, D. (2003). Acute coronary syndrome: Unstable angina and non-st segment elevation myocardial infarction. The BMJ, 326(7401), 1259.CrossRefGoogle Scholar
  47. Hallstrom, A., Litwin, P., & Douglas Weaver, W. (1992). A method of assigning scores to the components of a composite outcome: An example from the MITI trial. Controlled Clinical Trials, 13(2), 148–155.CrossRefGoogle Scholar
  48. Hanley, J. A., & McNeil, B. J. (1992). The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology, 143(1), 29–36.CrossRefGoogle Scholar
  49. Hariharan, S., McBride, M. A., & Cohen, E. P. (2003). Evolution of endpoints for renal transplant outcome. American Journal of Transplantation, 3(8), 933–941.CrossRefGoogle Scholar
  50. Heddle, N. M., & Cook, R. J. (2011). Composite outcomes in clinical trials: What are they and when should they be used? Transfusion, 51(1), 11–13.CrossRefGoogle Scholar
  51. Huang, P., Woolson, R. F., & O’Brien, P. C. (2008). A rank-based sample size method for multiple outcomes in clinical trials. Statistics in Medicine, 27(16), 3084–3104.MathSciNetCrossRefGoogle Scholar
  52. Huque, M. F., Alosh, M., & Bhore, R. (2011). Addressing multiplicity issues of a composite endpoint and its components in clinical trials. Journal of Biopharmaceutical Statistics, 21(4), 610–634.MathSciNetCrossRefGoogle Scholar
  53. Kaufman, K. D., Olsen, E. A., Whiting, D., Savin, R., DeVillez, R., Bergfeld, W., et al. (1998). Finasteride in the treatment of men with androgenetic alopecia. Journal of the American Academy of Dermatology, 39(4), 578–589.CrossRefGoogle Scholar
  54. Kawaguchi, A., Koch, G. G., & Wang, X. (2011). Stratified multivariate Mann–Whitney estimators for the comparison of two treatments with randomization based covariance adjustment. Statistics in Biopharmaceutical Research, 3(2), 217–231.CrossRefGoogle Scholar
  55. Lachin, J. (1999). Worst-rank score analysis with informatively missing observations in clinical trials. Controlled Clinical Trials, 20(5), 408–422.CrossRefGoogle Scholar
  56. Lachin, J. M., & Bebu, I. (2015). Application of the Wei–Lachin multivariate one-directional test to multiple event-time outcomes. Clinical Trials, 12(6), 627–633. Scholar
  57. Li, D., Zhao, G., Paty, D., University of British Columbia MS/MRI Analysis Research Group, The SPECTRIMS Study Group. (2001). Randomized controlled trial of interferon-beta-1a in secondary progressive MS MRI results. Neurology, 56(11), 1505–1513.CrossRefGoogle Scholar
  58. Lisa, A. B., & James, S. H. (1997). Rule-based ranking schemes for antiretroviral trials. Statistics in Medicine, 16, 1175–1191.CrossRefGoogle Scholar
  59. Logan, B., & Tamhane, A. (2008). Superiority inferences on individual endpoints following noninferiority testing in clinical trials. Biometrical Journal, 50(5), 693–703.MathSciNetCrossRefGoogle Scholar
  60. Lubsen, J., Just, H., Hjalmarsson, A., La Framboise, D., Remme, W., Heinrich-Nols, J., et al. (1996). Effect of pimobendan on exercise capacity in patients with heart failure: Main results from the Pimobendan in Congestive Heart Failure (PICO) trial. Heart, 76(3), 223.CrossRefGoogle Scholar
  61. Lubsen, J., & Kirwan, B.-A. (2002). Combined endpoints: Can we use them? Statistics in Medicine, 21(19), 2959–2970.CrossRefGoogle Scholar
  62. Luo, X., Qiu, J., Bai, S., & Tian, H. (2017). Weighted win loss approach for analyzing prioritized outcomes. Statistics in Medicine, 36(15), 2452–2465.MathSciNetCrossRefGoogle Scholar
  63. Manja, V., AlBashir, S., & Guyatt, G. (2017). Criteria for use of composite end points for competing risks–a systematic survey of the literature with recommendations. Journal of Clinical Epidemiology, 82, 4–11.CrossRefGoogle Scholar
  64. Mascha, E. J., & Turan, A. (2012). Joint hypothesis testing and gatekeeping procedures for studies with multiple endpoints. Anesthesia & Analgesia, 114(6), 1304–1317.CrossRefGoogle Scholar
  65. Matsouaka, R. A., & Betensky, R. A. (2015). Power and sample size calculations for the Wilcoxon–Mann–Whitney test in the presence of death-censored observations. Statistics in Medicine, 34(3), 406–431.MathSciNetCrossRefGoogle Scholar
  66. Matsouaka, R. A., Singhal, A. B., & Betensky, R. A. (2016). An optimal Wilcoxon–Mann–Whitney test of mortality and a continuous outcome. Statistical Methods in Medical Research, 27(8), 2384–2400. MathSciNetCrossRefGoogle Scholar
  67. Minas, G., Rigat, F., Nichols, T. E., Aston, J. A., & Stallard, N. (2012). A hybrid procedure for detecting global treatment effects in multivariate clinical trials: Theory and applications to fMRI studies. Statistics in Medicine, 31(3), 253–268.MathSciNetCrossRefGoogle Scholar
  68. Moyé, L. (2013). Multiple analyses in clinical trials: Fundamentals for investigators. Berlin: Springer.zbMATHGoogle Scholar
  69. Moyé, L., Davis, B., & Hawkins, C. (1992). Analysis of a clinical trial involving a combined mortality and adherence dependent interval censored endpoint. Statistics in Medicine, 11(13), 1705–1717.CrossRefGoogle Scholar
  70. National Asthma Education and Prevention Program (National Heart, Lung, and Blood Institute). (2007). Third expert panel on the management of asthma. Expert panel report 3: Guidelines for the diagnosis and management of asthma. US Department of Health and Human Services, National Institutes of Health, National Heart, Lung, and Blood Institute.Google Scholar
  71. Neaton, J., Gray, G., Zuckerman, B., & Konstam, M. (2005). Key issues in end point selection for heart failure trials: Composite end points. Journal of Cardiac Failure, 11(8), 567–575.CrossRefGoogle Scholar
  72. Neaton, J. D., Wentworth, D. N., Rhame, F., Hogan, C., Abrams, D. I., & Deyton, L. (1994). Considerations in choice of a clinical endpoint for aids clinical trials. Statistics in Medicine, 13(19–20), 2107–2125.CrossRefGoogle Scholar
  73. Newcombe, R. G. (2006). Confidence intervals for an effect size measure based on the Mann–Whitney statistic. part 2: Asymptotic methods and evaluation. Statistics in Medicine, 25(4), 559–573.MathSciNetCrossRefGoogle Scholar
  74. Oakes, J. M., & Feldman, H. A. (2001). Statistical power for nonequivalent pretest-posttest designs the impact of change-score versus ANCOVA models. Evaluation Review, 25(1), 3–28.CrossRefGoogle Scholar
  75. O’Brien, P. C. (1984). Procedures for comparing samples with multiple endpoints. Biometrics, 40, 1079–1087.MathSciNetCrossRefGoogle Scholar
  76. Parsons, M., Spratt, N., Bivard, A., Campbell, B., Chung, K., Miteff, F., et al. (2012). A randomized trial of tenecteplase versus alteplase for acute ischemic stroke. New England Journal of Medicine, 366(12), 1099–1107.CrossRefGoogle Scholar
  77. Pearl, J. (2014). Lord’s paradox revisited–(oh lord! kumbaya!). Tech. rep., Citeseer.Google Scholar
  78. Pocock, S. J., Ariti, C. A., Collier, T. J., & Wang, D. (2011). The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal, 33(2), 176–182.CrossRefGoogle Scholar
  79. Pratt, J. W. (1964). Robustness of some procedures for the two-sample location problem. Journal of the American Statistical Association, 59, 665–680.MathSciNetGoogle Scholar
  80. Prieto-Merino, D., Smeeth, L., van Staa, T. P., & Roberts, I. (2013). Dangers of non-specific composite outcome measures in clinical trials. The BMJ, 347, f6782.CrossRefGoogle Scholar
  81. Ramchandani, R., Schoenfeld, D. A., & Finkelstein, D. M. (2016). Global rank tests for multiple, possibly censored, outcomes. Biometrics, 72, 926–935.MathSciNetzbMATHCrossRefGoogle Scholar
  82. Röhmel, J., Gerlinger, C., Benda, N., & Läuter, J. (2006). On testing simultaneously non-inferiority in two multiple primary endpoints and superiority in at least one of them. Biometrical Journal, 48(6), 916–933.MathSciNetCrossRefGoogle Scholar
  83. Rosenbaum, P. R. (2006). Comment: The place of death in the quality of life. Statistical Science, 21(3), 313–316.MathSciNetzbMATHCrossRefGoogle Scholar
  84. Rosner, B. (2015). Fundamentals of biostatistics. Toronto: Nelson Education.Google Scholar
  85. Ross, S. (2007). Composite outcomes in randomized clinical trials: Arguments for and against. American Journal of Obstetrics and Gynecology, 196(2), 119–e1.MathSciNetCrossRefGoogle Scholar
  86. Rowan, J. A., Hague, W. M., Gao, W., Battin, M. R., & Moore, M. P. (2008). Metformin versus insulin for the treatment of gestational diabetes. New England Journal of Medicine, 358(19), 2003–2015.CrossRefGoogle Scholar
  87. Rubin, D. B. (2006). Rejoinder: Causal inference through potential outcomes and principal stratification: Application to studies with” censoring” due to death. Statistical Science, 21(3), 319–321.MathSciNetzbMATHCrossRefGoogle Scholar
  88. Sampson, U. K., Metcalfe, C., Pfeffer, M. A., Solomon, S. D., & Zou, K. H. (2010). Composite outcomes: Weighting component events according to severity assisted interpretation but reduced statistical power. Journal of Clinical Epidemiology, 63(10), 1156–1158CrossRefGoogle Scholar
  89. Samson, K. (2013). News from the AAN annual meeting: Why a trial of normobaric oxygen in acute ischemic stroke was halted early. Neurology Today, 13(10), 34–35.CrossRefGoogle Scholar
  90. Sankoh, A. J., Li, H., & D’Agostino, R. B. (2014). Use of composite endpoints in clinical trials. Statistics in Medicine, 33(27), 4709–4714.MathSciNetCrossRefGoogle Scholar
  91. Senn, S. (2006). Change from baseline and analysis of covariance revisited. Statistics in Medicine, 25(24), 4334–4344.MathSciNetCrossRefGoogle Scholar
  92. Shahar, E., & Shahar, D. J. (2012). Causal diagrams and change variables. Journal of Evaluation in Clinical Practice, 18(1), 143–148.CrossRefGoogle Scholar
  93. Singhal, A., Benner, T., Roccatagliata, L., Koroshetz, W., Schaefer, P., Lo, E., et al. (2005). A pilot study of normobaric oxygen therapy in acute ischemic stroke. Stroke, 36(4), 797.CrossRefGoogle Scholar
  94. Singhal, A. B. (2006). Normobaric oxygen therapy in acute ischemic stroke trial. Database.
  95. Singhal, A. B. (2007). A review of oxygen therapy in ischemic stroke. Neurological Research, 29(2), 173–183.CrossRefGoogle Scholar
  96. Spencer, S., Mayer, B., Bendall, K. L., & Bateman, E. D. (2007). Validation of a guideline-based composite outcome assessment tool for asthma control. Respiratory Research, 8(1), 26.CrossRefGoogle Scholar
  97. Subherwal, S., Anstrom, K. J., Jones, W. S., Felker, M. G., Misra, S., Conte, M. S., et al. (2012). Use of alternative methodologies for evaluation of composite end points in trials of therapies for critical limb ischemia. American Heart Journal, 164(3), 277.CrossRefGoogle Scholar
  98. Sun, H., Davison, B. A., Cotter, G., Pencina, M. J., & Koch, G. G. (2012). Evaluating treatment efficacy by multiple end points in phase ii acute heart failure clinical trials analyzing data using a global method. Circulation: Heart Failure, 5(6), 742–749.Google Scholar
  99. Tomlinson, G., & Detsky, A. S. (2010). Composite end points in randomized trials: There is no free lunch. JAMA, 303(3), 267–268.CrossRefGoogle Scholar
  100. Tyler, K. M., Normand, S.-L. T., & Horton, N. J. (2011). The use and abuse of multiple outcomes in randomized controlled depression trials. Contemporary Clinical Trials, 32(2), 299–304.CrossRefGoogle Scholar
  101. van Breukelen, G. J. (2013). ANCOVA versus CHANGE from baseline in nonrandomized studies: The difference. Multivariate Behavioral Research, 48(6), 895–922.CrossRefGoogle Scholar
  102. Van Elteren, P. (1960). On the combination of independent two-sample tests of Wilcoxon. Bulletin of the International Statistical Institute, 37, 351–361.MathSciNetzbMATHGoogle Scholar
  103. Wen, X., Hartzema, A., Delaney, J. A., Brumback, B., Liu, X., Egerman, R., et al. (2017). Combining adverse pregnancy and perinatal outcomes for women exposed to antiepileptic drugs during pregnancy, using a latent trait model. BMC Pregnancy and Childbirth, 17(1), 10.CrossRefGoogle Scholar
  104. Willett, J. B. (1988). Questions and answers in the measurement of change. Review of Research in Education, 15, 345–422.Google Scholar
  105. Wilson, R. F., & Berger, A. K. (2011). Are all end points created equal? The case for weighting. Journal of the American College of Cardiology, 57(5), 546–548.CrossRefGoogle Scholar
  106. Young, F. B., Weir, C. J., Lees, K. R., & GAIN International Trial Steering Committee and Investigators. (2005). Comparison of the national institutes of health stroke scale with disability outcome measures in acute stroke trials. Stroke, 36(10), 2187–2192.CrossRefGoogle Scholar
  107. Zhang, J., Quan, H., Ng, J., & Stepanavage, M. E. (1997). Some statistical methods for multiple endpoints in clinical trials. Controlled Clinical Trials, 18(3), 204–221.CrossRefGoogle Scholar
  108. Zhao, Y. (2006). Sample size estimation for the van Elteren test—a stratified Wilcoxon–Mann–Whitney test. Statistics in Medicine, 25(15), 2675–2687.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Roland A. Matsouaka
    • 1
    • 2
  • Aneesh B. Singhal
    • 3
  • Rebecca A. Betensky
    • 4
  1. 1.Department of Biostatistics and BioinformaticsDuke UniversityDurhamUSA
  2. 2.Program for Comparative Effectiveness MethodologyDuke Clinical Research Institute, Duke UniversityDurhamUSA
  3. 3.Department of NeurologyMassachusetts General HospitalBostonUSA
  4. 4.Department of BiostatisticsHarvard T.H. Chan School of Public Health and Harvard NeuroDiscovery Center, Harvard Medical SchoolBostonUSA

Personalised recommendations