Behavior Research Methods

, Volume 51, Issue 1, pp 258–279 | Cite as

Probability of bivariate superiority: A non-parametric common-language statistic for detecting bivariate relationships

  • Johnson Ching-Hong LiEmail author
  • Rory M. Waisman


Researchers often focus on bivariate normal correlation (r) to evaluate bivariate relationships. However, these techniques assume linearity and depend on parametric assumptions. We propose a new nonparametric statistical model that can be more intuitively understood than the conventional r: probability of bivariate superiority (PBS). Our development of Bp, the estimator of a PBS relationship, extends Dunlap’s (1994) common-language transformation of r (CLr) by providing a method to directly estimate PBS—the probability that when x is above (or below) the mean of all X, its paired y score will also be above (or below) the mean of all Y. Probability of superiority is an important form of bivariate relationship that until now could only be accurately estimated when data met the parametric assumptions for r. We specify the copula that forms the theoretical basis for PBS, provide an algorithm for estimating PBS from a sample, and describe the results of a Monte Carlo experiment that evaluated our algorithm across 448 data conditions. The PBS estimate, Bp, is robust to violations of parametric assumptions and offers a useful method for evaluating the significance of probability-of-superiority relationships in bivariate data. It is critical to note that Bp estimates a different form of bivariate relationship than does r. Our working examples show that a PBS effect can be significant in the absence of a significant correlation, and vice versa. In addition to utilizing the PBS model in future research, we suggest that this new statistical procedure be used to find theoretically important but previously overlooked effects from past studies.


Bivariate relationships Correlation Probability of superiority Common language Effect size 

Supplementary material

13428_2018_1089_MOESM1_ESM.docx (19 kb)
ESM 1 (DOCX 19 kb)


  1. Blomqvist, N. (1950). On a measure of dependence between two random variables. Annals of Mathematical Statistics, 21, 593–600.CrossRefGoogle Scholar
  2. Botev, Z. I. (2017). The normal law under linear restrictions: Simulation and estimation via minimax tilting. Journal of the Royal Statistical Society: Series B, Statistical Methodology, 79, 125–148. CrossRefGoogle Scholar
  3. Bradley, J. (1982). The insidious L-shaped distribution. Bulletin of the Psychonomic Society, 20, 85–88.CrossRefGoogle Scholar
  4. Brooks, M. E., Dalal, D. K., & Nolan, K. P. (2014). Are common language effect sizes easier to understand than traditional effect sizes? Journal of Applied Psychology, 99, 332–340. CrossRefGoogle Scholar
  5. Canty, A., & Ripley, B. (2016). boot: Bootstrap R (S-Plus) functions (R package version 1.3-18). Retrieved from
  6. Chan, W., & Chan, W.-L. (2004). Bootstrap standard error and confidence intervals for the correlation corrected for range restriction: A simulation study. Psychological Methods, 9, 369–385. CrossRefGoogle Scholar
  7. Cliff, N. (1993). Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological Bulletin, 114, 494–509. CrossRefGoogle Scholar
  8. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: ErlbaumGoogle Scholar
  9. Dunlap, W. P. (1994). Generalizing the common language effect size indicator to bivariate normal correlations. Psychological Bulletin, 116, 509–511. CrossRefGoogle Scholar
  10. Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman & Hall.CrossRefGoogle Scholar
  11. Gal, I. (2002). Adults’ statistical literacy: Meanings, components, responsibilities. International Statistical Review, 70, 1–51.
  12. Grissom, R. (1994). Probability of the superior outcome of one treatment over another. Journal of Applied Psychology, 79, 314–316.CrossRefGoogle Scholar
  13. Hogg, R., & Craig, A. (1971). Introduction to mathematical statistics (4th ed.). New York, NY: Macmillan.Google Scholar
  14. Howell, D. C. (2013). Statistical methods for psychology (8th ed.). Belmont, CA: Wadsworth.Google Scholar
  15. Huberty, C. J., & Lowman, L. L. (2000). Group overlap as a basis for effect size. Educational and Psychological Measurement, 60, 543–563. CrossRefGoogle Scholar
  16. Jaworski, P., Durante, F., Härdle, W. K., & Rychlik, T. (Eds.). (2010). Copula theory and its applications: Proceedings of the workshop held in Warsaw, 25–26 September 2009. Berlin, Germany: Springer.Google Scholar
  17. Karl, P. (1895). VII. Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society, 58, 240–242.
  18. Kendall, M. (1938). A new measure of rank correlation. Biometrika, 30, 81–89. CrossRefGoogle Scholar
  19. Kendall, M., & Stuart, A. (1977). The advanced theory of statistics (4th ed.). New York, NY: Macmillan.Google Scholar
  20. Kendall, M. G., & Gibbons, J. D. (1990). Rank correlation methods (5th ed.). London, UK: Edward ArnoldGoogle Scholar
  21. Lai, C., & Balakrishnan, N. (2009). Continuous bivariate distributions. New York, NY: Springer.Google Scholar
  22. Leech, N. L., & Onwuegbuzie, A. J. (2002). A call for greater use of nonparametric statistics. Retrieved from /ED471346.pdf
  23. Li, J. C.-H. (2015). Effect size measures in a two independent-samples case with non-normal and non-homogeneous data. Behavior Research Methods, 48, 1560–1574. CrossRefGoogle Scholar
  24. Li, J. C.-H., Chan, W., & Cui, Y. (2011). Bootstrap standard error and confidence intervals for the correlations corrected for indirect range restriction. British Journal of Mathematical and Statistical Psychology, 64, 367–387. CrossRefGoogle Scholar
  25. Ling, Y., & Nelson, P. I. (2014). Effect sizes for comparing two or more normal distributions based on maximal contrasts in outcomes. Statistical Methods & Applications, 23, 381–399. CrossRefGoogle Scholar
  26. May, H. (2004). Making statistics more meaningful for policy research and program evaluation. American Journal of Evaluation, 25, 525–540. CrossRefGoogle Scholar
  27. McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111, 361–365. CrossRefGoogle Scholar
  28. Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.CrossRefGoogle Scholar
  29. Mychasiuk, R. (2017). Behavioral and pathophysiological outcomes associated with caffeine consumption and repetitive mild traumatic brain injury (RmTBI) in adolescent rats (Scholars Portal Dataverse, V1). doi:10.5683/SP/8RODEVGoogle Scholar
  30. Nelson, R. B. (2006). An introduction to copulas (2nd ed.). New York, NY: Springer.Google Scholar
  31. Onwuegbuzie, A. J., & Daniel, L. G. (2002). Uses and misuses of the correlation coefficient. Research in the Schools, 9, 73–90.Google Scholar
  32. R Core Team. (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from
  33. Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., … Sabeti, P. C. (2011). Detecting novel associations in large data sets. Science, 334, 1518–1524. CrossRefGoogle Scholar
  34. Rodgers, J. L., & Nicewander, W. A. (1988). Thirteen ways to look at the correlation coefficient. American Statistician, 42, 59–66. Retrieved from CrossRefGoogle Scholar
  35. RStudio Team. (2016). RStudio: Integrated development for R (website). Boston, MA: RStudio, Inc. Retrieved from
  36. Ruscio, J. (2008). A probability-based measure of effect size: Robustness to base rates and other factors. Psychological Methods, 13, 19–30. CrossRefGoogle Scholar
  37. Siegal, S. (1956). Nonparametric statistics for the behavioral sciences. New York, NY: McGraw-Hill.Google Scholar
  38. Tomitaka, S., Kawasaki, Y., Ide, K., Yamada, H., Miyake, H., & Furukaw, T. A. (2016). Distribution of total depressive symptoms scores and each depressive symptom item in a sample of Japanese employees. PLoS ONE, 11, e0147577. CrossRefGoogle Scholar
  39. United Nations Economic Commission for Europe. (2009). Making data meaningful. Retrieved from Making_Data_Meaningful_Part_4_for_Web.pdf
  40. Vargha, A., & Delaney, H. D. (2000). A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25, 101–132.Google Scholar
  41. Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S (4th ed.). New York, NY: Springer.CrossRefGoogle Scholar
  42. Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing (3rd ed.). Amsterdam, The Netherlands: Elsevier.Google Scholar
  43. Wilcox, R. R., Granger, D. A., Szanton, S., & Clark, F. (2014). Cortisol diurnal patterns, associations with depressive symptoms, and the impact of intervention in older adults: Results using modern robust methods aimed at dealing with low power due to violations of standard assumptions. Hormones and Behavior, 65, 219–225.CrossRefGoogle Scholar
  44. Wolfe, D. A., & Hogg, R. V. (1971). On constructing statistics and reporting data. American Statistician, 25, 27–30.Google Scholar
  45. Wunch, D., Arrowsmith, C., & Heerah, S. (2017). GTA bike surveys June 28–July 19, 2017 (Scholars Portal Dataverse, V1).

Copyright information

© Psychonomic Society, Inc. 2018

Authors and Affiliations

  1. 1.Lab for Research in Quantitative and Applied Statistical Psychology (LIQAS), Department of PsychologyUniversity of ManitobaWinnipegCanada
  2. 2.School of BusinessUniversity of AlbertaEdmontonCanada

Personalised recommendations