Abstract
It is an important inferential problem to test no association between two binary variables based on data. Tests based on the sample odds ratio are commonly used. We bring in a competing test based on the Pearson correlation coefficient. In particular, the odds ratio does not extend to higher order contingency tables, whereas Pearson correlation does. It is important to understand how Pearson correlation stacks against the odds ratio in 2 x 2 contingency tables. Another measure of association is the canonical correlation. In this paper, we examine how competitive Pearson correlation in relation to odds ratio in terms of power in the binary context, contrasting further with both the Wald Z and Rao score tests. We generated an extensive collection of joint distributions of the binary variables and estimated the power of the tests under each joint alternative distribution based on random samples. The consensus is that none of the tests dominates the other.
Similar content being viewed by others
References
Alan A (2003) Categorical data analysis, vol 482. John Wiley & Sons, New Jersey
Alan A (2010) Analysis of ordinal categorical data, vol 656. John Wiley & Sons, New Jersey
Cochran WG (1952) The \(\chi \)2 test of goodness of fit. Ann. Math. Stat. pp 315–345
Cochran WG (1954) Some methods for strengthening the common \(\chi ^{2}\) tests. Biometrics 10(4):417–451
Dunlap WP, Brody CJ, Tammy G (2000) Canonical correlation and chi-square: relationships and interpretation. J General Psychol 127(4):341–353
Ferguson GA (1959) Statistical analysis in psychology and education, McGraw-Hill
Hays WL (1994) Statistics, 5th edn, Boston, Cengage Learning, ISBN-10:0030744679
Lancaster HO (1969) The chi-squared distribution. Wiley, New York, MR. ISBN-10:0471512303
O’Neill ME (1978a) Asymptotic distributions of the canonical correlations from contingency tables. Aust J Stat 20(1):75–82
O’Neill ME (1978) Distributional expansions for canonical correlations from contingency tables. J R Stat Soc Ser B (Methodological) 40:303–312
O’Neill ME (1981) A note on the canonical correlations from contingency tables. Aust J Stat 23(1):58–66
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The corresponding author states that, on behalf of all authors, there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Celebrating the Centenary of Professor C. R. Rao” guest edited by , Ravi Khattree, Sreenivasa Rao Jammalamadaka , and M. B. Rao .
Appendix
Appendix
Asymptotic variance of the maximum likelihood estimator of Pearson correlation \(\phi \) Steps:
-
1.
Joint distribution of X and Y
$$\begin{aligned} Q= \begin{pmatrix} {a} &{} {b} \\ {c} &{} {d} \\ \end{pmatrix} \end{aligned}$$ -
2.
Pearson correlation
$$\begin{aligned} \begin{aligned} \rho&= \frac{ad - bc}{\sqrt{(}a+b)(a+c)(c+d)(b+d) } \\&= \phi \\&= UV^{-0.5}, {\text{ where}}, U = ad - bc \,\, \text{and}\,\, V = (a+b)(a+c)(c+d)(b+d) \end{aligned} \end{aligned}$$ -
3.
Generate data
$$\begin{aligned} D= \begin{pmatrix} n_{11} &{} n_{12} \\ n_{21} &{} n_{22} \\ \end{pmatrix} \end{aligned}$$ -
4.
Estimator of Q ,
$$\begin{aligned} {{\widehat{Q}}}= \begin{pmatrix} \frac{n_{11}}{n} &{} \frac{n_{12}}{n} \\ \frac{n_{21}}{n} &{} \frac{n_{22}}{n} \\ \end{pmatrix} \end{aligned}$$For ease, in the description of the asymptotic formula, use a simple notation for the entries of \({\widehat{Q}}\)
$$\begin{aligned} {{\widehat{Q}}}= \begin{pmatrix} j &{} k \\ l &{} m \\ \end{pmatrix} \end{aligned}$$ -
5.
Estimate of \(\rho \),
$$\begin{aligned} \begin{aligned} {\widehat{\rho }}&= \frac{jm - lk}{\sqrt{(}j+k)(j+l)(l+m)(k+m) } \\&= f(j,k,l,m) \\&= x.y^{-0.5}, {\text{ where}},\, {x} = jm - lk \,\, {\text{and}} \,\, V = (j+k)(j+l)(l+m)(k+m) \end{aligned} \end{aligned}$$ -
6.
Asymptotic variance of \({{\widehat{\rho }}} \) using the delta method evaluated at their expectations, \(j= E(j), k= E(k), l= E(l), m= E(m)\)
$$\begin{aligned} \begin{aligned}& {\text{AsymptoticVariance}} = \left( \frac{df}{dj} \right) ^{2} * \text{var}(j) + \left( \frac{df}{dk} \right) ^{2} * \text{var}(k) + \left( \frac{df}{dl} \right) ^{2} * {\text{var}}(l) \\& + \left( \frac{df}{dm} \right) ^{2} * {\text{var}}(m) + 2 \left( \frac{df}{dj} \right) * \left( \frac{df}{dk} \right) * {\text{cov}}(j,k) \\& + 2 \left( \frac{df}{dj} \right) * \left( \frac{df}{dl} \right) * {\text{cov}}(j,l) + 2 \left( \frac{df}{dj} \right) * \left( \frac{df}{dm} \right) * {\text{cov}}(j,m) \\& + 2 \left( \frac{df}{dk} \right) * \left( \frac{df}{dl} \right) * {\text{cov}}(k,l) + 2 \left( \frac{df}{dk} \right) * \left( \frac{df}{dm} \right) * {\text{cov}}(k,m) \\ & + 2 \left( \frac{df}{dl} \right) * \left( \frac{df}{dm} \right) * {\text{cov}}(l,m) \end{aligned} \end{aligned}$$ -
7.
Calculate the variances and covariances,
$$\begin{aligned}\begin{array}{l} {\mathop {\mathrm {var}}} \left( j \right) = \frac{{a\left( {1 - a} \right) }}{n};{\mathop {\mathrm {var}}} \left( k \right) = \frac{{b\left( {1 - b} \right) }}{n}\\ {\mathop {\mathrm {var}}} \left( l \right) = \frac{{c\left( {1 - c} \right) }}{n};{\mathop {\mathrm {var}}} \left( m \right) = \frac{{d\left( {1 - d} \right) }}{n} {\mathop {\mathrm {cov}}} \left( {j,k} \right) = - \frac{{ab}}{n};{\mathop {\mathrm {cov}}} \left( {j,l} \right) = - \frac{{ac}}{n}\\ {\mathop {\mathrm {cov}}} \left( {j,m} \right) = - \frac{{ad}}{n};{\mathop {\mathrm {cov}}} \left( {k,l} \right) = - \frac{{bc}}{n}\\ {\mathop {\mathrm {cov}}} \left( {k,m} \right) = - \frac{{bd}}{n};{\mathop {\mathrm {cov}}} \left( {l,m} \right) = - \frac{{cd}}{n} \end{array}\end{aligned}$$ -
8.
\(\begin{aligned} \begin{aligned} \frac{df}{dj}&= x \left( \frac{dy^{-0.5}}{dj} \right) + y^{- 0.5} \left( \frac{dx}{dj} \right) \\&= x(-0.5)y^{- \frac{3}{2}} \frac{dy}{dj} + y ^{- 0.5} \left( \frac{dx}{dj} \right) \\&= -(0.5)xy ^{- 0.5}y ^{- 1}(2j+k+l)(l+m)(k+m) + y ^{- 1}m \end{aligned} \end{aligned}\)
-
9.
\(\begin{aligned} {\left( {\frac{{\partial f}}{{\partial j}}} \right) _{j = {\mathop {\mathrm {E}}\nolimits } \left( j \right) ,k = {\mathop {\mathrm {E}}\nolimits } \left( k \right) ,l = {\mathop {\mathrm {E}}\nolimits } \left( l \right) ,m = {\mathop {\mathrm {E}}\nolimits } \left( m \right) }} &= - {\textstyle {1 \over 2}}u{v^{ - {\scriptstyle 1} /{\scriptstyle 2}}}{v^{ - 1}}\left( {2a + b + c} \right) \left( {c + d} \right) \left( {b + d} \right) + {v^{ - {\scriptstyle 1} /{\scriptstyle 2}}}d\\ & = - {\textstyle {1 \over 2}}\rho {v^{ - 1}}\left( {2a + b + c} \right) \left( {c + d} \right) \left( {b + d} \right) + {v^{ - {{\scriptstyle 1} / {\scriptstyle 2}}}}d \end{aligned}\)
-
10.
\(\begin{aligned}&{\left( {\frac{{\partial f}}{{\partial k}}} \right) _{j = {\mathop {\mathrm { E}}\nolimits } \left( j \right) ,k = {\mathop {\mathrm {E}}\nolimits } \left( k \right) ,l = {\mathop {\mathrm {E}}\nolimits } \left( l \right) ,m = {\mathop {\mathrm {E}}\nolimits } \left( m \right) }} = - {\textstyle {1 \over 2}}\rho {v^{ - 1}}\left( {2b + a + d} \right) \left( {a + c} \right) \left( {c + d} \right) - {v^{ - {{\scriptstyle 1} /{\scriptstyle 2}}}}c\end{aligned}\)
-
11.
\(\begin{aligned}&{\left( {\frac{{\partial f}}{{\partial l}}} \right) _{j = {\mathop {\mathrm {E}}\nolimits } \left( j \right) ,k = {\mathop {\mathrm {E}}\nolimits } \left( k \right) ,l = {\mathop {\mathrm {E}}\nolimits } \left( l \right) ,m = {\mathop {\mathrm {E}}\nolimits } \left( m \right) }} = - {\textstyle {1 \over 2}}\rho {v^{ - 1}}\left( {2c + a + d} \right) \left( {a + b} \right) \left( {b + d} \right) - {v^{ - {{\scriptstyle 1} /{\scriptstyle 2}}}}b\end{aligned}\)
-
12.
\(\begin{aligned}&{\left( {\frac{{\partial f}}{{\partial m}}} \right) _{j = {\mathop {\mathrm {E}}\nolimits } \left( j \right) ,k = {\mathop {\mathrm {E}}} \left( k \right) ,l = {\mathrm{E}} \left( l \right) ,m = {\mathrm{E}} \left( m \right) }} = - {\textstyle {1 \over 2}}\rho {v^{ - 1}}\left( {2b + b + c} \right) \left( {a + b} \right) \left( {a + c} \right) + {v^{ - {{\scriptstyle 1} /{\scriptstyle 2}}}}a\end{aligned}\)
-
13.
The expression derived in steps 1 through 12 is plugged into the asymptotic variance formula in Step 6.
-
14.
if \(\rho = 0\) then asymptotic variance \(\left( {\widehat{\rho }} \right) = \frac{1}{n}\)
Rights and permissions
About this article
Cite this article
Bhuiyan, M.A.N., Wathen, M. & Rao, M. Power Comparisons in Contingency Tables. J Stat Theory Pract 15, 64 (2021). https://doi.org/10.1007/s42519-021-00199-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s42519-021-00199-8