Common Influencing Factors Are No Evidence of Association: A Comment on Callander, Newman, and Holt (2015)
KeywordsPublic Health Linear Model Statistical Method Explanatory Variable Sexual Behavior
In their article, Callander, Newman, and Holt (2015) claim to have found that “sexual racism” and “generic racist attitudes” are closely related. Their conclusion was based on the following procedure: In a survey among gay and bisexual men, “generic racist attitudes” were measured by the Quick Discrimination Index (QDI), and “sexual racism” was measured by an acceptance toward Online Sexual Racism (OSR) scale that Callander et al. introduced. Then, separate multivariate linear models for both measures were used to identify influencing factors among a set of variables (many of them demographic). Callander et al. reported that “Almost every identified factor associated with men’s racist attitudes was also related to their attitudes toward sexual racism…,” meaning that most explanatory variables were either significant in both linear models or in neither of them. From this, Callander et al. drew the following conclusion: “These similarities suggest that sexual racism is closely related to more general patterns of racism, although there are some distinctions to consider.”
In my opinion, the statistical methods are fundamentally unsuitable to support the hypothesis of an association or relation between the two variables. In general, a common set of influencing factors can exist for two (or more) variables, while the random variables are uncorrelated or even independent (in the probabilistic sense). This is possible because after controlling for explanatory variables, the remaining partial correlation might be negative. Below, I will illustrate this effect by a simple example (only one explanatory variable). Because of the unsuitable statistical analysis, the article by Callander et al. does not provide any evidence that there is an association between QDI and OSR.
Let me now give a mathematical example of two variables with a common influencing factor, which are uncorrelated with each other: Let Z 1, Z 2, Z 3, and Z 4 be four uncorrelated and standardized random variables, meaning that Var(Z i ) = 1 for i = 1,…,4 and Cov(Z i ,Z j ) = 0 for i ≠ j. We observe Y 1 = Z 1 + Z 2 and Y 2 = Z 3 + Z 4. If we use X = Z 2 + Z 3 as an explanatory variable, we find the following: By the bilinearity of the covariance (see, for example, Theorem 5.7 in Klenke ) and the fact that Z 1,…,Z 4 are uncorrelated, we get Cov(Y 1,X) = Cov(Z 1 + Z 2,Z 2 + Z 3) = Cov(Z 1,Z 2) + Cov(Z 1,Z 3) + Cov(Z 2,Z 2) + Cov(Z 2,Z 3) = Var(Z 2) = 1, and by the same arguments Cov(Y 2,X) = 1. So, X is positively correlated to both Y 1 and Y 2. However, Cov(Y 1,Y 2) = Cov(Z 1 + Z 2,Z 3 + Z 4) = Cov(Z 1,Z 3) + Cov(Z 1,Z 4) + Cov(Z 2,Z 3) + Cov(Z 2,Z 4) = 0, so there is no association between the variables Y 1 and Y 2.