Probing curvilinear-by-linear interactions when the predictors are randomly sampled

Liu, Yu

doi:10.3758/s13428-019-01276-4

Probing curvilinear-by-linear interactions when the predictors are randomly sampled

Published: 03 September 2019

Volume 52, pages 773–798, (2020)
Cite this article

Download PDF

Behavior Research Methods Aims and scope Submit manuscript

Probing curvilinear-by-linear interactions when the predictors are randomly sampled

Download PDF

Yu Liu¹

2967 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Curvilinear effects and curvilinear-by-linear interactions are hypothesized, tested, and probed in various fields of the social and behavioral sciences. When such effects are found, researchers traditionally have tested the simple slope, and more recently have utilized the Johnson–Neyman technique, often at the values of the sample statistics of the predictor(s). Both approaches implicitly assume that values on the predictors have been sampled according to a fixed sampling plan. More typically in social and behavioral research, however, values on the predictors can be more appropriately considered as randomly sampled from a multivariate population distribution. I show analytically and through simulation that for regression models containing a curvilinear-by-linear interaction, fixed and random regression models produce the same estimates but different standard errors of the simple slope at values of the sample statistics of the predictors. When values on the predictors are randomly sampled, treating them as fixed when testing the simple slope or generating conditional Johnson–Neyman confidence bands at values of the sample statistics of the predictors can lead to inflated Type I error rates and inaccurate coverage rates. Recommendations for researchers are provided, and directions for future research are discussed.

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Mixed methods research: what it is and what it could be

Article Open access 29 March 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Curvilinear effects and curvilinear-by-linear interactions are commonly hypothesized, tested, and probed in various fields of social and behavioral sciences. Examples include studies of job performance, job satisfaction, life satisfaction and turnover intention (Janssen, 2001), learning and performance in multidisciplinary teams (van der Vegt & Bunderson, 2005), creativity (Baer & Oldham, 2006), and voluntary turnover rates, organizational human resource management investment in employees, and workforce productivity and organizational financial performance (Shaw, Park, & Kim, 2013). When a significant interaction effect is found, researchers have traditionally followed-up the significant interaction effect by testing the simple slope of the effect of the focal predictor on the outcome at the values of the sample statistics of moderators, typically at the sample mean and one sample standard deviation above and below the sample mean (see Aiken & West, 1991). The probing of significant interactions more recently has been conducted using the Johnson–Neyman technique, which has been extended to models with curvilinear effects (J. W. Miller, Stromeyer, & Schwieterman, 2013; B. O. Muthén, Muthén, & Asparouhov, 2017). This study focuses on tests of the simple slope, followed by a discussion of the Johnson–Neyman approach applied to regression models with a curvilinear-by-linear interaction.

Simple slope analysis (and conditional Johnson–Neyman confidence bands) at sample statistics of the moderator typically use the asymptotic variance estimates of the regression coefficients to generate symmetric confidence intervals of the simple slope at values of the sample statistics of the predictors. In this process, researchers make the implicit assumption that the predictor values have been sampled according to a fixed sampling plan. Fixed sampling plans produce distributions of values on the predictors in which the means and variances of each sample are identical. In social and behavioral sciences, participants are rarely selected according to a fixed sampling plan. More typically, random samples or convenience samples are selected so that the distributions of the values on the predictors will vary from sample to sample and should theoretically be treated as if they were random. Although testing the simple slopes at a priori specified fixed constant values of the random predictors using the fixed regression model is still expected to lead to appropriate statistical tests, often researchers are interested in testing the simple slope at predictor values that reflect the relative standing in the population (e.g., low, mean, high), rather than some fixed constant values of the predictors. Moreover, it is not uncommon in social and behavioral research that the randomly sampled predictors do not have substantively meaningful fixed constant values of interest, especially for predictors created as an aggregation of information across several subdomains of a construct.

Liu, West, Levy, and Aiken (2017) recently showed analytically and through simulation that for regression models with a linear-by-linear interaction, fixed and random regression models produce the same estimates, but different standard errors of the simple slopes at sample values of the statistics of the moderator variable. Liu et al. discussed some factors that can influence the seriousness of the issue associated with treating random predictors as fixed in probing linear-by-linear interactions at sample values of the statistics of the moderator. This study extends these findings analytically and through simulation studies to the case of probing curvilinear-by-linear interactions. In this case, the complex set of product terms in the regression equation lead to more complicated differences in the variance expressions of the simple slope at particular sample values of the statistics of predictors in the random versus fixed sampling frameworks. These more complicated differences in the variance expressions lead to more complicated predictions regarding factors that can influence the seriousness of the issue associated with treating random predictors as fixed in testing simple slopes at particular sample values of the statistics of the predictors. This study is also the first to examine the influence of covariates on these tests of the simple slope.

The organization of this article is as follows. First, I present a brief overview of the procedures for testing of regression models involving linear-by-linear interactions, quadratic effects, or curvilinear-by-linear interactions followed by tests of the simple slope. Second, I provide a brief review of current practice in the recent applied literature on testing and probing curvilinear-by-linear interactions using simple slope tests and the Johnson–Neyman confidence bands. Third, I present the results of a mathematical derivation showing that in probing a curvilinear-by-linear interaction, the variance expression of the simple slope at sample values of the statistics of the predictors are different under the random effects model versus the fixed effects model. On the basis of the derivation, predictions are made regarding factors that can influence the seriousness of the problem associated with ignoring the sampling variability of the sample statistics of random predictors when testing the simple slope at sample values of the statistics of the predictors in probing a curvilinear-by-linear interaction. Some of these factors can also influence the performance of the conditional Johnson–Neyman confidence bands to probe a curvilinear-by-linear interaction created using sample values of the statistics of the randomly sampled moderator. Fourth, a brief introduction to the fully Bayesian approach is presented. Fifth, a set of small-scale Monte Carlo simulations is presented to evaluate the predictions regarding factors that can influence the seriousness of the problem associated with ignoring the sampling variability of the statistics of random predictors when testing the simple slope at sample values of the statistics of the predictors in probing a curvilinear-by-linear interaction. A random regression percentile bootstrapping approach, a fully Bayesian approach, and a fixed regression approach that tests the simple slope at fixed constant values of the predictors were also evaluated as potential methods to address this problem. Next, I extend the Johnson–Neyman technique to random regression models with a curvilinear-by-linear interaction probed at sample values of the statistics of the moderator. I show that conditional Johnson–Neyman confidence bands created under the fixed regression framework using sample values of the statistics of the randomly sampled moderator also reflect incorrect standard errors. In contrast, random regression with percentile bootstrapping and the fully Bayesian approach can create revised conditional Johnson–Neyman confidence/ credibility bands that can account for the sampling variability of the sample statistics of the randomly sampled moderator. Finally, I provide a brief discussion, recommendations, an overview of limitations, and directions for future research.

Regression models with interaction terms

Although tests of interactions have been popular, researchers have primarily tested regression models only involving a linear-by-linear interaction,

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2Z+{\beta}_3 XZ+e, $$

(1)

where X and Z are the two predictors, e is the residual assumed to follow a normal distribution, $ e\sim N\left(0,{\upsigma}_e^2\right) $, $ {\upsigma}_e^2 $ is the population variance of the residuals, and β₀, β_1, β₂, β₃ are the regression coefficients in the population, with β₃ representing the linear-by-linear interaction effect. When b₃, the sample estimate of β₃, is statistically significant, researchers may be interested in testing the simple slope of the effect of the focal predictor (say X) on the outcome Y at specific values of the moderator Z. This simple slope of the effect of X on Y is calculated as the first partial derivative of Eq. 1 with respect to X (e.g., Aiken & West, 1991):

$$ \frac{d\hat{Y}}{dX}={\beta}_1+{\beta}_3Z, $$

(2)

with the corresponding sample estimate being (b₁ + b₃Z).

However, in social and behavioral research, sometimes more complex forms of association are hypothesized and tested. One such form is the simple quadratic (curvilinear) association between the predictor of interest X and the outcome Y, described by

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2{X}^2+e. $$

(3)

In Eq. 3, X can be viewed as a moderator of its own association with Y. When b₂, the sample estimate of β₂, is statistically significant, researchers may want to test the simple slope of the effect of X on Y at specific values of X. This simple slope is calculated as the first partial derivative of Eq. 3 with respect to X (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ \frac{d\hat{Y}}{dX}={\beta}_1+2{\beta}_2X, $$

(4)

with the corresponding sample estimate being (b₁ + 2b₂X). This expression provides the slope of the tangent line to the curve defined by Eq. 3, and is a function of X.

One can see that the simple slope expression given by Eq. 4 follows the same functional form as the simple slope expression given by Eq. 2. Therefore, the probing of a significant X² quadratic term in the regression model described by Eq. 3 at sample values of the statistics of X has similar properties as the probing of a significant XZ interaction term in the regression model described by Eq. 1 at sample values of the statistics of Z, which is discussed at length in Liu et al. (2017) and will not be repeated further here.

On the other hand, when there is a quadratic component X² in the regression equation, the relationship between X and Y may also be modified by one or more moderators, resulting in even more complex equations for the relationships. Aiken and West (1991), Ganzach (1997), and J. W. Miller et al. (2013) have all described tests of various higher-order interactions involving quadratic components. In this study I focus on the test of the quadratic by linear interaction. The population multiple regression model with a quadratic-by-linear interaction (Aiken & West, 1991; see also J. W. Miller et al., 2013) is given by

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2{X}^2+{\beta}_3Z+{\beta}_4 XZ+{\beta}_5{X}^2Z+e. $$

(5)

When b₅, the estimate of the population regression coefficient β₅ for the curvilinear-by-linear interaction term, is statistically significant, researchers may want to test the simple slope of the effect of the focal predictor on the outcome at specific values of the predictors. If Z is the focal predictor and X the moderator, then the simple slope is calculated as the first partial derivative of the regression equation with respect to Z (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ \frac{d\hat{Y}}{dZ}={\beta}_3+{\beta}_4X+{\beta}_5{X}^2, $$

(6)

with the corresponding sample estimate being (b₃ + b₄X + b₅X²).

X may also be considered to be the focal predictor and Z the moderator—the distinction between the focal predictor and the moderator stems from the researchers’ conceptual perspective. In this case the simple slope is calculated as the first partial derivative of the regression equation with respect to X (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ \frac{d\hat{Y}}{dX}={\beta}_1+2{\beta}_2X+{\beta}_4Z+2{\beta}_5 XZ, $$

(7)

with the corresponding sample estimate being (b₁ + 2b₂X + b₄Z + 2b₅XZ).

It is worth mentioning that sometimes researchers are interested in testing whether there is evidence of any curvilinear or linear relationship between X and Y at a particular value of Z (e.g., Aiken & West, 1991, pp. 84–86; Dawson, 2014). In a regression model defined by Eq. 5, this can be done by rearranging the regression equation as follows:

$$ Y={\beta}_0+{\beta}_3Z+\left({\beta}_1+{\beta}_4Z\right)X+\left({\beta}_2+{\beta}_5Z\right){X}^2+e. $$

(8)

In Eq. 8, (β₁+ β₄Z) describes the overall linear trend of the simple curve representing the regression of Y on X at a value of Z, and (β₂+ β₅Z) describes the nature of the curvilinearity of the simple curve at a value of Z. To test whether there is any relationship between X and Y at a particular value of Z, one can center Z at the chosen value and test the increase in R² due to the inclusion of the X and X² terms, which is described in Dawson (2014). To test for evidence of any curvilinear relationship between X and Y at a particular value of Z, researchers would examine (b₂+ b₅Z), the sample estimate of (β₂+ β₅Z). This expression of the test of the existence of any curvilinear relationship between X and Y follows the same functional form as the expression of the simple slope of the effect of X on Y when probing a linear-by-linear interaction XZ given in Eq. 2. Therefore, the test of the existence of any curvilinear relationship between X and Y at sample values of the statistics of Z in a model containing a curvilinear-by-linear interaction has similar properties to the probing of a significant linear-by-linear XZ interaction at sample values of the statistics of Z, which is discussed at length in Liu et al. (2017).

Tests of the simple slope from the fixed regression framework

Testing the simple slope in Eqs. 2, 4, 6, or 7 involves estimating the regression coefficients that appear in the simple slope expression, picking a value of the predictor(s) that appear in the simple slope expression (i.e., X or Z or both), calculating the simple slope estimate, and testing whether the simple slope estimate is significantly different from zero. Typically researchers choose a few sample-based values for the predictor(s) that appear in the simple slope expression, often the values of the sample statistics M − SD, M, and M + SD, but other sample-based values such as the 25th and 75th sample percentiles may also be used.

The approach for testing the simple slope of the effect of the focal predictor on the outcome at sample statistics of the predictor(s) outlined above typically uses the asymptotic variance estimate of the regression coefficients to generate symmetric confidence intervals of the simple slope at sample values of the statistics of the predictors. If researchers try to make inferences from the results of such analyses to the population simple slope at the corresponding population statistics of the predictors (e.g., the significance of the simple slope of the effect of Z on Y described in Eq. 6 at the mean of X), instead of making inferences to the population simple slope at a priori fixed constant values of the moderator (e.g., the significance of the simple slope of the effect of Z on Y at X = 3), then an implicit assumption is made in this process that values on the predictors have been sampled according to a fixed sampling plan. As an example of a fixed sampling plan, in a study of the effect of pay disparity (defined as total CEO compensation divided by average total compensation of the top management team; Ridge, Aime, & White, 2015) on firm performance, certain numbers of firms with pay disparity of 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5 would be selected to reflect the population frequencies of different values on pay disparity. A replication of this study would select a sample in which the numbers of firms with pay disparity of 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5 are exactly the same as the first study. The fixed sampling plan implies that each sample will have identical distribution for the values of the predictor(s) with the same sample statistics (i.e., they do not vary from sample to sample).

Brief review of current practice in the applied literature

To gain a better understanding of current practice, I conducted a small review of applied literature from 2000 to the present that involved testing and probing curvilinear-by-linear interactions. I used two search criteria to identify 20 articles. I first examined journal articles that (a) were cited in J. W. Miller et al. (2013) or cited J. W. Miller et al., and (b) had found a significant curvilinear-by-linear interaction. This led to the first nine empirical articles summarized in Table 1. Then I entered “curvilinear interaction” in Google Scholar, defined the time range to be 2000 and on, and found the first 11 empirical articles that had revealed a significant curvilinear-by-linear interaction but were not cited in J. W. Miller et al. and did not cite J. W. Miller et al., resulting in a total of 20 articles (see Table 1). Eighteen out of the 20 articles reviewed used some form of convenience sampling, random sampling, or other sampling plan unrelated to the focal predictor or moderator in the analyses (e.g., sampling based on demographics), rather than fixed sampling of the predictors of interest.

Table 1 Summary of empirical articles after the year 2000 detecting significant curvilinear-by-linear interactions

Full size table

On the basis of the literature review, it seems that the use of a fixed sampling plan of predictors is rarely the case in social and behavioral research. Sometimes it is impractical to conduct fixed sampling of predictors but the predictors have meaningful constant values of interest. An example is the number of existing medical conditions of patients (Rast, Rush, Piccinin, & Hofer, 2014). It is also often the case that a fixed sampling plan is impractical and the predictors may not have substantively meaningful fixed constant values of interest. This applies to predictors created using any of the following three procedures.

(1)
Taking the averages of item scores of a Likert-scale measure whose response categories do not correspond to some fixed quantity (e.g., “often” encounters discrimination rather than encountering discrimination more than 10 times in the past 3 months).
(2)
Standardizing scores within each subdomain of the construct (using the mean and standard deviation of the sample), then taking the average of these z scores. An example is human resource management investments by the company that consists of training, pay level, benefit level, job security, procedural justice, and selective staffing (Shaw et al., 2013).
(3)
Using other means of calculation to aggregate information across several subdomains of the construct. Examples include diversity among team members in expertise, gender, or nationality calculated using Blau’s (1977) formula (van der Vegt & Bunderson, 2005).

When the predictors are sampled randomly rather than according to a fixed sampling plan, sample statistics of the predictors are estimates of the corresponding population statistics rather than fixed values. The uncertainty associated with the sample estimates of the population statistics is not considered in the estimation of the standard error of the test of the simple slope at sample values of the statistics of the predictors under the fixed regression framework. This can potentially lead to inappropriate significance test results and 95% confidence intervals with low coverage rates. This issue is discussed further in the next two sections.

Fixed sampling versus random sampling

The fixed-effects model assumes that the distributions of values on the predictor terms do not vary from sample to sample, but that values on the outcome Y conditional on the predictor scores are random. An important implication of the fixed-effects model is that sample values of the statistics of the predictors (such as M_X, M_Z, SD_X, SD_Z, and the 25th percentiles on X and Z) in each sample will be identical. In many social and behavioral studies, however, the predictors can be more appropriately considered as randomly sampled from a multivariate population distribution, and ideally an alternative and more realistic random effects regression model should be used.

Sampson (1974) showed that for a linear regression model that did not include any higher-order terms such as interaction or quadratic effects—for example, $ \hat{Y}={\beta}_0+{\beta}_1X+{\beta}_2Z $—the estimates of the regression coefficients, the Type I error rates of their statistical tests, and the Type I error rates of the statistical tests of general linear constraints on the regression coefficients were unbiased and did not differ between the fixed and random regression frameworks, assuming multivariate normality of the predictor terms. Importantly, Sampson’s proof was restricted to tests of a linear combination of the regression coefficients. For instance, when testing the simple slope of the effect of Z on Y when probing a significant curvilinear-by-linear interaction X²Z at an a priori fixed constant value of X = 2, the null hypothesis for this simple slope test is H₀: β₃ + β₄(2) + β₅(2)² = 0, which represents a linear combination of the regression coefficients. Following this, tests of the simple slope of the effect of the focal predictor on the outcome using the fixed-effects approach will be appropriate when either (a) a fixed sampling plan is used for the focal variable and moderator or (b) fixed constant values of the predictors are chosen at which to evaluate the simple slope. An example of case (b) would occur if substantively meaningful fixed constant values on a predictor (e.g., values of 2, 3, and 4 on the number of medical conditions of patients; Rast et al., 2014) were chosen a priori and used in the test of the simple slope (for a similar recommendation, see Dawson, 2014). Put differently, for moderators with substantively meaningful fixed constant values of interest, even when being randomly sampled, testing the simple slope of the effect of the focal predictor on the outcome at the substantively meaningful fixed constant values of the predictors using the fixed regression model is still expected to lead to appropriate statistical tests.

However, researchers often do not have meaningful fixed constant values on the predictor(s) at which to test the simple slope. Instead, they often wish to choose predictor values that reflect the participants’ relative standing in the population (e.g., low, mean, high). Furthermore, sometimes the predictors do not have meaningful fixed constant values of interest, particularly for predictors created by aggregating across subdomains of a construct. If values on the predictors are randomly sampled, then the sample statistics of the predictors such as sample mean of X, M_X, are likely to change from one sample to the next and should be considered as random variables with distributions. When values on the predictors are randomly sampled, the null hypothesis for the test of the simple slope of the effect of Z on Y when probing a significant curvilinear-by-linear interaction X²Z at X = M_X is H₀: β₃ + β₄(M_X) + β₅(M_X)² = 0. Because M_X is a random variable under the random sampling framework rather than a constant, the null hypothesis of this test no longer represents a linear combination of the regression coefficients, but rather represents a function that includes a nonlinear product of two random variables (β₄ and M_X) and a nonlinear product of three random variables (β₅, M_X, and M_X).

Variance estimates of the simple slope at sample-estimated conditional values of randomly sampled predictors

In the Appendix, for a regression model that contains a curvilinear-by-linear interaction as described by Eq. 5, I derive expressions for the variance (square of the standard error) of the simple slope of the effect of Z on Y given in Eq. 6 at sample values of the statistics of X (e.g., M_X– SD_X, M_X, M_X+ SD_X, the 25th and 75th sample percentiles, etc.), and of the simple slope of the effect of X on Y given in Eq. 7 at sample values of the statistics of X and Z when values on X and Z are randomly sampled from a bivariate normal distribution. The derivations in the Appendix show that the variance expression of the simple slope at sample values of the statistics of the predictors under the random-effects model includes additional terms, as compared to the variance expression arising in the fixed-effects model that is given in Aiken and West (1991) and in J. W. Miller et al. (2013). When the relative magnitudes of these additional terms in the variance expression under the random effects model are large, treating random predictors as fixed in testing the simple slope at sample values of the statistics of the predictors can lead to inaccurate significance tests with biased Type I error rates and unacceptable coverage rates for 95% confidence intervals. When the relative magnitudes of these additional terms are small, treating random predictors as fixed in testing the simple slope at sample values of the statistics of the predictors will have only a minimal influence on the significance test results. I will refer to the issue associated with treating the random predictors as fixed in testing the simple slope at sample values of the statistics of the predictors as the fixed-versus-random issue for the rest of the article. Based on the derivations in the Appendix, predictions can be made regarding which factors could and which factors would not influence the extent of the fixed-versus-random issue.

Liu et al. (2017) discussed some of these factors in probing linear-by-linear interactions. I expect these conclusions from probing linear-by-linear interactions to hold in probing curvilinear-by-linear interactions. Table 2 presents a summary of some factors expected to have similar influences on the extent of the fixed-versus-random issue when probing linear-by-linear interactions and curvilinear-by-linear interactions. For instance, Liu et al. pointed out that, holding everything else the same, the greater the magnitude of the variance(s) of the first-order predictor(s) that appear in the simple slope expression, the greater the extent of the fixed-versus-random issue. Similarly, holding everything else the same, the greater the magnitude of the regression coefficient for the highest-order term, the greater the extent of the fixed-versus-random issue. In addition, Liu et al. pointed out that sample size should not influence the extent of the fixed-versus-random issue, as sample size influences the sampling variability of regression coefficients (which appear in the terms in the simple slope variance expression under the fixed regression framework) and the sampling variability of the sample statistics of the random predictors (which appear in the additional terms in the simple slope variance expression under the random regression framework) to a similar degree. Of importance, a few additional factors can influence the extent of the fixed-versus-random issue, both when probing linear-by-linear interactions and when probing curvilinear-by-linear interactions, but were not discussed by Liu et al.; one such factor is the magnitude of the residual variance of the outcome in the model.

Table 2 Factors expected to have similar influences on the extent of the fixed-versus-random issue when probing linear-by-linear interactions and curvilinear-by-linear interactions

Full size table

Residual variance of the outcome in the regression model

Holding everything else constant, if the residual variance of the outcome decreases, then estimates of the sampling variance of the regression coefficients will also decrease. The sampling variances of the regression coefficients appear in the variance expression of the simple slope at sample values of the statistics of the predictor(s) under the fixed regression framework. However, the magnitude of the additional terms in the variance expression under the random regression framework will remain the same. Thus, the relative magnitude of the additional terms in the variance expression of the simple slope at sample values of the statistics of the predictor(s) under the random sampling framework would become greater relative to terms in the variance expression under the fixed sampling framework. I hypothesize that this factor can influence the extent of the fixed-versus-random issue when probing both curvilinear-by-linear interactions and linear-by-linear interactions, though this factor was not discussed in Liu et al. (2017).

Decreased residual variance of the outcome can occur in two manners. First, holding the total amount of explained variance in the outcome constant, if the total variance of the outcome decreases (e.g., due to obtaining a more homogeneous sample), then the magnitude of residual variance of the outcome in the regression model will also decrease. Second, holding constant the total variance of the outcome and the proportion of variance in the outcome accounted for by the terms involving the focal predictor and moderator (i.e., X, X², Z, XZ, and X²Z in Eq. 5), if covariates are added to the regression model to help explain additional variability in the outcome, then the residual variance of the outcome in the regression model can also decrease. In either situation, one can expect a larger influence of ignoring the sampling variability of the sample statistics of random predictors in the test of the simple slope at sample values of the statistics of the predictor(s). The later situation is particularly important given that studies in applied research that examine and probe curvilinear-by-linear interaction effects often do not have a large sample size (see Table 1). In such cases, if the ΔR² associated with the highest-order term is of a small magnitude, the use of covariates to reduce the residual variance of the outcome may be needed to yield sufficient statistical power to detect the effect of the highest-order term. However, the increased statistical power can be accompanied by a problematic greater influence of ignoring the sampling variability of the sample statistics of random predictors in the test of the simple slope at sample values of the statistics of the predictor(s).

When probing curvilinear-by-linear interactions, the differences in the variance expressions of the simple slope at sample values of the statistics of the predictors under the random-versus-fixed sampling frameworks are more complicated than when probing linear-by-linear interactions. As a result, new predictions can be made on the basis of derivations in the curvilinear-by-linear interaction case, which do not apply to the probing of the linear-by-linear interactions. I provide below a detailed discussion of some of these factors.

The combination of the signs and magnitudes of (1) the regression coefficient of the highest-order predictor term X²Z and (2) the expected values of the sample statistics of the predictors at which the simple slope is probed

When probing curvilinear-by-linear interactions, the variance expression of the simple slope is more complicated than the probing of the simpler linear-by-linear interactions. When probing curvilinear-by-linear interactions, the additional terms in the variance expression of the simple slope at sample values of the statistics of the random predictor(s) contain complex terms that are the products of four components: The variance of the sample statistic of a predictor (which is nonnegative), the expected value of the sample statistic of a predictor, the expected value of the regression coefficient of the highest-order predictor term, and the expected value of another regression coefficient. When Z is the focal predictor and X the moderator, one additional term in the variance expression of the simple slope at the sample statistic T_X of the moderator X is 4E[b₄]E[b₅]E[T_X]V[T_X]. When the focal predictor is X and the moderator is Z, the additional terms in the variance expression of the simple slope of the effect of X on Y at a sample statistic of X (T_X) and a sample statistic of Z (Q_Z) include 8E[b₂]E[b₅]E[Q_Z]V[T_X] and 4E[b₄]E[b₅]E[T_X]V[Q_Z]. In these additional terms that are the products of four components, the influence of any one component depends on the signs and magnitudes of other components. Moreover, when the focal predictor is X and the moderator is Z, the influence of the sign and magnitude of E[Q_Z] will also depend on those of E[T_X] as both appear in the additional terms in the variance expression of the simple slope.

An implication is that in the same sample, the extent of the fixed-versus-random issue in probing a curvilinear-by-linear interaction may be different when testing the simple slope at sample statistics of the same random predictor with the same sampling variability and the same absolute value but different signs. To illustrate, let T_X represent a sample statistic of X, and Q_Z a sample statistic of Z. Testing the simple slope of the effect of Z on Y at T_X ≠ 0 versus $ {T}_X^{\prime }=-{T}_X $ may reflect different extents of the fixed-versus-random issue. The same applies to testing the simple slope of the effect of X on Y at T_X ≠ 0 versus $ {T}_X^{\prime }=-{T}_X $ or at Q_Z ≠ 0 versus $ {Q}_Z^{\prime }=-{Q}_Z $.

In addition, consider a situation in which the regression coefficient for the highest-order term changes in sign, but not in absolute magnitude. The pattern of the relative seriousness of the fixed-versus-random issue when testing the simple slope of the effect of Z on Y at T_X ≠ 0 versus $ {T}_X^{\prime }=-{T}_X $ may be reversed. The same applies to testing the simple slope of the effect of X on Y at T_X ≠ 0 versus $ {T}_X^{\prime }=-{T}_X $ or at Q_Z ≠ 0 versus $ {Q}_Z^{\prime }=-{Q}_Z $.

Further, the absolute magnitude of E[Q_Z] or E[T_X] at which the simple slope is probed can have an influence beyond the influence of the sign of E[Q_Z] or E[T_X]. In the same sample, for two sample statistics of the same predictor that have expected values of the same sign and have the same sampling variance (e.g., $ {Q}_Z^{(1)}={M}_Z-{SD}_Z $ and $ {Q}_Z^{(2)}={M}_Z+{SD}_Z $ for normally distributed Z),^{Footnote 1} the magnitude of the expected values of the sample statistics (i.e., $ E\left[{Q}_Z^{(1)}\right] $ vs. $ E\left[{Q}_Z^{(2)}\right] $) will influence the extent of the fixed-versus-random issue.

A fully Bayesian approach to account for sampling variability of statistics of random predictors

In a multiple regression model with a curvilinear-by-linear interaction X²Z and a covariate A, the population regression equation is given by

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2{X}^2+{\beta}_3Z+{\beta}_4 XZ+{\beta}_5{X}^2Z+{\beta}_AA+e. $$

(9)

A fully Bayesian model of regression analysis includes a distribution for the outcome Y and a distribution for the predictors (Gelman et al., 2013). Let Y denote the full collection of values on the outcome Y, and let (X, Z, A) denote the full collection of values on the predictors X and Z and the covariate A. Furthermore, let Ω denote the parameters that govern the joint distribution of (X, Z, A). In Eq. 9, because all other predictor terms are product terms of X and/or Z, Ω in effect governs the joint distribution of (X, Z, X², XZ, X²Z, A). Thus, the model parameters of a fully Bayesian model of Eq. 9 include not only (β, $ {\upsigma}_e^2 $), which are the regression coefficients and the residual variance of the regression equation, but also Ω, which governs the joint distribution of all the predictor and covariate terms.

Bayesian estimation views model parameters as random variables with distributions and make statistical inferences about the model parameters through their posterior distributions. Using the Bayes’ theorem, prior distributions of the model parameters are combined with information from the data to produce posterior distributions of the model parameters. Assuming prior independence of (β, $ {\upsigma}_e^2 $) and Ω, the prior distribution of the model parameters of a fully Bayesian model of Eq. 9 can be factored as the product of the prior distribution of (β, $ {\upsigma}_e^2 $) and the prior distribution of Ω. Then the posterior distribution of the model parameters can also be factored as the product of the posterior distribution of (β, $ {\upsigma}_e^2 $), given the data on (X, Z, A) and Y, and the posterior distribution of Ω given the data on (X, Z, A) (Gelman et al., 2013).

Following this, if a significant curvilinear-by-linear interaction is found, then when evaluating the simple slope described by Eq. 6 or 7 at some chosen statistics of the predictors, the statistics of the predictors such as μ_X, σ_X, μ_Z, and σ_Z are viewed as random variables with distributions under the fully Bayesian framework. Thus, the posterior distribution of the simple slope is determined by not only the posterior distributions of the regression coefficients involved, but also the posterior distributions of the statistics of the predictors at which the simple slope is evaluated. Therefore, the fully Bayesian approach to testing simple slopes can account for not only the uncertainty in the regression coefficients, but also the uncertainty of the chosen statistics of random predictors.

This study used conditionally conjugate priors for the model parameters (Gelman et al., 2013), which yield full conditional distributions that are easy to sample from when using MCMC to approximate the posterior distribution. Noninformative (diffuse) priors were used to maximize the comparability of the results of the different approaches considered. The noninformative conditionally conjugate priors used here include normal prior distributions with zero mean and infinite variance for the regression coefficients, an inverse Gamma prior for the residual variance$ {\upsigma}_e^2 $, normal priors with zero mean and infinite variance for the means of (X, Z, A), and an inverse Wishart prior for the variance–covariance matrix of (X, Z, A).

With conditionally conjugate priors, the fully Bayesian regression model can be viewed as a single random effects regression model using a larger dataset that includes not only the observed data, but also “additional data” whose information is summarized in the prior distributions (Gelman et al., 2013). The use of noninformative conjugate priors is akin to having little or no such “additional data” to be included in the analysis. On the other hand, informative priors can be constructed from relevant meta-analyses, existing previous research or pilot studies (e.g., conjugate priors as in the Educational Outcomes example in chap. 5 of Gill, 2014; power priors introduced by Ibrahim & Chen, 2000), or expert judgment (elicited priors). For a comprehensive review of different prior distributions, see chapter 4 of Gill (2014). Meanwhile, if conclusions from existing studies (e.g., regarding the magnitudes of regression coefficients, the average level of variables, or the variability of variables) have limited applicability to the research scenario of the present study, then one might want to down-weight the information from these “additional data” relative to the observed data (e.g., by using a less informative prior, or by using the power prior models, see chap. 1 of Congdon, 2010). Moreover, a sensitivity analysis can be conducted to gauge how sensitive the analysis results are to the choice of priors, by repeating the analysis over a range of alternative priors (Congdon, 2010).

Simulation study

A small-scale Monte Carlo simulation was designed loosely on the basis of the simulated dataset in chapter 5 of Aiken and West (1991; also examined in J. W. Miller et al., 2013) to examine the influence of ignoring the sampling variability of the sample statistics of random predictors in probing a curvilinear-by-linear interaction at the sample values of statistics of the predictor(s) using fixed regression. Two additional approaches that can account for the sampling variability of the sample statistics of the random predictors were also examined: (1) A random regression percentile bootstrapping approach, which treats the predictors as random and calculates the corresponding sample statistics in each bootstrap sample, and (2) a fully Bayesian approach with noninformative priors that treats the statistics of the predictors and the regression coefficients as random variables with distributions. In addition, I examined a fourth approach that used fixed regression and probed the curvilinear-by-linear interaction at fixed constant values of the predictors, which is expected to have acceptable performance (see Liu et al., 2017).

The simulation conditions were selected to reflect representative situations in applied research, based on the literature review summarized in Table 1. The sample sizes in these studies have a wide range, from slightly over 50 to over 2,000; most studies had a sample size of about 100 to a few hundreds. The total R² of the final regression equation in these studies ranges from .06 to .59, and ΔR² for adding the significant curvilinear-by-linear interaction can range from .01 to over .15. Most of these studies also include one or more covariates in the regression equations, which accounted for from 3% to over 30% of the total variance of the outcome.

Method

Data generation

The population model used to generate the data for the small-scale simulation was given by:

$$ Y=3.5+\left(-2\right)X+(3){X}^2+(2)Z+(3) XZ+(2){X}^2Z+{\beta}_AA+e, $$

(10)

where the error e ~ N (0,$ {\upsigma}_e^2 $), and P($ \left[\genfrac{}{}{0pt}{}{X}{Z}\right] $) = N ($ \left[\genfrac{}{}{0pt}{}{0}{0}\right] $,$ \left[\begin{array}{cc}1& .6\\ {}.6& 4\end{array}\right] $). The values were chosen to be close to those in the simulated dataset in chapter 5 of Aiken and West (1991) to provide readers with a familiar example, but with integer (or half-integer) values for regression coefficients and variable standard deviations or residual standard deviations for ease of interpretation. The model also included A, a covariate that is related to the outcome Y but not to the other predictor terms. The covariate A followed a normal distribution with a population mean of zero and a population variance of 25 (i.e., population standard deviation of 5). The design was a 3 (amount of variance in Y unexplained by predictor terms involving X and Z) × 2 (sample size) factorial design. Data were generated in R (R Core Team, 2018) with 3,000 replications for each condition.

Amount of variance in Y not explained by the predictor terms involving X and Z

The magnitudes of β_A and $ {\upsigma}_e^2 $ were varied (see Table 3) such that the amount of variance in Y not explained by the predictor terms X, X², Z, XZ, and X²Z (i.e., the sum of the amount of variance in Y explained by A and $ {\upsigma}_e^2 $) were roughly 625 (= 25², which was close to the residual standard deviation in chap. 5 of Aiken & West, 1991), 400 (= 20²), and 225 (= 15²). The correlation between A and Y was kept approximately constant in a narrow band near .41 (obtained from the analysis of a very large sample of N = 1,000,000 generated from the same population model for each condition), so including A versus not including A in the analysis model can increase R² by about 17%, regardless of the amount of variance in Y unexplained by the predictor terms involving X and Z. These values lead to a realistic range of model R² values and ΔR² values for X²Z based on the brief review of articles in applied research summarized in Table 1.

Table 3 Population values of β_A and $ {\upsigma}_{\mathrm{e}}^2 $ in simulations

Full size table

Sample size

Two sample sizes were compared: 400, which was the sample size of the simulated dataset in chapter 5 of Aiken and West (1991), and 100, which was a smaller but realistic sample size used in applied research examining curvilinear-by-linear interactions (see Table 1). On the basis of the derivations in the Appendix, it is to be expected that sample size would not influence the extent of the fixed-versus-random issue. Sample size was varied in the simulation to confirm this conclusion from the derivations.

Analysis models

Four approaches were used to analyze each generated dataset: (1) The currently widely used fixed regression that probes the curvilinear-by-linear interaction at numeric values of sample statistics of the predictors (M_X − SD_X, M_X, and M_X + SD_X on X; M_Z − SD_Z, M_Z, and M_Z + SD_Z on Z) by means of recentering; (2) a random regression percentile bootstrapping approach, which treats the predictors as random and calculates the corresponding sample statistics in each bootstrap sample; (3) a fully Bayesian approach with noninformative priors, which treats the statistics of the predictors and the regression coefficients as random variables with distributions; and (4) a fixed regression approach that probes the curvilinear-by-linear interaction at fixed constant values of the predictors. The fixed regression approaches were implemented using SAS 9.4. The random regression percentile bootstrapping approach was implemented using the R package boot (Ripley, 2017), with 5,000 bootstrap samples per replication. The fully Bayesian approach was implemented in Mplus 8 (L. K. Muthén & Muthén, 1998–2018), using the default non-informative conditionally conjugate priors in Mplus, a thinning rate of 10, and 100,000 draws from the posterior distribution to estimate the highest posterior density (HPD) credible interval of the simple slopes.^{Footnote 2} The approaches were compared in terms of the coverage rates of the 95% confidence intervals/ HPD credible intervals of the simple slope of the effect of X on Y defined in Eq. 7 at M_X − SD_X, M_X, and M_X + SD_X on X and M_Z − SD_Z, M_Z, and M_Z + SD_Z on Z, and the coverage rates of the 95% confidence intervals/ HPD credible intervals of the simple slope of the effect of Z on Y defined in Eq. 6 at M_X − SD_X, M_X, and M_X + SD_X on X. Coverage rate was defined as the proportion of replications in which the population parameter fell within the 95% confidence intervals/HPD credible intervals. Coverage rate outside of the range of [.925, .975] was considered unacceptable (Bradley, 1978). Using each approach, the generated datasets were analyzed both without versus with the covariate A in the regression equation. The statistical power to detect the curvilinear-by-linear interaction term using the traditional fixed regression approach was also examined.

To evaluate the empirical Type I error rate of these approaches, for N = 400, I examined the coverage rates of the 95% confidence intervals/HPD credible intervals of the simple slope of the effect of X on Y defined in Eq. 7 at M-0.25SD of X and M + 0.875 SD of Z. Given the data generation model, the simple slope of the effect of X on Y at μ – 0.25σ of X and μ + 0.875σ of Z is zero in the population, so one minus the coverage rate reflects the empirical Type I error rate.^{Footnote 3}

Results

Table 4 summarizes the coverage rates of the 95% confidence intervals/HPD credible intervals of the simple slope of the effect of X on Y at M_X − SD_X, M_X, and M_X + SD_X on X and M_Z − SD_Z, M_Z, and M_Z + SD_Z on Z, and those of the simple slope of the effect of Z on Y at M_X − SD_X, M_X, and M_X + SD_X on X, for conditions with N = 400. The patterns of results were virtually identical for N = 400 and N = 100, in line with the prediction based on the mathematical derivation; hence, the N = 100 results are included in Supplemental Material 1. The fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors frequently produced 95% confidence intervals for the simple slope of the effect of X on Y and the effect of Z on Y whose coverage rates fell below .925, and sometimes below the less stringent criterion of .90 proposed by Collins, Schafer, and Kam (2001). In contrast, random regression with percentile bootstrapping and the fully Bayesian approach yielded 95% confidence intervals/HPD credible intervals that consistently produced coverage rates within Bradley’s (1978) criterion of .925 to .975 for acceptable coverage. The fixed regression approach that probed the curvilinear-by-linear interaction at fixed constant values of the predictors also had acceptable performance.

Table 4 Summary of simulation results for N = 400

Full size table

The coverage rates produced by the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors decreased as the amount of variance in Y unexplained by terms involving X or Z decreased, and as the covariate A was included in the analysis model. This pattern is in line with the prediction based on the derivations.

The fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tended to produce unacceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y at M or M+ 1SD of X and at M or M+ 1SD of Z, and for the simple slope of the effect of Z on Y at M+ 1SD of X, but acceptable coverage rates for the simple slope of the effect of X on Y tested at M − 1SD of X or at M− 1SD of Z, and for the simple slope of the effect of Z on Y at M− 1SD of X. This is in line with the prediction based on the derivations that in the same sample, the extent of the fixed-versus-random issue may be different when testing the simple slope at sample statistics of the same predictor with the same sampling variability and the same absolute value but different signs.

At N = 100, although maintaining adequate overall coverage, at some tested sample statistics of the random predictors, the proportions of the random regression 95% percentile bootstrap confidence intervals of the simple slope completely below versus completely above the population value were slightly asymmetric, with differences up to .024. The asymmetry was smaller when the covariate A was omitted from the model and when the amount of variance in Y unexplained by terms involving X or Z was larger. At N = 400, the asymmetry was negligible. This pattern was not observed for the fully Bayesian approach.

Table 5 summarizes the empirical Type I error rates of these approaches at N = 400 by investigating the coverage rates of 95% confidence intervals of the simple slope of the effect of X on Y at M – 0.25SD of X and M + 0.875SD of Z. Random regression with percentile bootstrapping, the fully Bayesian approach, and the fixed regression approach that probed the curvilinear-by-linear interaction at fixed constant values of the predictors produced 95% confidence intervals/HPD credible intervals of the simple slope of the effect of X on Y at M – 0.25SD of X and M + 0.875 SD of Z with coverage rates consistently within Bradley’s (1978) criterion of .925 to .975 for acceptable coverage of 95% confidence intervals, corresponding to empirical Type I error rates always within the range [.025, .075]. On the other hand, the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors frequently produced coverage rates of 95% confidence intervals below .925 and sometimes below .90; these rates correspond to empirical Type I error rates greater than .075 or even .10. The inflation of the Type I error rates increased as the amount of variance in Y unexplained by terms involving X or Z increased, and as the covariate A was included in the analysis model.

Table 5 Coverage rate of 95% confidence intervals (CIs) in which the population simple slope of Y on X is zero (N = 400)

Full size table

Additional simulations

Two additional simulations were conducted. Additional Simulation 1 aimed to investigate the prediction based on the mathematical derivation that, if the regression coefficient for the highest-order term X²Z changes in sign but not in magnitude, then the pattern of the relative seriousness of the fixed-versus-random issue when testing the simple slope of the effect of Z on Y at T_X versus $ {T}_X^{\prime }=-1\ast {T}_X $in the same sample may be reversed. Similarly, the pattern of the relative seriousness of the fixed-versus-random issue when testing the simple slope of the effect of X on Y at T_X versus $ {T}_X^{\prime }=-1\ast {T}_X $ or at Q_Z versus $ {Q}_Z^{\prime }=-1\ast {Q}_Z $in the same sample may be reversed.

In Additional Simulation 1, the population parameter value for the regression coefficient of the highest order term X²Z was changed from + 2 to – 2. Everything else was kept identical to the original simulation condition in which N = 400, and the amount of variance in Y unexplained by the predictor terms X, X², Z, XZ, and X²Z was 225 (= 15²). The analysis model in Additional Simulation 1 included the covariate A. The results are summarized in Table 6. In the original simulation (upper panel of Table 6), the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tended to produce unacceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y tested at M or M+ 1SD of X, and at M or M+ 1SD of Z, and for the simple slope of the effect of Z on Y at M+ 1SD of X, but acceptable coverage rates for the simple slope of the effect of X on Y tested at M – 1SD of X or at M– 1SD of Z and for the simple slope of the effect of Z on Y at M– 1SD of X. In contrast, when the regression coefficient for the highest-order term X²Z changed in sign but not in magnitude (lower panel of Table 6), the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tended to produce acceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y tested at M+ 1SD of X or at M+ 1SD of Z and for the simple slope of the effect of Z on Y at M+ 1SD of X, but unacceptable coverage rates for the simple slope of the effect of X on Y tested at M-1SD of X or at M-1SD of Z and for the simple slope of the effect of Z on Y at M– 1SD of X. This pattern of results is in line with the prediction based on the derivations.

Table 6 Summary of results for Additional Simulation 1, with N = 400, and the amount of variance in Y unexplained by terms involving X or Z = 225

Full size table

Additional Simulation 2 aimed to investigate the prediction based on the mathematical derivation that in the same sample, for two sample statistics of the same predictor that have expected values of the same sign and have the same sampling variance (e.g., $ {Q}_Z^{(1)}={M}_Z-{SD}_Z $ and $ {Q}_Z^{(2)}={M}_Z+{SD}_Z $ for normally distributed Z), the absolute magnitude of the expected values of the sample statistics (i.e., $ E\left[{Q}_Z^{(1)}\right] $ vs. $ E\left[{Q}_Z^{(2)}\right] $) will influence the extent of the fixed-versus-random issue. In Additional Simulation 2, the population mean value of Z was increased from 0 to 3, such that μ_Z − σ_Z = 1 and μ_Z + σ_Z = 3 are now of the same sign. Everything else was kept the same as the original simulation condition in which N = 400, and the amount of variance in Y unexplained by the predictor terms X, X², Z, XZ, and X²Z was 625 (= 25²). All analysis models included the covariate A in the model. The results are summarized in Table 7. Although in Additional Simulation 2 μ_Z − σ_Z = 1 and μ_Z + σ_Z = 3 are now of the same sign (they were of opposite signs but had equal absolute values in the original simulation), the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors still tended to produce unacceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y tested at M or M+ 1SD of Z, but acceptable coverage rates for the simple slope of the effect of X on Y tested at M– 1SD of Z. This result is in line with the prediction based on the mathematical derivation that the absolute magnitude of the expected value of the sample statistics of the predictors at which the simple slope is probed can have an influence beyond the influence of the sign of the expected value of the sample statistics.

Table 7 Summary of results for Additional Simulation 2, with N = 400, and the amount of variance in Y unexplained by terms involving X or Z = 625

Full size table

Johnson–Neyman confidence bands

The Johnson–Neyman technique applied to a linear-by-linear interaction identifies the range of values on a moderator at which the simple slope of the effect of the focal predictor on the outcome is significant (Preacher, Curran, & Bauer, 2006). Recent developments (e.g., J. W. Miller et al., 2013; B. O. Muthén et al., 2017) have extended the Johnson–Neyman technique to linear models with curvilinear effects. For a model with a linear-by-linear interaction or a simple quadratic effect as described by Eqs 1 or 3, the Johnson–Neyman confidence band of simple slope bypasses the need to explicitly select a conditional value on the moderator at which to test the simple slope—the researcher just examines the Johnson–Neyman confidence band of the simple slope across the range of the moderator to see if it includes zero at different fixed constant values of the moderator. However, with more complex forms of interactions such as those involving a quadratic-by-linear interaction, the use of the Johnson–Neyman technique involves selecting a conditional value on the moderator at which to examine the conditional Johnson–Neyman confidence band. Issues associated with the use of the conditional Johnson–Neyman confidence band at sample values of the statistics of the random moderator closely parallel those associated with the test of the simple slope at sample values of the statistics of the random predictors. When probing a significant quadratic-by-linear interaction term by plotting the conditional Johnson–Neyman confidence band of, say, the simple slope of the effect of X on Y across the range of X at a certain value of the random moderator Z, the typical practice of supplying a numeric value on the moderator to the computation of the Johnson–Neyman confidence band will be appropriate when either (a) a fixed sampling plan is used for the moderator or (b) fixed values of the moderator (e.g., Z = – 2, 0, and 2) are chosen. However, when values on the moderator are randomly sampled and when the researcher wishes to obtain conditional Johnson–Neyman confidence bands at certain statistics of the moderator (e.g., the mean and one standard deviation above and below the mean), treating the moderator as fixed (and using sample values of the statistics) versus random can result in conditional Johnson–Neyman confidence bands of different widths. Treating the moderator as fixed ignores the sampling variability of the sample statistics of the moderator used to establish the conditional Johnson–Neyman confidence bands.

To illustrate, one replication from the small-scale simulation described above was used to compute conditional Johnson–Neyman confidence bands of the simple slope of the effect of X on Y described by Eq. 7 across a range of X values, at M – 1SD of Z, M of Z, and M + 1SD of Z, treating the random moderator Z as fixed versus random. These approaches were implemented in Mplus 8 (L. K. Muthén & Muthén, 1998–2018) using loop plots, then the loop plot data were exported from Mplus and combined in R, to plot overlaid conditional Johnson–Neyman confidence bands (from fixed regression, random regression with percentile bootstrapping, and the fully Bayesian approach) using the R package ggplot2. The dataset chosen was the first replication in the simulation condition with N = 400 and the amount of variance in Y unexplained by predictor terms involving X or Z was 225. The covariate A was always included in the model. When the random moderator was properly treated as random, the Johnson–Neyman confidence bands were obtained from (1) a random regression model using percentile bootstrapping (with 10,000 bootstrap samples) and (2) a fully Bayesian approach using the default noninformative priors in Mplus (with 100,000 draws from the posterior distribution).^{Footnote 4}

Figure 1 displays the scatterplot of X on Z for this dataset. As can be seen from Fig. 1, at – 2, 0, and 2 on Z (corresponding to μ_Z − 1σ_Z, μ_Z, and μ_Z + 1σ_Z), X always had data in the range of [– 1, 1], so the Johnson–Neyman confidence bands of the simple slope of the effect of X on Y at M – 1SD of Z, M of Z, and M + 1SD of Z were plotted across the range of – 1 to 1 on X.

Figure 2 contains the conditional Johnson–Neyman confidence bands of the simple slope of the effect of X on Y across the range of – 1 to 1 on X at M + 1SD of Z, from (1) the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, (2) random regression with percentile bootstrapping, and (3) the fully Bayesian approach. Figure 2 shows that above the value zero on X, the Johnson–Neyman confidence bands from random regression with percentile bootstrapping and from the fully Bayesian approach were almost indistinguishable from one another and always wider than that from the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, especially for higher X values. This reflects a pattern similar to the one in the bottom panel of Table 4. However, below zero on X, the Johnson–Neyman confidence band from the fully Bayesian approach was almost indistinguishable from that generated by the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, and both were wider than that from random regression with percentile bootstrapping. The pattern of results for the Johnson–Neyman confidence bands of the simple slope of the effect of X on Y at M and M – 1SD of Z, respectively, are similar to those in Fig. 2, but with smaller discrepancies across methods, and are included in Supplemental Materials 2 and 3.

Figure 2 shows that the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors can produce conditional Johnson–Neyman confidence bands with different shapes than does random regression with bootstrapping or the fully Bayesian approach in the dataset from this one replication. A key issue is whether these results from this one replication (model R² = .584 and average ΔR² = .051 for X²Z from the fixed regression model) are generalizable. Thus, I examined for all 3,000 replications in this simulation condition the 95% CIs of the simple slopes of the effect of X on Y at fixed constant values of X = – 1, 0, 1, and sample statistics of Z = M_Z – SD_Z, M_Z, and M_Z + SD_Z from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, random regression with percentile bootstrapping, and the fully Bayesian approach.

Table 8 summarizes the coverage rates of the 95% CIs, and Fig. 3 provides boxplots of the widths of 95% CIs, of these simple slopes of the effect of X on Y from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, random regression with percentile bootstrapping, and the fully Bayesian approach, at constant values of X and sample values of the statistics of Z. It can be seen in Fig. 3 that the average widths of the 95% CIs are similar across methods at X = – 1, regardless of the sample statistic of Z at which the simple slope was probed. In such cases, the coverage rates of the 95% CIs are always close to 95% for all three approaches (see the top panel of Table 8). At X = 0, and particularly X = 1, however, the average width of the 95% CIs generated by the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tends to be smaller than those generated by random regression with percentile bootstrapping and the fully Bayesian approach, with the latter two always similar to one another. At X = 1, the difference between the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors and the other two methods is the greatest for the simple slope probed at Z = M_Z + SD_Z, followed by Z = M_Z, followed by Z = M_Z – SD_Z. Consistent with this pattern, the coverage rate of the 95% CI of simple slopes from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors fell below 0.925 at X = 1 and Z = M_Z or M_Z + SD_Z (see the bottom panel of Table 8). Comparing Table 8 against the bottom panel of Table 4 reveals that probing simple slopes in the face of a curvilinear-by-linear interaction at fixed constants of one random predictor and sample statistics of the other in the fixed regression framework can result in better coverage rates of 95% CIs than does probing simple slopes at sample statistics of both random predictors in the fixed regression framework. Figure 3 also reveals that the width of the 95% CI of simple slopes from random regression with percentile bootstrapping tends to have slightly greater variability than the widths from fixed regression and from the fully Bayesian approach. Nonetheless, random regression with percentile bootstrapping consistently produced 95% confidence intervals of the simple slope with acceptable coverage rates, as did the fully Bayesian approach. On the basis of the results in Fig. 3 and Table 8, when the revised conditional Johnson–Neyman confidence band from random regression with bootstrapping or the fully Bayesian approach is wider than the Johnson–Neyman confidence band from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, one of the former two approaches should be used.

Table 8 Summary of simple slope test results at constants of X and sample statistics of Z, for N = 400, and the amount of variance in Y unexplained by terms involving X or Z = 225

Full size table

Discussion

Curvilinear effects and curvilinear-by-linear interactions are often hypothesized, tested, and probed in social and behavioral research. When such effects are found, researchers have traditionally tested the simple slope of the effect of the focal predictor, and more recently utilized the Johnson–Neyman technique, often at sample values of the statistics of the predictor(s). Both approaches implicitly assume values on the predictors have been sampled according to a fixed sampling plan, whereas in many social and behavioral studies values on the predictors are more appropriately considered to be randomly sampled. As was pointed out by Dawson (2014) and Liu et al. (2017), when substantively meaningful constant values are available on the predictors, researchers should ideally specify what fixed constant values would be of substantively interest, and test simple slopes at these fixed constant values of the predictors. However, researchers often want to test the simple slope at values that reflect the relative standing of the predictor in the population. Moreover, it is not uncommon in social and behavioral research that the randomly sampled predictors do not have substantively meaningful fixed constant values of interest, particularly for predictors created by aggregating across subdomains of a construct.

When researchers test the simple slope at values that reflect the relative standing of the predictors in the population, it is important to consider random regression in the test of simple slopes at sample statistics of the predictors. This manuscript showed analytically and through simulation that fixed and random regression models produce the same estimates, but different standard errors of the simple slopes at sample values of the statistics of the moderator, for the case of regression models containing a curvilinear-by-linear interaction. When the predictor values are randomly sampled, treating the predictors as fixed when conducting simple slope tests at sample values of the statistics of the predictors can lead to inflated Type I error rates and inaccurate 95% confidence intervals for the simple slopes, depending on a variety of factors. Extending Liu et al.’s (2017) consideration of linear-by-linear interactions, I show the influence of some factors that can affect the probing of both linear-by-linear interactions and curvilinear-by-linear interactions, but were not discussed in Liu et al. Such factors include (1) the total variance of the outcome (e.g., which can be influenced by obtaining a more homogeneous/ heterogeneous sample) while holding constant the amount of variance explained by the predictor terms, and (2) whether a covariate that can explain unique variability in the outcome is included in the analysis model. I also identified additional factors unique to the probing of curvilinear-by-linear interactions. These factors include the interplay of the signs of the expected values of the sample statistics of the random predictors at which the simple slope is probed and the sign of the regression coefficient for the highest-order term; and the absolute magnitudes of the expected values (of the same sign) of the sample statistics of the random predictors at which the simple slope is probed. Some of these factors discussed in this manuscript, particularly the inclusion of a covariate, are not obvious and were only surfaced by the mathematical derivation.

This manuscript also considered the use of conditional Johnson–Neyman confidence bands in probing curvilinear-by-linear interactions when the predictor values are randomly sampled, which still involves selecting a conditional value on the moderator at which to examine the Johnson–Neyman confidence band. Standard conditional Johnson–Neyman confidence bands at sample values of the statistics of the randomly sampled moderator created within the fixed regression framework can reflect incorrect standard errors. In contrast, revised conditional Johnson–Neyman confidence bands created by random regression with percentile bootstrapping or the fully Bayesian approach can account for the sampling variability of the sample statistics of the random moderator. Thus when using conditional Johnson–Neyman confidence bands to probe a significant curvilinear-by-linear interaction, the researcher should choose either a priori fixed constant values of the random moderator, or some sample statistics of the random moderator but use random regression with bootstrapping or the fully Bayesian approach to account for the uncertainty associated with these sample statistics. Researchers using the Johnson–Neyman technique should note that the estimated cut points of the region of significance are highly dependent on the sample size used in the study (Dawson, 2014), and cannot be expected to replicate if the replication study uses a different sample size.

Of the methods used in this manuscript, bootstrapping has the advantage of being simpler to implement as bootstrapping routines are now available in most standard statistical packages.^{Footnote 5} Also, with a little modification of the R code, bootstrapping would allow easy examination of the simple slope at other sample statistics of interest (e.g., the 25th and 75th percentiles) on the randomly sampled predictors. A caution is warranted: In the simulations, at the small sample size of 100 for some combinations of sample statistics, the proportion of the replications in which the 95% percentile bootstrap confidence interval did not include the population value of the corresponding simple slope fell slightly asymmetrically on the two sides of the population value. It is likely due to the fact that only variable values observed in the original sample will appear in the bootstrap samples, and hence bootstrapping can potentially result in underestimated density of the underlying distribution at locations with few observed data points (Efron & Tibshirani, 1994), which is more of an issue at smaller sample sizes.

Another way to account for the sampling variability of random predictors in probing significant curvilinear-by-linear interactions examined in the present article is the fully Bayesian approach, which views statistics of the predictors (e.g., means, variances, percentiles) as random variables with distributions. The simulations and the illustration relied on the default noninformative conditionally conjugate priors implemented in Mplus to maximize comparability across methods. These noninformative conditionally conjugate priors always led to acceptable coverage rates of the 95% HPD credible intervals in this study. Very large numbers of draws from the posterior distributions of the parameters were used to ensure accurate estimation of the limits of the credible intervals. Although specifying prior distributions is straightforward for the means and variances of random predictors that are assumed to follow a multivariate normal distribution, it is less so for, say, the 25th and 75th percentiles of the random predictors. Previous work on nonparametric Bayesian inference on quantiles (e.g., Bornn, Shephard, & Solgi, 2016) may provide some insights, but further study is needed.

It is worth mentioning that the present article has focused on continuous-outcome variables. If the range of the outcome variable is limited, models other than the regular multiple regression model could be used. For instance, if the outcome is an ordered categorical variable with four or fewer categories, then logistic or multinomial logistic regression models should be used. If the outcome is a count variable, then appropriate models for the count outcome (e.g., Poisson regression, negative binomial regression, zero-inflated or zero-truncated negative binomial regression, etc.; see Coxe, West, & Aiken, 2009) should be used. I expect the factors that can influence the extent of the fixed-versus-random issue to be similar whether the outcome variable is continuous or bounded, if the appropriate form of the model for the outcome (e.g., model for continuous outcome, model for ordinal outcome, model for count outcome) is selected.

Notes

$ {Q}_Z^{(1)}={M}_Z-{SD}_Z $ and $ {Q}_Z^{(2)}={M}_Z+{SD}_Z $ for normally distributed Z have the same sampling variance because the mean and variance of a normal distribution have independent sampling distributions (Lukas, 1942).
A 95% HPD credible interval of the posterior distribution is the 95% credible interval that gives the highest posterior density among all 95% credible intervals (Gelman et al., 2013).
Although researchers seldom examine the simple slope at μ – 0.25σ or μ + 0.875σ of the predictors, these simulation conditions illustrate the potential influence of ignoring the sampling variability of the random predictors on the Type I error rate of the test of the simple slope at sample values of the statistics of the predictor(s).
The conditional Johnson–Neyman confidence bands of the fully Bayesian approach were based on the 95% credible intervals with equal tail percentages, but not the 95% HPD credible intervals, of the simple slope of the effect of X on Y across a range of X values, at M – 1SD of Z, M of Z, and M + 1SD of Z. Conditional Johnson–Neyman confidence bands based on the 95% HPD credible intervals can potentially be unsmooth in shape, because the limits of the 95% HPD credible intervals of the simple slope of the effect of X on Y at adjacent X values and the same value on Z could represent different percentiles of the posterior distributions.
To make the random regression with percentile bootstrapping and the fully Bayesian approaches more accessible to readers, I provide in the supplemental materials Mplus and R scripts that can be easily adapted by researchers to test their hypotheses about the simple slope in regression models with curvilinear-by-linear interactions.

References

Adams, R. E., & Laursen, B. (2007). The correlates of conflict: Disagreement is not necessarily detrimental. Journal of Family Psychology, 21, 445–458. https://doi.org/10.1037/0893-3200.21.3.445
Article PubMed PubMed Central Google Scholar
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage.
Google Scholar
Baer, M., & Oldham, G. R. (2006). The curvilinear relation between experienced creative time pressure and creativity: Moderating effects of openness to experience and support for creativity. Journal of Applied Psychology, 91, 963–970. https://doi.org/10.1037/0021-9010.91.4.963
Article PubMed Google Scholar
Baer, M., Oldham, G. R., Jacobsohn, G. C., & Hollingshead, A. B. (2008). The personality composition of teams and creativity: The moderating role of team creative confidence. Journal of Creative Behavior, 42, 255–282.
Article Google Scholar
Barber, J. P., Gallop, R., Crits-Christoph, P., Barrett, M. S., Klostermann, S., McCarthy, K. S., . . . Sharpless, B. A. (2008). The role of the alliance and techniques in predicting outcome of supportive–expressive dynamic therapy for cocaine dependence. Psychoanalytic Psychology, 25, 461–482. https://doi.org/10.1037/0736-9735.25.3.461
Article Google Scholar
Blau, P. M. (1977). Inequality and heterogeneity. New York, NY: Free Press.
Google Scholar
Bohrnstedt, G. W., & Goldberger, A. S. (1969). On the exact covariance of products of random variables. Journal of the American Statistical Association, 64, 1439–1442.
Article Google Scholar
Bornn, L., Shephard, N., & Solgi, R. (2016). Nonparametric hierarchical Bayesian quantiles. arXiv preprint. arXiv:1605.03471
Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152. https://doi.org/10.1111/j.2044-8317.1978.tb00581.x
Article Google Scholar
Chung-Yan, G. A. (2010). The nonlinear effects of job complexity and autonomy on job satisfaction, turnover, and psychological well-being. Journal of Occupational Health Psychology, 15, 237.
Article Google Scholar
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351. https://doi.org/10.1037/1082-989X.6.4.330
Article PubMed Google Scholar
Congdon, P. D. (2010). Applied Bayesian hierarchical methods. Boca Raton, FL: Chapman & Hall/CRC.
Book Google Scholar
Coxe, S., West, S. G., & Aiken, L. S. (2009). The analysis of count data: A gentle introduction to Poisson regression and its alternatives. Journal of Personality Assessment, 91, 121–136. https://doi.org/10.1080/00223890802634175
Article PubMed Google Scholar
Dawson, J. F. (2014). Moderation in management research: What, why, when and how. Journal of Business and Psychology, 29, 1–19.
Article Google Scholar
Dayan, M., & Di Benedetto, C. A. (2011). Team intuition as a continuum construct and new product creativity: The role of environmental turbulence, team experience, and stress. Research Policy, 40, 276–286.
Article Google Scholar
Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap. Boca Raton, FL: CRC Press.
Google Scholar
Ganzach, Y. (1997). Misleading interaction and curvilinear terms. Psychological Methods, 2, 235–247. https://doi.org/10.1037/1082-989X.2.3.235
Article Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis (3rd ed.). Boca Raton, FL: Chapman & Hall/CRC Press.
Book Google Scholar
Gill, J. (2014). Bayesian methods: A social and behavioral sciences approach (3rd ed.). Boca Raton, FL: Chapman & Hall/CRC Press.
Book Google Scholar
Harris-Britt, A., Valrie, C. R., Kurtz-Costes, B., & Rowley, S. J. (2007). Perceived racial discrimination and self-esteem in African American youth: Racial socialization as a protective factor. Journal of Research on Adolescence, 17, 669–682.
Article Google Scholar
Ibrahim, J. G., & Chen, M. H. (2000). Power prior distributions for regression models. Statistical Science, 15, 46–60.
Article Google Scholar
Janssen, O. (2001). Fairness perceptions as a moderator in the curvilinear relationships between job demands, and job performance and job dissatisfaction. Academy of Management Journal, 44, 1039–1050.
Google Scholar
Liu, Y., West, S. G., Levy, R., & Aiken, L. S. (2017). Tests of simple slopes in multiple regression models with an interaction: Comparison of four approaches. Multivariate Behavioral Research, 52, 445–464. https://doi.org/10.1080/00273171.2017.1309261
Article PubMed Google Scholar
Lukas, E. (1942). A characterization of the normal distribution. Annals of Mathematical Statistics, 13, 91–93.
Article Google Scholar
Mellat-Parast, M., Golmohammadi, D., McFadden, K. L., & Miller, J. W. (2015). Linking business strategy to service failures and financial performance: Empirical evidence from the US domestic airline industry. Journal of Operations Management, 38, 14–24.
Article Google Scholar
Miller, J. G., Kahle, S., & Hastings, P. D. (2017). Moderate baseline vagal tone predicts greater prosociality in children. Developmental Psychology, 53, 274–289. https://doi.org/10.1037/dev0000238
Article PubMed Google Scholar
Miller, J. W., Stromeyer, W. R., & Schwieterman, M. A. (2013). Extensions of the Johnson–Neyman technique to linear models with curvilinear effects: Derivations and analytical tools. Multivariate Behavioral Research, 48, 267–300.
Article Google Scholar
Muthén, B. O., Muthén, L. K., & Asparouhov, T. (2017). Regression and mediation analysis using Mplus. Los Angeles, CA: Muthén & Muthén.
Google Scholar
Muthén, L. K., & Muthén, B. O. (1998–2018). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthén & Muthén.
Google Scholar
Preacher, K. J., Curran, P. J., & Bauer, D. J. (2006). Computational tools for probing interactions in multiple regression, multilevel modeling, and latent curve analysis. Journal of Educational and Behavioral Statistics, 31, 437–448.
Article Google Scholar
R Core Team. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org/.
Google Scholar
Rapp, A. A., Bachrach, D. G., & Rapp, T. L. (2013). The influence of time management skill on the curvilinear relationship between organizational citizenship behavior and task performance. Journal of Applied Psychology, 98, 668–677. https://doi.org/10.1037/a0031733
Article PubMed Google Scholar
Rast, P., Rush, J., Piccinin, A., & Hofer, S. M. (2014). The identification of regions of significance in the effect of multimorbidity on depressive symptoms using longitudinal data: An application of the Johnson–Neyman technique. Gerontology, 60, 274–281.
Article Google Scholar
Ridge, J. W., Aime, F., & White, M. A. (2015). When much more of a difference makes a difference: Social comparison and tournaments in the CEO’s top team. Strategic Management Journal, 36, 618–636.
Article Google Scholar
Sampson, A. R. (1974). A tale of two regressions. Journal of the American Statistical Association, 69, 682–689.
Article Google Scholar
Shaw, J. D., Park, T. Y., & Kim, E. (2013). A resource-based perspective on human capital losses, HRM investments, and organizational performance. Strategic Management Journal, 34, 572–589. https://doi.org/10.1002/smj.2025
Article Google Scholar
Ting, D. H. (2004). Service quality and satisfaction perceptions: Curvilinear and interaction effect. International Journal of Bank Marketing, 22, 407–420.
Article Google Scholar
Van de Vliert, E., Huang, X., & Parker, P. M. (2004). Do colder and hotter climates make richer societies more, but poorer societies less, happy and altruistic? Journal of Environmental Psychology, 24, 17–30.
Article Google Scholar
van der Vegt, G. S., & Bunderson, J. S. (2005). Learning and performance in multidisciplinary teams: The importance of collective team identification. Academy of Management Journal, 48, 532–547.
Article Google Scholar
Vandewater, E. A., Shim, M. S., & Caplovitz, A. G. (2004). Linking obesity and activity level with children’s television and video game use. Journal of Adolescence, 27, 71–85.
Article Google Scholar
Villena, V. H., Choi, T. Y., & Revilla, E. (2019). Revisiting interorganizational trust: Is more always better or could more be worse? Journal of Management, 45, 752–785. https://doi.org/10.1177/0149206316680031
Article Google Scholar
Wagner, S. M. (2011). Supplier development and the relationship life-cycle. International Journal of Production Economics, 129, 277–283.
Article Google Scholar
Zhou, J., Shin, S. J., Brass, D. J., Choi, J., & Zhang, Z. (2009). Social networks, personal values, and creativity: Evidence for curvilinear and interaction effects. Journal of Applied Psychology, 94, 1544–1552.
Article Google Scholar

Download references

Open Practices Statement

The article is based on computer simulations, and the population models for the simulations are clearly described in the article, to make replications possible. Analysis codes in Mplus and R are provided in the supplemental materials. None of the Monte Carlo simulations was preregistered.

Author information

Authors and Affiliations

Department of Psychological, Health, and Learning Sciences, University of Houston, Houston, TX, USA
Yu Liu

Authors

Yu Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

ESM 1

(DOCX 186 kb)

Appendix: Mathematical derivations for the variance of the simple slope at sample-estimated values of random predictors in multiple regression with curvilinear-by-linear interaction

This appendix contains the mathematical derivations for the variance expression of the simple slope at values of the sample statistics (e.g., sample mean, one sample standard deviation above and below the sample mean, the 25th and 75th sample percentiles, etc.) of random predictors in multiple regression models with curvilinear-by-linear interaction, when the goal is to make inferences to the population simple slope at corresponding population statistics of the predictors. The derivations are based on a multiple regression model given in Eq. 11:

$$ Y={\upbeta}_0+{\upbeta}_1X+{\upbeta}_2{X}^2+{\upbeta}_3Z+{\upbeta}_4 XZ+{\upbeta}_5{X}^2Z+e. $$

(11)

Derivations of the variance expression of the simple slope traditionally have assumed that values on the predictors X and Z are sampled according to a fixed sampling plan (e.g., Aiken & West, 1991; J. W. Miller et al., 2013). The derivations below show analytically that the variance expression of the simple slope at sample values of the statistics of the predictors in a random regression model differs from that in the fixed regression model, when the researcher’s goal is to make reference to the population simple slope at the corresponding population statistics.

Two cases in multiple regression are considered:

(a)
Fixed regression model in which values on the predictor variables X and Z are sampled on the basis of a fixed sampling plan and have identical distributions across samples;
(b)
Random regression model, in which values on the predictor variables X and Z are randomly drawn from a population in which X and Z have a bivariate normal distribution.

I show that in case (a) the variance expression of the simple slope at sample values of the statistics of the predictors equals the equations in Table 5.1 of Aiken and West (1991, p. 64).

In case (b), when the values of the population statistics on the predictors are unknown and the numeric values of their sample estimates are used in the test of simple slope, the variance expression of the simple slope at numeric values of sample statistics of X and Z includes not only all the terms in the expression for the fixed regression model, but also additional terms leading to a different variance expression of the simple slope than that derived by Aiken and West (1991).

In Eq. 11 when b₅, the estimate of the population regression coefficient β₅ for the highest-ordered interaction term, the curvilinear-by-linear interaction term X²Z, is statistically significant, researchers may be interested in probing simple slopes of the effect of the focal predictor on the outcome.

When the focal predictor is Z and the moderator is X

If the focal predictor is Z and the moderator is X, then simple slope would be calculated as the first partial derivative of the regression equation with respect to Z, and the resulting simple slope (denoted S) is (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ {S}_Z=\frac{d\hat{Y}}{dZ}={\upbeta}_3+{\upbeta}_4X+{\upbeta}_5{X}^2. $$

(12)

In a random sample, the estimate of the simple slope of the outcome Y on the predictor Z at a sample statistic of X (T_X) is written as

$$ {\hat{S}}_{Z\mid {T}_X}={b}_3+{b}_4{T}_X+{b}_5{\left({T}_X\right)}^2, $$

(13)

where b₃, b₄, and b₅ are the sample estimates of β₃, β₄, and β₅, respectively. The variance of this simple slope estimate is

$$ V\left[{\hat{S}}_{Z\mid {T}_X}\right]=V\left[{b}_3+{b}_4{T}_X+{b}_5{T}_X^2\right]. $$

(14)

Equation 14 can be written as

$$ {\displaystyle \begin{array}{l}V\left[{\hat{S}}_{Z\mid {T}_X}\right]=V\left[{b}_3\right]+V\left[{b}_4{T}_X\right]+V\left[{b}_5{T}_X^2\right]\\ {}+2C\left[{b}_3,{b}_4{T}_X\right]+2C\left[{b}_3,{b}_5{T}_X^2\right]+2C\left[{b}_4{T}_X,{b}_5{T}_X^2\right].\end{array}} $$

(15)

On the basis of the formula for the variance of the product of two random variables (Eq. 5 in Bohrnstedt & Goldberger, 1969), the second term in Eq. 15, the variance of the product of b₄ and T_X can be written as

$$ {\displaystyle \begin{array}{c}V\left[{b}_4{T}_X\right]={E}^2\left[{b}_4\right]V\left[{T}_X\right]+{E}^2\left[{T}_X\right]V\left[{b}_4\right]+E\left[{\left({b}_4-E\left[{b}_4\right]\right)}^2{\left({T}_X-E\left[{T}_X\right]\right)}^2\right]\\ {}+2E\left[{b}_4\right]E\left[\left({b}_4-E\left[{b}_4\right]\right){\left({T}_X-E\left[{T}_X\right]\right)}^2\right]\\ {}+2E\left[{T}_X\right]E\left[{\left({b}_4-E\left[{b}_4\right]\right)}^2\left({T}_X-E\left[{T}_X\right]\right)\right]\\ {}+2E\left[{b}_4\right]E\left[{T}_X\right]C\left[{b}_4,{T}_X\right]-{C}^2\left[{b}_4,{T}_X\right].\end{array}} $$

(16)

Similarly, the third term in Eq. 15, the variance of the product of b₅ and $ {T}_X^2 $ can be written as

$$ {\displaystyle \begin{array}{c}V\left[{b}_5{T}_X^2\right]={E}^2\left[{b}_5\right]V\left[{T}_X^2\right]+{E}^2\left[{T}_X^2\right]V\left[{b}_5\right]+E\left[{\left({b}_5-E\left[{b}_5\right]\right)}^2{\left({T}_X^2-E\left[{T}_X^2\right]\right)}^2\right]\\ {}+2E\left[{b}_5\right]E\left[\left({b}_5-E\left[{b}_5\right]\right){\left({T}_X^2-E\left[{T}_X^2\right]\right)}^2\right]\\ {}+2E\left[{T}_X^2\right]E\left[{\left({b}_5-E\left[{b}_5\right]\right)}^2\left({T}_X^2-E\left[{T}_X^2\right]\right)\right]\\ {}+2E\left[{b}_5\right]E\left[{T}_X^2\right]C\left[{b}_5,{T}_X^2\right]-{C}^2\left[{b}_5,{T}_X^2\right].\end{array}} $$

(17)

On the basis of the formula for the covariance between the product of two random variable and a third random variable (Eq. 12 in Bohrnstedt & Goldberger, 1969), C[b₃, b₄T_X] can be written as

$$ {\displaystyle \begin{array}{c}C\left[{b}_3,{b}_4{T}_X\right]=C\left[{b}_4{T}_X,{b}_3\right]\\ {}=E\left[{b}_4\right]C\left[{T}_X,{b}_3\right]+E\left[{T}_X\right]C\left[{b}_4,{b}_3\right]\\ {}+E\left[\left({b}_4-E\left[{b}_4\right]\right)\left({T}_X-E\left[{T}_X\right]\right)\left({b}_3-E\left[{b}_3\right]\right)\right]\\ {}=E\left[{b}_4\right]C\left[{b}_3,{T}_X\right]+E\left[{T}_X\right]C\left[{b}_3,{b}_4\right]\\ {}+E\left[\left({b}_3-E\left[{b}_3\right]\right)\left({b}_4-E\left[{b}_4\right]\right)\left({T}_X-E\left[{T}_X\right]\right)\right].\end{array}} $$

(18)

Similarly, $ C\left[{b}_3,{b}_5{T}_X^2\right] $ can be written as

$$ {\displaystyle \begin{array}{c}C\left[{b}_3,{b}_5{T}_X^2\right]=C\left[{b}_5{T}_X^2,{b}_3\right]\\ {}=E\left[{b}_5\right]C\left[{T}_X^2,{b}_3\right]+E\left[{T}_X^2\right]C\left[{b}_5,{b}_3\right]\\ {}+E\left[\left({b}_5-E\left[{b}_5\right]\right)\left({T}_X^2-E\left[{T}_X^2\right]\right)\left({b}_3-E\left[{b}_3\right]\right)\right]\\ {}=E\left[{b}_5\right]C\left[{b}_3,{T}_X^2\right]+E\left[{T}_X^2\right]C\left[{b}_3,{b}_5\right]\\ {}+E\left[\left({b}_3-E\left[{b}_3\right]\right)\left({b}_5-E\left[{b}_5\right]\right)\left({T}_X^2-E\left[{T}_X^2\right]\right)\right].\end{array}} $$

(19)

On the basis of the formula for the covariance between the product of two random variable and the product of two other random variables (Eq. 11 in Bohrnstedt & Goldberger, 1969), $ C\left[{b}_4{T}_X,{b}_5{T}_X^2\right] $can be written as

$$ {\displaystyle \begin{array}{c}C\left[{b}_4{T}_X,{b}_5{T}_X^2\right]=E\left[{b}_4\right]E\left[{b}_5\right]C\left[{T}_X,{T}_X^2\right]+E\left[{b}_4\right]E\left[{T}_X^2\right]C\left[{T}_X,{b}_5\right]\\ {}+E\left[{T}_X\right]E\left[{b}_5\right]C\left[{b}_4,{T}_X^2\right]+E\left[{T}_X\right]E\left[{T}_X^2\right]C\left[{b}_4,{b}_5\right]\\ {}+E\left[\left({b}_4-E\left[{b}_4\right]\right)\left({T}_X-E\left[{T}_X\right]\right)\left({b}_5-E\left[{b}_5\right]\right)\left({T}_X^2-E\left[{T}_X^2\right]\right)\right]\\ {}+E\left[{b}_4\right]E\left[\left({T}_X-E\left[{T}_X\right]\right)\left({b}_5-E\left[{b}_5\right]\right)\left({T}_X^2-E\left[{T}_X^2\right]\right)\right]\\ {}+E\left[{T}_X\right]E\left[\left({b}_4-E\left[{b}_4\right]\right)\left({b}_5-E\left[{b}_5\right]\right)\left({T}_X^2-E\left[{T}_X^2\right]\right)\right]\\ {}+E\left[{b}_5\right]E\left[\left({b}_4-E\left[{b}_4\right]\right)\left({T}_X-E\left[{T}_X\right]\right)\left({T}_X^2-E\left[{T}_X^2\right]\right)\right]\\ {}+E\left[{T}_X^2\right]E\left[\left({b}_4-E\left[{b}_4\right]\right)\left({T}_X-E\left[{T}_X\right]\right)\left({b}_5-E\left[{b}_5\right]\right)\right]\\ {}-C\left[{b}_4,{T}_X\right]C\left[{b}_5,{T}_X^2\right].\end{array}} $$

(20)

Substituting Eqs. 16–20 into Eq. 15, the variance of the simple slope estimate at a sample statistic of X (T_X) can be written as

(21)

Note that the terms with single underline in Eq. 21 constitute the variance expression of the simple slope under the fixed regression model (Aiken & West, 1991, p. 64, Table 5.1, case 4b). All additional terms contain the variance of T_X (i.e., V[T_X]), the variance of $ {T}_X^2 $ (i.e., $ V\left[{T}_X^2\right] $), the covariance of T_X or $ {T}_X^2 $ with other random variables, and the third or fourth central moments involving T_X or $ {T}_X^2 $.

When values on the predictors are sampled according to a fixed sampling plan, T_X is a fixed value rather than a random variable, as is $ {T}_X^2 $. In this case, the variance of T_X (i.e., V[T_X]), the variance of $ {T}_X^2 $ (i.e., $ V\left[{T}_X^2\right] $), the covariance of T_X or $ {T}_X^2 $ with other random variables, and the third or fourth central moments involving T_X or $ {T}_X^2 $ would all be equal to zero. Thus, when values on the predictors are sampled according to a fixed sampling plan, all terms other than those with single underline in Eq. 21 would be equal to zero, and Eq. 21 reduces to the variance expression of the simple slope under the fixed regression model (Aiken & West, 1991, p. 64, Table 5.1, case 4b).

When values on the predictors are randomly sampled, all these additional terms in Eq. 21 suggest that the sampling variability of the simple slope of Y on Z at the value of a sample statistic of the moderator X will differ from the fixed sampling case. In such situations, treating the random predictors as fixed can result in a biased estimate of the sampling variability of the simple slope at values of the sample statistics of the moderator. The extent of the issue depends on the relative magnitude of the additional terms as compared to the terms with single underline in Eq. 21.

Among these additional terms, consider the three terms with double underlines more closely. E²[b₄]V[T_X] is always non-negative, and with a significant b₅ term that would lead to the probing of the simple slope, $ {E}^2\left[{b}_5\right]V\left[{T}_X^2\right] $ is always positive. $ 2E\left[{b}_4\right]E\left[{b}_5\right]C\left[{T}_X,{T}_X^2\right] $ can be positive or negative, depending on the signs of E[b₄] = β₄, E[b₅] = β₅, and $ C\left[{T}_X,{T}_X^2\right] $. Based on the formula for the covariance between the product of two random variables and a third random variable (Eq. 12 in Bohrnstedt & Goldberger, 1969), $ C\left[{T}_X,{T}_X^2\right] $ can be rewritten as

$$ {\displaystyle \begin{array}{c}C\left[{T}_X,{T}_X^2\right]=E\left[{T}_X\right]C\left[{T}_X,{T}_X\right]+E\left[{T}_X\right]C\left[{T}_X,{T}_X\right]+E\left[{\left({T}_X-E\left[{T}_X\right]\right)}^3\right]\\ {}=2E\left[{T}_X\right]V\left[{T}_X\right]+E\left[{\left({T}_X-E\left[{T}_X\right]\right)}^3\right],\end{array}} $$

(22)

and $ 2E\left[{b}_4\right]E\left[{b}_5\right]C\left[{T}_X,{T}_X^2\right] $ can be rewritten as

$$ {\displaystyle \begin{array}{c}2E\left[{b}_4\right]E\left[{b}_5\right]C\left[{T}_X,{T}_X^2\right]=2E\left[{b}_4\right]E\left[{b}_5\right]\left\{2E\left[{T}_X\right]V\left[{T}_X\right]+E\left[{\left({T}_X-E\left[{T}_X\right]\right)}^3\right]\right\}\\ {}=4E\left[{b}_4\right]E\left[{b}_5\right]E\left[{T}_X\right]V\left[{T}_X\right]+2E\left[{b}_4\right]E\left[{b}_5\right]E\left[{\left({T}_X-E\left[{T}_X\right]\right)}^3\right].\end{array}} $$

(23)

Thus, among other things, the extent of the issue associated with treating the random predictors as fixed in probing a significant curvilinear-by-linear interaction X²Z at the sample statistic T_X of the random moderator X depends on the relative magnitude of V[T_X] and $ V\left[{T}_X^2\right] $, and the signs and relative magnitudes of E[b₄] = β₄, E[b₅] = β₅, and E[T_X], as compared to the magnitude of terms in the variance expression of the simple slope of Y on Z at a sample statistic T_X of X under the fixed regression model: V[b₃], E²[T_X]V[b₄], $ {E}^2\left[{T}_X^2\right]V\left[{b}_5\right] $, 2E[T_X]C[b₃, b₄], $ 2E\left[{T}_X^2\right]C\left[{b}_3,{b}_5\right] $, and $ 2E\left[{T}_X\right]E\left[{T}_X^2\right]C\left[{b}_4,{b}_5\right] $.

When the focal predictor is X and the moderator is Z

If the focal predictor is X and the moderator is Z, then the simple slope would be calculated as the first partial derivative of the regression equation with respect to X, and the resulting simple slope (denoted S) is (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ {S}_X=\frac{d\hat{Y}}{dX}={\upbeta}_1+2{\upbeta}_2X+{\upbeta}_4Z+2{\upbeta}_5 XZ. $$

(24)

In a random sample, the estimate of the simple slope of the outcome Y on the predictor X at a sample statistic of X (T_X) and a sample statistic of Z (Q_Z) is written as

$$ {\hat{S}}_{X\mid {T}_X,{Q}_Z}={b}_1+2{b}_2{T}_X+{b}_4{Q}_Z+2{b}_5{T}_X{Q}_Z, $$

(25)

where b₁, b₂, b₄, and b₅ are the sample estimates of β₁, β₂, β₄, and β₅, respectively. The variance of this simple slope estimate is

$$ V\left[{\hat{S}}_{X\mid {T}_X,{Q}_Z}\right]=V\left[{b}_1+2{b}_2{T}_X+{b}_4{Q}_Z+2{b}_5{T}_X{Q}_Z\right]. $$

(26)

Let A = 2b₂T_X + b₄Q_Z, and B = 2b₅T_XQ_Z, then Eq. 26 can be written as

$$ {\displaystyle \begin{array}{c}V\left[{\hat{S}}_{X\mid {T}_X,{Q}_Z}\right]=V\left[{b}_1+A+B\right]\\ {}=V\left[{b}_1\right]+V\left[A\right]+V\left[B\right]+2C\left[{b}_1,A\right]+2C\left[{b}_1,B\right]+2C\left[A,B\right].\end{array}} $$

(27)

For A = 2b₂T_X + b₄Q_Z,

$$ {\displaystyle \begin{array}{c}V\left[A\right]=V\left[2{b}_2{T}_X+{b}_4{Q}_Z\right]\\ {}=V\left[2{b}_2{T}_X\right]+V\left[{b}_4{Q}_Z\right]+2C\left[2{b}_2{T}_X,{b}_4{Q}_Z\right]\\ {}=4V\left[{b}_2{T}_X\right]+V\left[{b}_4{Q}_Z\right]+4C\left[{b}_2{T}_X,{b}_4{Q}_Z\right],\end{array}} $$

(28)

and

$$ {\displaystyle \begin{array}{c}2C\left[{b}_1,A\right]=2C\left[{b}_1,2{b}_2{T}_X+{b}_4{Q}_Z\right]\\ {}=2C\left[{b}_1,2{b}_2{T}_X\right]+2C\left[{b}_1,{b}_4{Q}_Z\right]\\ {}=4C\left[{b}_1,{b}_2{T}_X\right]+2C\left[{b}_1,{b}_4{Q}_Z\right].\end{array}} $$

(29)

For B = 2b₅T_XQ_Z,

$$ V\left[B\right]=V\left[2{b}_5{T}_X{Q}_Z\right]=4V\left[{b}_5{T}_X{Q}_Z\right], $$

(30)

and

$$ 2C\left[{b}_1,B\right]=2C\left[{b}_1,2{b}_5{T}_X{Q}_Z\right]=4C\left[{b}_1,{b}_5{T}_X{Q}_Z\right], $$

(31)

and

$$ {\displaystyle \begin{array}{c}2C\left[A,B\right]=2C\left[2{b}_2{T}_X+{b}_4{Q}_Z,2{b}_5{T}_X{Q}_Z\right]\\ {}=2C\left[2{b}_2{T}_X,2{b}_5{T}_X{Q}_Z\right]+2C\left[{b}_4{Q}_Z,2{b}_5{T}_X{Q}_Z\right]\\ {}=8C\left[{b}_2{T}_X,{b}_5{T}_X{Q}_Z\right]+4C\left[{b}_4{Q}_Z,{b}_5{T}_X{Q}_Z\right].\end{array}} $$

(32)

Full expansion of terms like the variance of the product of three random variables or the covariance involving the product of three random variables in Eqs. 30–32 is beyond the scope of this article. However, by expansion of the terms in Eq. 28, as well as partial expansion of the terms in Eq. 32, one finds some additional terms in the variance expression of the simple slope of Y on X at sample statistics of X and Z in the random regression case as compared to the fixed regression case.

On the basis of the formula for the variance of the product of two random variables (Eq. 5 in Bohrnstedt & Goldberger, 1969), the first term in Eq. 28, 4V[b₂T_X], can be written as

(33)

Similarly, the second term in Eq. 28, V[b₄Q_Z], can be written as

(34)

On the basis of the formula for the covariance between the product of two random variable and the product of two other random variables (Eq. 11 in Bohrnstedt & Goldberger, 1969), the last term in Eq. 28), 4C[b₂T_X, b₄Q_Z], can be written as

(35)

In Eqs. 33–35, only the terms with single underline appear in the variance expression of the simple slope under the fixed regression model (Aiken & West, 1991, p. 64, Table 5.1, case 4a). All the other terms are additional terms in the random regression case. In particular, examining those additional terms with double underlines in Eqs. 33–35, one can see that among other things, the extent of the issue associated with treating the random predictors as fixed in probing a significant curvilinear-by-linear interaction X²Z at the sample statistics of the random predictors X and Z depends on the relative magnitude of V[T_X] andV[Q_Z], and the signs and relative magnitudes of E[b₂] = β₂,E[b₄] = β₄, and C[T_X, Q_Z], as compared to the magnitude of terms in the variance expression of the simple slope of Y on X at sample statistics of X and Z under the fixed regression model.

The first term in Eq. 32, 8C[b₂T_X, b₅T_XQ_Z], can be rewritten as

(36)

The first term in Eq. 36, 8E[b₂]E[b₅Q_Z]C[T_X, T_X], does not appear in the variance expression of the simple slope under the fixed regression model (Aiken & West, 1991, p. 64, Table 5.1, case 4a)—it is an additional term for the random regression case. Given that E[b₅Q_Z] = C[b₅, Q_Z] + E[b₅]E[Q_Z], it can be further rewritten as

(37)

Thus, the second term in Eq. 37, 8E[b₂]E[b₅]E[Q_Z]V[T_X], is an additional term in the variance expression of the simple slope in the random regression model as compared to the fixed regression model. Similarly, the second term in Eq. 32, 4C[b₄Q_Z, b₅T_XQ_Z], can be rewritten as

(38)

The first term in Eq. 38, 4E[b₄]E[b₅T_X]C[Q_Z, Q_Z], does not appear in the variance expression of the simple slope under the fixed regression model (Aiken & West, 1991, p. 64, Table 5.1, case 4a)—it is an additional term for the random regression case. Given that E[b₅T_X] = C[b₅, T_X] + E[b₅]E[T_X], it can be further rewritten as

(39)

Thus, the second term in Eq. 39, 4E[b₄]E[b₅]E[T_X]V[Q_Z], is another additional term in the variance expression of the simple slope in the random regression model as compared to the fixed regression model.

From the partial expansion of Eq. 32 in Eqs. 36–39, one can see that among other things, the extent of the issue associated with treating the random predictors as fixed in probing a significant curvilinear-by-linear interaction X²Z at the sample statistics of the random predictors X and Z depends on the relative magnitude of V[T_X] and V[Q_Z], and the signs and relative magnitudes of E[b₂] = β₂,E[b₄] = β₄, E[b₅] = β₅, and E[Q_Z] and E[T_X], as compared to the magnitude of terms in the variance expression of the simple slope under the fixed regression model.

To sum up, when the focal predictor is X and the moderator is Z, the variance expression of the simple slope of Y on X is more complicated (see Eqs. 33–39). In this situation, the variance expression of the simple slope of Y on X at the numeric value of a sample statistic of X (T_X) and the numeric value of a sample statistic of Z (Q_Z) under the random regression framework as compared to the fixed regression framework includes additional terms such as 4E²[b₂]V[T_X], E²[b₄]V[Q_Z], 4E[b₂]E[b₄]C[T_X, Q_Z], 8E[b₂]E[b₅]E[Q_Z]V[T_X], and 4E[b₄]E[b₅]E[T_X]V[Q_Z] (and others). The extent of the issue associated with treating the random predictors as fixed in probing a significant curvilinear-by-linear interaction X²Z at the sample statistics T_X and Q_Z of the random predictors X and Z will depend on the relative magnitude of V[T_X] and V[Q_Z], and the signs and relative magnitudes of E[b₂] = β₂,E[b₄] = β₄, E[b₅] = β₅, E[Q_Z] and E[T_X], and C[T_X, Q_Z], as compared to the magnitude of terms in the variance expression of the simple slope under the fixed regression model. Because several of these additional terms in the variance expression of the simple slope in the random regression framework include products of three components whose signs can be positive or negative and sometimes another non-negative component, the influence of any one component depends on the signs and magnitudes of other components of the additional terms in the variance expression of the simple slope in the random regression framework.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y. Probing curvilinear-by-linear interactions when the predictors are randomly sampled. Behav Res 52, 773–798 (2020). https://doi.org/10.3758/s13428-019-01276-4

Download citation

Published: 03 September 2019
Issue Date: April 2020
DOI: https://doi.org/10.3758/s13428-019-01276-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Probing curvilinear-by-linear interactions when the predictors are randomly sampled

Abstract

Similar content being viewed by others

A new criterion for assessing discriminant validity in variance-based structural equation modeling

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Mixed methods research: what it is and what it could be

Regression models with interaction terms

Tests of the simple slope from the fixed regression framework

Brief review of current practice in the applied literature

Fixed sampling versus random sampling

Variance estimates of the simple slope at sample-estimated conditional values of randomly sampled predictors

Residual variance of the outcome in the regression model

The combination of the signs and magnitudes of (1) the regression coefficient of the highest-order predictor term X2Z and (2) the expected values of the sample statistics of the predictors at which the simple slope is probed

A fully Bayesian approach to account for sampling variability of statistics of random predictors

Simulation study

Method

Data generation

Amount of variance in Y not explained by the predictor terms involving X and Z

Sample size

Analysis models

Results

Additional simulations

Johnson–Neyman confidence bands

Discussion

Notes

References

Open Practices Statement

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

ESM 1

Appendix: Mathematical derivations for the variance of the simple slope at sample-estimated values of random predictors in multiple regression with curvilinear-by-linear interaction

Appendix: Mathematical derivations for the variance of the simple slope at sample-estimated values of random predictors in multiple regression with curvilinear-by-linear interaction

When the focal predictor is Z and the moderator is X

When the focal predictor is X and the moderator is Z

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

The combination of the signs and magnitudes of (1) the regression coefficient of the highest-order predictor term X²Z and (2) the expected values of the sample statistics of the predictors at which the simple slope is probed