Curvilinear effects and curvilinear-by-linear interactions are commonly hypothesized, tested, and probed in various fields of social and behavioral sciences. Examples include studies of job performance, job satisfaction, life satisfaction and turnover intention (Janssen, 2001), learning and performance in multidisciplinary teams (van der Vegt & Bunderson, 2005), creativity (Baer & Oldham, 2006), and voluntary turnover rates, organizational human resource management investment in employees, and workforce productivity and organizational financial performance (Shaw, Park, & Kim, 2013). When a significant interaction effect is found, researchers have traditionally followed-up the significant interaction effect by testing the simple slope of the effect of the focal predictor on the outcome at the values of the sample statistics of moderators, typically at the sample mean and one sample standard deviation above and below the sample mean (see Aiken & West, 1991). The probing of significant interactions more recently has been conducted using the Johnson–Neyman technique, which has been extended to models with curvilinear effects (J. W. Miller, Stromeyer, & Schwieterman, 2013; B. O. Muthén, Muthén, & Asparouhov, 2017). This study focuses on tests of the simple slope, followed by a discussion of the Johnson–Neyman approach applied to regression models with a curvilinear-by-linear interaction.

Simple slope analysis (and conditional Johnson–Neyman confidence bands) at sample statistics of the moderator typically use the asymptotic variance estimates of the regression coefficients to generate symmetric confidence intervals of the simple slope at values of the sample statistics of the predictors. In this process, researchers make the implicit assumption that the predictor values have been sampled according to a fixed sampling plan. Fixed sampling plans produce distributions of values on the predictors in which the means and variances of each sample are identical. In social and behavioral sciences, participants are rarely selected according to a fixed sampling plan. More typically, random samples or convenience samples are selected so that the distributions of the values on the predictors will vary from sample to sample and should theoretically be treated as if they were random. Although testing the simple slopes at a priori specified fixed constant values of the random predictors using the fixed regression model is still expected to lead to appropriate statistical tests, often researchers are interested in testing the simple slope at predictor values that reflect the relative standing in the population (e.g., low, mean, high), rather than some fixed constant values of the predictors. Moreover, it is not uncommon in social and behavioral research that the randomly sampled predictors do not have substantively meaningful fixed constant values of interest, especially for predictors created as an aggregation of information across several subdomains of a construct.

Liu, West, Levy, and Aiken (2017) recently showed analytically and through simulation that for regression models with a linear-by-linear interaction, fixed and random regression models produce the same estimates, but different standard errors of the simple slopes at sample values of the statistics of the moderator variable. Liu et al. discussed some factors that can influence the seriousness of the issue associated with treating random predictors as fixed in probing linear-by-linear interactions at sample values of the statistics of the moderator. This study extends these findings analytically and through simulation studies to the case of probing curvilinear-by-linear interactions. In this case, the complex set of product terms in the regression equation lead to more complicated differences in the variance expressions of the simple slope at particular sample values of the statistics of predictors in the random versus fixed sampling frameworks. These more complicated differences in the variance expressions lead to more complicated predictions regarding factors that can influence the seriousness of the issue associated with treating random predictors as fixed in testing simple slopes at particular sample values of the statistics of the predictors. This study is also the first to examine the influence of covariates on these tests of the simple slope.

The organization of this article is as follows. First, I present a brief overview of the procedures for testing of regression models involving linear-by-linear interactions, quadratic effects, or curvilinear-by-linear interactions followed by tests of the simple slope. Second, I provide a brief review of current practice in the recent applied literature on testing and probing curvilinear-by-linear interactions using simple slope tests and the Johnson–Neyman confidence bands. Third, I present the results of a mathematical derivation showing that in probing a curvilinear-by-linear interaction, the variance expression of the simple slope at sample values of the statistics of the predictors are different under the random effects model versus the fixed effects model. On the basis of the derivation, predictions are made regarding factors that can influence the seriousness of the problem associated with ignoring the sampling variability of the sample statistics of random predictors when testing the simple slope at sample values of the statistics of the predictors in probing a curvilinear-by-linear interaction. Some of these factors can also influence the performance of the conditional Johnson–Neyman confidence bands to probe a curvilinear-by-linear interaction created using sample values of the statistics of the randomly sampled moderator. Fourth, a brief introduction to the fully Bayesian approach is presented. Fifth, a set of small-scale Monte Carlo simulations is presented to evaluate the predictions regarding factors that can influence the seriousness of the problem associated with ignoring the sampling variability of the statistics of random predictors when testing the simple slope at sample values of the statistics of the predictors in probing a curvilinear-by-linear interaction. A random regression percentile bootstrapping approach, a fully Bayesian approach, and a fixed regression approach that tests the simple slope at fixed constant values of the predictors were also evaluated as potential methods to address this problem. Next, I extend the Johnson–Neyman technique to random regression models with a curvilinear-by-linear interaction probed at sample values of the statistics of the moderator. I show that conditional Johnson–Neyman confidence bands created under the fixed regression framework using sample values of the statistics of the randomly sampled moderator also reflect incorrect standard errors. In contrast, random regression with percentile bootstrapping and the fully Bayesian approach can create revised conditional Johnson–Neyman confidence/ credibility bands that can account for the sampling variability of the sample statistics of the randomly sampled moderator. Finally, I provide a brief discussion, recommendations, an overview of limitations, and directions for future research.

Regression models with interaction terms

Although tests of interactions have been popular, researchers have primarily tested regression models only involving a linear-by-linear interaction,

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2Z+{\beta}_3 XZ+e, $$
(1)

where X and Z are the two predictors, e is the residual assumed to follow a normal distribution, \( e\sim N\left(0,{\upsigma}_e^2\right) \), \( {\upsigma}_e^2 \) is the population variance of the residuals, and β0, β1, β2, β3 are the regression coefficients in the population, with β3 representing the linear-by-linear interaction effect. When b3, the sample estimate of β3, is statistically significant, researchers may be interested in testing the simple slope of the effect of the focal predictor (say X) on the outcome Y at specific values of the moderator Z. This simple slope of the effect of X on Y is calculated as the first partial derivative of Eq. 1 with respect to X (e.g., Aiken & West, 1991):

$$ \frac{d\hat{Y}}{dX}={\beta}_1+{\beta}_3Z, $$
(2)

with the corresponding sample estimate being (b1 + b3Z).

However, in social and behavioral research, sometimes more complex forms of association are hypothesized and tested. One such form is the simple quadratic (curvilinear) association between the predictor of interest X and the outcome Y, described by

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2{X}^2+e. $$
(3)

In Eq. 3, X can be viewed as a moderator of its own association with Y. When b2, the sample estimate of β2, is statistically significant, researchers may want to test the simple slope of the effect of X on Y at specific values of X. This simple slope is calculated as the first partial derivative of Eq. 3 with respect to X (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ \frac{d\hat{Y}}{dX}={\beta}_1+2{\beta}_2X, $$
(4)

with the corresponding sample estimate being (b1 + 2b2X). This expression provides the slope of the tangent line to the curve defined by Eq. 3, and is a function of X.

One can see that the simple slope expression given by Eq. 4 follows the same functional form as the simple slope expression given by Eq. 2. Therefore, the probing of a significant X2 quadratic term in the regression model described by Eq. 3 at sample values of the statistics of X has similar properties as the probing of a significant XZ interaction term in the regression model described by Eq. 1 at sample values of the statistics of Z, which is discussed at length in Liu et al. (2017) and will not be repeated further here.

On the other hand, when there is a quadratic component X2 in the regression equation, the relationship between X and Y may also be modified by one or more moderators, resulting in even more complex equations for the relationships. Aiken and West (1991), Ganzach (1997), and J. W. Miller et al. (2013) have all described tests of various higher-order interactions involving quadratic components. In this study I focus on the test of the quadratic by linear interaction. The population multiple regression model with a quadratic-by-linear interaction (Aiken & West, 1991; see also J. W. Miller et al., 2013) is given by

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2{X}^2+{\beta}_3Z+{\beta}_4 XZ+{\beta}_5{X}^2Z+e. $$
(5)

When b5, the estimate of the population regression coefficient β5 for the curvilinear-by-linear interaction term, is statistically significant, researchers may want to test the simple slope of the effect of the focal predictor on the outcome at specific values of the predictors. If Z is the focal predictor and X the moderator, then the simple slope is calculated as the first partial derivative of the regression equation with respect to Z (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ \frac{d\hat{Y}}{dZ}={\beta}_3+{\beta}_4X+{\beta}_5{X}^2, $$
(6)

with the corresponding sample estimate being (b3 + b4X + b5X2).

X may also be considered to be the focal predictor and Z the moderator—the distinction between the focal predictor and the moderator stems from the researchers’ conceptual perspective. In this case the simple slope is calculated as the first partial derivative of the regression equation with respect to X (e.g., Aiken & West, 1991; J. W. Miller et al., 2013):

$$ \frac{d\hat{Y}}{dX}={\beta}_1+2{\beta}_2X+{\beta}_4Z+2{\beta}_5 XZ, $$
(7)

with the corresponding sample estimate being (b1 + 2b2X + b4Z + 2b5XZ).

It is worth mentioning that sometimes researchers are interested in testing whether there is evidence of any curvilinear or linear relationship between X and Y at a particular value of Z (e.g., Aiken & West, 1991, pp. 84–86; Dawson, 2014). In a regression model defined by Eq. 5, this can be done by rearranging the regression equation as follows:

$$ Y={\beta}_0+{\beta}_3Z+\left({\beta}_1+{\beta}_4Z\right)X+\left({\beta}_2+{\beta}_5Z\right){X}^2+e. $$
(8)

In Eq. 8, (β1+ β4Z) describes the overall linear trend of the simple curve representing the regression of Y on X at a value of Z, and (β2+ β5Z) describes the nature of the curvilinearity of the simple curve at a value of Z. To test whether there is any relationship between X and Y at a particular value of Z, one can center Z at the chosen value and test the increase in R2 due to the inclusion of the X and X2 terms, which is described in Dawson (2014). To test for evidence of any curvilinear relationship between X and Y at a particular value of Z, researchers would examine (b2+ b5Z), the sample estimate of (β2+ β5Z). This expression of the test of the existence of any curvilinear relationship between X and Y follows the same functional form as the expression of the simple slope of the effect of X on Y when probing a linear-by-linear interaction XZ given in Eq. 2. Therefore, the test of the existence of any curvilinear relationship between X and Y at sample values of the statistics of Z in a model containing a curvilinear-by-linear interaction has similar properties to the probing of a significant linear-by-linear XZ interaction at sample values of the statistics of Z, which is discussed at length in Liu et al. (2017).

Tests of the simple slope from the fixed regression framework

Testing the simple slope in Eqs. 2, 4, 6, or 7 involves estimating the regression coefficients that appear in the simple slope expression, picking a value of the predictor(s) that appear in the simple slope expression (i.e., X or Z or both), calculating the simple slope estimate, and testing whether the simple slope estimate is significantly different from zero. Typically researchers choose a few sample-based values for the predictor(s) that appear in the simple slope expression, often the values of the sample statistics M − SD, M, and M + SD, but other sample-based values such as the 25th and 75th sample percentiles may also be used.

The approach for testing the simple slope of the effect of the focal predictor on the outcome at sample statistics of the predictor(s) outlined above typically uses the asymptotic variance estimate of the regression coefficients to generate symmetric confidence intervals of the simple slope at sample values of the statistics of the predictors. If researchers try to make inferences from the results of such analyses to the population simple slope at the corresponding population statistics of the predictors (e.g., the significance of the simple slope of the effect of Z on Y described in Eq. 6 at the mean of X), instead of making inferences to the population simple slope at a priori fixed constant values of the moderator (e.g., the significance of the simple slope of the effect of Z on Y at X = 3), then an implicit assumption is made in this process that values on the predictors have been sampled according to a fixed sampling plan. As an example of a fixed sampling plan, in a study of the effect of pay disparity (defined as total CEO compensation divided by average total compensation of the top management team; Ridge, Aime, & White, 2015) on firm performance, certain numbers of firms with pay disparity of 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5 would be selected to reflect the population frequencies of different values on pay disparity. A replication of this study would select a sample in which the numbers of firms with pay disparity of 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5 are exactly the same as the first study. The fixed sampling plan implies that each sample will have identical distribution for the values of the predictor(s) with the same sample statistics (i.e., they do not vary from sample to sample).

Brief review of current practice in the applied literature

To gain a better understanding of current practice, I conducted a small review of applied literature from 2000 to the present that involved testing and probing curvilinear-by-linear interactions. I used two search criteria to identify 20 articles. I first examined journal articles that (a) were cited in J. W. Miller et al. (2013) or cited J. W. Miller et al., and (b) had found a significant curvilinear-by-linear interaction. This led to the first nine empirical articles summarized in Table 1. Then I entered “curvilinear interaction” in Google Scholar, defined the time range to be 2000 and on, and found the first 11 empirical articles that had revealed a significant curvilinear-by-linear interaction but were not cited in J. W. Miller et al. and did not cite J. W. Miller et al., resulting in a total of 20 articles (see Table 1). Eighteen out of the 20 articles reviewed used some form of convenience sampling, random sampling, or other sampling plan unrelated to the focal predictor or moderator in the analyses (e.g., sampling based on demographics), rather than fixed sampling of the predictors of interest.

Table 1 Summary of empirical articles after the year 2000 detecting significant curvilinear-by-linear interactions

On the basis of the literature review, it seems that the use of a fixed sampling plan of predictors is rarely the case in social and behavioral research. Sometimes it is impractical to conduct fixed sampling of predictors but the predictors have meaningful constant values of interest. An example is the number of existing medical conditions of patients (Rast, Rush, Piccinin, & Hofer, 2014). It is also often the case that a fixed sampling plan is impractical and the predictors may not have substantively meaningful fixed constant values of interest. This applies to predictors created using any of the following three procedures.

  1. (1)

    Taking the averages of item scores of a Likert-scale measure whose response categories do not correspond to some fixed quantity (e.g., “often” encounters discrimination rather than encountering discrimination more than 10 times in the past 3 months).

  2. (2)

    Standardizing scores within each subdomain of the construct (using the mean and standard deviation of the sample), then taking the average of these z scores. An example is human resource management investments by the company that consists of training, pay level, benefit level, job security, procedural justice, and selective staffing (Shaw et al., 2013).

  3. (3)

    Using other means of calculation to aggregate information across several subdomains of the construct. Examples include diversity among team members in expertise, gender, or nationality calculated using Blau’s (1977) formula (van der Vegt & Bunderson, 2005).

When the predictors are sampled randomly rather than according to a fixed sampling plan, sample statistics of the predictors are estimates of the corresponding population statistics rather than fixed values. The uncertainty associated with the sample estimates of the population statistics is not considered in the estimation of the standard error of the test of the simple slope at sample values of the statistics of the predictors under the fixed regression framework. This can potentially lead to inappropriate significance test results and 95% confidence intervals with low coverage rates. This issue is discussed further in the next two sections.

Fixed sampling versus random sampling

The fixed-effects model assumes that the distributions of values on the predictor terms do not vary from sample to sample, but that values on the outcome Y conditional on the predictor scores are random. An important implication of the fixed-effects model is that sample values of the statistics of the predictors (such as MX, MZ, SDX, SDZ, and the 25th percentiles on X and Z) in each sample will be identical. In many social and behavioral studies, however, the predictors can be more appropriately considered as randomly sampled from a multivariate population distribution, and ideally an alternative and more realistic random effects regression model should be used.

Sampson (1974) showed that for a linear regression model that did not include any higher-order terms such as interaction or quadratic effects—for example, \( \hat{Y}={\beta}_0+{\beta}_1X+{\beta}_2Z \)—the estimates of the regression coefficients, the Type I error rates of their statistical tests, and the Type I error rates of the statistical tests of general linear constraints on the regression coefficients were unbiased and did not differ between the fixed and random regression frameworks, assuming multivariate normality of the predictor terms. Importantly, Sampson’s proof was restricted to tests of a linear combination of the regression coefficients. For instance, when testing the simple slope of the effect of Z on Y when probing a significant curvilinear-by-linear interaction X2Z at an a priori fixed constant value of X = 2, the null hypothesis for this simple slope test is H0: β3 + β4(2) + β5(2)2 = 0, which represents a linear combination of the regression coefficients. Following this, tests of the simple slope of the effect of the focal predictor on the outcome using the fixed-effects approach will be appropriate when either (a) a fixed sampling plan is used for the focal variable and moderator or (b) fixed constant values of the predictors are chosen at which to evaluate the simple slope. An example of case (b) would occur if substantively meaningful fixed constant values on a predictor (e.g., values of 2, 3, and 4 on the number of medical conditions of patients; Rast et al., 2014) were chosen a priori and used in the test of the simple slope (for a similar recommendation, see Dawson, 2014). Put differently, for moderators with substantively meaningful fixed constant values of interest, even when being randomly sampled, testing the simple slope of the effect of the focal predictor on the outcome at the substantively meaningful fixed constant values of the predictors using the fixed regression model is still expected to lead to appropriate statistical tests.

However, researchers often do not have meaningful fixed constant values on the predictor(s) at which to test the simple slope. Instead, they often wish to choose predictor values that reflect the participants’ relative standing in the population (e.g., low, mean, high). Furthermore, sometimes the predictors do not have meaningful fixed constant values of interest, particularly for predictors created by aggregating across subdomains of a construct. If values on the predictors are randomly sampled, then the sample statistics of the predictors such as sample mean of X, MX, are likely to change from one sample to the next and should be considered as random variables with distributions. When values on the predictors are randomly sampled, the null hypothesis for the test of the simple slope of the effect of Z on Y when probing a significant curvilinear-by-linear interaction X2Z at X = MX is H0: β3 + β4(MX) + β5(MX)2 = 0. Because MX is a random variable under the random sampling framework rather than a constant, the null hypothesis of this test no longer represents a linear combination of the regression coefficients, but rather represents a function that includes a nonlinear product of two random variables (β4 and MX) and a nonlinear product of three random variables (β5, MX, and MX).

Variance estimates of the simple slope at sample-estimated conditional values of randomly sampled predictors

In the Appendix, for a regression model that contains a curvilinear-by-linear interaction as described by Eq. 5, I derive expressions for the variance (square of the standard error) of the simple slope of the effect of Z on Y given in Eq. 6 at sample values of the statistics of X (e.g., MX– SDX, MX, MX+ SDX, the 25th and 75th sample percentiles, etc.), and of the simple slope of the effect of X on Y given in Eq. 7 at sample values of the statistics of X and Z when values on X and Z are randomly sampled from a bivariate normal distribution. The derivations in the Appendix show that the variance expression of the simple slope at sample values of the statistics of the predictors under the random-effects model includes additional terms, as compared to the variance expression arising in the fixed-effects model that is given in Aiken and West (1991) and in J. W. Miller et al. (2013). When the relative magnitudes of these additional terms in the variance expression under the random effects model are large, treating random predictors as fixed in testing the simple slope at sample values of the statistics of the predictors can lead to inaccurate significance tests with biased Type I error rates and unacceptable coverage rates for 95% confidence intervals. When the relative magnitudes of these additional terms are small, treating random predictors as fixed in testing the simple slope at sample values of the statistics of the predictors will have only a minimal influence on the significance test results. I will refer to the issue associated with treating the random predictors as fixed in testing the simple slope at sample values of the statistics of the predictors as the fixed-versus-random issue for the rest of the article. Based on the derivations in the Appendix, predictions can be made regarding which factors could and which factors would not influence the extent of the fixed-versus-random issue.

Liu et al. (2017) discussed some of these factors in probing linear-by-linear interactions. I expect these conclusions from probing linear-by-linear interactions to hold in probing curvilinear-by-linear interactions. Table 2 presents a summary of some factors expected to have similar influences on the extent of the fixed-versus-random issue when probing linear-by-linear interactions and curvilinear-by-linear interactions. For instance, Liu et al. pointed out that, holding everything else the same, the greater the magnitude of the variance(s) of the first-order predictor(s) that appear in the simple slope expression, the greater the extent of the fixed-versus-random issue. Similarly, holding everything else the same, the greater the magnitude of the regression coefficient for the highest-order term, the greater the extent of the fixed-versus-random issue. In addition, Liu et al. pointed out that sample size should not influence the extent of the fixed-versus-random issue, as sample size influences the sampling variability of regression coefficients (which appear in the terms in the simple slope variance expression under the fixed regression framework) and the sampling variability of the sample statistics of the random predictors (which appear in the additional terms in the simple slope variance expression under the random regression framework) to a similar degree. Of importance, a few additional factors can influence the extent of the fixed-versus-random issue, both when probing linear-by-linear interactions and when probing curvilinear-by-linear interactions, but were not discussed by Liu et al.; one such factor is the magnitude of the residual variance of the outcome in the model.

Table 2 Factors expected to have similar influences on the extent of the fixed-versus-random issue when probing linear-by-linear interactions and curvilinear-by-linear interactions

Residual variance of the outcome in the regression model

Holding everything else constant, if the residual variance of the outcome decreases, then estimates of the sampling variance of the regression coefficients will also decrease. The sampling variances of the regression coefficients appear in the variance expression of the simple slope at sample values of the statistics of the predictor(s) under the fixed regression framework. However, the magnitude of the additional terms in the variance expression under the random regression framework will remain the same. Thus, the relative magnitude of the additional terms in the variance expression of the simple slope at sample values of the statistics of the predictor(s) under the random sampling framework would become greater relative to terms in the variance expression under the fixed sampling framework. I hypothesize that this factor can influence the extent of the fixed-versus-random issue when probing both curvilinear-by-linear interactions and linear-by-linear interactions, though this factor was not discussed in Liu et al. (2017).

Decreased residual variance of the outcome can occur in two manners. First, holding the total amount of explained variance in the outcome constant, if the total variance of the outcome decreases (e.g., due to obtaining a more homogeneous sample), then the magnitude of residual variance of the outcome in the regression model will also decrease. Second, holding constant the total variance of the outcome and the proportion of variance in the outcome accounted for by the terms involving the focal predictor and moderator (i.e., X, X2, Z, XZ, and X2Z in Eq. 5), if covariates are added to the regression model to help explain additional variability in the outcome, then the residual variance of the outcome in the regression model can also decrease. In either situation, one can expect a larger influence of ignoring the sampling variability of the sample statistics of random predictors in the test of the simple slope at sample values of the statistics of the predictor(s). The later situation is particularly important given that studies in applied research that examine and probe curvilinear-by-linear interaction effects often do not have a large sample size (see Table 1). In such cases, if the ΔR2 associated with the highest-order term is of a small magnitude, the use of covariates to reduce the residual variance of the outcome may be needed to yield sufficient statistical power to detect the effect of the highest-order term. However, the increased statistical power can be accompanied by a problematic greater influence of ignoring the sampling variability of the sample statistics of random predictors in the test of the simple slope at sample values of the statistics of the predictor(s).

When probing curvilinear-by-linear interactions, the differences in the variance expressions of the simple slope at sample values of the statistics of the predictors under the random-versus-fixed sampling frameworks are more complicated than when probing linear-by-linear interactions. As a result, new predictions can be made on the basis of derivations in the curvilinear-by-linear interaction case, which do not apply to the probing of the linear-by-linear interactions. I provide below a detailed discussion of some of these factors.

The combination of the signs and magnitudes of (1) the regression coefficient of the highest-order predictor term X2Z and (2) the expected values of the sample statistics of the predictors at which the simple slope is probed

When probing curvilinear-by-linear interactions, the variance expression of the simple slope is more complicated than the probing of the simpler linear-by-linear interactions. When probing curvilinear-by-linear interactions, the additional terms in the variance expression of the simple slope at sample values of the statistics of the random predictor(s) contain complex terms that are the products of four components: The variance of the sample statistic of a predictor (which is nonnegative), the expected value of the sample statistic of a predictor, the expected value of the regression coefficient of the highest-order predictor term, and the expected value of another regression coefficient. When Z is the focal predictor and X the moderator, one additional term in the variance expression of the simple slope at the sample statistic TX of the moderator X is 4E[b4]E[b5]E[TX]V[TX]. When the focal predictor is X and the moderator is Z, the additional terms in the variance expression of the simple slope of the effect of X on Y at a sample statistic of X (TX) and a sample statistic of Z (QZ) include 8E[b2]E[b5]E[QZ]V[TX] and 4E[b4]E[b5]E[TX]V[QZ]. In these additional terms that are the products of four components, the influence of any one component depends on the signs and magnitudes of other components. Moreover, when the focal predictor is X and the moderator is Z, the influence of the sign and magnitude of E[QZ] will also depend on those of E[TX] as both appear in the additional terms in the variance expression of the simple slope.

An implication is that in the same sample, the extent of the fixed-versus-random issue in probing a curvilinear-by-linear interaction may be different when testing the simple slope at sample statistics of the same random predictor with the same sampling variability and the same absolute value but different signs. To illustrate, let TX represent a sample statistic of X, and QZ a sample statistic of Z. Testing the simple slope of the effect of Z on Y at TX ≠ 0 versus \( {T}_X^{\prime }=-{T}_X \) may reflect different extents of the fixed-versus-random issue. The same applies to testing the simple slope of the effect of X on Y at TX ≠ 0 versus \( {T}_X^{\prime }=-{T}_X \) or at QZ ≠ 0 versus \( {Q}_Z^{\prime }=-{Q}_Z \).

In addition, consider a situation in which the regression coefficient for the highest-order term changes in sign, but not in absolute magnitude. The pattern of the relative seriousness of the fixed-versus-random issue when testing the simple slope of the effect of Z on Y at TX ≠ 0 versus \( {T}_X^{\prime }=-{T}_X \) may be reversed. The same applies to testing the simple slope of the effect of X on Y at TX ≠ 0 versus \( {T}_X^{\prime }=-{T}_X \) or at QZ ≠ 0 versus \( {Q}_Z^{\prime }=-{Q}_Z \).

Further, the absolute magnitude of E[QZ] or E[TX] at which the simple slope is probed can have an influence beyond the influence of the sign of E[QZ] or E[TX]. In the same sample, for two sample statistics of the same predictor that have expected values of the same sign and have the same sampling variance (e.g., \( {Q}_Z^{(1)}={M}_Z-{SD}_Z \) and \( {Q}_Z^{(2)}={M}_Z+{SD}_Z \) for normally distributed Z),Footnote 1 the magnitude of the expected values of the sample statistics (i.e., \( E\left[{Q}_Z^{(1)}\right] \) vs. \( E\left[{Q}_Z^{(2)}\right] \)) will influence the extent of the fixed-versus-random issue.

A fully Bayesian approach to account for sampling variability of statistics of random predictors

In a multiple regression model with a curvilinear-by-linear interaction X2Z and a covariate A, the population regression equation is given by

$$ Y={\beta}_0+{\beta}_1X+{\beta}_2{X}^2+{\beta}_3Z+{\beta}_4 XZ+{\beta}_5{X}^2Z+{\beta}_AA+e. $$
(9)

A fully Bayesian model of regression analysis includes a distribution for the outcome Y and a distribution for the predictors (Gelman et al., 2013). Let Y denote the full collection of values on the outcome Y, and let (X, Z, A) denote the full collection of values on the predictors X and Z and the covariate A. Furthermore, let Ω denote the parameters that govern the joint distribution of (X, Z, A). In Eq. 9, because all other predictor terms are product terms of X and/or Z, Ω in effect governs the joint distribution of (X, Z, X2, XZ, X2Z, A). Thus, the model parameters of a fully Bayesian model of Eq. 9 include not only (β, \( {\upsigma}_e^2 \)), which are the regression coefficients and the residual variance of the regression equation, but also Ω, which governs the joint distribution of all the predictor and covariate terms.

Bayesian estimation views model parameters as random variables with distributions and make statistical inferences about the model parameters through their posterior distributions. Using the Bayes’ theorem, prior distributions of the model parameters are combined with information from the data to produce posterior distributions of the model parameters. Assuming prior independence of (β, \( {\upsigma}_e^2 \)) and Ω, the prior distribution of the model parameters of a fully Bayesian model of Eq. 9 can be factored as the product of the prior distribution of (β, \( {\upsigma}_e^2 \)) and the prior distribution of Ω. Then the posterior distribution of the model parameters can also be factored as the product of the posterior distribution of (β, \( {\upsigma}_e^2 \)), given the data on (X, Z, A) and Y, and the posterior distribution of Ω given the data on (X, Z, A) (Gelman et al., 2013).

Following this, if a significant curvilinear-by-linear interaction is found, then when evaluating the simple slope described by Eq. 6 or 7 at some chosen statistics of the predictors, the statistics of the predictors such as μX, σX, μZ, and σZ are viewed as random variables with distributions under the fully Bayesian framework. Thus, the posterior distribution of the simple slope is determined by not only the posterior distributions of the regression coefficients involved, but also the posterior distributions of the statistics of the predictors at which the simple slope is evaluated. Therefore, the fully Bayesian approach to testing simple slopes can account for not only the uncertainty in the regression coefficients, but also the uncertainty of the chosen statistics of random predictors.

This study used conditionally conjugate priors for the model parameters (Gelman et al., 2013), which yield full conditional distributions that are easy to sample from when using MCMC to approximate the posterior distribution. Noninformative (diffuse) priors were used to maximize the comparability of the results of the different approaches considered. The noninformative conditionally conjugate priors used here include normal prior distributions with zero mean and infinite variance for the regression coefficients, an inverse Gamma prior for the residual variance\( {\upsigma}_e^2 \), normal priors with zero mean and infinite variance for the means of (X, Z, A), and an inverse Wishart prior for the variance–covariance matrix of (X, Z, A).

With conditionally conjugate priors, the fully Bayesian regression model can be viewed as a single random effects regression model using a larger dataset that includes not only the observed data, but also “additional data” whose information is summarized in the prior distributions (Gelman et al., 2013). The use of noninformative conjugate priors is akin to having little or no such “additional data” to be included in the analysis. On the other hand, informative priors can be constructed from relevant meta-analyses, existing previous research or pilot studies (e.g., conjugate priors as in the Educational Outcomes example in chap. 5 of Gill, 2014; power priors introduced by Ibrahim & Chen, 2000), or expert judgment (elicited priors). For a comprehensive review of different prior distributions, see chapter 4 of Gill (2014). Meanwhile, if conclusions from existing studies (e.g., regarding the magnitudes of regression coefficients, the average level of variables, or the variability of variables) have limited applicability to the research scenario of the present study, then one might want to down-weight the information from these “additional data” relative to the observed data (e.g., by using a less informative prior, or by using the power prior models, see chap. 1 of Congdon, 2010). Moreover, a sensitivity analysis can be conducted to gauge how sensitive the analysis results are to the choice of priors, by repeating the analysis over a range of alternative priors (Congdon, 2010).

Simulation study

A small-scale Monte Carlo simulation was designed loosely on the basis of the simulated dataset in chapter 5 of Aiken and West (1991; also examined in J. W. Miller et al., 2013) to examine the influence of ignoring the sampling variability of the sample statistics of random predictors in probing a curvilinear-by-linear interaction at the sample values of statistics of the predictor(s) using fixed regression. Two additional approaches that can account for the sampling variability of the sample statistics of the random predictors were also examined: (1) A random regression percentile bootstrapping approach, which treats the predictors as random and calculates the corresponding sample statistics in each bootstrap sample, and (2) a fully Bayesian approach with noninformative priors that treats the statistics of the predictors and the regression coefficients as random variables with distributions. In addition, I examined a fourth approach that used fixed regression and probed the curvilinear-by-linear interaction at fixed constant values of the predictors, which is expected to have acceptable performance (see Liu et al., 2017).

The simulation conditions were selected to reflect representative situations in applied research, based on the literature review summarized in Table 1. The sample sizes in these studies have a wide range, from slightly over 50 to over 2,000; most studies had a sample size of about 100 to a few hundreds. The total R2 of the final regression equation in these studies ranges from .06 to .59, and ΔR2 for adding the significant curvilinear-by-linear interaction can range from .01 to over .15. Most of these studies also include one or more covariates in the regression equations, which accounted for from 3% to over 30% of the total variance of the outcome.

Method

Data generation

The population model used to generate the data for the small-scale simulation was given by:

$$ Y=3.5+\left(-2\right)X+(3){X}^2+(2)Z+(3) XZ+(2){X}^2Z+{\beta}_AA+e, $$
(10)

where the error e ~ N (0,\( {\upsigma}_e^2 \)), and P(\( \left[\genfrac{}{}{0pt}{}{X}{Z}\right] \)) = N (\( \left[\genfrac{}{}{0pt}{}{0}{0}\right] \),\( \left[\begin{array}{cc}1& .6\\ {}.6& 4\end{array}\right] \)). The values were chosen to be close to those in the simulated dataset in chapter 5 of Aiken and West (1991) to provide readers with a familiar example, but with integer (or half-integer) values for regression coefficients and variable standard deviations or residual standard deviations for ease of interpretation. The model also included A, a covariate that is related to the outcome Y but not to the other predictor terms. The covariate A followed a normal distribution with a population mean of zero and a population variance of 25 (i.e., population standard deviation of 5). The design was a 3 (amount of variance in Y unexplained by predictor terms involving X and Z) × 2 (sample size) factorial design. Data were generated in R (R Core Team, 2018) with 3,000 replications for each condition.

Amount of variance in Y not explained by the predictor terms involving X and Z

The magnitudes of βA and \( {\upsigma}_e^2 \) were varied (see Table 3) such that the amount of variance in Y not explained by the predictor terms X, X2, Z, XZ, and X2Z (i.e., the sum of the amount of variance in Y explained by A and \( {\upsigma}_e^2 \)) were roughly 625 (= 252, which was close to the residual standard deviation in chap. 5 of Aiken & West, 1991), 400 (= 202), and 225 (= 152). The correlation between A and Y was kept approximately constant in a narrow band near .41 (obtained from the analysis of a very large sample of N = 1,000,000 generated from the same population model for each condition), so including A versus not including A in the analysis model can increase R2 by about 17%, regardless of the amount of variance in Y unexplained by the predictor terms involving X and Z. These values lead to a realistic range of model R2 values and ΔR2 values for X2Z based on the brief review of articles in applied research summarized in Table 1.

Table 3 Population values of βA and \( {\upsigma}_{\mathrm{e}}^2 \) in simulations

Sample size

Two sample sizes were compared: 400, which was the sample size of the simulated dataset in chapter 5 of Aiken and West (1991), and 100, which was a smaller but realistic sample size used in applied research examining curvilinear-by-linear interactions (see Table 1). On the basis of the derivations in the Appendix, it is to be expected that sample size would not influence the extent of the fixed-versus-random issue. Sample size was varied in the simulation to confirm this conclusion from the derivations.

Analysis models

Four approaches were used to analyze each generated dataset: (1) The currently widely used fixed regression that probes the curvilinear-by-linear interaction at numeric values of sample statistics of the predictors (MX − SDX, MX, and MX + SDX on X; MZ − SDZ, MZ, and MZ + SDZ on Z) by means of recentering; (2) a random regression percentile bootstrapping approach, which treats the predictors as random and calculates the corresponding sample statistics in each bootstrap sample; (3) a fully Bayesian approach with noninformative priors, which treats the statistics of the predictors and the regression coefficients as random variables with distributions; and (4) a fixed regression approach that probes the curvilinear-by-linear interaction at fixed constant values of the predictors. The fixed regression approaches were implemented using SAS 9.4. The random regression percentile bootstrapping approach was implemented using the R package boot (Ripley, 2017), with 5,000 bootstrap samples per replication. The fully Bayesian approach was implemented in Mplus 8 (L. K. Muthén & Muthén, 1998–2018), using the default non-informative conditionally conjugate priors in Mplus, a thinning rate of 10, and 100,000 draws from the posterior distribution to estimate the highest posterior density (HPD) credible interval of the simple slopes.Footnote 2 The approaches were compared in terms of the coverage rates of the 95% confidence intervals/ HPD credible intervals of the simple slope of the effect of X on Y defined in Eq. 7 at MX − SDX, MX, and MX + SDX on X and MZ − SDZ, MZ, and MZ + SDZ on Z, and the coverage rates of the 95% confidence intervals/ HPD credible intervals of the simple slope of the effect of Z on Y defined in Eq. 6 at MX − SDX, MX, and MX + SDX on X. Coverage rate was defined as the proportion of replications in which the population parameter fell within the 95% confidence intervals/HPD credible intervals. Coverage rate outside of the range of [.925, .975] was considered unacceptable (Bradley, 1978). Using each approach, the generated datasets were analyzed both without versus with the covariate A in the regression equation. The statistical power to detect the curvilinear-by-linear interaction term using the traditional fixed regression approach was also examined.

To evaluate the empirical Type I error rate of these approaches, for N = 400, I examined the coverage rates of the 95% confidence intervals/HPD credible intervals of the simple slope of the effect of X on Y defined in Eq. 7 at M-0.25SD of X and M + 0.875 SD of Z. Given the data generation model, the simple slope of the effect of X on Y at μ – 0.25σ of X and μ + 0.875σ of Z is zero in the population, so one minus the coverage rate reflects the empirical Type I error rate.Footnote 3

Results

Table 4 summarizes the coverage rates of the 95% confidence intervals/HPD credible intervals of the simple slope of the effect of X on Y at MX − SDX, MX, and MX + SDX on X and MZ − SDZ, MZ, and MZ + SDZ on Z, and those of the simple slope of the effect of Z on Y at MX − SDX, MX, and MX + SDX on X, for conditions with N = 400. The patterns of results were virtually identical for N = 400 and N = 100, in line with the prediction based on the mathematical derivation; hence, the N = 100 results are included in Supplemental Material 1. The fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors frequently produced 95% confidence intervals for the simple slope of the effect of X on Y and the effect of Z on Y whose coverage rates fell below .925, and sometimes below the less stringent criterion of .90 proposed by Collins, Schafer, and Kam (2001). In contrast, random regression with percentile bootstrapping and the fully Bayesian approach yielded 95% confidence intervals/HPD credible intervals that consistently produced coverage rates within Bradley’s (1978) criterion of .925 to .975 for acceptable coverage. The fixed regression approach that probed the curvilinear-by-linear interaction at fixed constant values of the predictors also had acceptable performance.

Table 4 Summary of simulation results for N = 400

The coverage rates produced by the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors decreased as the amount of variance in Y unexplained by terms involving X or Z decreased, and as the covariate A was included in the analysis model. This pattern is in line with the prediction based on the derivations.

The fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tended to produce unacceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y at M or M+ 1SD of X and at M or M+ 1SD of Z, and for the simple slope of the effect of Z on Y at M+ 1SD of X, but acceptable coverage rates for the simple slope of the effect of X on Y tested at M − 1SD of X or at M 1SD of Z, and for the simple slope of the effect of Z on Y at M 1SD of X. This is in line with the prediction based on the derivations that in the same sample, the extent of the fixed-versus-random issue may be different when testing the simple slope at sample statistics of the same predictor with the same sampling variability and the same absolute value but different signs.

At N = 100, although maintaining adequate overall coverage, at some tested sample statistics of the random predictors, the proportions of the random regression 95% percentile bootstrap confidence intervals of the simple slope completely below versus completely above the population value were slightly asymmetric, with differences up to .024. The asymmetry was smaller when the covariate A was omitted from the model and when the amount of variance in Y unexplained by terms involving X or Z was larger. At N = 400, the asymmetry was negligible. This pattern was not observed for the fully Bayesian approach.

Table 5 summarizes the empirical Type I error rates of these approaches at N = 400 by investigating the coverage rates of 95% confidence intervals of the simple slope of the effect of X on Y at M – 0.25SD of X and M + 0.875SD of Z. Random regression with percentile bootstrapping, the fully Bayesian approach, and the fixed regression approach that probed the curvilinear-by-linear interaction at fixed constant values of the predictors produced 95% confidence intervals/HPD credible intervals of the simple slope of the effect of X on Y at M – 0.25SD of X and M + 0.875 SD of Z with coverage rates consistently within Bradley’s (1978) criterion of .925 to .975 for acceptable coverage of 95% confidence intervals, corresponding to empirical Type I error rates always within the range [.025, .075]. On the other hand, the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors frequently produced coverage rates of 95% confidence intervals below .925 and sometimes below .90; these rates correspond to empirical Type I error rates greater than .075 or even .10. The inflation of the Type I error rates increased as the amount of variance in Y unexplained by terms involving X or Z increased, and as the covariate A was included in the analysis model.

Table 5 Coverage rate of 95% confidence intervals (CIs) in which the population simple slope of Y on X is zero (N = 400)

Additional simulations

Two additional simulations were conducted. Additional Simulation 1 aimed to investigate the prediction based on the mathematical derivation that, if the regression coefficient for the highest-order term X2Z changes in sign but not in magnitude, then the pattern of the relative seriousness of the fixed-versus-random issue when testing the simple slope of the effect of Z on Y at TX versus \( {T}_X^{\prime }=-1\ast {T}_X \)in the same sample may be reversed. Similarly, the pattern of the relative seriousness of the fixed-versus-random issue when testing the simple slope of the effect of X on Y at TX versus \( {T}_X^{\prime }=-1\ast {T}_X \) or at QZ versus \( {Q}_Z^{\prime }=-1\ast {Q}_Z \)in the same sample may be reversed.

In Additional Simulation 1, the population parameter value for the regression coefficient of the highest order term X2Z was changed from + 2 to – 2. Everything else was kept identical to the original simulation condition in which N = 400, and the amount of variance in Y unexplained by the predictor terms X, X2, Z, XZ, and X2Z was 225 (= 152). The analysis model in Additional Simulation 1 included the covariate A. The results are summarized in Table 6. In the original simulation (upper panel of Table 6), the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tended to produce unacceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y tested at M or M+ 1SD of X, and at M or M+ 1SD of Z, and for the simple slope of the effect of Z on Y at M+ 1SD of X, but acceptable coverage rates for the simple slope of the effect of X on Y tested at M – 1SD of X or at M 1SD of Z and for the simple slope of the effect of Z on Y at M 1SD of X. In contrast, when the regression coefficient for the highest-order term X2Z changed in sign but not in magnitude (lower panel of Table 6), the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tended to produce acceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y tested at M+ 1SD of X or at M+ 1SD of Z and for the simple slope of the effect of Z on Y at M+ 1SD of X, but unacceptable coverage rates for the simple slope of the effect of X on Y tested at M-1SD of X or at M-1SD of Z and for the simple slope of the effect of Z on Y at M 1SD of X. This pattern of results is in line with the prediction based on the derivations.

Table 6 Summary of results for Additional Simulation 1, with N = 400, and the amount of variance in Y unexplained by terms involving X or Z = 225

Additional Simulation 2 aimed to investigate the prediction based on the mathematical derivation that in the same sample, for two sample statistics of the same predictor that have expected values of the same sign and have the same sampling variance (e.g., \( {Q}_Z^{(1)}={M}_Z-{SD}_Z \) and \( {Q}_Z^{(2)}={M}_Z+{SD}_Z \) for normally distributed Z), the absolute magnitude of the expected values of the sample statistics (i.e., \( E\left[{Q}_Z^{(1)}\right] \) vs. \( E\left[{Q}_Z^{(2)}\right] \)) will influence the extent of the fixed-versus-random issue. In Additional Simulation 2, the population mean value of Z was increased from 0 to 3, such that μZ − σZ = 1 and μZ + σZ = 3 are now of the same sign. Everything else was kept the same as the original simulation condition in which N = 400, and the amount of variance in Y unexplained by the predictor terms X, X2, Z, XZ, and X2Z was 625 (= 252). All analysis models included the covariate A in the model. The results are summarized in Table 7. Although in Additional Simulation 2 μZ − σZ = 1 and μZ + σZ = 3 are now of the same sign (they were of opposite signs but had equal absolute values in the original simulation), the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors still tended to produce unacceptable coverage rates of 95% confidence intervals for the simple slope of the effect of X on Y tested at M or M+ 1SD of Z, but acceptable coverage rates for the simple slope of the effect of X on Y tested at M 1SD of Z. This result is in line with the prediction based on the mathematical derivation that the absolute magnitude of the expected value of the sample statistics of the predictors at which the simple slope is probed can have an influence beyond the influence of the sign of the expected value of the sample statistics.

Table 7 Summary of results for Additional Simulation 2, with N = 400, and the amount of variance in Y unexplained by terms involving X or Z = 625

Johnson–Neyman confidence bands

The Johnson–Neyman technique applied to a linear-by-linear interaction identifies the range of values on a moderator at which the simple slope of the effect of the focal predictor on the outcome is significant (Preacher, Curran, & Bauer, 2006). Recent developments (e.g., J. W. Miller et al., 2013; B. O. Muthén et al., 2017) have extended the Johnson–Neyman technique to linear models with curvilinear effects. For a model with a linear-by-linear interaction or a simple quadratic effect as described by Eqs 1 or 3, the Johnson–Neyman confidence band of simple slope bypasses the need to explicitly select a conditional value on the moderator at which to test the simple slope—the researcher just examines the Johnson–Neyman confidence band of the simple slope across the range of the moderator to see if it includes zero at different fixed constant values of the moderator. However, with more complex forms of interactions such as those involving a quadratic-by-linear interaction, the use of the Johnson–Neyman technique involves selecting a conditional value on the moderator at which to examine the conditional Johnson–Neyman confidence band. Issues associated with the use of the conditional Johnson–Neyman confidence band at sample values of the statistics of the random moderator closely parallel those associated with the test of the simple slope at sample values of the statistics of the random predictors. When probing a significant quadratic-by-linear interaction term by plotting the conditional Johnson–Neyman confidence band of, say, the simple slope of the effect of X on Y across the range of X at a certain value of the random moderator Z, the typical practice of supplying a numeric value on the moderator to the computation of the Johnson–Neyman confidence band will be appropriate when either (a) a fixed sampling plan is used for the moderator or (b) fixed values of the moderator (e.g., Z = – 2, 0, and 2) are chosen. However, when values on the moderator are randomly sampled and when the researcher wishes to obtain conditional Johnson–Neyman confidence bands at certain statistics of the moderator (e.g., the mean and one standard deviation above and below the mean), treating the moderator as fixed (and using sample values of the statistics) versus random can result in conditional Johnson–Neyman confidence bands of different widths. Treating the moderator as fixed ignores the sampling variability of the sample statistics of the moderator used to establish the conditional Johnson–Neyman confidence bands.

To illustrate, one replication from the small-scale simulation described above was used to compute conditional Johnson–Neyman confidence bands of the simple slope of the effect of X on Y described by Eq. 7 across a range of X values, at M – 1SD of Z, M of Z, and M + 1SD of Z, treating the random moderator Z as fixed versus random. These approaches were implemented in Mplus 8 (L. K. Muthén & Muthén, 1998–2018) using loop plots, then the loop plot data were exported from Mplus and combined in R, to plot overlaid conditional Johnson–Neyman confidence bands (from fixed regression, random regression with percentile bootstrapping, and the fully Bayesian approach) using the R package ggplot2. The dataset chosen was the first replication in the simulation condition with N = 400 and the amount of variance in Y unexplained by predictor terms involving X or Z was 225. The covariate A was always included in the model. When the random moderator was properly treated as random, the Johnson–Neyman confidence bands were obtained from (1) a random regression model using percentile bootstrapping (with 10,000 bootstrap samples) and (2) a fully Bayesian approach using the default noninformative priors in Mplus (with 100,000 draws from the posterior distribution).Footnote 4

Figure 1 displays the scatterplot of X on Z for this dataset. As can be seen from Fig. 1, at – 2, 0, and 2 on Z (corresponding to μZ − 1σZ, μZ, and μZ + 1σZ), X always had data in the range of [– 1, 1], so the Johnson–Neyman confidence bands of the simple slope of the effect of X on Y at M – 1SD of Z, M of Z, and M + 1SD of Z were plotted across the range of – 1 to 1 on X.

Fig. 1
figure 1

Scatterplot of X on Z, for Replication 1 from the simulation condition with N = 400 and the amount of variance in Y unexplained by predictor terms involving X or Z = 225

Figure 2 contains the conditional Johnson–Neyman confidence bands of the simple slope of the effect of X on Y across the range of – 1 to 1 on X at M + 1SD of Z, from (1) the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, (2) random regression with percentile bootstrapping, and (3) the fully Bayesian approach. Figure 2 shows that above the value zero on X, the Johnson–Neyman confidence bands from random regression with percentile bootstrapping and from the fully Bayesian approach were almost indistinguishable from one another and always wider than that from the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, especially for higher X values. This reflects a pattern similar to the one in the bottom panel of Table 4. However, below zero on X, the Johnson–Neyman confidence band from the fully Bayesian approach was almost indistinguishable from that generated by the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, and both were wider than that from random regression with percentile bootstrapping. The pattern of results for the Johnson–Neyman confidence bands of the simple slope of the effect of X on Y at M and M – 1SD of Z, respectively, are similar to those in Fig. 2, but with smaller discrepancies across methods, and are included in Supplemental Materials 2 and 3.

Fig. 2
figure 2

Johnson–Neyman plot of confidence bands of the simple slope of Y on X at mean + 1 standard deviation of Z, from – 1.0 to 1.0 on X, for Replication 1 from the simulation condition with N = 400 and the amount of variance in Y unexplained by predictor terms involving X or Z = 225; the covariate A was included in the analysis model

Figure 2 shows that the fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors can produce conditional Johnson–Neyman confidence bands with different shapes than does random regression with bootstrapping or the fully Bayesian approach in the dataset from this one replication. A key issue is whether these results from this one replication (model R2 = .584 and average ΔR2 = .051 for X2Z from the fixed regression model) are generalizable. Thus, I examined for all 3,000 replications in this simulation condition the 95% CIs of the simple slopes of the effect of X on Y at fixed constant values of X = – 1, 0, 1, and sample statistics of Z = MZSDZ, MZ, and MZ + SDZ from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, random regression with percentile bootstrapping, and the fully Bayesian approach.

Table 8 summarizes the coverage rates of the 95% CIs, and Fig. 3 provides boxplots of the widths of 95% CIs, of these simple slopes of the effect of X on Y from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, random regression with percentile bootstrapping, and the fully Bayesian approach, at constant values of X and sample values of the statistics of Z. It can be seen in Fig. 3 that the average widths of the 95% CIs are similar across methods at X = – 1, regardless of the sample statistic of Z at which the simple slope was probed. In such cases, the coverage rates of the 95% CIs are always close to 95% for all three approaches (see the top panel of Table 8). At X = 0, and particularly X = 1, however, the average width of the 95% CIs generated by the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors tends to be smaller than those generated by random regression with percentile bootstrapping and the fully Bayesian approach, with the latter two always similar to one another. At X = 1, the difference between the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors and the other two methods is the greatest for the simple slope probed at Z = MZ + SDZ, followed by Z = MZ, followed by Z = MZSDZ. Consistent with this pattern, the coverage rate of the 95% CI of simple slopes from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors fell below 0.925 at X = 1 and Z = MZ or MZ + SDZ (see the bottom panel of Table 8). Comparing Table 8 against the bottom panel of Table 4 reveals that probing simple slopes in the face of a curvilinear-by-linear interaction at fixed constants of one random predictor and sample statistics of the other in the fixed regression framework can result in better coverage rates of 95% CIs than does probing simple slopes at sample statistics of both random predictors in the fixed regression framework. Figure 3 also reveals that the width of the 95% CI of simple slopes from random regression with percentile bootstrapping tends to have slightly greater variability than the widths from fixed regression and from the fully Bayesian approach. Nonetheless, random regression with percentile bootstrapping consistently produced 95% confidence intervals of the simple slope with acceptable coverage rates, as did the fully Bayesian approach. On the basis of the results in Fig. 3 and Table 8, when the revised conditional Johnson–Neyman confidence band from random regression with bootstrapping or the fully Bayesian approach is wider than the Johnson–Neyman confidence band from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, one of the former two approaches should be used.

Table 8 Summary of simple slope test results at constants of X and sample statistics of Z, for N = 400, and the amount of variance in Y unexplained by terms involving X or Z = 225
Fig. 3
figure 3

Boxplot of the widths of 95% confidence/credible intervals of the simple slope of the effect of X on Y from the traditional fixed regression approach that probes the interaction at numeric values of sample statistics of the predictors, random regression with percentile bootstrapping, and the fully Bayesian approach, at constant values of X and sample statistics of Z, from the simulation condition with N = 400 and the amount of variance in Y unexplained by predictor terms involving X or Z = 225; the covariate A was included in the analysis model

Discussion

Curvilinear effects and curvilinear-by-linear interactions are often hypothesized, tested, and probed in social and behavioral research. When such effects are found, researchers have traditionally tested the simple slope of the effect of the focal predictor, and more recently utilized the Johnson–Neyman technique, often at sample values of the statistics of the predictor(s). Both approaches implicitly assume values on the predictors have been sampled according to a fixed sampling plan, whereas in many social and behavioral studies values on the predictors are more appropriately considered to be randomly sampled. As was pointed out by Dawson (2014) and Liu et al. (2017), when substantively meaningful constant values are available on the predictors, researchers should ideally specify what fixed constant values would be of substantively interest, and test simple slopes at these fixed constant values of the predictors. However, researchers often want to test the simple slope at values that reflect the relative standing of the predictor in the population. Moreover, it is not uncommon in social and behavioral research that the randomly sampled predictors do not have substantively meaningful fixed constant values of interest, particularly for predictors created by aggregating across subdomains of a construct.

When researchers test the simple slope at values that reflect the relative standing of the predictors in the population, it is important to consider random regression in the test of simple slopes at sample statistics of the predictors. This manuscript showed analytically and through simulation that fixed and random regression models produce the same estimates, but different standard errors of the simple slopes at sample values of the statistics of the moderator, for the case of regression models containing a curvilinear-by-linear interaction. When the predictor values are randomly sampled, treating the predictors as fixed when conducting simple slope tests at sample values of the statistics of the predictors can lead to inflated Type I error rates and inaccurate 95% confidence intervals for the simple slopes, depending on a variety of factors. Extending Liu et al.’s (2017) consideration of linear-by-linear interactions, I show the influence of some factors that can affect the probing of both linear-by-linear interactions and curvilinear-by-linear interactions, but were not discussed in Liu et al. Such factors include (1) the total variance of the outcome (e.g., which can be influenced by obtaining a more homogeneous/ heterogeneous sample) while holding constant the amount of variance explained by the predictor terms, and (2) whether a covariate that can explain unique variability in the outcome is included in the analysis model. I also identified additional factors unique to the probing of curvilinear-by-linear interactions. These factors include the interplay of the signs of the expected values of the sample statistics of the random predictors at which the simple slope is probed and the sign of the regression coefficient for the highest-order term; and the absolute magnitudes of the expected values (of the same sign) of the sample statistics of the random predictors at which the simple slope is probed. Some of these factors discussed in this manuscript, particularly the inclusion of a covariate, are not obvious and were only surfaced by the mathematical derivation.

This manuscript also considered the use of conditional Johnson–Neyman confidence bands in probing curvilinear-by-linear interactions when the predictor values are randomly sampled, which still involves selecting a conditional value on the moderator at which to examine the Johnson–Neyman confidence band. Standard conditional Johnson–Neyman confidence bands at sample values of the statistics of the randomly sampled moderator created within the fixed regression framework can reflect incorrect standard errors. In contrast, revised conditional Johnson–Neyman confidence bands created by random regression with percentile bootstrapping or the fully Bayesian approach can account for the sampling variability of the sample statistics of the random moderator. Thus when using conditional Johnson–Neyman confidence bands to probe a significant curvilinear-by-linear interaction, the researcher should choose either a priori fixed constant values of the random moderator, or some sample statistics of the random moderator but use random regression with bootstrapping or the fully Bayesian approach to account for the uncertainty associated with these sample statistics. Researchers using the Johnson–Neyman technique should note that the estimated cut points of the region of significance are highly dependent on the sample size used in the study (Dawson, 2014), and cannot be expected to replicate if the replication study uses a different sample size.

Of the methods used in this manuscript, bootstrapping has the advantage of being simpler to implement as bootstrapping routines are now available in most standard statistical packages.Footnote 5 Also, with a little modification of the R code, bootstrapping would allow easy examination of the simple slope at other sample statistics of interest (e.g., the 25th and 75th percentiles) on the randomly sampled predictors. A caution is warranted: In the simulations, at the small sample size of 100 for some combinations of sample statistics, the proportion of the replications in which the 95% percentile bootstrap confidence interval did not include the population value of the corresponding simple slope fell slightly asymmetrically on the two sides of the population value. It is likely due to the fact that only variable values observed in the original sample will appear in the bootstrap samples, and hence bootstrapping can potentially result in underestimated density of the underlying distribution at locations with few observed data points (Efron & Tibshirani, 1994), which is more of an issue at smaller sample sizes.

Another way to account for the sampling variability of random predictors in probing significant curvilinear-by-linear interactions examined in the present article is the fully Bayesian approach, which views statistics of the predictors (e.g., means, variances, percentiles) as random variables with distributions. The simulations and the illustration relied on the default noninformative conditionally conjugate priors implemented in Mplus to maximize comparability across methods. These noninformative conditionally conjugate priors always led to acceptable coverage rates of the 95% HPD credible intervals in this study. Very large numbers of draws from the posterior distributions of the parameters were used to ensure accurate estimation of the limits of the credible intervals. Although specifying prior distributions is straightforward for the means and variances of random predictors that are assumed to follow a multivariate normal distribution, it is less so for, say, the 25th and 75th percentiles of the random predictors. Previous work on nonparametric Bayesian inference on quantiles (e.g., Bornn, Shephard, & Solgi, 2016) may provide some insights, but further study is needed.

It is worth mentioning that the present article has focused on continuous-outcome variables. If the range of the outcome variable is limited, models other than the regular multiple regression model could be used. For instance, if the outcome is an ordered categorical variable with four or fewer categories, then logistic or multinomial logistic regression models should be used. If the outcome is a count variable, then appropriate models for the count outcome (e.g., Poisson regression, negative binomial regression, zero-inflated or zero-truncated negative binomial regression, etc.; see Coxe, West, & Aiken, 2009) should be used. I expect the factors that can influence the extent of the fixed-versus-random issue to be similar whether the outcome variable is continuous or bounded, if the appropriate form of the model for the outcome (e.g., model for continuous outcome, model for ordinal outcome, model for count outcome) is selected.