Abstract
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a twolevel regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the twolevel model by normaldistributionbased maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the twolevel model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the twolevel model leads to essentially the same results as LS with the MMR model. Most importantly, the twolevel regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the twolevel model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the twolevel model.
Similar content being viewed by others
Notes
With the LS estimates of γ and the population values of σ as starting values, the criterion for convergence is defined as the difference for any parameter between two consecutive iterations being smaller than 0.0001 within 300 iterations. As we shall see, nonconvergence happens mostly with smaller sample sizes together with stochastic predictors and/or nonnormally distributed errors.
The web folder http://www3.nd.edu/~kyuan/moderation/ also contains a SAS IML program (NML.sas), which performs essentially the same function as the R package.
References
Aguinis, H. (2004). Regression analysis for categorical moderators. New York: Guilford.
Aguinis, H., Petersen, S.A., & Pierce, C.A. (1999). Appraisal of the homogeneity of error variance assumption and alternatives for multiple regression for estimating moderated effects of categorical variables. Organizational Research Methods, 2, 315–339.
Aiken, L.S., & West, S.G. (1991). Multiple regression: testing and interpreting interactions. Thousand Oaks: Sage.
Baron, R.M., & Kenny, D.A. (1986). The moderatormediator variable distinction in social psychological research: concept, strategic and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182.
Bast, J., & Reitsma, P. (1998). Analyzing the development of individual differences in terms of Matthew effects in reading: results from a Dutch longitudinal study. Developmental Psychology, 34, 1373–1399.
Carroll, R.J., & Ruppert, D. (1988). Transformation and weighting in regression. New York: Chapman & Hall/CRC.
Casella, G., & Berger, R.L. (2002). Statistical inference (2nd ed.). Pacific Grove: Duxbury Press.
Chaplin, W.F. (2007). Moderator and mediator models in personality research: a basic introduction. In R.W. Robins, R.C. Fraley, & R.F. Krueger (Eds.), Handbook of research methods in personality psychology (pp. 602–632). New York: Guilford.
Cohen, J. (1978). Partialed products are interactions; partialed powers are curve components. Psychological Bulletin, 85, 858–866.
CribariNeto, F. (2004). Asymptotic inference under heteroskedasticity of unknown form. Computational Statistics & Data Analysis, 45, 215–233.
Darlington, R.B. (1990). Regression and linear models. New York: McGrawHill.
Davidson, R., & MacKinnon, J.G. (1993). Estimation and inference in econometrics. Oxford: Oxford University Press.
Davison, M.L., Kwak, N., Seo, Y.S., & Choi, J. (2002). Using hierarchical linear models to examine moderator effects: personbyorganization interactions. Organizational Research Methods, 5, 231–254.
Dent, W.T., & Hildreth, C. (1977). Maximum likelihood estimation in random coefficient models. Journal of the American Statistical Association, 72, 69–72.
DeShon, R.P., & Alexander, R.A. (1996). Alternative procedures for testing regression slope homogeneity when group error variances are unequal. Psychological Methods, 1, 261–277.
Dretzke, B.J., Levin, J.R., & Serlin, R.C. (1982). Testing for regression homogeneity under variance heterogeneity. Psychological Bulletin, 91, 376–383.
Efron, B., & Tibshirani, R.J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Fisicaro, S.A., & Tisak, J. (1994). A theoretical note on the stochastics of moderated multiple regression. Educational and Psychological Measurement, 54, 32–41.
Froehlich, B.R. (1973). Some estimators for a random coefficient regression model. Journal of the American Statistical Association, 68, 329–335.
Hayes, A.F., & Cai, L. (2007). Using heteroscedasticityconsistent standard error estimators in OLS regression: an introduction and software implementation. Behavior Research Methods, 39, 709–722.
Hinkley, D.V. (1977). Jackknifing in unbalanced situations. Technometrics, 19, 285–292.
Hildreth, C., & Houck, J. (1968). Some estimators for a linear model with random coefficients. Journal of the American Statistical Association, 63, 584–595.
Holmbeck, G.N. (1997). Toward terminological, conceptual and statistical clarity in the study of mediators and moderators: examples from the childclinical and pediatric psychology literatures. Journal of Consulting and Clinical Psychology, 65, 599–610.
Kenny, D., & Judd, C.M. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin, 96, 201–210.
Littell, R., Milliken, G., Stroup, W., Wolfinger, R., & Schabenberger, O. (2006). SAS for mixed models (2nd ed.). Cary: SAS Institute.
Long, J.S., & Ervin, L.H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. American Statistician, 54, 217–224.
MacKinnon, J.G., & White, H. (1985). Some heteroskedasticity consistent covariance matrix estimators with improved finite sample properties. Journal of Econometrics, 29, 305–325.
Marsh, H.W., Wen, Z., & Hau, K.T. (2004). Structural equation models of latent interactions: evaluation of alternative estimation strategies and indicator construction. Psychological Methods, 9, 275–300.
Nelson, E.A., & Dannefer, D. (1992). Aged heterogeneity: fact or fiction? The fate of diversity in gerontological research. The Gerontologist, 32, 17–23.
Newsom, J.T., Prigerson, H.G., Schulz, R., & Reynolds, C.F. (2003). Investigating moderator hypotheses in aging research: statistical methodological, and conceptual difficulties with comparing separate regressions. The International Journal of Aging & Human Development, 57, 119–150.
Ng, M., & Wilcox, R.R. (2010). Comparing the regression slopes of independent groups. British Journal of Mathematical & Statistical Psychology, 63, 319–340.
Overton, R.C. (2001). Moderated multiple regression for interactions involving categorical variables: a statistical control for heterogeneous variance across two groups. Psychological Methods, 6, 218–233.
Ping, R.A. (1996). Latent variable interaction and quadratic effect estimation: a twostep technique using structural equation analysis. Psychological Bulletin, 119, 166–175.
Preacher, K.J., & Merkle, E.C. (2012). The problem of model selection uncertainty in structural equation modeling. Psychological Methods, 17, 1–14.
Shieh, G. (2009). Detection of interactions between a dichotomous moderator and a continuous predictor in moderated multiple regression with heterogeneous error variance. Behavior Research Methods, 41, 61–74.
Singh, B., Nagar, A.L., Choudhry, N.K., & Raj, B. (1976). On the estimation of structural change: a generalization of the random coefficients regression model. International Economic Review, 17, 340–361.
Tang, W., Yu, Q., CritsChristoph, P., & Tu, X.M. (2009). A new analytic framework for moderation analysis–moving beyond analytic interactions. Journal of Data Science, 7, 313–329.
Weisberg, S. (1980). Applied linear regression. New York: Wiley.
White, H. (1980). A heteroskedasticconsistent covariance matrix estimator and a direct test of heteroskedasticity. Econometrica, 48, 817–838.
Wooldridge, J.M. (2010). Econometric analysis of cross section and panel data (2nd ed.). Cambridge: MIT Press.
Wu, C.F.J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis. The Annals of Statistics, 14, 1261–1295.
Yuan, K.H., & Bentler, P.M. (1997). Improving parameter tests in covariance structure analysis. Computational Statistics & Data Analysis, 26, 177–198.
Yuan, K.H., & Bentler, P.M. (2010). Finite normal mixture SEM analysis by fitting multiple conventional SEM models. Sociological Methodology, 40, 191–245.
Acknowledgement
The research of KeHai Yuan was partially supported by a grant from National Natural Science Foundation of China (31271116).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A. Iteratively Reweighted Least Squares (IRLS) Algorithm for NMLEs
This Appendix contains the development of the IRLS algorithm that maximizes the likelihood function l(θ) in (8). Setting the partial derivatives of l(θ) with respect to γ and σ at zero yields the normal estimating equations
and
We need to solve (A.1) and (A.2) for NMLEs of γ and σ. If the \(\tau_{i}^{2}\)s are known, then the solution to (A.1) is
and that to (A.2) is
Although the \(\tau_{i}^{2}\)s are unknown in practice, they can be estimated as \({\bf h}_{i}'\hat{\boldsymbol{\sigma}}\) once a \(\hat{\boldsymbol{\sigma}}\) is available. Thus, Equations (A.1) and (A.2) can be solved by the following IRLS algorithm:

(S1)
With an initial value of σ, obtain each \(\tau_{i}^{2}={\bf h}_{i}'\boldsymbol{\sigma} \), i=1,2,…,n.

(S2)
Obtain \(\hat{\boldsymbol{\gamma}}\) by (A.3) and \(\hat {\boldsymbol{\sigma}}\) by (A.4).

(S3)
Update the initial σ by \(\hat{\boldsymbol{\sigma}}\) in (S2) and go back to (S1).

(S4)
Continue (S1) to (S3) until \(\hat{\boldsymbol{\gamma}}\) and \(\hat{\boldsymbol{\sigma}}\) stabilize.
The converged solutions are the NMLEs of γ and σ.
Appendix B. TwoLevel Regression with p Predictors and m Moderators
This Appendix contains the details of extending the simple twolevel regression model in (4) to cases with p predictors and m moderators. With p predictors, the counterpart of the first regression equation in (4) is
where \(\mathbf {c}_{i}'=(1,x_{i1},x_{i2},\ldots,x_{ip})\) and β _{ i }=(β _{ i0},β _{ i1},β _{ i2},…,β _{ ip })′. The counterpart of the 2nd and 3rd regression equations in (4) are
Let u _{ i }=(1,u _{ i1},u _{ i2},…,u _{ im })′, ε _{ i }=(ε _{ i0},ε _{ i1},ε _{ i2},…,ε _{ ip })′,
We can rewrite (B.2) in matrix form as
The counterpart of Equation (5) is obtained by putting (B.3) into (B.1),
where \(\delta_{i}=\mathbf {c}_{i}'\boldsymbol {\varepsilon }_{i}+e_{i}\). Let
Then
Similarly, the parameters σ _{00} and \(\sigma_{e}^{2}\) are not distinguishable in (B.5) due to not having a nested data structure.
As the counterpart of (16),
is the estimate of the percentage of variance of β _{ ij } accounted for by the m moderators, where \(\hat{\boldsymbol{\gamma}}_{j}=(\hat{\gamma}_{j1},\hat{\gamma}_{j2},\ldots, \hat{\gamma}_{jm})'\), j=1,2,…,p; and S _{ uu } is the sample covariance matrix of u _{ i }=(u _{ i1},u _{ i2},…,u _{ im })′, i=1,2,…,n.
Appendix C. R Package
This Appendix introduces an R package to perform NML estimation of the twolevel regression model. The package can be downloaded at http://www3.nd.edu/~kyuan/moderation/NML.R. A simulated data set with 6 variables (y,x _{1},x _{2},u _{1},u _{2},u _{3}) and 500 cases is used to illustrate the use of the package, and the data set can be downloaded at http://www3.nd.edu/~kyuan/moderation/simudata.dat. Both of these files are saved in the folder d:/moderation/ in this illustration with names NML.R and simudata.dat, respectively.
The code for running the package and its utilities are documented in Appendix D. The first three lines of the code are to change the working directory, to load the package into the R Console, and to read the data, respectively. Lines 4 to 10 of Appendix D are to identify the number of cases, the dependent variable (y), possible level1 predictors (x1, x2), and possible level2 predictors or moderators (u1, u2, u3). In this package, we regard the level1 and level2 intercepts as the regression coefficients corresponding to the predictor x _{ i0}=1 and u _{ i0}=1, respectively. These are fulfilled by lines 11 and 12, where the labels or column names are for identifying parameter estimates corresponding to intercepts. Lines 16 to 20 are to specify the level1 and level2 regression models. In the example, the level1 model specified by L1=cbind(x0,x1,x2) has three coefficients, β _{ i0}, β _{ i1}, and β _{ i2}, corresponding to the coefficients of x _{ i0}=1, x _{ i1} and x _{ i2}, respectively. Lines 17 to 19 are to specify the level2 models corresponding to each of the level1 coefficients. L20=cbind(u0,u1,u2) assigns three predictors to β _{ i0}: u _{ i0}=1, u _{ i1} and u _{ i2}; L21=cbind(u0,u2,u3) assigns three predictors to β _{ i1}: u _{ i0}=1, u _{ i2} and u _{ i3}; and L22=cbind(u0,u1,u2,u3) assigns four predictors to β _{ i2}: u _{ i0}=1, u _{ i1}, u _{ i2} and u _{ i3}. The 20th line in Appendix D puts all the level2 predictors together to pass to the package. Notice that the specification of level2 predictors must correspond to the level1 predictors. For example, if you decide to leave out the intercept in level1, then the specification for level1 and level2 predictors become
L1=cbind(x1,x2);#level1 predictors;
L21=cbind(u0,u2,u3);#level2 predictors for beta_i1;
L22=cbind(u0,u1,u2,u3);#level2 predictors for beta_i2;
L2=list(L21,L22); #all level2 predictors;
Lines 24 to 26 in Appendix D are to setup a H matrix corresponding to \({\bf h}_{i}\) in Equation (7) that contains the predictors for variance parameters in σ. Line 24 requests six variance parameters be estimated in the order of σ _{11}=Var(ε _{ i1}), σ _{22}=Var(ε _{ i2}), σ _{12}=Cov(ε _{ i1},ε _{ i2}), σ _{01}=Cov(ε _{ i0},ε _{ i1}), σ _{02}=Cov(ε _{ i0},ε _{ i2}), and \(\sigma_{0e}^{2}=\mathrm {Var}(\varepsilon_{i0})+\mathrm {Var}(e)\). If one chooses not to let the prediction errors at level2 covary, then line 24 needs to be set as H_mat=cbind(x1*x1, x2*x2, 1), which typically results in smaller SEs for the variance estimates. Lines 25 and 26 are to label the estimates for the variance estimates to be correctly identified in the output. For example, if not allowing the level2 prediction errors to covary, then line 25 should be set as H_name=cbind("x1x1", "x2x2", "x0e"). In particular, the label x1x1 and x2x2 are used to identify the proper variance estimates to calculate Rsquares in the package. We do not encourage users to change the notation used in labeling.
Line 28 is to run the package, and the output of running Appendix D is in Appendix E. In addition to the results of NML for the twolevel model, the default output also contains the results of LS analysis for the corresponding MMR model, where xjuk corresponds to the level1 coefficient of xj predicted by the level2 predictor uk, e.g., x0u0 corresponds to the intercept γ _{00} or the level1 coefficient of x0 predicted by the level2 predictor u0. For the LS estimates, the default output contains SE_{ls}, SE_{sw0}, SE_{sw3}, SE_{sw4}, and the corresponding zscores. The results corresponding to SE_{sw0} are not in Appendix E because there is not enough horizontal space. We choose to output these four sets of SEs because SE_{ls}s are the default for LS analysis; SE_{sw0}s are the most commonly used consistent SEs in software, SE_{sw3}s perform the best with smaller sample sizes according to Long and Ervin (2000), and SE_{sw4}s are most reliable when having highleverage observations (CribariNeto 2004).
The labels for the results of NML are the same as those for the results of LS. The default output of NML also contains SE_{sw0}s and the corresponding zscores, which are not included in Appendix E due to space limitation.
Appendix D. R Code for NML Estimation for the Model Described in Appendix B
Appendix E. The Output of Running the Example in Appendix C
Rights and permissions
About this article
Cite this article
Yuan, KH., Cheng, Y. & Maxwell, S. Moderation Analysis Using a TwoLevel Regression Model. Psychometrika 79, 701–732 (2014). https://doi.org/10.1007/s113360139357x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s113360139357x