Moderation Analysis Using a Two-Level Regression Model

Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

doi:10.1007/s11336-013-9357-x

Moderation Analysis Using a Two-Level Regression Model

Published: 12 December 2013

Volume 79, pages 701–732, (2014)
Cite this article

Psychometrika Aims and scope Submit manuscript

Ke-Hai Yuan¹,
Ying Cheng¹ &
Scott Maxwell¹

2150 Accesses
12 Citations
Explore all metrics

Abstract

Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Some Issues in Generalized Linear Modeling

New Perspectives on Causal Mediation Analysis

Restricted Maximum Likelihood Estimation for Parameters of the Social Relations Model

Article 14 August 2015

Notes

Another implicit assumption in Equations (1) to (5) is that x _i and u _i do not contain measurement errors. We will discuss measurement errors in predictors in the concluding section.
With the LS estimates of γ and the population values of σ as starting values, the criterion for convergence is defined as the difference for any parameter between two consecutive iterations being smaller than 0.0001 within 300 iterations. As we shall see, nonconvergence happens mostly with smaller sample sizes together with stochastic predictors and/or nonnormally distributed errors.
The web folder http://www3.nd.edu/~kyuan/moderation/ also contains a SAS IML program (NML.sas), which performs essentially the same function as the R package.

References

Aguinis, H. (2004). Regression analysis for categorical moderators. New York: Guilford.
Google Scholar
Aguinis, H., Petersen, S.A., & Pierce, C.A. (1999). Appraisal of the homogeneity of error variance assumption and alternatives for multiple regression for estimating moderated effects of categorical variables. Organizational Research Methods, 2, 315–339.
Article Google Scholar
Aiken, L.S., & West, S.G. (1991). Multiple regression: testing and interpreting interactions. Thousand Oaks: Sage.
Google Scholar
Baron, R.M., & Kenny, D.A. (1986). The moderator-mediator variable distinction in social psychological research: concept, strategic and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182.
Article PubMed Google Scholar
Bast, J., & Reitsma, P. (1998). Analyzing the development of individual differences in terms of Matthew effects in reading: results from a Dutch longitudinal study. Developmental Psychology, 34, 1373–1399.
Article PubMed Google Scholar
Carroll, R.J., & Ruppert, D. (1988). Transformation and weighting in regression. New York: Chapman & Hall/CRC.
Book Google Scholar
Casella, G., & Berger, R.L. (2002). Statistical inference (2nd ed.). Pacific Grove: Duxbury Press.
Google Scholar
Chaplin, W.F. (2007). Moderator and mediator models in personality research: a basic introduction. In R.W. Robins, R.C. Fraley, & R.F. Krueger (Eds.), Handbook of research methods in personality psychology (pp. 602–632). New York: Guilford.
Google Scholar
Cohen, J. (1978). Partialed products are interactions; partialed powers are curve components. Psychological Bulletin, 85, 858–866.
Article Google Scholar
Cribari-Neto, F. (2004). Asymptotic inference under heteroskedasticity of unknown form. Computational Statistics & Data Analysis, 45, 215–233.
Article Google Scholar
Darlington, R.B. (1990). Regression and linear models. New York: McGraw-Hill.
Google Scholar
Davidson, R., & MacKinnon, J.G. (1993). Estimation and inference in econometrics. Oxford: Oxford University Press.
Google Scholar
Davison, M.L., Kwak, N., Seo, Y.S., & Choi, J. (2002). Using hierarchical linear models to examine moderator effects: person-by-organization interactions. Organizational Research Methods, 5, 231–254.
Article Google Scholar
Dent, W.T., & Hildreth, C. (1977). Maximum likelihood estimation in random coefficient models. Journal of the American Statistical Association, 72, 69–72.
Article Google Scholar
DeShon, R.P., & Alexander, R.A. (1996). Alternative procedures for testing regression slope homogeneity when group error variances are unequal. Psychological Methods, 1, 261–277.
Article Google Scholar
Dretzke, B.J., Levin, J.R., & Serlin, R.C. (1982). Testing for regression homogeneity under variance heterogeneity. Psychological Bulletin, 91, 376–383.
Article Google Scholar
Efron, B., & Tibshirani, R.J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Book Google Scholar
Fisicaro, S.A., & Tisak, J. (1994). A theoretical note on the stochastics of moderated multiple regression. Educational and Psychological Measurement, 54, 32–41.
Article Google Scholar
Froehlich, B.R. (1973). Some estimators for a random coefficient regression model. Journal of the American Statistical Association, 68, 329–335.
Google Scholar
Hayes, A.F., & Cai, L. (2007). Using heteroscedasticity-consistent standard error estimators in OLS regression: an introduction and software implementation. Behavior Research Methods, 39, 709–722.
Article PubMed Google Scholar
Hinkley, D.V. (1977). Jackknifing in unbalanced situations. Technometrics, 19, 285–292.
Article Google Scholar
Hildreth, C., & Houck, J. (1968). Some estimators for a linear model with random coefficients. Journal of the American Statistical Association, 63, 584–595.
Article Google Scholar
Holmbeck, G.N. (1997). Toward terminological, conceptual and statistical clarity in the study of mediators and moderators: examples from the child-clinical and pediatric psychology literatures. Journal of Consulting and Clinical Psychology, 65, 599–610.
Article PubMed Google Scholar
Kenny, D., & Judd, C.M. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin, 96, 201–210.
Article Google Scholar
Littell, R., Milliken, G., Stroup, W., Wolfinger, R., & Schabenberger, O. (2006). SAS for mixed models (2nd ed.). Cary: SAS Institute.
Google Scholar
Long, J.S., & Ervin, L.H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. American Statistician, 54, 217–224.
Google Scholar
MacKinnon, J.G., & White, H. (1985). Some heteroskedasticity consistent covariance matrix estimators with improved finite sample properties. Journal of Econometrics, 29, 305–325.
Article Google Scholar
Marsh, H.W., Wen, Z., & Hau, K.-T. (2004). Structural equation models of latent interactions: evaluation of alternative estimation strategies and indicator construction. Psychological Methods, 9, 275–300.
Article PubMed Google Scholar
Nelson, E.A., & Dannefer, D. (1992). Aged heterogeneity: fact or fiction? The fate of diversity in gerontological research. The Gerontologist, 32, 17–23.
Article PubMed Google Scholar
Newsom, J.T., Prigerson, H.G., Schulz, R., & Reynolds, C.F. (2003). Investigating moderator hypotheses in aging research: statistical methodological, and conceptual difficulties with comparing separate regressions. The International Journal of Aging & Human Development, 57, 119–150.
Article Google Scholar
Ng, M., & Wilcox, R.R. (2010). Comparing the regression slopes of independent groups. British Journal of Mathematical & Statistical Psychology, 63, 319–340.
Article Google Scholar
Overton, R.C. (2001). Moderated multiple regression for interactions involving categorical variables: a statistical control for heterogeneous variance across two groups. Psychological Methods, 6, 218–233.
Article PubMed Google Scholar
Ping, R.A. (1996). Latent variable interaction and quadratic effect estimation: a two-step technique using structural equation analysis. Psychological Bulletin, 119, 166–175.
Article Google Scholar
Preacher, K.J., & Merkle, E.C. (2012). The problem of model selection uncertainty in structural equation modeling. Psychological Methods, 17, 1–14.
Article PubMed Google Scholar
Shieh, G. (2009). Detection of interactions between a dichotomous moderator and a continuous predictor in moderated multiple regression with heterogeneous error variance. Behavior Research Methods, 41, 61–74.
Article PubMed Google Scholar
Singh, B., Nagar, A.L., Choudhry, N.K., & Raj, B. (1976). On the estimation of structural change: a generalization of the random coefficients regression model. International Economic Review, 17, 340–361.
Article Google Scholar
Tang, W., Yu, Q., Crits-Christoph, P., & Tu, X.M. (2009). A new analytic framework for moderation analysis–moving beyond analytic interactions. Journal of Data Science, 7, 313–329.
PubMed Central PubMed Google Scholar
Weisberg, S. (1980). Applied linear regression. New York: Wiley.
Google Scholar
White, H. (1980). A heteroskedastic-consistent covariance matrix estimator and a direct test of heteroskedasticity. Econometrica, 48, 817–838.
Article Google Scholar
Wooldridge, J.M. (2010). Econometric analysis of cross section and panel data (2nd ed.). Cambridge: MIT Press.
Google Scholar
Wu, C.F.J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis. The Annals of Statistics, 14, 1261–1295.
Article Google Scholar
Yuan, K.-H., & Bentler, P.M. (1997). Improving parameter tests in covariance structure analysis. Computational Statistics & Data Analysis, 26, 177–198.
Article Google Scholar
Yuan, K.-H., & Bentler, P.M. (2010). Finite normal mixture SEM analysis by fitting multiple conventional SEM models. Sociological Methodology, 40, 191–245.
Article PubMed Central PubMed Google Scholar

Download references

Acknowledgement

The research of Ke-Hai Yuan was partially supported by a grant from National Natural Science Foundation of China (31271116).

Author information

Authors and Affiliations

Department of Psychology, University of Notre Dame, Notre Dame, IN, 46556, USA
Ke-Hai Yuan, Ying Cheng & Scott Maxwell

Authors

Ke-Hai Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Ying Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Scott Maxwell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke-Hai Yuan.

Appendices

Appendix A. Iteratively Reweighted Least Squares (IRLS) Algorithm for NMLEs

This Appendix contains the development of the IRLS algorithm that maximizes the likelihood function l(θ) in (8). Setting the partial derivatives of l(θ) with respect to γ and σ at zero yields the normal estimating equations

$$ \sum_{i=1}^n \frac{1}{\tau_i^2}\bigl(y_i-\mathbf {c}_i'\boldsymbol{\gamma}\bigr)\mathbf {c}_i=\mathbf {0}, $$

(A.1)

and

$$ \sum_{i=1}^n \frac{1}{2\tau_i^4}\bigl[\bigl(y_i-\mathbf {c}_i' \boldsymbol{\gamma}\bigr)^2-{\bf h}_i' \boldsymbol{\sigma}\bigr]{\bf h}_i=\mathbf {0}. $$

(A.2)

We need to solve (A.1) and (A.2) for NMLEs of γ and σ. If the $\tau_{i}^{2}$s are known, then the solution to (A.1) is

$$ \hat{\boldsymbol{\gamma}}=\Biggl(\sum _{i=1}^n\frac{1}{\tau _i^2}\mathbf {c}_i \mathbf {c}_i'\Biggr)^{-1} \Biggl(\sum _{i=1}^n\frac{1}{\tau_i^2}\mathbf {c}_iy_i \Biggr), $$

(A.3)

and that to (A.2) is

$$ \hat{\boldsymbol{\sigma}}=\Biggl(\sum _{i=1}^n \frac{1}{\tau_i^4}{\bf h}_i{ \bf h}_i'\Biggr)^{-1}\Biggl[ \sum _{i=1}^n \frac{1}{\tau_i^4}{\bf h}_i \bigl(y_i-\mathbf {c}_i'\hat{\boldsymbol{\gamma}}\bigr)^2\Biggr]. $$

(A.4)

Although the $\tau_{i}^{2}$s are unknown in practice, they can be estimated as ${\bf h}_{i}'\hat{\boldsymbol{\sigma}}$ once a $\hat{\boldsymbol{\sigma}}$ is available. Thus, Equations (A.1) and (A.2) can be solved by the following IRLS algorithm:

(S1)
With an initial value of σ, obtain each $\tau_{i}^{2}={\bf h}_{i}'\boldsymbol{\sigma} $, i=1,2,…,n.
(S2)
Obtain $\hat{\boldsymbol{\gamma}}$ by (A.3) and $\hat {\boldsymbol{\sigma}}$ by (A.4).
(S3)
Update the initial σ by $\hat{\boldsymbol{\sigma}}$ in (S2) and go back to (S1).
(S4)
Continue (S1) to (S3) until $\hat{\boldsymbol{\gamma}}$ and $\hat{\boldsymbol{\sigma}}$ stabilize.

The converged solutions are the NMLEs of γ and σ.

Appendix B. Two-Level Regression with p Predictors and m Moderators

This Appendix contains the details of extending the simple two-level regression model in (4) to cases with p predictors and m moderators. With p predictors, the counterpart of the first regression equation in (4) is

$$ y_i=\beta_{i0}+\beta_{i1}x_{i1}+ \beta_{i2}x_{i2}+\cdots+\beta _{ip}x_{ip}+e_i= \mathbf {c}'_i\boldsymbol {\beta }_i+e_i, \quad i=1,2,\ldots,n, $$

(B.1)

where $\mathbf {c}_{i}'=(1,x_{i1},x_{i2},\ldots,x_{ip})$ and β _i=(β _i0,β _i1,β _i2,…,β _ip)′. The counterpart of the 2nd and 3rd regression equations in (4) are

$$ \beta_{ij}=\gamma_{j0}+\gamma_{j1}u_{i1}+ \gamma_{j2}u_{i2}+\cdots +\gamma _{jm}u_{im}+ \varepsilon_{ij},\quad j=0,1,2,\ldots, p;\ i=1,2,\ldots ,n. $$

(B.2)

Let u _i=(1,u _i1,u _i2,…,u _im)′, ε _i=(ε _i0,ε _i1,ε _i2,…,ε _ip)′,

$$\boldsymbol {\Gamma }=\left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c} \gamma_{00}&\gamma_{01}&\gamma_{02}&\cdots&\gamma_{0m}\\ \gamma_{10}&\gamma_{11}&\gamma_{12}&\cdots&\gamma_{1m}\\ \gamma_{20}&\gamma_{21}&\gamma_{22}&\cdots&\gamma_{2m}\\ \cdots&\cdots&\cdots&\cdots&\cdots\\ \gamma_{p0}&\gamma_{p1}&\gamma_{p2}&\cdots&\gamma_{pm} \end{array} \right ). $$

We can rewrite (B.2) in matrix form as

$$ \boldsymbol {\beta }_i=\boldsymbol {\Gamma }\mathbf {u}_i+ \boldsymbol {\varepsilon }_i,\quad i=1,2,\ldots, n. $$

(B.3)

The counterpart of Equation (5) is obtained by putting (B.3) into (B.1),

$$ y_i=\mathbf {c}_i'\boldsymbol {\Gamma }\mathbf {u}_i+\delta_i, $$

(B.4)

where $\delta_{i}=\mathbf {c}_{i}'\boldsymbol {\varepsilon }_{i}+e_{i}$. Let

$$\boldsymbol {\Sigma }=\mathrm {Var}(\boldsymbol {\varepsilon }_i)= \left ( \begin{array}{c@{\quad}@{\quad}c@{\quad}c@{\quad}c@{\quad}c} \sigma_{00}&\sigma_{01}&\sigma_{02}&\cdots&\sigma_{0p}\\ \sigma_{10}&\sigma_{11}&\sigma_{12}&\cdots&\sigma_{1p}\\ \sigma_{20}&\sigma_{21}&\sigma_{22}&\cdots&\sigma_{2p}\\ \cdots&\cdots&\cdots&\cdots&\cdots\\ \sigma_{p0}&\sigma_{p1}&\sigma_{p2}&\cdots&\sigma_{pp} \end{array} \right ). $$

Then

$$ \tau_i^2=\mathrm {Var}(\delta_i)= \mathbf {c}_i'\boldsymbol {\Sigma }\mathbf {c}_i+\sigma_e^2. $$

(B.5)

Similarly, the parameters σ ₀₀ and $\sigma_{e}^{2}$ are not distinguishable in (B.5) due to not having a nested data structure.

As the counterpart of (16),

$$R_j^2=\frac{\hat {\boldsymbol {\gamma }}_j'\mathbf {S}_{uu}\hat {\boldsymbol {\gamma }}_j}{\hat {\boldsymbol {\gamma }}_j'\mathbf {S}_{uu}\hat {\boldsymbol {\gamma }}_j+\hat{\sigma}_j^2} $$

is the estimate of the percentage of variance of β _ij accounted for by the m moderators, where $\hat{\boldsymbol{\gamma}}_{j}=(\hat{\gamma}_{j1},\hat{\gamma}_{j2},\ldots, \hat{\gamma}_{jm})'$, j=1,2,…,p; and S _uu is the sample covariance matrix of u _i=(u _i1,u _i2,…,u _im)′, i=1,2,…,n.

Appendix C. R Package

This Appendix introduces an R package to perform NML estimation of the two-level regression model. The package can be downloaded at http://www3.nd.edu/~kyuan/moderation/NML.R. A simulated data set with 6 variables (y,x ₁,x ₂,u ₁,u ₂,u ₃) and 500 cases is used to illustrate the use of the package, and the data set can be downloaded at http://www3.nd.edu/~kyuan/moderation/simudata.dat. Both of these files are saved in the folder d:/moderation/ in this illustration with names NML.R and simudata.dat, respectively.

The code for running the package and its utilities are documented in Appendix D. The first three lines of the code are to change the working directory, to load the package into the R Console, and to read the data, respectively. Lines 4 to 10 of Appendix D are to identify the number of cases, the dependent variable (y), possible level-1 predictors (x1, x2), and possible level-2 predictors or moderators (u1, u2, u3). In this package, we regard the level-1 and level-2 intercepts as the regression coefficients corresponding to the predictor x _i0=1 and u _i0=1, respectively. These are fulfilled by lines 11 and 12, where the labels or column names are for identifying parameter estimates corresponding to intercepts. Lines 16 to 20 are to specify the level-1 and level-2 regression models. In the example, the level-1 model specified by L1=cbind(x0,x1,x2) has three coefficients, β _i0, β _i1, and β _i2, corresponding to the coefficients of x _i0=1, x _i1 and x _i2, respectively. Lines 17 to 19 are to specify the level-2 models corresponding to each of the level-1 coefficients. L20=cbind(u0,u1,u2) assigns three predictors to β _i0: u _i0=1, u _i1 and u _i2; L21=cbind(u0,u2,u3) assigns three predictors to β _i1: u _i0=1, u _i2 and u _i3; and L22=cbind(u0,u1,u2,u3) assigns four predictors to β _i2: u _i0=1, u _i1, u _i2 and u _i3. The 20th line in Appendix D puts all the level-2 predictors together to pass to the package. Notice that the specification of level-2 predictors must correspond to the level-1 predictors. For example, if you decide to leave out the intercept in level-1, then the specification for level-1 and level-2 predictors become

L1=cbind(x1,x2);#level-1 predictors;

L21=cbind(u0,u2,u3);#level-2 predictors for beta_i1;

L22=cbind(u0,u1,u2,u3);#level-2 predictors for beta_i2;

L2=list(L21,L22); #all level-2 predictors;

Lines 24 to 26 in Appendix D are to setup a H matrix corresponding to ${\bf h}_{i}$ in Equation (7) that contains the predictors for variance parameters in σ. Line 24 requests six variance parameters be estimated in the order of σ ₁₁=Var(ε _i1), σ ₂₂=Var(ε _i2), σ ₁₂=Cov(ε _i1,ε _i2), σ ₀₁=Cov(ε _i0,ε _i1), σ ₀₂=Cov(ε _i0,ε _i2), and $\sigma_{0e}^{2}=\mathrm {Var}(\varepsilon_{i0})+\mathrm {Var}(e)$. If one chooses not to let the prediction errors at level-2 covary, then line 24 needs to be set as H_mat=cbind(x1*x1, x2*x2, 1), which typically results in smaller SEs for the variance estimates. Lines 25 and 26 are to label the estimates for the variance estimates to be correctly identified in the output. For example, if not allowing the level-2 prediction errors to covary, then line 25 should be set as H_name=cbind("x1x1", "x2x2", "x0e"). In particular, the label x1x1 and x2x2 are used to identify the proper variance estimates to calculate R-squares in the package. We do not encourage users to change the notation used in labeling.

Line 28 is to run the package, and the output of running Appendix D is in Appendix E. In addition to the results of NML for the two-level model, the default output also contains the results of LS analysis for the corresponding MMR model, where xjuk corresponds to the level-1 coefficient of xj predicted by the level-2 predictor uk, e.g., x0u0 corresponds to the intercept γ ₀₀ or the level-1 coefficient of x0 predicted by the level-2 predictor u0. For the LS estimates, the default output contains SE_ls, SE_sw0, SE_sw3, SE_sw4, and the corresponding z-scores. The results corresponding to SE_sw0 are not in Appendix E because there is not enough horizontal space. We choose to output these four sets of SEs because SE_lss are the default for LS analysis; SE_sw0s are the most commonly used consistent SEs in software, SE_sw3s perform the best with smaller sample sizes according to Long and Ervin (2000), and SE_sw4s are most reliable when having high-leverage observations (Cribari-Neto 2004).

The labels for the results of NML are the same as those for the results of LS. The default output of NML also contains SE_sw0s and the corresponding z-scores, which are not included in Appendix E due to space limitation.

Appendix D. R Code for NML Estimation for the Model Described in Appendix B

Appendix E. The Output of Running the Example in Appendix C

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, KH., Cheng, Y. & Maxwell, S. Moderation Analysis Using a Two-Level Regression Model. Psychometrika 79, 701–732 (2014). https://doi.org/10.1007/s11336-013-9357-x

Download citation

Received: 10 January 2013
Published: 12 December 2013
Issue Date: October 2014
DOI: https://doi.org/10.1007/s11336-013-9357-x

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Moderation Analysis Using a Two-Level Regression Model

Abstract

Access this article

Similar content being viewed by others

Some Issues in Generalized Linear Modeling

New Perspectives on Causal Mediation Analysis

Restricted Maximum Likelihood Estimation for Parameters of the Social Relations Model

Notes

References

Acknowledgement