Skip to main content
Log in

Consistent Partial Least Squares for Nonlinear Structural Equation Models

Psychometrika Aims and scope Submit manuscript

Abstract

Partial Least Squares as applied to models with latent variables, measured indirectly by indicators, is well-known to be inconsistent. The linear compounds of indicators that PLS substitutes for the latent variables do not obey the equations that the latter satisfy. We propose simple, non-iterative corrections leading to consistent and asymptotically normal (CAN)-estimators for the loadings and for the correlations between the latent variables. Moreover, we show how to obtain CAN-estimators for the parameters of structural recursive systems of equations, containing linear and interaction terms, without the need to specify a particular joint distribution. If quadratic and higher order terms are included, the approach will produce CAN-estimators as well when predictor variables and error terms are jointly normal. We compare the adjusted PLS, denoted by PLSc, with Latent Moderated Structural Equations (LMS), using Monte Carlo studies and an empirical application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Notes

  1. In favor of the sign-weights one could argue that in sufficiently large samples \(\mathit{sign}_{ij}\cdot S_{ij}\widehat{w}_{j}\) is approximately equal to \(\lambda_{i}\cdot|\rho_{ij}|\cdot ( \lambda_{j}^{\intercal }\overline{w}_{j} ) \), where the term in brackets measures the (positive) correlation between η j and its proxy; see below for results that help justify this claim. So the tighter the connection between η i and η j , and the better η j can be measured, the more important \(\widehat{\eta}_{j}\) is in determining \(\widehat {w}_{i}\).

  2. Things like the sample average of the product \(\widehat{\eta}_{i}\widehat{\eta}_{j}\widehat{\eta}_{k}\) estimate \(E\overline{\eta}_{i}\overline{\eta}_{j}\overline{\eta }_{k}\) consistently as a little algebra and a routine application of asymptotic theory will easily show. The established consistency of \(\widehat{w}\) and of the sample moments of the indicators is sufficient.

  3. The colon indicates stacking, as in Matlab.

  4. ζ j.i is the residual that is left after regressing η j on η i . It is independent of η i and has mean zero. Also η k =β ki η i +β kj η j +ζ k.ij is the regression of η k on η i and η j ; the residual ζ k.ij is independent of the regressors and has mean zero.

  5. Only (34) requires some work:

    $$E\eta_{i}^{2}\eta _{j}\eta_{k}=E\eta_{i}^{2}\eta _{j} ( \beta_{ki}\eta_{i}+\beta_{kj}\eta_{j}+\zeta_{k.ij} ) =\beta _{ki}E\eta _{i}^{3}\eta_{j}+\beta_{kj}E\eta_{i}^{2}\eta_{j}^{2}=3\beta _{ki}\rho_{ij}+\beta_{kj} \bigl( 1+2\rho_{ij}^{2} \bigr). $$

    Inserting \(( \rho_{ik}-\rho_{ij}\rho_{jk} ) \div ( 1-\rho_{ij}^{2} ) \) for β ki and an analogous expression for β kj yields (34).

  6. We gratefully thank Wynne W. Chin from the Department of Decision and Information Sciences, Bauer College of Business, University of Houston, Texas, USA, for providing the empirical data.

  7. Matlab code, that exemplifies how sensitive the PLS structural coefficients can be for less than perfect proxies (in linear models) is available from the first author.

  8. This happens, e.g., in Klein and Muthén (2007), see in particular pp. 660 and 661.

References

  • Bollen, K.A. (1996). An alternative two stage least squares (2SLS) estimator for latent variable equations. Psychometrika, 61, 109–121.

    Article  Google Scholar 

  • Bollen, K.A., & Paxton, P. (1998). Two-stage least squares estimation of interaction effects. In R.E. Schumacker & G.A. Marcoulides (Eds.), Interaction and nonlinear effects in structural equation modeling, Mahwah: Erlbaum.

    Google Scholar 

  • Chin, W.W., Marcolin, B.L., & Newsted, P.R. (2003). A partial least squares latent variable modeling approach for measuring interaction effects. Results from a Monte Carlo simulation study and an electronic-mail emotion/adoption study. Information Systems Research, 14, 189–217.

    Article  Google Scholar 

  • Cramér, H. (1946). Mathematical methods of statistics. Princeton: Princeton University Press.

    Google Scholar 

  • DasGupta, A. (2008). Asymptotic theory of statistics and probability. New York: Springer.

    Google Scholar 

  • Dijkstra, T.K. (1981, 1985). Latent variables in linear stochastic models (PhD thesis 1981, 2nd ed. 1985). Amsterdam, The Netherlands: Sociometric Research Foundation.

  • Dijkstra, T.K. (2009). PLS for path diagrams revisited, and extended. In Proceedings of the 6th international conference on partial least squares, Beijing. Available from http://www.rug.nl/staff/t.k.dijkstra/research.

    Google Scholar 

  • Dijkstra, T.K. (2010). Latent variables and indices. In V. Esposito Vinzi, W.W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of Partial Least Squares: concepts, methods and applications (pp. 23–46). Berlin: Springer.

    Chapter  Google Scholar 

  • Dijkstra, T.K. (2011). Consistent Partial Least Squares estimators for linear and polynomial factor models. Unpublished manuscript, University of Groningen, The Netherlands. Available from http://www.rug.nl/staff/t.k.dijkstra/research.

  • Dijkstra, T.K. (2013). The simplest possible factor model estimator, and successful suggestions how to complicate it again. Unpublished manuscript, University of Groningen, The Netherlands. Available from http://www.rug.nl/staff/t.k.dijkstra/research.

  • Dijkstra, T.K., & Henseler, J. (2011). Linear indices in nonlinear structural equation models: best fitting proper indices and other composites. Quality and Quantity, 45, 1505–1518.

    Article  Google Scholar 

  • Dijkstra, T.K., & Henseler, J. (2012). Consistent and asymptotically normal PLS-estimators for linear structural equations. In preparation.

  • Esposito Vinzi, V., Chin, W.W., Henseler, J., & Wang, H. (2010). Handbook of Partial Least Squares. Concepts, methods and applications. Berlin: Springer.

    Book  Google Scholar 

  • Ferguson, T.S. (1996). A course in large sample theory. London: Chapman & Hall.

    Book  Google Scholar 

  • Fleishman, A.I. (1978). A method for simulating non-normal distributions. Psychometrika, 43, 521–532.

    Article  Google Scholar 

  • Gray, H.L., & Schucany, W.R. (1972). The generalized statistic. New York: Dekker

    Google Scholar 

  • Headrick, T.C. (2010). Statistical simulation: power method polynomials and other transformations. Boca Raton: Chapman & Hall/CRC.

    Google Scholar 

  • Henseler, J., & Chin, W.W. (2010). A comparison of approaches for the analysis of interaction effects between latent variables using partial least squares path modeling. Structural Equation Modeling, 17, 82–109.

    Article  Google Scholar 

  • Henseler, J., & Fassott, G. (2010). Testing moderating effects in PLS path models: an illustration of available procedures. In V. Esposito Vinzi, W.W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of Partial Least Squares: concepts, methods and applications (pp. 713–735). Berlin: Springer.

    Chapter  Google Scholar 

  • Jöreskog, K.G., & Yang, F. (1996). Non-linear structural equation models: the Kenny–Judd model with interaction effects. In G.A. Marcoulides & R.E. Schumacker (Eds.), Advanced structural equation modeling (pp. 57–87). Mahwah: Erlbaum.

    Google Scholar 

  • Kan, R. (2008). From moments of sum to moments of product. Journal of Multivariate Analysis, 99, 542–554.

    Article  Google Scholar 

  • Kenny, D.A., & Judd, C.M. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin, 96, 201–210.

    Article  Google Scholar 

  • Kharab, A., & Guenther, R.B. (2012). An introduction to numerical methods: a MATLAB ® approach. Boca Raton: CRC Press.

    Google Scholar 

  • Klein, A.G., & Moosbrugger, H. (2000). Maximum likelihood estimation of latent interaction effects with the LMS method. Psychometrika, 65, 457–474.

    Article  Google Scholar 

  • Klein, A.G., & Muthén, B.O. (2007). Quasi maximum likelihood estimation of structural equation models with multiple interaction and quadratic effects. Multivariate Behavioral Research, 42, 647–673.

    Article  Google Scholar 

  • Lee, S.-Y., Song, X.-Y., & Poon, W.-Y. (2007a). Comparison of approaches in estimating interaction and quadratic effects of latent variables. Multivariate Behavioral Research, 39, 37–67.

    Article  Google Scholar 

  • Lee, S.-Y., Song, X.-Y., & Tang, N.-S. (2007b). Bayesian methods for analyzing structural equation models with covariates, interaction, and quadratic latent variables. Structural Equation Modeling, 14, 404–434.

    Article  Google Scholar 

  • Little, T.D., Bovaird, J.A., & Widaman, K.F. (2006). On the merits of orthogonalizing powered and product terms: implications for modeling interactions among latent variables. Structural Equation Modeling, 13, 497–519.

    Article  Google Scholar 

  • Liu, J.S. (1994). Siegel’s formula via Stein’s identities. Statistics & Probability Letters, 21, 247–251.

    Article  Google Scholar 

  • Lu, I.R.R., Kwan, E., Thomas, D.R., & Cedzynski, M. (2011). Two new methods for estimating structural equation models: an illustration and a comparison with two established methods. International Journal of Research in Marketing, 28, 258–268.

    Article  Google Scholar 

  • Marsh, H.W., Wen, Z., & Hau, K.-T. (2004). Structural equation models of latent interactions: evaluation of alternative estimation strategies and indicator construction. Psychological Methods, 9, 275–300.

    Article  PubMed  Google Scholar 

  • McDonald, R.P. (1996). Path analysis with composite variables. Multivariate Behavioral Research, 31, 239–270.

    Article  Google Scholar 

  • Mooijaart, A., & Bentler, P.M. (2010). An alternative approach for nonlinear latent variable models. Structural Equation Modeling, 17, 357–373.

    Article  Google Scholar 

  • Moosbrugger, H., Schermelleh-Engel, K., Kelava, A., & Klein, A.G. (2009). Testing multiple nonlinear effects in structural equation modeling: a comparison of alternative estimation approaches. In T. Teo & M.S. Khine (Eds.), Structural equation modeling in educational research: concepts and applications (pp. 103–136). Rotterdam: Sense Publishers.

    Google Scholar 

  • Schermelleh-Engel, K., Klein, A., & Moosbrugger, H. (1998). Estimating nonlinear effects using a latent moderated structural equations approach. In R.E. Schumacker & G.A. Marcoulides (Eds.), Interaction and nonlinear effects in structural equation modeling (pp. 203–238). Mahwah: Erlbaum.

    Google Scholar 

  • Schermelleh-Engel, K., Werner, C.S., Klein, A.G., & Moosbrugger, H. (2010). Nonlinear structural equation modeling: is partial least squares an alternative? AStA Advances in Statistical Analysis, 94, 167–184.

    Article  Google Scholar 

  • Serfling, R. (1980). Approximation theorems of mathematical statistics. New York: Wiley.

    Book  Google Scholar 

  • Shao, J., & Tu, D. (1995). The jackknife and bootstrap. New York: Springer.

    Book  Google Scholar 

  • Tenenhaus, M., Esposito Vinzi, V., Chatelin, Y.-M., & Lauro, C. (2005). PLS path modelling. Computational Statistics & Data Analysis, 48, 159–205.

    Article  Google Scholar 

  • Vale, C.D., & Maurelli, V.A. (1983). Simulating multivariate nonnormal distributions. Psychometrika, 48, 465–471.

    Article  Google Scholar 

  • Wall, M.M., & Amemiya, Y. (2003). A method of moments technique for fitting interaction effects in structural equation models. British Journal of Mathematical and Statistical Psychology, 56, 47–64.

    Article  PubMed  Google Scholar 

  • Wold, H.O.A. (1966). Nonlinear estimation by iterative least squares procedures. In F.N. David (Ed.), Research papers in statistics: festschrift for J. Neyman (pp. 411–444). New York: Wiley.

    Google Scholar 

  • Wold, H.O.A. (1975). Path models with latent variables: the NIPALS approach. In H.M. Blalock (Ed.), Quantitative sociology (pp. 307–359). New York: Academic Press.

    Chapter  Google Scholar 

  • Wold, H.O.A. (1982). Soft modelling: the basic design and some extensions. In K.G. Jöreskog & H.O.A. Wold (Eds.), Systems under indirect observation, Part II (pp. 1–55). Amsterdam: North-Holland.

    Google Scholar 

Download references

Acknowledgements

The authors wish to acknowledge the constructive comments and suggestions by the editor, an associate editor, and three reviewers, which led to substantial improvements in contents and readability.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theo K. Dijkstra.

Appendices

Appendix A

Here we derive Equation (23). The other moment equations are derived in a similar, usually even simpler, way.

Equation (23) says

$$ E\overline{\eta}_{i}^{2}\overline{\eta }_{j}^{2}=Q_{i}^{2}Q_{j}^{2} \cdot\bigl( E\eta_{i}^{2}\eta_{j}^{2}-1 \bigr) +1. $$

Recall that \(\overline{\eta}_{i}=Q_{i}\eta_{i}+\delta_{i}\) where the δ’s are mutually independent, and independent of the η’s. The Q’s are real numbers from [−1,1]. The η’s and \(\overline {\eta}\)’s have zero mean and unit variance. The δ’s have a zero mean as well. Note that \(E\delta_{i}^{2}=1-Q_{i}^{2}\). We have

$$\begin{aligned} E\overline{\eta}_{i}^{2}\overline{\eta }_{j}^{2} =&E \bigl[ ( Q_{i}\eta _{i}+\delta_{i} ) ^{2}\cdot( Q_{j} \eta_{j}+\delta_{j} ) ^{2} \bigr] \\ =&E \bigl[ \bigl( Q_{i}^{2}\eta_{i}^{2}+ \delta_{i}^{2}+2Q_{i}\eta_{i}\delta _{i} \bigr) \cdot\bigl( Q_{j}^{2}\eta _{j}^{2}+\delta_{j}^{2}+2Q_{j} \eta_{j}\delta_{j} \bigr) \bigr] . \end{aligned}$$

The expected value consists of three parts. The first part equals

$$\begin{aligned} E \bigl[ Q_{i}^{2}\eta_{i}^{2}\cdot \bigl( Q_{j}^{2}\eta_{j}^{2}+\delta _{j}^{2}+2Q_{j}\eta_{j}\delta _{j} \bigr) \bigr] =&Q_{i}^{2}Q_{j}^{2}E\eta _{i}^{2}\eta_{j}^{2}+Q_{i}^{2}E \eta_{i}^{2}\delta_{j}^{2}+2Q_{i}^{2}Q_{j}E \eta_{i}^{2}\eta_{j}\delta_{j} \\ =&Q_{i}^{2}Q_{j}^{2}E\eta _{i}^{2}\eta_{j}^{2}+Q_{i}^{2} \cdot1\cdot\bigl( 1-Q_{j}^{2} \bigr) +0. \end{aligned}$$

The second part equals

$$\begin{aligned} E \bigl[ \delta_{i}^{2}\cdot\bigl( Q_{j}^{2} \eta_{j}^{2}+\delta_{j}^{2}+2Q_{j} \eta_{j}\delta_{j} \bigr) \bigr] =&Q_{j}^{2}E\delta_{i}^{2}\eta _{j}^{2} +E\delta_{i}^{2}\delta _{j}^{2} +2Q_{j}E\delta_{i}^{2} \eta_{j}\delta_{j} \\ =&Q_{j}^{2} \bigl( 1-Q_{i}^{2} \bigr) \cdot1+ \bigl( 1-Q_{i}^{2} \bigr) \bigl( 1-Q_{j}^{2} \bigr) +0. \end{aligned}$$

And the last part is

$$\begin{aligned} E \bigl[ 2Q_{i}\eta_{i}\delta_{i}\cdot \bigl( Q_{j}^{2}\eta_{j}^{2}+\delta _{j}^{2}+2Q_{j}\eta_{j}\delta _{j} \bigr) \bigr] =&2Q_{i}E \bigl[ \delta_{i}\cdot\bigl( \eta _{i} \bigl( Q_{j}^{2}\eta_{j}^{2}+ \delta_{j}^{2}+2Q_{j}\eta_{j}\delta _{j} \bigr) \bigr) \bigr] =0. \end{aligned}$$

Collecting terms yields

$$\begin{aligned} E\overline{\eta}_{i}^{2}\overline{\eta }_{j}^{2} =&Q_{i}^{2}Q_{j}^{2}E \eta_{i}^{2}\eta_{j}^{2}+ Q_{i}^{2} \bigl( 1-Q_{j}^{2} \bigr) +Q_{j}^{2} \bigl( 1-Q_{i}^{2} \bigr) + \bigl( 1-Q_{i}^{2} \bigr) \bigl( 1-Q_{j}^{2} \bigr) \\ =&Q_{i}^{2}Q_{j}^{2}E\eta _{i}^{2}\eta_{j}^{2}+1-Q_{i}^{2}Q_{j}^{2} \end{aligned}$$

as claimed.

Appendix B. Stein’s Identity

The normal distribution has many interesting (and characterizing) properties. One of them is the property known as Stein’s identity. This tells us that when X is a p-dimensional normal vector with mean μ and covariance matrix Ω, then f(X), where f(.) is a real valued smooth function, satisfies (cf. Liu, 1994):

$$ \operatorname{cov} \bigl( X,f ( X ) \bigr) =\varOmega E \bigl[ \nabla f ( X ) \bigr] . $$
(B.1)

Here ∇f(.) is f(.)’s gradient. So when Ω is invertible, the regression of f(X) on X equals

$$ f ( X ) =Ef ( X ) +E \bigl[ \nabla f ( X ) \bigr] ^{\intercal} \cdot( X-\mu) +\text{regression residual}. $$
(B.2)

Consequently, the regression coefficients are the average first partial derivatives. We add here an extension of Stein’s identity that covers the quadratic case:

$$\begin{aligned} f ( X ) =&Ef ( X ) +E \bigl[ \nabla f ( X ) \bigr] ^{\intercal} \cdot( X-\mu) \\ &{}+1/2\cdot \operatorname{trace} \bigl[ E \bigl( H ( X ) \bigr) \cdot\bigl( ( X-\mu) (X-\mu) ^{\intercal}-\varOmega\bigr) \bigr] +\nu \end{aligned}$$
(B.3)

where H(.) is the Hessian of f(.) and ν is the regression residual. So the regression coefficients for the interaction and quadratic terms (apart from the multiple \({\frac{1}{2}}\)) are average second order partial derivatives. They are estimated consistently when a quadratic specification is used instead of the true nonlinear function. (There does not appear to be a comparably neat expression for higher order partial derivatives.)

We will prove (B.3), assuming the existence of partial derivatives and boundedness of their expected absolute values.

Write \(X=\mu+\varOmega^{{\frac{1}{2}}}Z\), where Z is p-dimensional standard normal. Consider first, for a real function g(.) from the same class as f(.), the regression of g(Z)−Eg(Z) on Z and the squares and cross-products of its components. The covariance matrix of the regressors is diagonal, with ones everywhere, except at the entries corresponding with the squares where we have 2. So the regression coefficient of Z i equals EZ i g(Z)=Eg(Z) i by Stein’s identity. For twice the coefficient of \(Z_{i}^{2}-1 \) we get

$$\begin{aligned} E \bigl( Z_{i}^{2}-1 \bigr) g ( Z ) =&EZ_{i}^{2}g ( Z ) -Eg ( Z ) \end{aligned}$$
(B.4)
$$\begin{aligned} =& EZ_{i} \bigl( Z_{i}g ( Z ) \bigr) -Eg(Z) = E \bigl( g ( Z)+Z_{i}\nabla g ( Z ) _{i} \bigr) -Eg ( Z ) \end{aligned}$$
(B.5)
$$\begin{aligned} =& E \bigl( Z_{i}\nabla g ( Z ) _{i} \bigr) = EH_{ii} ( Z ) . \end{aligned}$$
(B.6)

On the second line we applied Stein’s identity to Z i g(Z) and on the third line to ∇g(Z) i . One obtains similarly for the coefficient of a cross-product (ij):

$$ EZ_{i} \bigl( Z_{j}g ( Z ) \bigr) =E \bigl( Z_{j}\nabla g ( Z ) _{i} \bigr) =EH_{ij} ( Z ) . $$
(B.7)

Collecting terms yields (with H subscripted by g for identification):

$$ g(Z)=Eg(Z)+E\nabla g(Z)^{\intercal}\cdot Z+1/2\cdot\text{trace} \bigl[ EH_{g}(Z)\cdot\bigl( ZZ^{\intercal}-I \bigr) \bigr] +\nu. $$
(B.8)

Finally take a smooth function f(.) of X, and define \(g ( Z ) :=f ( \mu+\varOmega^{{\frac{1}{2}}}Z )\). A substitution of \(Z=\varOmega^{-{\frac{1}{2}}} ( X-\mu ) \), \(\nabla g ( Z ) =\varOmega ^{{\frac{1}{2}}}\nabla f(X)\), and \(H_{g}(Z)=\varOmega^{{\frac {1}{2}}}H_{f}(X)\varOmega^{{\frac{1}{2}}}\) into (B.8) yields the desired expression for general f(X).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dijkstra, T.K., Schermelleh-Engel, K. Consistent Partial Least Squares for Nonlinear Structural Equation Models. Psychometrika 79, 585–604 (2014). https://doi.org/10.1007/s11336-013-9370-0

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-013-9370-0

Key words

Navigation