Latent Variables and Indices: Herman Wold’s Basic Design and Partial Least Squares
In this chapter it is shown that the PLS-algorithms typically converge if the covariance matrix of the indicators satisfies (approximately) the “basic design”, a factor analysis type of model. The algorithms produce solutions to fixed point equations; the solutions are smooth functions of the sample covariance matrix of the indicators. If the latter matrix is asymptotically normal, the PLS-estimators will share this property. The probability limits, under the basic design, of the PLS-estimators for loadings, correlations, multiple R’s, coefficients of structural equations et cetera will differ from the true values. But the difference is decreasing, tending to zero, in the “quality” of the PLS estimators for the latent variables. It is indicated how to correct for the discrepancy between true values and the probability limits. We deemphasize the “normality”-issue in discussions about PLS versus ML: in employing either method one is not required to subscribe to normality; they are “just” different ways of extracting information from second-order moments.
We also propose a new “back-to-basics” research program, moving away from factor analysis models and returning to the original object of constructing indices that extract information from high-dimensional data in a predictive, useful way. For the generic case we would construct informative linear compounds, whose constituent indicators have non-negative weights as well as non-negative loadings, satisfying constraints implied by the path diagram. Cross-validation could settle the choice between various competing specifications. In short: we argue for an upgrade of principal components and canonical variables analysis.
KeywordsLatent Variable Basic Design Probability Limit Sample Covariance Matrix Path Diagram
Unable to display preview. Download preview PDF.
- Cox, D. R., & Wermuth, N. (1998). Multivariate dependencies- models, analysis and interpretation. Boca Raton: Chapman & Hall.Google Scholar
- Dijkstra, T. K. (1981). Latent variables in linear stochastic models, PhD-thesis, (second edition (1985), Amsterdam: Sociometric Research Foundation).Google Scholar
- Dijkstra, T. K. (1982). Some comments on maximum likelihood and partial least squares methods, Research Report UCLA, Dept. Psychology, a shortened version was published in 1983.Google Scholar
- Gantmacher, F. R. (1977). The theory of matrices, Vol. 1. New York: Chelsea.Google Scholar
- Geisser, S. (1993). Predictive inference: an introduction.New York: Chapman&Hall.Google Scholar
- Hastie, T., Tibshirani, R., & Friedman, J. (2002). The elements of statistical learning. New York: Springer.Google Scholar
- Jöreskog, K. G., & Wold, H. O. A. (Eds.), (1982). Systems under indirect observation, Part II. Amsterdam: North-Holland.Google Scholar
- Kaplan, A. (1964). The conduct of inquiry. New York: Chandler.Google Scholar
- Schrijver, A. (2004). A course in combinatorial optimization. Berlin: Springer.Google Scholar
- Stone, M., & Brooks, R. J. (1990). Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. Journal of the Royal Statistical Society, Series B (Methodological), 52, 237–269.Google Scholar
- Wold, H. O. A. (1966). Nonlinear estimation by iterative least squares procedures. In David, F. N. (Ed.), Research Papers in statistics, Festschrift for J. Neyman, (pp. 411–414). New York: New York.Google Scholar
- Wold, H. O. A. (1975). Path models with latent variables: the NIPALS approach. In H. M. Blalock, A., Aganbegian, A., F. M.Borodkin, R. Boudon, V. Capecchi, (Eds.), Quantitative Sociology, (pp. pp. 307–359). New York: Academic.Google Scholar
- Wold, H. O. A. (1982). Soft modelling: the basic design and some extensions. in Jöreskog, K. G., & Wold, H. O. A. (eds), Systems under indirect observation, Part II, pp. 1–5. Amsterdam: Northholland.Google Scholar