On Exact Statistical Properties of Multidimensional Indices Based on Principal Components, Factor Analysis, MIMIC and Structural Equation Models

Krishnakumar, Jaya; Nagar, A. L.

doi:10.1007/s11205-007-9181-8

On Exact Statistical Properties of Multidimensional Indices Based on Principal Components, Factor Analysis, MIMIC and Structural Equation Models

Published: 04 September 2007

Volume 86, pages 481–496, (2008)
Cite this article

Social Indicators Research Aims and scope Submit manuscript

Jaya Krishnakumar¹ &
A. L. Nagar²

1331 Accesses
88 Citations
3 Altmetric
Explore all metrics

Abstract

Recent empirical literature has seen many multidimensional indices emerge as well-being or poverty measures, in particular indices derived from principal components and various latent variable models. Though such indices are being increasingly and widely employed, few studies motivate their use or report the standard errors or confidence intervals associated with these estimators. This paper reviews the different underlying models, reaffirms their appropriateness in this context, examines the statistical properties of resulting indices, gives analytical expressions of their variances and establishes certain exact relationships among them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Mixed methods research: what it is and what it could be

Article Open access 29 March 2019

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

References

Anderson, T. W. (1984). An introduction to multivariate statistical analysis. New York: John Wiley and Sons.
Google Scholar
Balestrino, A., & Sciclone, N. (2000). Should we use functionings instead of income to measure well-being? Theory, and some evidence from Italy. Mimeo, University of Pisa.
Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis. U.K.: Edward Arnold.
Google Scholar
Biswas, B., & Caliendo, F. (2002). A multivariate analysis of the human development index. Indian Economic Journal, 49(4), 96–100.
Google Scholar
Bollen, K. A. (1989). Structural equations with latent variables. New York: John Wiley & Sons.
Google Scholar
Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Staistical Psychology, 37, 62–83.
Google Scholar
Browne, M. W., & Arminger, G. (1995). Specification and estimation of mean - and covariance-structural models. In G. Arminger, C. C. Clogg, & M. E. Sobel (Eds.), Handbook of statistical modelling for the social and behavioral sciences (pp. 311–359). Newbury Park: Plenum Press.
Google Scholar
Di Tommaso, M. L. (2006). Measuring the well-being of children using a capability approach: An application to Indian data. Working Paper CHILD No. 05/2006, University of Turin.
Gouriroux, C., Monfort, A., & Trognon, A.(1984). Pseudo-maximum likelihood methods: Theory. Econometrica, 52, 681–700.
Article Google Scholar
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–441
Article Google Scholar
Jöreskog, K. (1973). A general method for estimatimg a linear structural equation system. In A. S. Goldberger & O. D. Duncan (Eds.), Structural equation models in the social sciences. New York: Seminar Press.
Google Scholar
Jöreskog, K. (2002). Structural equation modelling with ordinal variables using LISREL. http://www.ssicentral.com/lisrel/ordinal.htm
Jöreskog, K., & Goldberger, A. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70(351), 631–639.
Google Scholar
Klasen, S. (2000). Measuring poverty and deprivation in South Africa. Review of Income and Wealth, 46, 33–58.
Article Google Scholar
Krishnakumar, J. (2007). Going beyond functionings to capabilities: An econometric model to explain and estimate capabilities. Journal of Human Development, 7, 39–63.
Article Google Scholar
Krishnakumar, J., & Ballon, P. (2007). Estimating basic capabilities: A structural equation model applied to Bolivia. Working paper under review.
Kuklys, W. (2005). Amartya Sen’s capability approach: Theoretical insights and empirical applications. Berlin: Springer.
Book Google Scholar
Lelli, S. (2001). Factor analysis vs. fuzzy sets theory: Assessing the influence of different techniques on Sen’s functioning approach. Center for Economic studies, K.U. Leuven.
Maasoumi, E., & Nickelsburg, G. (1988). Multidimensional measures of well-being and an analysis of inequality in the Michigan data. Journal of Business and Economic Statistics, 6(3), 327–334.
Article Google Scholar
McGillivray, M. (2005). Measuring non-economic well-being achievement. Review of Income and Wealth, 51(2), 337–364.
Article Google Scholar
Morris, M. D. (1979). Measuring the condition of the world’s poor: The physical quality of life index. New York: Pergamon.
Google Scholar
Muthen, B. (1984). A general structural equation model with dichotomous, ordered categorical and continuous latent indicators. Psychometrika, 49, 115–132.
Article Google Scholar
Muthen, B. (2002). Beyond SEM: General latent variable modelling. Behaviormetrika, 29(1), 81–117.
Article Google Scholar
Muthen, B. O. (1998-2004). Mplus technical appendices. Los Angeles, CA: Muthen & Muthen.
Google Scholar
Nagar, A. L., & Basu, S. (2001). Weighting socio-economic indicators of human development (a latent variable approach). New Delhi: National Institute of Public Finance and Policy.
Google Scholar
Noorbaksh, F. (2003). Human development and regional disparities in India. Discussion Paper, Helsinki: UN-WIDER.
Rahman, T., Mittelhammer, R. C., & Wandschneider, P. (2003). Measuring the quality of life across countries: A sensitivity analysis of well-being indices. In WIDER International Conference on Inequality, Poverty and Human Well-being, Helsinki, Finland
Ram, R. (1982). Composite indices of physical quality of life, basic needs fulfilment, and income: A principal component representation. Journal of Development Economics, 11, 227–247.
Article Google Scholar
Schokkaert, E., & Lootehgem, L. (1990). Sen’s concept of the living standard applied to the Belgian unemployed. Recherches Economiques de Louvain, 56, 429–450.
Google Scholar
Sen, A. K. (1985). Commodities and capabilities. Amsterdam: North-Holland.
Google Scholar
Sen, A. K. (1999). Development as freedom. Oxford: Oxford University Press.
Google Scholar
Slottje, D. J (1991). Measuring the quality of life across countries. The Review of Economics and Statistics 73(4), 684–693.
Article Google Scholar
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. Boca Raton, U.S.A.: Chapman & Hall/CRC.
Google Scholar
UNDP (1990). Human Development Report (HDR). U.K.: Oxford University Press.
Wagle, U. (2005). Multidimensional poverty measurement with economic well-being, capability and social inclusion: A case from Kathmandu, Nepal. Journal of Human Development, 6(3), 301–328.
Article Google Scholar
White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica, 50, 1–26.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Econometrics, University of Geneva, 40, Bd. du Pont d’Arve, 1211, Geneva 4, Switzerland
Jaya Krishnakumar
National Institute of Public Finance and Policy, New Delhi, India
A. L. Nagar

Authors

Jaya Krishnakumar
View author publications
You can also search for this author in PubMed Google Scholar
A. L. Nagar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaya Krishnakumar.

Appendices

Appendix A

1.1 Minimum Variance Unbiased Estimation of Factor Scores in the FA Model

We are interested in estimators of latent factors $\hat{f}$ such that

$$ E(\hat{f}-f|f)= 0 $$

and

$$ V(\hat{f}-f) \,\, \hbox{is minimal}. $$

Let us denote the estimator as $\hat{f} = C y. $ Then $E(\hat{f}-f) = E(C(\Uplambda f + \varepsilon)-f) = (C\Uplambda - I) E(f) = 0$ implies the following condition:

$$ C\Uplambda = I $$

Thus we need to solve the following program:

Minimise $V(\hat{f}-f)= (C\Uplambda - I) (C\Uplambda - I)^{\prime} + C\Uppsi C^{\prime}$ under the constraint

$$ C\Uplambda = I. $$

The Lagrangian is :

$$ \begin{array}{l} \pounds = tr[C\Uplambda-I) (C\Uplambda-I)^{\prime} + C\Uppsi C^{\prime}] - \rho^{\prime} \hbox{vec}(C\Uplambda-I)\\ = tr[C\Uplambda-I) (C\Uplambda-I)^{\prime} + C\Uppsi C^{\prime}] - \rho^{\prime} (\Uplambda^{\prime}\otimes I) \hbox{vec}C - \rho^{\prime} \hbox{vec} I \end{array} $$

Substituting the constraint in the objective function we get

$$ \pounds = tr C\Uppsi C^{\prime} - \rho^{\prime}(\Uplambda \otimes I) \hbox{vec} C- \rho^{\prime}\hbox{vec} I $$

The first order conditions are given by:

$$ (\Uppsi^{\prime}\otimes I) \hbox{vec} C - (\Uplambda \otimes I) \rho = 0 $$

$$ (\lambda^{\prime}\otimes I) \hbox{vec} C = 0 $$

Solving the above system, one obtains:

$$ \begin{array}{l} \rho^{\ast} = (\Uplambda^{\prime}\Uppsi^{-1} \Uplambda)^{-1}\\ C^{\ast}= (\Uplambda^{\prime}\Uppsi^{-1}\Uplambda)^{-1} \Uplambda^{\prime}\Uppsi^{-1} \end{array} $$

In the special case Ψ = I, C ^* = (Λ′Λ)⁻¹ Λ′ and $\tilde{f} = C^{\ast}x = \Uptheta^{-\frac{1}{2}}A^{\prime}x = \Uptheta^{-\frac{1}{2}} p .$

Appendix B

2.1 “Unbiased" Principal Components

If we require the first m principal components to be also unbiased estimators of the latent factors that they are supposed to represent then we should find B such that

$$ E(B A^{*^{\prime}}y-f|f)=0 \quad \hbox{i.e.} \,\, E((BA^{*^{\prime}}\Uplambda-I)f|f)=0 \quad \forall f. $$

This implies

$$ BA^{*^{\prime}}\Uplambda-I = 0 $$

or

$$ B A^{*^{\prime}}A^{\ast}\Uptheta^{\ast\frac{1}{2}}=I $$

or

$$ B\Uptheta^{\ast\frac{1}{2}}=I $$

or

$$ B= \Uptheta^{\ast-\frac{1}{2}} $$

Thus the ‘unbiased’ principal component estimator is given by

$$ p^{\ast\ast}=\Uptheta^{\ast-\frac{1}{2}}A^{\prime}x = \Uptheta^{\ast-\frac{1}{2}} p = \tilde{f}^{\ast}. $$

Appendix C

3.1 Expression of MIMIC Estimator

Following Jöreskog and Goldberger (1975), the conditional expectation of f given y,x is given by:

$$ \hat{f} = Bx + \Uplambda^{\prime}\Omega^{-1} ( y - \Uplambda B x) $$

where

$$ \Omega = \Uplambda \Uplambda^{\prime}+ \Uppsi $$

Using

$$(\Uplambda \Uplambda^{\prime}+ \Uppsi)^{-1} = \Uppsi^{-1} + \Uppsi^{-1} \Uplambda (I+\Uplambda^{\prime}\Uppsi^{-1} \Uplambda) ^{-1} \Uplambda^{\prime}\Uppsi^{-1} $$

we obtain

$$\hat{f} = [I - \Uplambda^{\prime}\Uppsi^{-1} \Uplambda + \Uplambda^{\prime}\Uppsi^{-1} \Uplambda (I+ \Uplambda^{\prime}\Uppsi^{-1} \Uplambda)^{-1} \Uplambda^{\prime}\Uppsi^{-1} \Uplambda] [Bx+\Uplambda^{\prime}\Uppsi^{-1}y] $$

which can be simplified to

$$ (I+ \Uplambda^{\prime}\Uppsi^{-1} \Uplambda)^{-1} (Bx + \Uplambda^{\prime}\Uppsi^{-1}y) $$

Appendix D

4.1 Latent Factor Estimators and Their Variances in the Linear SEM

As explained in the text, the latent factors are estimated as the expectation of the posterior distribution of these factors given the sample i.e. given y,x. For a pure measurement model (with exogenous variables w) written as

$$ \begin{aligned} y &= Dw+\Uplambda \eta+ \varepsilon\\ x &= \eta_x \end{aligned} $$

(14)

the latent factor (Empirical Bayes) estimator is derived in Skrondal and Rabe-Hesketh (2004) as follows:

$$ \hat{\eta} = V(\eta) \Uplambda^{\prime}\left(\Uplambda V(\eta) \Uplambda^{\prime}+\Uppsi \right)^{-1} (y-Dw) $$

Here we take the above formula and adapt it to our case in which we have a SEM for explaining the latent factors. Our model is reproduced below for reference:

$$ \begin{array}{l} Ay^{\ast}+Bx^{\ast}+u=0\\ y = \Uplambda y^{\ast} + \varepsilon \end{array} $$

(15)

with

$$ V(u) = \Upsigma $$

To make use of the above result we substitute the reduced form of our SEM given by

$$ y^{\ast}= A^{-1} B x + A^{-1} u $$

into the measurement equation (15) to get

$$ y = \Uplambda A^{-1} B x + \Uplambda A^{-1} u + \varepsilon $$

(16)

Identifying (16) with (14) and η with u one can obtain the ‘estimator’ of u as

$$ \hat{u} = \Upsigma A^{-1} \Uplambda^{\prime}(\Uplambda A^{-1} \Upsigma A^{-1^{\prime}} \Uplambda^{\prime}+\Uppsi)^{-1} (y-\Uplambda A^{-1} B x) $$

The factor estimators are then obtained by substituting $\hat{u}$ for u in the SEM model (15):

$$ \hat{y}^{\ast}= A^{-1} B x + A^{-1} \Upsigma A^{-1} \Uplambda^{\prime}(\Uplambda A^{-1}\Upsigma A^{-1^{\prime}} \Uplambda^{\prime}+\Uppsi)^{-1} (y-\Uplambda A^{-1} B x) $$

(17)

which is the equation given in the text.

Finally, the variance of $\hat{y}^*$ is derived by noting that

$$ y-\Uplambda A^{-1} B x= \Uplambda A^{-1} u + \varepsilon $$

and

$$ V(\Uplambda A^{-1} u + \varepsilon) = \Uplambda A^{-1}\Upsigma A^{-1^{\prime}} \Uplambda^{\prime}+\Uppsi $$

and using the above to calculate $V(\hat{y}^{\ast})$ according to (17).

Alternatively, Muthen (1998-2004) gives another expression of the latent factor estimator based on maximisation of posterior likelihood. The model is written as

$$ v = \nu_v +\Uplambda_v \eta_v + \varepsilon_v $$

$$ A_v \eta_v = \alpha_v + u_v $$

where

$$ v = \left[\begin{array}{l}y \\ x\end{array}\right]; \quad \nu_v = \left[ \begin{array}{l} v_y\\ 0\end{array}\right] \quad \Uplambda_v= \left[\begin{array}{ll} \Uplambda & 0\\ 0 & I\end{array}\right];\quad \eta_v = \left[\begin{array}{l}\eta\\ \eta_x\end{array}\right] \quad \varepsilon_v = \left[\begin{array}{l} \varepsilon\\ 0\end{array}\right] $$

$$ A_v = \left[\begin{array}{ll} A & -B \\0 & I\end{array}\right]; \quad \alpha_v = \left[\begin{array}{l}\alpha\\0 \end{array}\right]; \quad u_v = \left[\begin{array}{l} u\\0\end{array}\right]; $$

with

$$ E(\varepsilon) = 0 \quad \quad E(u) = 0 $$

and

$$ V(\varepsilon) = \Uppsi \quad \quad V(u) = \Upsigma $$

Thus the model is in fact

$$ \begin{aligned} y &= \nu +\Uplambda \eta + \varepsilon\\ A \eta &= \alpha + B x + u\\ \end{aligned} $$

$$ x = \eta_x $$

The factor score estimator is then:

$$ \hat{\eta_v} = \mu_v + C (v -\nu_v - \Uplambda_v \mu_v) $$

(18)

where

$$ \mu_v = A^{-1} \alpha_v $$

$$ C = A_v^{-1} \Upsigma_v A_v^{-1^{\prime}} \Uplambda_v^{\prime} (\Uplambda_v A_v^{-1} \Upsigma_v A_v^{-1^{\prime}} \Uplambda_v^{\prime}+ \Uppsi_v)^{-1} $$

and

$$ \Upsigma_v = \left[\begin{array}{ll}\Upsigma & 0\\ 0 & \Upsigma_{xx}\end{array}\right]; \quad \Uppsi_v = \left[\begin{array}{ll} \Uppsi & 0\\0 & 0\end{array}\right]. $$

Replacing the above partitioned matrices and vectors in (18) and performing all the calculations, one gets:

$$ \hat{\eta} = A^{-1} \alpha + A^{-1} B x + A^{-} \Upsigma A^{-1^{\prime}} \Uplambda (\Uplambda A^{-1} \Upsigma A^{-1^{\prime}} \Uplambda^{\prime}+ \Uppsi)^{-1} (y- v_y - \Uplambda A^{-1} \alpha - \Uplambda B x) $$

and

$$ \hat{\eta}_x = x $$

The last result is expected as we assume that the x’s are directly observed.

Assuming y is centered and regrouping the intercept term A ⁻¹α and the ‘exogenous’ elements term A ⁻¹ Bx into one term denoting it with the same symbol A ⁻¹ Bx (i.e. assuming x incorporates a constant), one gets

$$\hat{\eta} = A^{-1} B x + A^{-1} \Upsigma A^{-1^{\prime}} \Uplambda (\Uplambda A^{-1} \Upsigma A^{-1^{\prime}} \Uplambda^{\prime}+ \Uppsi)^{-1} (y - \Uplambda A^{-1} B x) $$

Thus we see that it is the same expression as the Empirical Bayes estimator (17) (under our above assumptions) and hence has the same variance.

Appendix E

5.1 Monotonic Transformation and Posterior Distribution

The ordinality of latent factors implies that any monotonic transformation of y ^* will preserve the order in $\hat{y^{\ast}}. $ We will show this in the case of a scalar latent factor y ^* with a vector indicator y. The proof can be extended to the vector case without any major difficulty.

The posterior distribution of the latent factor y ^* given the indicator y is given by

$$ p(y^{\ast}|y) = \frac{p(y^{\ast}) \pi(y|y^{\ast})}{f(y)} $$

where p(y ^*|y) denotes the posterior density of y ^* given y, p(y ^*) is the prior density of y ^*, π(y|y ^*) is the distribution of y given y ^* and f(y) denotes the density of y.

Let us now transform y ^*: u ^* = g(y ^*).

Then, using

$$ y^{\ast}= g^{-1}(u^{\ast}), \quad p(u^{\ast}) = p(y^{\ast}) \left(\frac{d g}{d y^{\ast}}\right)^{-1} $$

and

$$ \pi(y|y^{\ast}) = \pi(y|g^{-1}(u^{\ast})) $$

one can write

$$ p(y^{\ast}|y) = \frac{p(y^{\ast}) \left(\frac{dg}{dy^{\ast}}\right) \pi(y|g^{-1}(u^{\ast}))}{f(y)} $$

or

$$ = \left(\frac{dg}{dy^{\ast}}\right) \frac{p(g^{-1}(u^{\ast})) \left(\frac{dg}{dy^{\ast}}\right) \pi(y|g^{-1}(u^{\ast}))}{f(y)} $$

The first element of the product is positive if g(y ^*) is monotonic increasing and one can write the second part as p(g ⁻¹(u ^*)|y) ≡ p(u ^*|y).

Hence

$$ p(u^{\ast}|y) = \left(\frac{dg}{dy^{\ast}}\right)^{-1} p(y^{\ast}|y) $$

Therefore if

$$ E(y^{\ast}|y_1) > E(y^{\ast}|y_2) $$

then we have

$$ \int y^{\ast} p(y^{\ast}|y_1) dy^{\ast} > \int y^{\ast} p(y^{\ast}|y_2) dy^{\ast} $$

$$ \int g(y^{\ast}) p(y^{\ast}|y_1) dy^{\ast} > \int g(y^{\ast}) p(y^{\ast}|y_2) dy^{\ast} $$

$$ \int g(y^{\ast}) \left(\frac{dg}{dy^{\ast}}\right) ^{-1} p(u^{\ast}|y_1) dy^{\ast} > \int g(y^{\ast}) \left(\frac{dg}{dy^{\ast}}\right)^{-1} p(u^{\ast}|y_2) dy^{\ast} $$

$$ \int u^{\ast} \left(\frac{dg}{dy^{\ast}}\right)^{-1} p(u^{\ast}|y_1) \left(\frac{dg}{dy^{\ast}}\right) du^{\ast} > \int u^{\ast} \left(\frac{dg}{dy^{\ast}}\right)^{-1} p(u^{\ast}|y_2) \left(\frac{dg}{dy^{\ast}}\right) du^{\ast} $$

$$ \int u^{\ast} p(u^{\ast}|y_1) du^{\ast} > \int u^{\ast} p(u^{\ast}|y_2) du^{\ast} $$

and finally

$$ E(u^{\ast}|y_1) > E(u^{\ast}|y_2). $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Krishnakumar, J., Nagar, A.L. On Exact Statistical Properties of Multidimensional Indices Based on Principal Components, Factor Analysis, MIMIC and Structural Equation Models. Soc Indic Res 86, 481–496 (2008). https://doi.org/10.1007/s11205-007-9181-8

Download citation

Received: 24 January 2007
Accepted: 13 August 2007
Published: 04 September 2007
Issue Date: May 2008
DOI: https://doi.org/10.1007/s11205-007-9181-8

Keywords

JEL Classification codes

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Exact Statistical Properties of Multidimensional Indices Based on Principal Components, Factor Analysis, MIMIC and Structural Equation Models

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Mixed methods research: what it is and what it could be

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

1.1 Minimum Variance Unbiased Estimation of Factor Scores in the FA Model

Appendix B

2.1 “Unbiased" Principal Components

Appendix C

3.1 Expression of MIMIC Estimator

Appendix D

4.1 Latent Factor Estimators and Their Variances in the Linear SEM

Appendix E

5.1 Monotonic Transformation and Posterior Distribution

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification codes

Navigation

On Exact Statistical Properties of Multidimensional Indices Based on Principal Components, Factor Analysis, MIMIC and Structural Equation Models

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Mixed methods research: what it is and what it could be

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

1.1 Minimum Variance Unbiased Estimation of Factor Scores in the FA Model

Appendix B

2.1 “Unbiased" Principal Components

Appendix C

3.1 Expression of MIMIC Estimator

Appendix D

4.1 Latent Factor Estimators and Their Variances in the Linear SEM

Appendix E

5.1 Monotonic Transformation and Posterior Distribution

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification codes

Search

Navigation