Frequentist Model Averaging in Structural Equation Modelling

Abstract

Model selection from a set of candidate models plays an important role in many structural equation modelling applications. However, traditional model selection methods introduce extra randomness that is not accounted for by post-model selection inference. In the current study, we propose a model averaging technique within the frequentist statistical framework. Instead of selecting an optimal model, the contributions of all candidate models are acknowledged. Valid confidence intervals and a \(\chi ^2\) test statistic are proposed. A simulation study shows that the proposed method is able to produce a robust mean-squared error, a better coverage probability, and a better goodness-of-fit test compared to model selection. It is an interesting compromise between model selection and the full model.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. Akaike, H. (1973). Information theory as an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Second international symposium on information theory. Budapest: Akademiai Kiado.

    Google Scholar 

  2. Ankargren, S., & Jin, S. (2018). On the least squares model averaging interval estimator. Communications in Statistics: Theory and Methods, 47, 118–132.

    Article  Google Scholar 

  3. Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246.

    Article  Google Scholar 

  4. Berk, R., Brown, L., Buja, A., Zhang, K., & Zhao, L. (2013). Valid post-selection inference. Annals of Statistics, 41, 802–837.

    Article  Google Scholar 

  5. Browne, M. W. (1974). Generalized least squares estimators in the analysis of covariance structures. South African Statistical Journal, 8, 1–24. Reprinted in 1977 in D. J. Aigner & A. S. Goldberger (Eds.). Latent variables in socioeconomic models (pp. 205–226). Amsterdam: North Holland.

  6. Browne, M. W. (1984). Asymptotically distribution-free methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83.

    Article  PubMed  Google Scholar 

  7. Browne, M. W. (1987). Robustness of statistical inference in factor analysis and related models. Biometrika, 74, 375–384.

    Article  Google Scholar 

  8. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–161). Newbury Park: Sage.

    Google Scholar 

  9. Buckland, S. T., Burnham, K. P. K. P., & Augustin, H. (1997). Model selection: An integral part of inference. Biometrics, 53, 603–618.

    Article  Google Scholar 

  10. Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference understanding AIC and BIC in model selection. Sociological Methods & Research, 33, 261–304.

    Article  Google Scholar 

  11. Charkhi, A., Claeskens, G., & Hansen, B. E. (2016). Minimum mean square error model averaging in likelihood models. Statistica Sinica, 26, 809–840.

    Google Scholar 

  12. Fletcher, D., & Dillingham, P. W. (2011). Model-averaged confidence intervals for factorial experiments. Computational Statistics & Data Analysis, 55, 3041–3048.

    Article  Google Scholar 

  13. Fletcher, D., & Turek, D. (2011). Model-averaged profile likelihood intervals. Journal of Agricultural, Biological and Environmental Statistics, 17, 38–51.

    Article  Google Scholar 

  14. Hansen, B. E. (2007). Least squares model averaging. Econometrica, 75, 1175–1189.

    Article  Google Scholar 

  15. Hansen, B. E. (2014). Model averaging, asymptotic risk, and regression groups. Quantitative Economics, 5, 495–530.

    Article  Google Scholar 

  16. Hansen, B. E., & Racine, J. S. (2012). Jackknife model averaging. Journal of Econometrics, 167, 38–46.

    Article  Google Scholar 

  17. Hjort, N. L., & Claeskens, G. (2003a). Frequentist model average estimators. Journal of the American Statistical Association, 98, 879–899.

    Article  Google Scholar 

  18. Hjort, N. L., & Claeskens, G. (2003b). Rejoinder. Journal of the American Statistical Association, 98, 938–945.

    Article  Google Scholar 

  19. Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial (with discussion). Statistical Science, 14, 382–417.

    Article  Google Scholar 

  20. Ishwaran, H., & Rao, J. S. (2003). Discussion. Journal of the American Statistical Association, 98, 922–925.

    Article  Google Scholar 

  21. Kabaila, P. (1995). The effect of model selection on confidence regions and prediction regions. Econometric Theory, 11, 537–549.

    Article  Google Scholar 

  22. Kabaila, P., & Leeb, H. (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. Journal of the American Statistical Association, 101, 619–629.

    Article  Google Scholar 

  23. Kabaila, P., Welsh, A. H., & Abeysekera, W. (2016). Model-averaged confidence intervals. Scandinavian Journal of Statistics, 43, 35–48.

    Article  Google Scholar 

  24. Karatzoglou, A., Smola, A., Hornik, K., & Zeileis, A. (2004). kernlab—An S4 package for kernel methods in R. Journal of Statistical Software, 11, 1–20.

    Article  Google Scholar 

  25. Kember, D., & Leung, D. Y. P. (2009). Development of a questionnaire for assessing students’ perceptions of the teaching and learning environment and its use in quality assurance. Learning Environments Research, 12, 15–29.

    Article  Google Scholar 

  26. Kember, D., & Leung, D. Y. P. (2011). Disciplinary differences in student rating of teaching quality. Research in Higher Education, 52, 278–299.

    Article  Google Scholar 

  27. Kim, J., & Pollard, D. (1990). Cube root asymptotics. The Annals of Statistics, 18, 191–219.

    Article  Google Scholar 

  28. Knight, K., & Fu, W. (2000). Aasymptotic for lasso-type estimators. The Annals of Statistics, 28, 1356–1378.

    Article  Google Scholar 

  29. Lee, W. W. S., Leung, D. Y. P., & Lo, K. C. H. (2013). Development of generic capabilities in teaching and learning eenvironment at associate degree level. In M. S. Khine (Ed.), Application of structural equation modeling in educational research and practice (pp. 169–184). Rotterdam: Sense Publishers.

    Chapter  Google Scholar 

  30. Leeb, H., & Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21, 21–59.

    Article  Google Scholar 

  31. Liu, C.-A. (2015). Distribution theory of the least squares averaging estimator. Journal of Econometrics, 186, 142–159.

    Article  Google Scholar 

  32. Liu, C.-A., & Kuo, B.-S. (2016). Model averaging in predictive regressions. The Econometrics Journal, 19, 203–231.

    Article  Google Scholar 

  33. Liu, Q., & Okui, R. (2013). Heteroscedasticity-robust \(C_p\) model averaging. The Econometrics Journal, 16, 463–472.

    Article  Google Scholar 

  34. MacCallum, R. C. (1986). Specification searches in covariance structure modeling. Quantittative Methods in Psychology, 100, 107–120.

    Google Scholar 

  35. MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Quantittative Methods in Psychology, 111, 490–504.

    Google Scholar 

  36. Madigan, D., & Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association, 89, 1535–1546.

    Article  Google Scholar 

  37. Magnus, J. R., & Neudecker, H. (1986). Symmetry, 0–1 matrices and Jacobians: A review. Econometric Theory, 2, 157–190.

    Article  Google Scholar 

  38. Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, 551–560.

    Article  Google Scholar 

  39. Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Retrieved from https://www.statmodel.com/download/Article_075.pdf. Accessed 12 Sept 2013.

  40. Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory, 7, 163–185.

    Article  Google Scholar 

  41. Raftery, A. E., & Zheng, Y. (2003). Discussion: Performance of Bayesian model averaging. Journal of the American Statistical Association, 98, 931–938.

    Article  Google Scholar 

  42. Sarris, W. E., Satorra, A., & Sorbom, D. (1987). The detection and correction of specification errors in structural equation models. Sociological Methodology, 17, 105–129.

    Article  Google Scholar 

  43. Schomaker, M., & Heumann, C. (2011). Model averaging in factor analysis: An analysis of Olympic decathlon data. Journal of Quantitative Analysis in Sports, 7, Article 4.

  44. Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.

    Article  Google Scholar 

  45. Sörbom, D. (1989). Model modification. Psychometrika, 54, 371–384.

    Article  Google Scholar 

  46. Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.

    Article  Google Scholar 

  47. Turek, D., & Fletcher, D. (2012). Model-averaged Wald confidence intervals. Computational Statistics & Data Analysis, 56, 2809–2815.

    Article  Google Scholar 

  48. Turlach, B. A., & Weingessel, A. (2013). quadprog: Functions to solve quadratic programming problems. R package version 1.5-5.

  49. Vanderbei, R. J. (1999). LOQO: An interior point code for quadratic programming. Optimization Methods and Software, 11, 451–484.

    Article  Google Scholar 

  50. Wang, H., & Zhou, S. Z. F. (2013). Interval estimation by frequentist model averaging. Communications in Statistics: Theory and Methods, 42, 4342–4356.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the reviewers for providing valuable comments. Shaobo Jin was partly supported by Vetenskapsrådet (Swedish Research Council) under the contract 2017-01175.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Shaobo Jin.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 37 KB)

Appendix

Appendix

Proof of Theorem 1

From the distribution (6), \(\varvec{V}\), \(\varvec{W}\), \(\varvec{K}\), and \(\varvec{K}_{s}\) can be consistently estimated from the full model. \(\varvec{\delta }\varvec{\delta }^{T}\) can be estimated from the full model by \(\hat{\varvec{\delta }}_\mathrm{full}\hat{\varvec{\delta }}_\mathrm{full}^T - 4\hat{\varvec{K}}\) or \(\hat{\varvec{\delta }}_\mathrm{full}\hat{\varvec{\delta }}_\mathrm{full}^T\), where

$$\begin{aligned} \hat{\varvec{\delta }}_\mathrm{full} = \sqrt{n} \left( \hat{\varvec{\gamma }}_\mathrm{full} - \varvec{\gamma }_0 \right) \overset{d}{\rightarrow } \varvec{D}-\varvec{\delta } = 2\varvec{K}\varvec{N}-2\varvec{K}\varvec{J}_{10}\varvec{J}_{00}^{-1}\varvec{M} \sim N\left( \varvec{0},4\varvec{K}\right) \end{aligned}$$

(Hjort & Claeskens, 2003a). Thus, \(Q\left( \{c_{s}\}\right) \overset{d}{\rightarrow } Q^*\left( \{c_{s}\}\right) \) for some \(Q^*\left( \{c_{s}\}\right) \). Such quadratic programming to obtain the model weights is a strictly convex minimization problem when it is positive definite, which indicates that \(Q^{*}\left( \{c_{s}\}\right) \) has a unique minimum. Thus, \(\left\{ {\hat{c}}_{s}\right\} \) converges in distribution to \(\left\{ c_{s}^{*}\right\} \), the minimizer of \(Q^{*}\left( \{c_{s}\}\right) \) (Kim & Pollard, 1990), and the distribution of \(Q^{*}\left( \{c_{s}\}\right) \) depends on \(\varvec{M}\) and \(\varvec{N}\). Note that the distribution of \(\sqrt{n}\left( \hat{\varvec{\mu }}-\varvec{\mu }_\mathrm{true}\right) \) also depends on \(\varvec{M}\) and \(\varvec{N}\) as shown in Eq. (9). Thus, there is joint convergence in the distribution of \(\left\{ {\hat{c}}_{s}\right\} \) and \(\sqrt{n}\left( \hat{\varvec{\mu }} \left( \{{\hat{c}}_{s}\} \right) -\varvec{\mu }_\mathrm{true}\right) \). Therefore,

$$\begin{aligned} \sqrt{n}\left( \hat{\varvec{\mu }} \left( \{{\hat{c}}_{s}\} \right) -\varvec{\mu }_\mathrm{true}\right) \overset{d}{\rightarrow } 2\frac{\partial \varvec{\mu }}{\partial \varvec{\theta }^{T}}\varvec{J}_{00}^{-1}\varvec{M}+\varvec{W}\left[ \varvec{\delta }- \left( \sum _{s}c_{s}^*\varvec{\pi }_{s}^{T}\varvec{K}_{s}\varvec{\pi }_{s} \right) \varvec{K}^{-1}\varvec{D} \right] . \end{aligned}$$

\(\square \)

Proof of Eq. (15)

Define the duplication matrix \(\varvec{P}\) such that \(\mathrm{vec}\left( \varvec{S}\right) =\varvec{P}\text {vech}\left( \varvec{S}\right) \), where \(\text {vech}\left( \right) \) vectorizes the lower diagonal elements of the enclosed symmetric matrix. It is know from (Browne, 1974, 1987) that

$$\begin{aligned} \sqrt{n}\text {vech}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right)&\overset{d}{\rightarrow }N\left( \varvec{0},\varvec{\Omega }\right) , \end{aligned}$$

where \(\varvec{P}^{+}=\left( \varvec{P}^{T}\varvec{P}\right) ^{-1}\varvec{P}^{T}\) and \(\varvec{\Omega }=2\varvec{P}^{+}\left( \varvec{\Sigma }_{0}\otimes \varvec{\Sigma }_{0}\right) \varvec{P}^{+T}\). Consider the symmetric matrix

$$\begin{aligned} \varvec{A}&= \frac{1}{2}\varvec{\Omega }^{1/2}\varvec{P}^{T}\left[ \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) -\varvec{G}\left( \varvec{\mu }_{0}\right) \right] \varvec{P}\varvec{\Omega }^{1/2}. \end{aligned}$$

Lemma 14 in Magnus and Neudecker (1986) indicates that

$$\begin{aligned} \varvec{\Omega }&= 2\left[ \varvec{P}^{T}\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}\right] ^{-1}. \end{aligned}$$

Lemma 11 in Magnus and Neudecker (1986) indicates that

$$\begin{aligned}&\varvec{\Delta }^{T}\left( \varvec{\mu }_{0}\right) \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}\varvec{\Omega }\varvec{P}^{T}\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P} = 2\varvec{\Delta }^{T}\left( \varvec{\mu }_{0}\right) \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}. \end{aligned}$$
(17)

Consequently, \(\varvec{A}\) can be shown to be idempotent. Lemma 14 in Magnus and Neudecker (1986) also implies that \(\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}\varvec{\Omega }\varvec{P}^{T}= 2\varvec{P}\left( \varvec{P}^{T}\varvec{P}\right) ^{-1}\varvec{P}^{T}\). Then, it can be shown that \(\text {tr}\left( \varvec{A}\right) = \left( p_x+p_y\right) \left( p_x+p_y+1\right) /2-r\). Therefore, Eq. 15 holds. \(\square \)

Proof of Theorem 3

For the full model, \(\partial F\left( \hat{\varvec{\beta }}_\mathrm{full}\right) /\partial \varvec{\beta }=\varvec{0}\) and

$$\begin{aligned} F\left( \hat{\varvec{\beta }}_\mathrm{full}\right)&= F\left( \varvec{\beta }_\mathrm{true}\right) -\frac{n}{4}\left( \varvec{\beta }_\mathrm{true}-\hat{\varvec{\beta }}_\mathrm{full}\right) ^{T}\varvec{J}_\mathrm{full}\left( \varvec{\beta }_\mathrm{true}-\hat{\varvec{\beta }}_\mathrm{full}\right) +o_{p}\left( 1\right) . \end{aligned}$$

The distribution (6) indicates that

$$\begin{aligned} \sqrt{n}\left( \hat{\varvec{\beta }}_\mathrm{full}-\varvec{\beta }_\mathrm{true}\right)&= 2\varvec{J}_\mathrm{full} \varvec{\Delta }_{0}^{T}\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \sqrt{n}\mathrm{vec}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right) . \end{aligned}$$

Thus, Eq. (12) becomes

$$\begin{aligned} F\left( \hat{\varvec{\beta }}_\mathrm{full}\right)&= \frac{n}{2}\mathrm{vec}^{T}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right) \left[ \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) - \varvec{G}\left( \varvec{\beta }_0 \right) \right] \mathrm{vec}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right) + o_{p}\left( 1\right) \\&= F_{1} + o_{p}\left( 1\right) , \end{aligned}$$

which is asymptotically the same as the FMA test statistic. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jin, S., Ankargren, S. Frequentist Model Averaging in Structural Equation Modelling. Psychometrika 84, 84–104 (2019). https://doi.org/10.1007/s11336-018-9624-y

Download citation

Keywords

  • model selection
  • post-selection inference
  • coverage probability
  • local asymptotic
  • goodness-of-fit