Abstract
Sijtsma and Pfadt (Psychometrika, 2021) provide a wide-ranging defense for the use of coefficient alpha. Alpha is practical and useful when its limitations are acceptable. This paper discusses several methodologies for reliability, some new here, that go beyond alpha and were not emphasized by Sijtsma and Pfadt. Bentler’s (Psychometrika 33:335–345, 1968. https://doi.org/10.1007/BF02289328) combined factor analysis (FA) and classical test theory (CTT) model. FACTT provides a key conceptual foundation.
Similar content being viewed by others
Notes
This notation is different from S&P. For simplicity, we assume that the variables are linearly independent with means of zero. S&P also recommend the use of estimated factor scores \(y_{w} ={w}'x\) for some weight vector w, a topic previously developed in Bentler (1968) but not discussed here.
The p observed variables x are dependent variables, while the 2p variables \(\tau \) and \(\varepsilon \) are independent variables in the sense of Bentler and Weeks (1980).
Here and elsewhere, if the variables involved are multivariate normally distributed, uncorrelated implies independent.
S&P remind us that \(\Sigma _{\varepsilon } =\Delta _{\varepsilon } \) “underlies the lower bound theorem.”
It is also an insurance policy against possibly subjective decisions, since \(\alpha =p^{2}\bar{{\sigma }}_{ij} /\sigma _{y}^{2} \) (S&P’s Eq. 16) depends only on data: number of parts (items) p, average covariance \(\bar{{\sigma }}_{ij} \) of parts, and \(\sigma _{y}^{2} \), the sum of all variances and covariances of parts. No additional parameter estimation or modeling decisions are needed.
This is 40\(+\) years before the 2009 paper cited by S&P, the same year as Lord & Novick (1968).
The FACTT variance composition is illustrated with a Venn diagram in Bentler (2017).
Or internal consistency reliability, and no doubt misleadingly shortened to “reliability” on occasion. S&P state “Thus, in Bentler’s conception, internal consistency refers to unidimensionality operationalized by a common factor.” Actually, a 1-factor model is not assumed in either (2) or (3), although it is not disallowed.
As did Heise and Bohrnstedt (1970).
This is the greatest lower bound (glb) to reliability if \(\Delta _{u}\) contains non-negative variances, though \(\hat{{\Delta }}_{u} \) may contain negative estimates. The possibly larger—and more famous—glb forces \(\hat{{\Delta }}_{u} \) to have non-negative elements; it is discussed further below.
Though at the time Bentler wrote, Jöreskog (1969) had not yet published on CFA.
This recommendation is a bit strange, since internal consistency coefficients (3) and chains of lower bounds do not require \(\Sigma _{c} \) to be rank 1. For example, the glb does not require any specification for number of factors.
Though any bias will be trivial with very large datasets as exist for some internet samples or national testing programs.
The optimization problem also has been called constrained minimum trace factor analysis (ten Berge, Snijders, & Zegers, 1981).
They also provided indirectly corrected versions of these bias-corrected coefficients based on the degree of reliability underestimation by \(\alpha \).
S&P emphasize that “reliability values are dependent on the triplet test, group, and procedure.” They also point out the “misconception…that each particular test allegedly has only one reliability.”
In the context of multiple factors, an application of this partition is to take the covariate-free \(\tau ^{(\bot Z)}\) as that part of the true score due to one or more relevant factors, with the covariate-dependent part as \(\tau ^{(Z)}=\tau -\tau ^{(\bot Z)}\).
S&P are explicit in this assumption in their Eq. (4), stating “…measurement error covaries 0 with any other variable Y, not necessarily a test score, in which E is not included” (S&P’s Y is not this paper’s y). Note, however, that a reviewer of Bentler (2017) explicitly rejected this idea (see 2017, Footnote 5).
References
Arruda, E. H., & Bentler, P. M. (2017). A regularized GLS for structural equation modeling. Structural Equation Modeling, 24, 657–665. https://doi.org/10.1080/10705511.2017.1318392.
Bentler, P. M. (1968). Alpha-maximized factor analysis (Alphamax): Its relation to alpha and canonical factor analysis. Psychometrika, 33, 335–345. https://doi.org/10.1007/BF02289328.
Bentler, P. M. (1972). A lower-bound method for the dimension-free measurement of internal consistency. Social Science Research, 1, 343–357. https://doi.org/10.1016/0049-089X(72)90082-8.
Bentler, P. M. (2007). Covariance structure models for maximal reliability of unit-weighted composites. In S.-Y. Lee (Ed.), Handbook of latent variable and related models (pp. 1–19). North-Holland.
Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal consistency reliability. Psychometrika, 74, 137–143. https://doi.org/10.1007/s11336-008-9100-1.
Bentler, P. M. (2016). Covariate-free and covariate-dependent reliability. Psychometrika, 81, 907–920. https://doi.org/10.1007/s11336-016-9524-y.
Bentler, P. M. (2017). Specificity-enhanced reliability coefficients. Psychological Methods, 22, 527–540. https://doi.org/10.1037/met0000092.
Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588–606. https://doi.org/10.1037/0033-2909.88.3.588.
Bentler, P. M., & Weeks, D. G. (1980). Linear structural equations with latent variables. Psychometrika, 45, 289–308. https://doi.org/10.1007/BF02293905.
Bentler, P. M., & Woodward, J. A. (1980). Inequalities among lower bounds to reliability: With applications to test construction and factor analysis. Psychometrika, 45, 249–267. https://doi.org/10.1007/BF02294079.
Bentler, P. M., & Woodward, J. A. (1983). The greatest lower bound to reliability. In H. Wainer & S. Messick (Eds.), Principals of modern psychological measurement: A Festschrift for Frederic M. Lord (pp. 237–253). Erlbaum.
Bentler, P. M., & Woodward, J. A. (1985). On the greatest lower bound to reliability. Psychometrika, 50, 245–246. https://doi.org/10.1007/BF02294250.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. https://doi.org/10.1007/BF02310555.
Cronbach, L. J., & Gleser, G. C. (1964). The signal/noise ratio in the comparison of reliability coefficients. Educational and Psychological Measurement, 24, 467–480. https://doi.org/10.1177/001316446402400303.
Du, H., & Bentler, P. M. (2021). Distributionally weighted least squares in structural equation modeling. Psychological Methods,. https://doi.org/10.1037/met0000388.
Guttman, L. A. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255–282. https://doi.org/10.1007/BF02288892.
Heise, D. R., & Bohrnstedt, G. W. (1970). Validity, invalidity, and reliability. In E. F. Borgatta & G. W. Bohrnstedt (Eds.), Sociological methodology (pp. 104–129). Jossey-Bass.
Hunt, T. D., & Bentler, P. M. (2015). Quantile lower bounds to reliability based on locally optimal splits. Psychometrika, 80, 182–195. https://doi.org/10.1007/s11336-013-9393-6.
Jackson, P. H., & Agunwamba, C. C. (1977). Lower bounds for the reliability of total scores on a test composed of nonhomogeneous items: I. Algebraic lower bounds. Psychometrika, 42, 567–578. https://doi.org/10.1007/BF02295979.
Jalal, S., & Bentler, P. M. (2018). Using Monte Carlo normal distributions to evaluate structural models with nonnormal data. Structural Equation Modeling, 25, 541–557. https://doi.org/10.1080/10705511.2017.1390753.
Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183–202. https://doi.org/10.1007/BF02289343.
Jöreskog, K. G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109–133. https://doi.org/10.1007/BF02291393.
Kim, D. S., Reise, S. P., & Bentler, P. M. (2018). Identifying aberrant data in structural equation models with IRLS-ADF. Structural Equation Modeling, 25, 343–358. https://doi.org/10.1080/10705511.2017.1379881.
Li, L., & Bentler, P. M. (2011). The greatest lower bound to reliability: Corrected and resampling estimators. Modelling and Data Analysis, 1, 87–104.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley.
McCrae, R. R. (2015). A more nuanced view of reliability: Specificity in the trait hierarchy. Personality and Social Psychology Review, 19, 97–112. https://doi.org/10.1177/1088868314541857.
McDonald, R. P. (1970). The theoretical foundations of principal factor analysis, canonical factor analysis, and alpha factor analysis. British Journal of Mathematical and Statistical Psychology, 23, 1–21. https://doi.org/10.1111/j.2044-8317.1970.tb00432.x.
McDonald, R. P. (1999). Test theory: A unified treatment. Erlbaum.
Raykov, T. (2007). Reliability of multiple-component measuring instruments: Improved evaluation in repeated measure designs. The British Journal of Mathematical and Statistical Psychology, 60, 119–136. https://doi.org/10.1348/000711006X100464.
Raykov, T., & Tisak, J. (2004). Examining time-invariance in reliability in multi-wave, multi-indicator models: A covariance structure analysis approach accounting for indicator specificity. The British Journal of Mathematical and Statistical Psychology, 57, 253–263. https://doi.org/10.1348/0007110042307267.
Shapiro, A., & ten Berge, J. M. F. (2000). The asymptotic bias of minimum trace factor analysis, with applications to the greatest lower bound to reliability. Psychometrika, 65, 413–425. https://doi.org/10.1007/BF02296154.
Sijtsma, K., & Pfadt, J. (2021). Part II: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha: Discussing lower bounds and correlated errors. Psychometrika.
ten Berge, J. M. F., Snijders, T. A. B., & Zegers, F. E. (1981). Computational aspects of the greatest lower bound to the reliability and constrained minimum trace factor analysis. Psychometrika, 46, 201–213. https://doi.org/10.1007/BF02293900.
Wilkinson, L., & APA Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604. https://doi.org/10.1037/0003-066X.54.8.594.
Woodhouse, B., & Jackson, P. M. (1977). Lower bounds for the reliability of the total score on a test composed of nonhomogeneous items: II. A search procedure to locate the greatest lower bound. Psychometrika, 42, 579–591. https://doi.org/10.1007/BF02295980.
Woodward, J. A., & Bentler, P. M. (1978). A statistical lower-bound to population reliability. Psychological Bulletin, 85, 1323–1326. https://doi.org/10.1037/0033-2909.85.6.1323.
Yuan, K.-H., & Bentler, P. M. (2017). Improving the convergence rate and speed of Fisher-scoring algorithm: Ridge and anti-ridge methods in structural equation modeling. Annals of the Institute of Statistical Mathematics, 69, 571–597.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bentler, P.M. Alpha, FACTT, and Beyond. Psychometrika 86, 861–868 (2021). https://doi.org/10.1007/s11336-021-09797-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-021-09797-8