Skip to main content
Log in

Covariate-free and Covariate-dependent Reliability

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Classical test theory reliability coefficients are said to be population specific. Reliability generalization, a meta-analysis method, is the main procedure for evaluating the stability of reliability coefficients across populations. A new approach is developed to evaluate the degree of invariance of reliability coefficients to population characteristics. Factor or common variance of a reliability measure is partitioned into parts that are, and are not, influenced by control variables, resulting in a partition of reliability into a covariate-dependent and a covariate-free part. The approach can be implemented in a single sample and can be applied to a variety of reliability coefficients.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. The theorem states that when \(\Sigma _c\) is not rank 1, composite reliability coefficients defined on it are identical to a 1-factor-based reliability coefficient for the composite based on a rotated factor whose loading vector \(\lambda \) maximizes \(({1}'\lambda )^{2}\). Subsequent rotated factors have loadings whose columns sum to zero.

  2. Other approaches are possible. We could take \(\tilde{\lambda }^{(Z)}=({1}'\Sigma _c 1)^{-.5}\Sigma _c^{(Z)} 1\) and \(\tilde{\lambda }^{\bot Z}=({1}'\Sigma _c 1)^{-.5}\Sigma _c^{\bot Z} 1\) but these would not have the desired property of (11).

  3. A recent discussion on the interpretation of \(\alpha \) in terms of all possible k-split alphas is given by Warrens (2014).

  4. The greatest lower bound can be biased in small samples; Li and Bentler (2011) remove this bias. Note that Jackson and Agunwamba (1977) provide a condition under which \(\lambda _{4(\max )} \) is the greatest lower bound.

  5. Alternatively, step 2 can produce \(\varphi ^{\xi (Z)}\) and \(\varphi ^{\bot Z}\) as described in the first approach, but these values are not guaranteed to precisely add to \(\varphi \) from step 1 in the 2-step approach.

  6. We treat the correlations as covariances, and ignore the fact that these correlations are based on different sample sizes, N = 258 for brain volumes, N = 135 for inter-domain correlations, N = 688 for WAIS-III variables.

References

  • Bentler, P. M. (1968). Alpha-maximized factor analysis (Alphamax): Its relation to alpha and canonical factor analysis. Psychometrika, 33, 335–345.

    Article  PubMed  Google Scholar 

  • Bentler, P. M. (1972). A lower-bound method for the dimension-free measurement of internal consistency. Social Science Research, 1, 343–357.

    Article  Google Scholar 

  • Bentler, P. M. (2007). Covariance structure models for maximal reliability of unit-weighted composites. In S.-Y. Lee (Ed.), Handbook of latent variable and related models (pp. 1–19). Amsterdam: North-Holland.

    Google Scholar 

  • Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal consistency reliability. Psychometrika, 74, 137–143.

    Article  PubMed  PubMed Central  Google Scholar 

  • Bentler, P. M. (2016). Specificity-enhanced reliability coefficients. Psychological Methods. http://dx.doi.org/10.1037/met0000092.

  • Bentler, P. M., & Weeks, D. G. (1980). Linear structural equations with latent variables. Psychometrika, 45, 289–308.

    Article  Google Scholar 

  • Bentler, P. M., & Woodward, J. A. (1980). Inequalities among lower bounds to reliability: With applications to test construction and factor analysis. Psychometrika, 45, 249–267.

    Article  Google Scholar 

  • Bentler, P. M., & Wu, E. J. C. (2015). EQS 6.3 structural equations program. Temple City, CA: Multivariate Software.

    Google Scholar 

  • Beretvas, S. N., & Pastor, D. A. (2003). Using mixed-effects models in reliability generalization studies. Educational and Psychological Measurement, 63, 75–95.

    Article  Google Scholar 

  • Bonett, D. G. (2010). Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods, 15, 368–385.

    Article  PubMed  Google Scholar 

  • Botella, J., Suero, M., & Gambara, H. (2010). Psychometric inferences from a meta-analysis of reliability and internal consistency coefficients. Psychological Methods, 15, 386–397.

    Article  PubMed  Google Scholar 

  • Brannick, M. T., & Zhang, N. (2013). Bayesian meta-analysis of coefficient alpha. Research Synthesis Methods, 4, 198–207.

    Article  PubMed  Google Scholar 

  • Brennan, R. L. (2001). Generalizability theory. New York: Springer.

    Book  Google Scholar 

  • Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456–466.

    Article  Google Scholar 

  • Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255.

    Article  Google Scholar 

  • Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.

    Article  Google Scholar 

  • Deng, L., & Yuan, K.-H. (2015). Multiple-group analysis for structural equation modeling with dependent samples. Structural Equation Modeling. doi:10.1080/10705511.2014.950534.

  • Geldhof, G. J., Preacher, K. J., & Zyphur, M. J. (2014). Reliability estimation in a multilevel confirmatory factor analysis framework. Psychological Methods, 19, 72–91.

    Article  PubMed  Google Scholar 

  • Guttman, L. A. (1945). A basis for analyzing test–retest reliability. Psychometrika, 10, 255–282.

    Article  PubMed  Google Scholar 

  • Heise, D. R., & Bohrnstedt, G. W. (1970). Validity, invalidity, and reliability. In E. F. Borgatta & G. W. Bohrnstedt (Eds.), Sociological methodology (pp. 104–129). San Francisco: Jossey-Bass.

    Google Scholar 

  • Hsu, H.-Y., Kwok, O.-M, Jr., Lin, H., & Acosta, S. (2015). Detecting misspecified multilevel structural equation models with common fit indices: A Monte Carlo study. Multivariate Behavioral Research, 50, 197–215.

    Article  PubMed  Google Scholar 

  • Hunt, T. D., & Bentler, P. M. (2015). Quantile lower bounds to reliability based on locally optimal splits. Psychometrika, 80, 182–195.

    Article  PubMed  Google Scholar 

  • Jackson, P. H. (1979). A note on the relation between coefficient alpha and Guttman’s "split-half" lower bounds. Psychometrika, 44, 251–252.

    Article  Google Scholar 

  • Jackson, P. H., & Agunwamba, C. C. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: I. Algebraic lower bounds. Psychometrika, 42, 567–578.

    Article  Google Scholar 

  • Jak, S., Oort, F. J., & Dolan, C. V. (2013). A test for cluster bias: Detecting violations of measurement invariance across clusters in multilevel data. Structural Equation Modeling, 20, 265–282.

    Article  Google Scholar 

  • Jamshidian, M., & Bentler, P. M. (1998). A quasi-Newton method for minimum trace factor analysis. Journal of Statistical Computation and Simulation, 62, 73–89.

    Article  Google Scholar 

  • Jöreskog, K. G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109–133.

    Article  Google Scholar 

  • Jöreskog, K. G., & Sörbom, D. G. (1996). LISREL 8 user’s guide. Chicago: Scientific Software International.

    Google Scholar 

  • Kaiser, H. F., & Caffrey, J. (1965). Alpha factor analysis. Psychometrika, 30, 1–14.

    Article  PubMed  Google Scholar 

  • Kelley, K., & Pornprasertmanit, S. (2016). Confidence intervals for population reliability coefficients: Evaluation of methods, recommendations, and software for composite measures. Psychological Methods, 21, 69–92.

    Article  PubMed  Google Scholar 

  • Labouvie, E., & Ruetsch, C. (1995). Testing for equivalence of measurement scales: Simple structure and metric invariance reconsidered. Multivariate Behavioral Research, 30, 63–76.

    Article  PubMed  Google Scholar 

  • Li, L., & Bentler, P. M. (2011). The greatest lower bound to reliability: Corrected and resampling estimators. Modelling and Data Analysis, 1, 87–104.

    Google Scholar 

  • Liang, J., & Bentler, P. M. (2004). An EM algorithm for fitting two-level structural equation models. Psychometrika, 69, 101–122.

    Article  Google Scholar 

  • López-López, J. A., Botella, J., Sánchez-Meca, J., & Marín-Martínez, F. (2013). Alternatives for mixed-effects meta-regression models in the reliability generalization approach: A simulation study. Journal of Educational and Behavioral Statistics, 38, 443–469.

    Article  Google Scholar 

  • McDaniel, M. A. (2005). Big-brained people are smarter: A meta analysis of the relationship between in vivo brain volume and intelligence. Intelligence, 33, 337–346.

    Article  Google Scholar 

  • McDonald, R. P. (1970). The theoretical foundations of principal factor analysis, canonical factor analysis, and alpha factor analysis. British Journal of Mathematical and Statistical Psychology, 23, 1–21.

    Article  Google Scholar 

  • McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Meade, A. W., Johnson, E. C., & Braddy, P. W. (2008). Power and sensitivity of alternative fit indices in tests of measurement invariance. Journal of Applied Psychology, 93, 568–592.

    Article  PubMed  Google Scholar 

  • Merkle, E. C., & Zeileis, A. (2013). Tests of measurement invariance without subgroups: A generalization of classical methods. Psychometrika, 78, 59–82.

    Article  PubMed  Google Scholar 

  • Millsap, R. E. (2011). Statistical approaches to measurement invariance. New York: Routledge.

    Google Scholar 

  • Posthuma, D., Baaré, W. F. C., Hulshoff Pol, H. E., Kahn, R. S., Boomsma, D. I., & De Geus, E. J. C. (2003). Genetic correlations between brain volumes and the WAIS-III dimensions of verbal comprehension, working memory, perceptual organization, and processing speed. Twin Research, 6, 131–139.

    Article  PubMed  Google Scholar 

  • Raykov, T., & Marcoulides, G. A. (2011). Introduction to psychometric theory. New York: Routledge.

    Google Scholar 

  • Raykov, T., & Marcoulides, G. A. (2013). Meta-analysis of scale reliability using latent variable modeling. Structural Equation Modeling, 20, 338–353.

    Article  Google Scholar 

  • Raykov, T., Marcoulides, G. A., & Millsap, R. E. (2012). Factorial invariance in multiple populations: A multiple testing procedure. Educational and Psychological Measurement, 73, 713–727.

    Article  Google Scholar 

  • Ryu, E., & West, S. G. (2009). Level-specific evaluation of model fit in multilevel structural equation modeling. Structural Equation Modeling, 16, 583–601.

    Article  Google Scholar 

  • Sawilowsky, S. S. (2000). Psychometrics versus datametrics: Comment on Vacha-Haase’s "reliability generalization" method and some EPM editorial policies. Educational and Psychological Measurement, 60, 157–173.

    Article  Google Scholar 

  • Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of generalization. Journal of Applied Psychology, 62, 529–540.

    Article  Google Scholar 

  • Schweig, J. (2014). Multilevel factor analysis by model segregation: New applications for robust test statistics. Journal of Educational and Behavioral Statistics, 39, 394–422.

    Google Scholar 

  • Shavelson, R. J., & Webb, N. (1991). Generalizability theory: A primer. Thousand Oaks, CA: Sage.

    Google Scholar 

  • Sörbom, D. (1974). A general method for studying differences in factor means and factor structures between groups. British Journal of Mathematical and Statistical Psychology, 27, 229–239.

    Article  Google Scholar 

  • Tarkkonen, L., & Vehkalahti, K. (2005). Measurement errors in multivariate measurement scales. Journal of Multivariate Analysis, 96, 172–189.

    Article  Google Scholar 

  • Thompson, B. (1994). Guidelines for authors. Educational and Psychological Measurement, 54, 837–847.

    Google Scholar 

  • Thompson, B., & Vacha-Haase, T. (2000). Psychometrics is datametrics: The test is not reliable. Educational and Psychological Measurement, 60, 174–195.

    Article  Google Scholar 

  • Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58, 6–20.

    Article  Google Scholar 

  • Vacha-Haase, T., & Thompson, B. (2011). Score reliability: A retrospective look back at 12 years of reliability generalization studies. Measurement and Evaluation in Counseling and Development, 44, 159–168.

    Article  Google Scholar 

  • Van de Schoot, R., Kluytmans, A., Tummers, L., Lugtig, P., Hox, J., & Muthén, B. (2013). Facing off with Scylla and Charybdis: A comparison of scalar, partial, and the novel possibility of approximate measurement invariance. Frontiers of Psychology, 4, 770. doi:10.3389/fpsyg.2013.00770.

    Google Scholar 

  • Warrens, M. J. (2014). On Cronbach’s alpha as the mean of all possible k-split alphas. Advances in Statistics, 1–5. doi:10.1155/2014/742863.

  • Werts, C. E., Rock, D. R., Linn, R. L., & Jöreskog, K. G. (1978). A general method of estimating the reliability of a composite. Educational and Psychological Measurement, 38, 933–938.

    Article  Google Scholar 

  • Wilkinson, L., & APA Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604.

    Article  Google Scholar 

  • Woodhouse, B., & Jackson, P. H. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items. II: A search procedure to locate the greatest lower bound. Psychometrika, 42, 579–591.

    Article  Google Scholar 

  • Yuan, K.-H., & Bentler, P. M. (2003). Eight test statistics for multilevel structural equation models. Computational Statistics & Data Analysis, 44, 89–107.

    Article  Google Scholar 

  • Yuan, K.-H., & Bentler, P. M. (2007). Multilevel covariance structure analysis by fitting multiple single-level models. Sociological Methodology, 37, 53–82.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter M. Bentler.

Additional information

Based on the invited Lifetime Achievement Award Address, International Meeting of the Psychometric Society 2014, Madison WI, July 23, 2014.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bentler, P.M. Covariate-free and Covariate-dependent Reliability. Psychometrika 81, 907–920 (2016). https://doi.org/10.1007/s11336-016-9524-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-016-9524-y

Keywords

Navigation