The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Martin, Stephen R.; Rast, Philippe

doi:10.1007/s11336-022-09847-9

The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Theory and Methods
Published: 21 March 2022

Volume 87, pages 1318–1342, (2022)
Cite this article

Psychometrika Aims and scope Submit manuscript

640 Accesses
3 Citations
Explore all metrics

Abstract

Reliability is a crucial concept in psychometrics. Although it is typically estimated as a single fixed quantity, previous work suggests that reliability can vary across persons, groups, and covariates. We propose a novel method for estimating and modeling case-specific reliability without repeated measurements or parallel tests. The proposed method employs a “Reliability Factor” that models the error variance of each case across multiple indicators, thereby producing case-specific reliability estimates. Additionally, we use Gaussian process modeling to estimate a nonlinear, non-monotonic function between the latent factor itself and the reliability of the measure, providing an analogue to test information functions in item response theory. The reliability factor model is a new tool for examining latent regions with poor conditional reliability, and correlates thereof, in a classical test theory framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Covariate-free and Covariate-dependent Reliability

Article 12 October 2016

Concepts and Models from Psychometrics

Statistical Properties of Lower Bounds and Factor Analysis Methods for Reliability Estimation

Notes

Note that the reliability factor models error variance in responses, and not intraindividual variability of responses, nor variance in latent factors.
A traditional Gaussian model assumes that, e.g., \(y \sim {\mathcal {N}}(\mu = f(X), \sigma ^2)\). A location-scale Gaussian model includes a second submodel on the scale parameter, \(\sigma ^2\), to model variance: \(y \sim {\mathcal {N}}(\mu = f(X), \sigma ^2 = g(X))\). Because \(\sigma ^2\) must be positive, a log-link function is used in the submodel, \(\sigma ^2 = \exp (g(X))\). The reliability factor approach uses this exact strategy in its formulation.
The latent factors (\({\varvec{\eta _i}}\)) can be endogenously modeled as per usual, but this is not discussed here.

References

Asparouhov, T., Hamaker, E. L., & Muthén, B. (2018). Dynamic structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 25(3), 359–388. https://doi.org/10.1080/10705511.2017.1406803
Article Google Scholar
Bacon, D. R., Sauer, P. L., & Young, M. (1995). Composite reliability in structural equations modeling. Educational and Psychological Measurement, 55(3), 394–406. https://doi.org/10.1177/0013164495055003003
Article Google Scholar
Barnard, J., McCulloch, R., & Meng, X.-L. (2000). Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Statistica Sinica, 10(4), 1281–1311.
Google Scholar
Bauer, D. J. (2017). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22(3), 507–526. https://doi.org/10.1037/met0000077
Article PubMed Google Scholar
Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal consistency reliability. Psychometrika, 74(1), 137–143. https://doi.org/10.1007/s11336-008-9100-1
Article PubMed PubMed Central Google Scholar
Betancourt, M. (2017). A conceptual introduction to Hamiltonian Monte Carlo. Retrieved from arxiv.org/abs/1701.02434
Brennan, R. L. (2005). Generalizability theory. Educational Measurement: Issues and Practice, 11(4), 27–34. https://doi.org/10.1111/j.1745-3992.1992.tb00260.x
Article Google Scholar
Brunton-Smith, I., Sturgis, P., & Leckie, G. (2017). Detecting and understanding interviewer effects on survey data by using a cross-classified mixed effects location-scale model. Journal of the Royal Statistical Society: Series A (Statistics in Society), 180(2), 551–568. https://doi.org/10.1111/rssa.12205
Article Google Scholar
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software. https://doi.org/10.18637/jss.v076.i01
de Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.
Google Scholar
Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399–412. https://doi.org/10.1111/bjop.12046
Article PubMed Google Scholar
Ellis, J. L., & van den Wollenberg, A. L. (1993). Local homogeneity in latent trait models. A characterization of the homogeneous monotone IRT model. Psychometrika, 58(3), 417-429. https://doi.org/10.1007/BF02294649
Feldt, L. S., & Quails, A. L. (1996). Estimation of measurement error variance at specific score levels. Journal of Educational Measurement, 33(2), 141–156. https://doi.org/10.1111/j.1745-3984.1996.tb00486.x
Article Google Scholar
Feldt, L. S., Steffen, M., & Gupta, N. C. (1985). A comparison of five methods for estimating the standard error of measurement at specific score levels. Applied Psychological Measurement, 9(4), 351–361. https://doi.org/10.1177/014662168500900402
Article Google Scholar
Geldhof, G. J., Preacher, K. J., & Zyphur, M. J. (2014). Reliability estimation in a multilevel confirmatory factor analysis framework. Psychological Methods, 19(1), 72–91. https://doi.org/10.1037/a0032138
Article PubMed Google Scholar
Gelman, A., Hill, J., & Yajima, M. (2012). Why we (usually) don’t have to worry about multiple comparisons. Journal of Research on Educational Effectiveness, 5(2), 189–211. https://doi.org/10.1080/19345747.2011.618213
Article Google Scholar
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7(4), 457–472. https://doi.org/10.1214/ss/1177011136
Article Google Scholar
Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., Kennedy, L., Gabry, J., Burkner, P.-C., & Modrák, M. (2020). Bayesian workflow. Retrieved from arXiv:2011.01808
Harvill, L. M. (1991). An NCME Instructional Module on. standard error of measurement. Educational Measurement: Issues and Practice, 10(2), 33–41. https://doi.org/10.1111/j.1745-3992.1991.tb00195.x
Hedeker, D., Mermelstein, R. J., Berbaum, M. L., & Campbell, R. T. (2009). Modeling mood variation associated with smoking: An application of a heterogeneous mixed-effects model for analysis of ecological momentary assessment (EMA) data. Addiction, 104(2), 297–307. https://doi.org/10.1111/j.1360-0443.2008.02435.x
Article PubMed PubMed Central Google Scholar
Hedeker, D., Mermelstein, R. J., & Demirtas, H. (2008). An application of a mixed-effects location scale model for analysis of ecological momentary assessment (EMA) data. Biometrics, 64(2), 627–634. https://doi.org/10.1111/j.1541-0420.2007.00924.x
Article PubMed Google Scholar
Hedeker, D., Mermelstein, R. J., & Demirtas, H. (2012). Modeling between-subject and within-subject variances in ecological momentary assessment data using mixed-effects location scale models. Statistics in medicine, 31(27), 3328–36. https://doi.org/10.1002/sim.5338
Article PubMed Google Scholar
Holzinger, K. J., & Swineford, F. A. (1939). A study in factor analysis: The stability of a bi-factor solution. Supplementary Education Monographs, 48.
Hu, Y., Nesselroade, J. R., Erbacher, M. K., Boker, S. M., Burt, S. A., Keel, P. K., & Klump, K. (2016). Test reliability at the individual level. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 532–543. https://doi.org/10.1080/10705511.2016.1148605
Article Google Scholar
Jöreskog, K. G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36(2), 109–133. https://doi.org/10.1007/BF02291393
Article Google Scholar
Kapur, K., Li, X., Blood, E. A., & Hedeker, D. (2015). Bayesian mixed-effects location and scale models for multivariate longitudinal outcomes: An application to ecological momentary assessment data. Statistics in Medicine, 34(4), 630–651. https://doi.org/10.1002/sim.6345
Article PubMed Google Scholar
Leckie, G., French, R., Charlton, C., & Browne, W. (2014). Modeling heterogeneous variance-covariance components in two-level models. Journal of Educational and Behavioral Statistics, 39(5), 307–332. https://doi.org/10.3102/1076998614546494
Article Google Scholar
Lee, Y., & Nelder, J. A. (2006). Double hierarchical generalized linear models (with discussion). Journal of the Royal Statistical Society: Series C (Applied Statistics), 55(2), 139–185. https://doi.org/10.1111/j.1467-9876.2006.00538.x
Article Google Scholar
Lek, K. M., & Van De Schoot, R. (2018). A comparison of the single, conditional and person-specific standard error of measurement: What do they measure and when to use them? Frontiers in Applied Mathematics and Statistics. https://doi.org/10.3389/fams.2018.00040
Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis. https://doi.org/10.1016/j.jmva.2009.04.008
Li, X., & Hedeker, D. (2012). A three-level mixed-effects location scale model with an application to ecological momentary assessment data. Statistics in Medicine, 31(26), 3192–210. https://doi.org/10.1002/sim.5393
Article PubMed PubMed Central Google Scholar
Liu, H., Zhang, Z., & Grimm, K. J. (2016). Comparison of inverse Wishart and separation-strategy priors for Bayesian estimation of covariance parameter matrix in growth curve analysis. Structural Equation Modeling: A Multidisciplinary Journal, 23(3), 354–367. https://doi.org/10.1080/10705511.2015.1057285
Article Google Scholar
Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. Information Age Publishing.
Martin, S. R., Williams, D. R., & Rast, P. (2019). Measurement invariance assessment with Bayesian hierarchical inclusion modeling. PsyArXiv. https://doi.org/10.31234/osf.io/qbdjt
Martin, S. R., Williams, D. R., & Rast, P. (2020). Omegad. Retrieved from http://github.com/stephensrmmartin/ omegad
McNeish, D. (2018). Thanks coeffcient alpha, we’ll take it from here. Psychological Methods, 23(3), 412–433. https://doi.org/10.1037/met0000144
Article PubMed Google Scholar
Mehta, P. D., & Neale, M. C. (2005). People are variables too: Multilevel structural equations modeling. Psychological Methods, 10(3), 259–284. https://doi.org/10.1037/1082-989X.10.3.259
Article PubMed Google Scholar
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543. https://doi.org/10.1007/BF02294825
Article Google Scholar
Merkle, E. C., & Wang, T. (2018). Bayesian latent variable models for the analysis of experimental psychology data. Psychonomic Bulletin & Review, 25(1), 256–270. https://doi.org/10.3758/s13423-016-1016-7
Article Google Scholar
Muthén, B. O. (1994). Multilevel covariance structure analysis. Sociological Methods & Research, 22(3), 376–398. https://doi.org/10.1177/0049124194022003006
Article Google Scholar
Nestler, S. (2020). Modelling inter-individual differences in latent within-person variation: The confirmatory factor level variability model. British Journal of Mathematical and Statistical Psychology, 73(3), 452–473. https://doi.org/10.1111/bmsp.12196
Article PubMed Google Scholar
Raju, N. S., Price, L. R., Oshima, T., & Nering, M. L. (2007). Standardized conditional SEM: A case for conditional reliability. Applied Psychological Measurement, 31(3), 169–180. https://doi.org/10.1177/0146621606291569
Article Google Scholar
Rast, P., & Ferrer, E. (2018). A mixed-effects location scale model for dyadic interactions. Multivariate Behavioral Research, 53(5), 756–775. https://doi.org/10.1080/00273171.2018.1477577
Article PubMed PubMed Central Google Scholar
Rast, P., Hofer, S. M., & Sparks, C. (2012). Modeling individual differences in within-person variation of negative and positive affect in a mixed effects location scale model using BUGS/JAGS. Multivariate Behavioral Research, 47(2), 177–200. https://doi.org/10.1080/00273171.2012.658328
Article PubMed Google Scholar
Rast, P., Martin, S. R., Liu, S., & Williams, D. R. (2020). A new frontier for studying within-person variability: Bayesian multivariate generalized autoregressive conditional heteroskedasticity models. Psychological Methods. https://doi.org/10.1037/met0000357
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: applications and data analysis methods (2nd edn). Thousand Oaks.
Raykov, T. (1997). Estimation of composite reliability for congeneric measures. Applied Psychological Measurement, 21(2), 173–184. https://doi.org/10.1177/01466216970212006
Article Google Scholar
Raykov, T. (2001). Estimation of congeneric scale reliability using covariance structure analysis with nonlinear constraints. British Journal of Mathematical and Statistical Psychology, 54(2), 315–323. https://doi.org/10.1348/000711001159582
Article PubMed Google Scholar
Raykov, T., & du Toit, S. H. C. (2005). Estimation of reliability for multiple-component measuring instruments in hierarchical designs. Structural Equation Modeling: A Multidisciplinary Journal, 12(4), 536–550. https://doi.org/10.1207/s15328007sem1204_2
Article Google Scholar
Rothenberg, T. J. (1971). Identification in parametric models. Econometrica, 39(3), 577. https://doi.org/10.2307/1913267
Article Google Scholar
Schad, D. J., Betancourt, M., & Vasishth, S. (2019). Toward a principled Bayesian workflow in cognitive science.
Solin, A., & Särkkä, S. (2019). Hilbert space methods for reduced-rank Gaussian process regression. Statistics and Computing. https://doi.org/10.1007/s11222-019-09886-w
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out crossvalidation andWAIC. Statistics and Computing, 27(5), 1413–1432. https://doi.org/10.1007/s11222-016-9696-4
Article Google Scholar
Viallefont, A., Lebreton, J.-D., Reboulet, A.-M., & Gory, G. (1998). Parameter identifiability and model selection in capture-recapture models: A numerical approach. Biometrical Journal, 40(3), 313–325. https://doi.org/10.1002/(SICI)1521-4036(199807)40:3\(<\)313:AID-BIMJ313\(>\)3.0.CO2-2
Williams, D. R., Liu, S., Martin, S. R., & Rast, P. (2019). Bayesian multivariate mixed-effects location scale modeling of longitudinal relations among affective traits, states, and physical activity. PsyArXiv. https://doi.org/10.31234/osf.io/4kfjp
Williams, D. R., Martin, S. R., & Rast, P. (2019). Putting the individual into reliability: Bayesian testing of homogeneous within-person variance in hierarchical models. PsyArXiv. https://doi.org/10.31234/OSF.IO/HPQ7W
Yang, Y., Bhattacharya, A., & Pati, D. (2017). Frequentist coverage and sup-norm convergence rate in Gaussian process regression. Retrieved from arxiv.org/abs/1708.04753
Zhang, X., & Savalei, V. (2019). Examining the effect of missing data on RMSEA and CFI under normal theory full-information maximum likelihood. Structural Equation Modeling: A Multidisciplinary Journal . https://doi.org/10.1080/10705511.2019.1642111

Download references

Author information

Authors and Affiliations

Department of Psychology, University of California, Davis, 135 Young Hall, 1 Shields Avenue, Davis, CA, 95616, USA
Stephen R. Martin & Philippe Rast

Authors

Stephen R. Martin
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Rast
View author publications
You can also search for this author in PubMed Google Scholar

Ethics declarations

Data Availability Statement

The datasets generated during the current study are available from the corresponding author on request. The Holzinger–Swineford (1939) dataset is freely available in the lavaan (https://cran.r-project.org/web/packages/lavaan/index.html) and MBESS (https://cran.r-project.org/web/packages/MBESS/index.html) R packages.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Research reported in this publication was supported by the National Institute On Aging of the National Institutes of Health under Award Number R01AG050720 to PR. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martin, S.R., Rast, P. The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment. Psychometrika 87, 1318–1342 (2022). https://doi.org/10.1007/s11336-022-09847-9

Download citation

Received: 22 January 2020
Revised: 04 September 2021
Published: 21 March 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11336-022-09847-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Abstract

Access this article

Similar content being viewed by others

Covariate-free and Covariate-dependent Reliability

Concepts and Models from Psychometrics

Statistical Properties of Lower Bounds and Factor Analysis Methods for Reliability Estimation

Notes

References

Author information

Authors and Affiliations

Ethics declarations

Data Availability Statement

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Abstract

Access this article

Similar content being viewed by others

Covariate-free and Covariate-dependent Reliability

Concepts and Models from Psychometrics

Statistical Properties of Lower Bounds and Factor Analysis Methods for Reliability Estimation

Notes

References

Author information

Authors and Affiliations

Ethics declarations

Data Availability Statement

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation