Latent variable mixture models: a promising approach for the validation of patient reported outcomes
A fundamental assumption of patient-reported outcomes (PRO) measurement is that all individuals interpret questions about their health status in a consistent manner, such that a measurement model can be constructed that is equivalently applicable to all people in the target population. The related assumption of sample homogeneity has been assessed in various ways, including the many approaches to differential item functioning analysis.
This expository paper describes the use of latent variable mixture modeling (LVMM), in conjunction with item response theory (IRT), to examine: (a) whether a sample is homogeneous with respect to a unidimensional measurement model, (b) implications of sample heterogeneity with respect to model-predicted scores (theta), and (c) sources of sample heterogeneity. An example is provided using the 10 items of the Short-Form Health Status (SF-36®) physical functioning subscale with data from the Canadian Community Health Survey (2003) (N = 7,030 adults in Manitoba).
The sample was not homogeneous with respect to a unidimensional measurement structure. Specification of three latent classes, to account for sample heterogeneity, resulted in significantly improved model fit. The latent classes were partially explained by demographic and health-related variables.
The illustrative analyses demonstrate the value of LVMM in revealing the potential implications of sample heterogeneity in the measurement of PROs.
KeywordsSelf-report measurement Psychometrics Measurement validity Physical function
Akaike information criterion
Bootstrapped likelihood ratio test
Bayesian information criterion
Sample-adjusted Bayesian information criterion
Canadian Community Health survey
Differential item functioning
Item response theory
Graded response model
Latent variable mixture model
Patient reported outcomes
- VLMR LRT
Vuong-Lo-Mendell-Rubin likelihood ratio test
This research was completed with support from the Michael Smith Foundation for Health Research, the Arthritis Research Centre of Canada, and the Canadian Arthritis Network. The research and analysis are based on data from Statistics Canada, and the opinions expressed do not represent the views of Statistics Canada.
- 1.van der Linden, W. J., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.Google Scholar
- 2.Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New Jersey: Lawrence Erlbaum.Google Scholar
- 3.Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. London: Sage.Google Scholar
- 4.Streiner, D. L., & Norman, G. R. (2008). Health measurement scales: A practical guide to their development and use (4th ed.). Oxford: Oxford University Press.Google Scholar
- 5.Fayers, P., & Machin, D. (2007). Quality of life: The assessment, analysis and interpretation of patient-reported outcomes. Chichester, West Sussex: Wiley.Google Scholar
- 6.Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (vol. 26: Psychometrics) (pp. 45–79). Amsterdam: Elsevier Science.Google Scholar
- 8.Zumbo, B. D. (2009). Validity as contextualized and pragmatic explanation, and its implications for validation practice. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions and applications (pp. 65–82). Charlotte, NC: Information Age Publishing.Google Scholar
- 9.Zumbo, B. D. (2007). Three generations of DIF analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4, 223–233.Google Scholar
- 10.Sawatzky, R., Ratner, P. A., Johnson, J. L., Kopec, J., & Zumbo, B. D. (2009). Sample heterogeneity and the measurement structure of the Multidimensional Students’ Life Satisfaction Scale. Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, 94, 273–296.Google Scholar
- 12.Byrne, B. M. (1998). Structural equation modeling with LISREL, PRELIS, and SIMPLIS: Basic concepts, applications, and programming. Mahwah, NJ: L. Erlbaum.Google Scholar
- 13.Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel Haenszel procedure. In H. Wainer, H. I. Braun, & Educational Testing Service (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: L. Erlbaum Associates.Google Scholar
- 16.Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.Google Scholar
- 23.De Ayala, R. J., Kim, S. H., Stapleton, L. M., & Dayton, C. M. (2002). Differential item functioning: A mixture distribution conceptualization. International Journal of Testing, 2, 243.Google Scholar
- 24.Samuelsen, K. M. (2008). Examining differential item functioning from a latent mixture perspective. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 177–198). Charlotte, NC: Information Age Publishing.Google Scholar
- 25.Mislevy, R. J., Levy, R., Kroopnick, M., & Rutstein, D. (2008). Evidentiary foundations of mixture item response theory models. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 149–176). Charlotte, NC: Information Age Publishing.Google Scholar
- 26.De Ayala, R. J. (2009). The theory and practice of item response theory. New York: Guilford Press.Google Scholar
- 30.Muthén, B. (2008). Latent variables hybrids. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 1–24). Charlotte, NC: Information Age Publishing.Google Scholar
- 31.Muthén, B. (2001). Latent variable mixture modeling. In G. A. Marcoulides & R. E. Schumacker (Eds.), New developments and techniques in structural equation modeling (pp. 1–33). Mahwah, NJ: Lawrence Erlbaum.Google Scholar
- 32.Muthén, B. (2002). Beyond SEM: General latent variable modeling. Behaviormetrika, 29(1), 81–117.Google Scholar
- 33.Muthén, B., & Muthén, L. (2008). MPlus (version 5.2). Los Angeles, CA: Statmodel.Google Scholar
- 35.von Davier, M., & Carstensen, C. H. (2007). Multivariate and mixture distribution Rasch models: Extensions and applications. New York, NY: Springer.Google Scholar
- 42.McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: L. Erlbaum Associates.Google Scholar
- 43.Samejima, F. (1997). Graded response model. In W. J. Van Der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 85–100). New York: Springer.Google Scholar
- 44.Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., et al. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the patient-reported outcomes measurement information system (PROMIS). Medical Care, 45(5 Suppl 1), S22–S31.PubMedCrossRefGoogle Scholar
- 47.Asparouhov, T., & Muthén, B. (2008). Multilevel mixture models. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 27–51). Charlotte, NC: Information Age Publishing.Google Scholar
- 48.Chen, W. H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265–289.Google Scholar
- 58.Lubke, G., & Muthén, B. (2007). Performance of factor mixture models as a function of model size, covariate effects, and class-specific parameters. Structural Equation Modeling: A Multidisciplinary Journal, 14, 26–47.Google Scholar
- 60.Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York: Springer.Google Scholar
- 61.Thomas, D. R., Zhu, P., Zumbo, B. D., & Dutta, S. (2008). On measuring the relative importance of explanatory variables in a logistic regression. Journal of Modern Applied Statistical Methods, 7, 21–38.Google Scholar
- 62.Muthén, B., & Muthén, L. (2010). IRT in Mplus. http://www.statmodel.com/download/MplusIRT2.pdf. Accessed 15 Jan 2011.
- 63.Maij-de Meij, A. M., Kelderman, H., & van der Flier, H. (2008). Fitting a mixture item response theory model to personality questionnaire data: Characterizing latent classes and investigating possibilities for improving prediction. Applied Psychological Measurement, 32, 611–631.CrossRefGoogle Scholar
- 65.Tofighi, D., & Enders, C. K. (2008). Identifying the correct number of classes in growth mixture models. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 317–342). Charlotte, NC: Information Age Publishing.Google Scholar
- 70.Muthén, B., & Muthén, L. (2007). Wald test of mean equality for potential latent class predictors in mixture modeling. http://www.statmodel.com/download/MeanTest1.pdf. Accessed on 20 Oct 2010.
- 71.Ware, J. E., Snow, K. K., Kosinski, M., & Gandek, B. (1993). SF-36 health survey: Manual and interpretation guide. Boston, MA: The Health Institute, New England Medical Center.Google Scholar
- 72.Canada, Statistics. (2005). Canadian Community Health Survey Cycle 2.1: User guide for the public use microdata file. Ottawa, ON: Statistics Canada: Health Statistics Division.Google Scholar
- 73.Dayton, C. M. (1998). Latent class scaling analysis. Thousand Oaks, CA: Sage.Google Scholar
- 74.Finney, S. J., & DiStefano, C. (2006). Non-normal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (pp. 269–314). Greenwich, CT: Information Age Publishing.Google Scholar
- 78.Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: L. Erlbaum Associates.Google Scholar
- 80.Canada, Statistics. (2005). Canadian Community Health Survey: Questionnaire for cycle 2.1. Ottawa, ON: Statistics Canada: Health Statistics Division.Google Scholar