Date: 11 Nov 2012

Joint modeling of multiple longitudinal cost outcomes using multivariate generalized linear mixed models


The common approach to modeling healthcare cost data is to use aggregated total cost from multiple categories or sources (e.g. inpatient, outpatient, prescriptions, etc.) as the dependent variable. However, this approach could hide the differential impact of covariates on the different cost categories. An alternative is to model each cost category separately. This could also lead to wrong conclusions due to failure to account for the interdependence among the multiple cost outcomes. Therefore, we propose a multivariate generalized linear mixed model (mGLMM) that allows for joint modeling of longitudinal cost data from multiple sources. We assessed four different approaches, (1) shared random intercept, (2) shared random intercept and slope, (3) separate random intercepts from a joint multivariate distribution, and (4) separate random intercepts and slopes from a joint multivariate distribution. Each of these approaches differs in the way they account for the correlation among the multiple cost outcomes. Comparison was made via goodness of fit measures and residual plots. Longitudinal cost data from a national cohort of 740,195 veterans with diabetes (followed from 2002–2006) was used to demonstrate joint modeling. Among examined models, the separate random intercept approach exhibited the lowest AIC/BIC in both log-normal and gamma GLMMs. However, for our data example, the shared random intercept approach seemed to be sufficient as the more complex models did not lead to qualitatively different conclusions.