Abstract
Overdispersion and intra cluster correlation are two important issues in clustered categorical/ordinal data and failure to account for them can result in misleading inferences. Generalized estimating equations and mixed effects models are two common frameworks for analyzing clustered data which are recently combined and extended to the marginalized random effects model. The location scale models are a different extension of the mixed effects models that furthermore allow the variance to vary as a function of covariates. In this paper, we extend a marginalized location scale model for longitudinal ordinal responses by allowing a log-linear model for variance components that facilitates both population-averaged and subject-specific interpretations. We then extend the marginalized location scale model by incorporating an additional random term into the model to handle the overdispersion aspect of the data. We conduct extensive simulation studies to investigate the statistical properties of the maximum likelihood estimators of the model parameters. We illustrate this methodology using a dataset on a children’s growth failure.
Similar content being viewed by others
References
Agresti, A. and Lang, J.B. (1993). A proportional odds model with subject-specific effects for repeated ordered categorical responses. Biometrika80, 527–534.
Breslow, N.E. (1984). Extra-Poisson variation in log-linear models. Appl. Stat.33, 38–44.
Choo-Wosoba, H. and Datta, S. (2018). Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway–Maxwell–Poisson distribution. J. Appl. Stat.45, 799–814.
Choo-Wosoba, H., Levy, S.M. and Datta, S. (2016). Marginal regression models for clustered count data based on zero-inflated Conway-Maxwell-Poisson distribution with applications. Biometrics72, 606–618.
Cole, S.Z. and Lanham, J.S. (2011). Failure to thrive: an update. Am. Fam. Physician83, 829–834.
Cox, C. (1995). Location-scale cumulative odds models for ordinal data: a generalized non-linear model approach. Stat. Med.14, 1191–1203.
Ezzet, F. and Whitehead, J. (1991). A random effects model for ordinal responses from a crossover trial. Stat. Med.10, 901–907.
Fielding, A., Yang, M. and Goldstein, H. (2003). Multilevel ordinal models for examination grades. Stat. Model.3, 127–153.
Gelman, A. and Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New York.
Griswolda, M., Swiharta, B., Caffoa, B. and Zeger, S. (2013). A practical marginalized multilevel models. Stat2, 129–142.
Heagerty, P. and Zeger, S. (2000). Marginalized multilevel models and likelihood inference. Stat. Sci.15, 1–19.
Hedeker, D. and Gibbons, R.D. (1994). A random-effects ordinal regression model for multilevel analysis. Biometrics50, 933–944.
Hedeker, D. and Gibbons, R.D. (2006). Wiley, New York.
Hedeker, D. and Mermelstein, R.J. (1998). A multilevel thresholds of change model for analysis of stages of change data. Multivar. Behav. Res.33, 427–455.
Hedeker, D., Demirtas, H. and Mermelstein, R.J. (2009). A mixed ordinal location scale model for analysis of Ecological Momentary Assessment (EMA). Stat. Interface2, 391–401.
Hinde, J. and Demetrio, C.G.B. (1998a). Overdispersion: models and estimation. Comput. Stat. Data Anal.27, 151–170.
Iddi, S. and Molenberghs, G. (2013). A marginalized model for zero-inflated, overdispersed and correlated count data. Electron. J. Appl. Stat. Anal.6, 149–165.
Ivanova, A., Molenberghs, G. and Verbeke, G. (2014). A model for overdispersed hierarchical ordinal data. Stat. Model.14, 399–415.
Kong, M., Xu, S., Levy, S.M. and Datta, S. (2015). GEE type inference for clustered zero-inflated negative binomial regression with application to dental caries. Comput. Stat. Data Anal.85, 54–66.
Kuczmarski, R.J., Ogden, C.L., Guo, S.S., Grummer-Strawn, L.M., Flegal, K.M., Mei, Z. et al. (2002). 2000 CDC growth charts for the United States: methods and development. Vital Health Stat.11, 1–190.
Laird, N.M. and Ware, J.H. (1982). Random-effects models for longitudinal data. Biometrics38, 963–974.
Lawless, J. (1987). Negative binomial and mixed Poisson regression. Can. J. Stat.15, 209–225.
Lee, K. and Daniels, M. (2008). Marginalized models for longitudinal ordinal data with application to quality of life studies. Stat. Med.27, 4359–4380.
Lesaffre, E. and Spiessens, B. (2001). On the effect of the number of quadrature points in a logistic random effects model: an example. Appl. Stat.50, 325–335.
McCullagh, P. (1980). Regression models for ordinal data (with discussion). J. R. Stat. Soc. Ser. B42, 109–142.
Molenberghs, G., Verbeke, G. and Demétrio, C. (2007). An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Anal.13, 513–531.
Molenberghs, G., Verbeke, G., Demétrio, C. and Vieira, A. (2010). A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat. Sci.25, 325–347.
Peterson, B. and Harrell, F.E. (1990). Partial proportional odds models for ordinal response variables. Appl. Stat.39, 205–217.
Rao, T.J. (1999). Mahalanobis’ contributions to sample survey. Resonance4, 27–33.
Raudenbush, S.W., Bryk, A.S., Cheong, Y.F. and Congdon, R. (2004). HLM 6: hierarchical Linear and Nonlinear Modeling. Scientific Software International Inc, Chicago.
Sellers, K.F. and Shmueli, G. (2010). A flexible regression model for count data. Ann. Appl. Stat.4, 943–961.
Tosteson, A.N. and Begg, C.B. (1988). A general regression methodology for ROC curve estimation. Med. Decis. Mak.8, 204–215.
Tutz, G. and Hennevogl, W. (1996). Random effects in ordinal regression models. Comput. Stat. Data Anal.22, 537–557.
Vahabi, N., Salehi, M., Azarbar, A., Zayeri, F. and Kholdi, N. (2014). Application of multilevel model for assessing the affected factors on failure to thrive in children less than two years old. Razi J. Med. Sci.14, 91–99.
Vahabi, N., Kazemnejad, A. and Fallah, R. (2016). Length and weight growth trends for children less than two years old in Zanjan, Iran: Longitudinal modeling. Med. J. Islam Repub. Iran30, 374.
Vahabi, N., Kazemnejad, A. and Datta, S. (2017). A joint overdispersed marginalized random effects model for analyzing two or more longitudinal ordinal responses. Stat. Methods Med. Res. Epub ahead of print. https://doi.org/10.1177/0962280217714616.
WHO Multicenter Growth Reference Study Group (2006). WHO Child Growth Standards based on length/height, weight and age. Acta Paediatr. Suppl.450, 76–85.
Zeger, S., Liang, K. and Albert, P. (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics44, 1049–1060.
Acknowledgments
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The Authors declare that there is no conflict of interest.
Appendices
Appendix
We provide a proof of Theorem 2 below. Since, Theorem 1 can be obtained by essentially replacing 𝜃 by the constant 1, a separate proof of Theorem 1 is not provided.
The following lemma is need for the proof of Theorem 2.
Lemma 1.
For normally distributed random variable b i , \(\int {\Phi } \left (\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}\right ) f(b_{i})\) \(db_{i} =\sigma _{\varepsilon _{ij}} {\Phi } \left (\frac {{\Delta }_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+\sigma _{b}^{2}}} \right )\) .
Proof of Lemma 1.
\(\int {{\Phi }{(\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}\mathrm ) } f{(b_{i})} db_{i}} \mathrm {=}\sigma _{\varepsilon _{ij}}{\Phi } (\frac {{{\Delta }}_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+{\sigma _{b}^{2}}}} \mathrm )\). □
By definition,
Hence
Here we make a change of variable \(t=\frac {z+b_{i}}{\sigma _{\varepsilon _{ij}}}\) to conclude that the above expression equals
Furthermore, letting \(\acute {z}=\frac {z}{\sigma _{\varepsilon _{ij}}}\) and \(\acute {b}_{i}=\frac {b_{i}}{\sigma _{\varepsilon _{ij}}}\) we see that the above expression further equals
Further let us to define:
hence the above expression equals
Consider that \(\acute {b}_{i}^{2} \sigma _{\varepsilon _{ij}}^{2} \sigma _{b}^{-2}+ {(\acute {z}+\acute {b}_{i})}^{2}\mathrm {=}\acute {b}_{i}^{2}A + 2\acute {z}\acute {b}_{i}+\acute {z}^{2}{=}{{(\acute {b}_{i}{-}B)}}^{2}A\mathrm {+}C\), the above expression equals
By multiplying the above formula by \(\frac {1}{A^{\frac {1}{2}}A^{\frac {\mathrm {-1}}{2}}\quad E^{\frac {1}{2}}{E}^{-\frac {1}{2}}}\), it further equals
With a slight rearrangement the above equation equals
Since the first integral equals one, then the above formula equals
Here we make a change of variable \(q= \acute {z}\sqrt E \) to conclude that the above expression equals
Proof of the Theorem 2.
\({\Delta }_{ijk} = {\Phi }^{-1}\,\left \{\!\frac {1}{E\left (\theta \right ) \sigma _{e_{ij}}}\,expit\,(\,\!\alpha _{0k}\,+\, \boldsymbol {x}_{\boldsymbol {ij}}^{\boldsymbol {T}} \boldsymbol {\beta }\,)\!\right \} \times \)\( \left \{\sqrt {\sigma _{e_{ij}}^{2}+\sigma _{b}^{2}}\right \}.\)□
Knowing the relationship between the marginal and conditional probabilities as follows:
we can write:
where \(\int {{\Phi } (\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}) f(b_{i}) db_{i}} \) simplifies to \(\sigma _{\varepsilon _{ij}} {\Phi } (\frac {{\Delta }_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+{\sigma _{b}^{2}}}} )\) by Lemma 1.
Therefore,
and hence
A slight rearrangement shows
implying
and finally
Proof of the Theorem 3.
For the OMLS model in equation (2.7), if α01 < ⋯ < α0K− 1 then Δij1 < ⋯ < ΔijK− 1. □
The proof follows along the same lines as in Vahabi et al. (2017).
Web Supplements
SAS Code for Generating Simulated Data Based on the Proposed OMLS Model in Section 2.3
SAS Code for the LS Model Reviewed in Section 2.1
SAS Code for the MLS Model Introduced in Section 2.2
SAS Code for the OMLS Model Introduced in Section 2.3
Rights and permissions
About this article
Cite this article
Vahabi, N., Kazemnejad, A. & Datta, S. A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data. Sankhya B 80 (Suppl 1), 103–134 (2018). https://doi.org/10.1007/s13571-018-0162-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13571-018-0162-5
Keywords and phrases.
- Longitudinal ordinal outcome
- Overdispersion
- Beta distribution
- Location-scale model
- Marginalized framework.