Skip to main content

Advertisement

Log in

A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data

  • Published:
Sankhya B Aims and scope Submit manuscript

Abstract

Overdispersion and intra cluster correlation are two important issues in clustered categorical/ordinal data and failure to account for them can result in misleading inferences. Generalized estimating equations and mixed effects models are two common frameworks for analyzing clustered data which are recently combined and extended to the marginalized random effects model. The location scale models are a different extension of the mixed effects models that furthermore allow the variance to vary as a function of covariates. In this paper, we extend a marginalized location scale model for longitudinal ordinal responses by allowing a log-linear model for variance components that facilitates both population-averaged and subject-specific interpretations. We then extend the marginalized location scale model by incorporating an additional random term into the model to handle the overdispersion aspect of the data. We conduct extensive simulation studies to investigate the statistical properties of the maximum likelihood estimators of the model parameters. We illustrate this methodology using a dataset on a children’s growth failure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti, A. and Lang, J.B. (1993). A proportional odds model with subject-specific effects for repeated ordered categorical responses. Biometrika80, 527–534.

    Article  MathSciNet  Google Scholar 

  • Breslow, N.E. (1984). Extra-Poisson variation in log-linear models. Appl. Stat.33, 38–44.

    Article  Google Scholar 

  • Choo-Wosoba, H. and Datta, S. (2018). Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway–Maxwell–Poisson distribution. J. Appl. Stat.45, 799–814.

    Article  MathSciNet  Google Scholar 

  • Choo-Wosoba, H., Levy, S.M. and Datta, S. (2016). Marginal regression models for clustered count data based on zero-inflated Conway-Maxwell-Poisson distribution with applications. Biometrics72, 606–618.

    Article  MathSciNet  Google Scholar 

  • Cole, S.Z. and Lanham, J.S. (2011). Failure to thrive: an update. Am. Fam. Physician83, 829–834.

    Google Scholar 

  • Cox, C. (1995). Location-scale cumulative odds models for ordinal data: a generalized non-linear model approach. Stat. Med.14, 1191–1203.

    Article  Google Scholar 

  • Ezzet, F. and Whitehead, J. (1991). A random effects model for ordinal responses from a crossover trial. Stat. Med.10, 901–907.

    Article  Google Scholar 

  • Fielding, A., Yang, M. and Goldstein, H. (2003). Multilevel ordinal models for examination grades. Stat. Model.3, 127–153.

    Article  MathSciNet  Google Scholar 

  • Gelman, A. and Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New York.

    Book  Google Scholar 

  • Griswolda, M., Swiharta, B., Caffoa, B. and Zeger, S. (2013). A practical marginalized multilevel models. Stat2, 129–142.

    Article  Google Scholar 

  • Heagerty, P. and Zeger, S. (2000). Marginalized multilevel models and likelihood inference. Stat. Sci.15, 1–19.

    MathSciNet  Google Scholar 

  • Hedeker, D. and Gibbons, R.D. (1994). A random-effects ordinal regression model for multilevel analysis. Biometrics50, 933–944.

    Article  Google Scholar 

  • Hedeker, D. and Gibbons, R.D. (2006). Wiley, New York.

  • Hedeker, D. and Mermelstein, R.J. (1998). A multilevel thresholds of change model for analysis of stages of change data. Multivar. Behav. Res.33, 427–455.

    Article  Google Scholar 

  • Hedeker, D., Demirtas, H. and Mermelstein, R.J. (2009). A mixed ordinal location scale model for analysis of Ecological Momentary Assessment (EMA). Stat. Interface2, 391–401.

    Article  MathSciNet  Google Scholar 

  • Hinde, J. and Demetrio, C.G.B. (1998a). Overdispersion: models and estimation. Comput. Stat. Data Anal.27, 151–170.

    Article  Google Scholar 

  • Iddi, S. and Molenberghs, G. (2013). A marginalized model for zero-inflated, overdispersed and correlated count data. Electron. J. Appl. Stat. Anal.6, 149–165.

    MathSciNet  Google Scholar 

  • Ivanova, A., Molenberghs, G. and Verbeke, G. (2014). A model for overdispersed hierarchical ordinal data. Stat. Model.14, 399–415.

    Article  MathSciNet  Google Scholar 

  • Kong, M., Xu, S., Levy, S.M. and Datta, S. (2015). GEE type inference for clustered zero-inflated negative binomial regression with application to dental caries. Comput. Stat. Data Anal.85, 54–66.

    Article  MathSciNet  Google Scholar 

  • Kuczmarski, R.J., Ogden, C.L., Guo, S.S., Grummer-Strawn, L.M., Flegal, K.M., Mei, Z. et al. (2002). 2000 CDC growth charts for the United States: methods and development. Vital Health Stat.11, 1–190.

    Google Scholar 

  • Laird, N.M. and Ware, J.H. (1982). Random-effects models for longitudinal data. Biometrics38, 963–974.

    Article  Google Scholar 

  • Lawless, J. (1987). Negative binomial and mixed Poisson regression. Can. J. Stat.15, 209–225.

    Article  MathSciNet  Google Scholar 

  • Lee, K. and Daniels, M. (2008). Marginalized models for longitudinal ordinal data with application to quality of life studies. Stat. Med.27, 4359–4380.

    Article  MathSciNet  Google Scholar 

  • Lesaffre, E. and Spiessens, B. (2001). On the effect of the number of quadrature points in a logistic random effects model: an example. Appl. Stat.50, 325–335.

    MathSciNet  MATH  Google Scholar 

  • McCullagh, P. (1980). Regression models for ordinal data (with discussion). J. R. Stat. Soc. Ser. B42, 109–142.

    MATH  Google Scholar 

  • Molenberghs, G., Verbeke, G. and Demétrio, C. (2007). An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Anal.13, 513–531.

    Article  MathSciNet  Google Scholar 

  • Molenberghs, G., Verbeke, G., Demétrio, C. and Vieira, A. (2010). A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat. Sci.25, 325–347.

    Article  MathSciNet  Google Scholar 

  • Peterson, B. and Harrell, F.E. (1990). Partial proportional odds models for ordinal response variables. Appl. Stat.39, 205–217.

    Article  Google Scholar 

  • Rao, T.J. (1999). Mahalanobis’ contributions to sample survey. Resonance4, 27–33.

    Article  Google Scholar 

  • Raudenbush, S.W., Bryk, A.S., Cheong, Y.F. and Congdon, R. (2004). HLM 6: hierarchical Linear and Nonlinear Modeling. Scientific Software International Inc, Chicago.

    Google Scholar 

  • Sellers, K.F. and Shmueli, G. (2010). A flexible regression model for count data. Ann. Appl. Stat.4, 943–961.

    Article  MathSciNet  Google Scholar 

  • Tosteson, A.N. and Begg, C.B. (1988). A general regression methodology for ROC curve estimation. Med. Decis. Mak.8, 204–215.

    Article  Google Scholar 

  • Tutz, G. and Hennevogl, W. (1996). Random effects in ordinal regression models. Comput. Stat. Data Anal.22, 537–557.

    Article  Google Scholar 

  • Vahabi, N., Salehi, M., Azarbar, A., Zayeri, F. and Kholdi, N. (2014). Application of multilevel model for assessing the affected factors on failure to thrive in children less than two years old. Razi J. Med. Sci.14, 91–99.

    Google Scholar 

  • Vahabi, N., Kazemnejad, A. and Fallah, R. (2016). Length and weight growth trends for children less than two years old in Zanjan, Iran: Longitudinal modeling. Med. J. Islam Repub. Iran30, 374.

    Google Scholar 

  • Vahabi, N., Kazemnejad, A. and Datta, S. (2017). A joint overdispersed marginalized random effects model for analyzing two or more longitudinal ordinal responses. Stat. Methods Med. Res. Epub ahead of print. https://doi.org/10.1177/0962280217714616.

    Article  MathSciNet  Google Scholar 

  • WHO Multicenter Growth Reference Study Group (2006). WHO Child Growth Standards based on length/height, weight and age. Acta Paediatr. Suppl.450, 76–85.

    Google Scholar 

  • Zeger, S., Liang, K. and Albert, P. (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics44, 1049–1060.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Somnath Datta.

Ethics declarations

Conflict of interest

The Authors declare that there is no conflict of interest.

Appendices

Appendix

We provide a proof of Theorem 2 below. Since, Theorem 1 can be obtained by essentially replacing 𝜃 by the constant 1, a separate proof of Theorem 1 is not provided.

The following lemma is need for the proof of Theorem 2.

Lemma 1.

For normally distributed random variable b i , \(\int {\Phi } \left (\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}\right ) f(b_{i})\) \(db_{i} =\sigma _{\varepsilon _{ij}} {\Phi } \left (\frac {{\Delta }_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+\sigma _{b}^{2}}} \right )\) .

Proof of Lemma 1.

\(\int {{\Phi }{(\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}\mathrm ) } f{(b_{i})} db_{i}} \mathrm {=}\sigma _{\varepsilon _{ij}}{\Phi } (\frac {{{\Delta }}_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+{\sigma _{b}^{2}}}} \mathrm )\). □

By definition,

$${\Phi} (x)=\int\limits_{-\infty}^{x} {\frac{1}{\sqrt{2\pi} } exp {\left( \frac{-t^{2}}{2}\right)}dt.} $$

Hence

$$\begin{array}{@{}rcl@{}} \int {{\Phi} \left( \frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}\right) f(b_{i}) db_{i}} &=&\int\limits_{-\infty}^{+\infty}{\int\limits_{-\infty }^{\frac{{\Delta}_{ijk}+b_{i}}{\sigma _{\varepsilon_{ij}}}} {\frac{1}{\sqrt {2\pi} } {exp}\left( \frac{-t^{2}}{2}\right) . \frac{1}{\sqrt {2\pi}s}{exp}\left( \frac{-{b_{i}^{2}}}{2\sigma_{b}^{2}}\right)dt db_{i},}}\\ &=&\int\limits_{-\infty}^{+\infty} {\int\limits_{-\infty}^{\frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}} {{\left( \frac{1}{\sqrt{2\pi} }\right)}^{2}\frac{1}{\sigma_{b}} {exp}\left\{\frac{-1}{2} [{b_{i}^{2}} \sigma_{b}^{-2}+(t^{2})]\right\}} dt} db_{i}. \end{array} $$

Here we make a change of variable \(t=\frac {z+b_{i}}{\sigma _{\varepsilon _{ij}}}\) to conclude that the above expression equals

$$\int\limits_{-\infty}^{+\infty} {\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt{2\pi} } \right)}^{2}\frac{1}{\sigma_{b}} {exp}\left\{\frac{-1}{2} \left[{b_{i}^{2}}\sigma_{b}^{-2}+ {\left( \frac{z+b_{i}}{\sigma_{\varepsilon_{ij}}}\right)}^{2}\right]\right\}} \frac{1}{\sigma_{\varepsilon_{ij}}}dz} db_{i}. $$

Furthermore, letting \(\acute {z}=\frac {z}{\sigma _{\varepsilon _{ij}}}\) and \(\acute {b}_{i}=\frac {b_{i}}{\sigma _{\varepsilon _{ij}}}\) we see that the above expression further equals

$$\int\limits_{-\infty}^{+\infty}{\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt {2\pi} } \right)}^{2}\frac{1}{\sigma_{b}} {exp}\left\{\frac{-1}{2} \left[\acute{b}_{i}^{2} \sigma _{\varepsilon_{ij}}^{2} \sigma_{b}^{-2}+ {(\acute{z}+\acute{b}_{i})}^{2}\right]\right\}} \frac{1}{\sigma_{\varepsilon_{ij}}}\sigma_{\varepsilon_{ij}}d\acute{z}} \sigma_{\varepsilon _{ij}}d\acute{b}_{i}. $$

Further let us to define:

$$\left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}l} A = 1+\frac{\sigma_{\varepsilon_{ij}}^{2}}{{\sigma_{b}^{2}}} \\ B=-\acute{z} {\sigma_{b}^{2}} \end{array}} \\ {\begin{array}{*{20}l} C=\acute{z}^{2}E \\ E=\frac{1}{\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}} \end{array}} \end{array}} \right. $$

hence the above expression equals

$$\int\limits_{-\infty}^{+\infty}{\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt {2\pi} } \right)}^{2}\frac{\sigma_{\varepsilon_{ij}}}{\sigma_{b}} {exp}\left\{\frac{-1}{2} [\acute{b}_{i}^{2}A + 2\acute{z}\acute{b}_{i}+\acute{z}^{2}]\right\}} d\acute{z}} d\acute{b}_{i}. $$

Consider that \(\acute {b}_{i}^{2} \sigma _{\varepsilon _{ij}}^{2} \sigma _{b}^{-2}+ {(\acute {z}+\acute {b}_{i})}^{2}\mathrm {=}\acute {b}_{i}^{2}A + 2\acute {z}\acute {b}_{i}+\acute {z}^{2}{=}{{(\acute {b}_{i}{-}B)}}^{2}A\mathrm {+}C\), the above expression equals

$$\int\limits_{-\infty}^{+\infty} {\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt {2\pi} } \right)}^{2}\frac{\sigma_{\varepsilon_{ij}}}{\sigma_{b}} {exp}\left\{\frac{-1}{2} \frac{{(\acute{b}_{i}-B)}^{2}}{\frac{1}{A}}\right\} . {exp}\left\{\frac{-1}{2}\frac{\acute{z}^{2}}{\frac{1}{E}}\right\}} d\acute{z}} d\acute{b}_{i}. $$

By multiplying the above formula by \(\frac {1}{A^{\frac {1}{2}}A^{\frac {\mathrm {-1}}{2}}\quad E^{\frac {1}{2}}{E}^{-\frac {1}{2}}}\), it further equals

$$\int\limits_{-\infty}^{+\infty} \int\limits_{-\infty}^{{\Delta}_{ijk}}{\frac{1}{A^{\frac{1}{2}}{A}^{\frac{\mathrm{-1}}{2}}E^{\frac{1}{2}}{E}^{-\frac{1}{2}}}.\left( \frac{1}{\sqrt{2\pi}}\right)}^{2}\frac{\sigma_{\varepsilon_{ij}}}{\sigma_{b}}{exp}\left\{\frac{-1}{2} \frac{{(\acute{b}-B)}^{2}}{\frac{1}{A}}\right\}.{exp}\left\{\frac{-1}{2}\frac{\acute{z}^{2}}{\frac{1}{E}}\right\} d\acute{z} d\acute{b}_{i}. $$

With a slight rearrangement the above equation equals

$$\sigma_{\varepsilon_{ij}}\times \int\limits_{-\infty}^{+\infty} \frac{1}{\sqrt {2\pi } { A}^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} \frac{{(\acute{b}-B)}^{2}}{\frac{1}{A}}\right\}d\acute{b}_{i} \int\limits_{-\infty}^{{\Delta}_{\boldsymbol{ijk}}}{\frac{1}{\sqrt{2\pi} E^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} \frac{\acute{z}^{2}}{\frac{1}{E}}\right\}} d\acute{z}. $$

Since the first integral equals one, then the above formula equals

$$\sigma_{\varepsilon_{ij}}\times 1\times \int\limits_{-\infty}^{{\Delta}_{ijk}} {\frac{1}{\sqrt {2\pi} E^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} \frac{\acute{z}^{2}}{\frac{1}{E}}\right\}} d. $$

Here we make a change of variable \(q= \acute {z}\sqrt E \) to conclude that the above expression equals

$$\begin{array}{@{}rcl@{}} \sigma_{\varepsilon_{ij}}\times \int\limits_{-\infty}^{\frac{{\Delta}_{ijk}}{\sqrt{\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}}}} \frac{1}{\sqrt {2\pi} E^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} q^{2}\right\} {E}^{\frac{\mathrm{-1}}{2}} dq,\\ =\sigma_{\varepsilon_{ij}}{\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}} \right). \end{array} $$

Proof of the Theorem 2.

\({\Delta }_{ijk} = {\Phi }^{-1}\,\left \{\!\frac {1}{E\left (\theta \right ) \sigma _{e_{ij}}}\,expit\,(\,\!\alpha _{0k}\,+\, \boldsymbol {x}_{\boldsymbol {ij}}^{\boldsymbol {T}} \boldsymbol {\beta }\,)\!\right \} \times \)\( \left \{\sqrt {\sigma _{e_{ij}}^{2}+\sigma _{b}^{2}}\right \}.\)

Knowing the relationship between the marginal and conditional probabilities as follows:

$$F_{ijk}=\int F_{ijk} (b_{i}\theta ) f(b_{i}) db_{i}, $$

we can write:

$$\begin{array}{@{}rcl@{}} \beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} =Logit\left[\int {E(\theta ) {\Phi} \left( \frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}\right) f(b_{i}) db_{i}} \right],\\ \beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} =Logit\left[E(\theta )\int {{\Phi} \left( \frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}\right) f(b_{i}) db_{i}} \right], \end{array} $$
(A1)

where \(\int {{\Phi } (\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}) f(b_{i}) db_{i}} \) simplifies to \(\sigma _{\varepsilon _{ij}} {\Phi } (\frac {{\Delta }_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+{\sigma _{b}^{2}}}} )\) by Lemma 1.

Therefore,

$$\beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} =Logit\left[E(\theta ) \sigma_{\varepsilon_{ij}} {\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}}} \right)\right],$$

and hence

$$expit[\beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} ]=E(\theta ) \sigma_{\varepsilon_{ij}} {\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}} \right). $$

A slight rearrangement shows

$$\frac{1}{\sigma_{\varepsilon_{ij}} E(\theta )}expit[\beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} ]={\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}} \right), $$

implying

$${\Phi}^{-1}\left[\frac{1}{\sigma_{\varepsilon_{ij}} E(\theta )} expit \{\beta_{0k}+ \boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta }\}\right]=\frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}}, $$

and finally

$${{\Delta}_{ijk}={\Phi}}^{-1}\left[\frac{1}{\sigma_{\varepsilon_{ij}} E(\theta )} expit \{\beta_{0k}+ \boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta}\} \right]\times \sqrt {\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}}. $$

Proof of the Theorem 3.

For the OMLS model in equation (2.7), if α01 < ⋯ < α0K− 1 then Δij1 < ⋯ < ΔijK− 1. □

The proof follows along the same lines as in Vahabi et al. (2017).

Web Supplements

SAS Code for Generating Simulated Data Based on the Proposed OMLS Model in Section 2.3

figure a

SAS Code for the LS Model Reviewed in Section 2.1

figure b

SAS Code for the MLS Model Introduced in Section 2.2

figure c

SAS Code for the OMLS Model Introduced in Section 2.3

figure d

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vahabi, N., Kazemnejad, A. & Datta, S. A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data. Sankhya B 80 (Suppl 1), 103–134 (2018). https://doi.org/10.1007/s13571-018-0162-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-018-0162-5

Keywords and phrases.

AMS (2000) subject classification

Navigation