A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data

Vahabi, Nasim; Kazemnejad, Anoshirvan; Datta, Somnath

doi:10.1007/s13571-018-0162-5

A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data

Published: 28 June 2018

Volume 80, pages 103–134, (2018)
Cite this article

Sankhya B Aims and scope Submit manuscript

Nasim Vahabi¹,
Anoshirvan Kazemnejad² &
Somnath Datta³

67 Accesses
Explore all metrics

Abstract

Overdispersion and intra cluster correlation are two important issues in clustered categorical/ordinal data and failure to account for them can result in misleading inferences. Generalized estimating equations and mixed effects models are two common frameworks for analyzing clustered data which are recently combined and extended to the marginalized random effects model. The location scale models are a different extension of the mixed effects models that furthermore allow the variance to vary as a function of covariates. In this paper, we extend a marginalized location scale model for longitudinal ordinal responses by allowing a log-linear model for variance components that facilitates both population-averaged and subject-specific interpretations. We then extend the marginalized location scale model by incorporating an additional random term into the model to handle the overdispersion aspect of the data. We conduct extensive simulation studies to investigate the statistical properties of the maximum likelihood estimators of the model parameters. We illustrate this methodology using a dataset on a children’s growth failure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Article Open access 07 September 2023

Fixed and random effects models: making an informed choice

Article Open access 07 August 2018

References

Agresti, A. and Lang, J.B. (1993). A proportional odds model with subject-specific effects for repeated ordered categorical responses. Biometrika80, 527–534.
Article MathSciNet Google Scholar
Breslow, N.E. (1984). Extra-Poisson variation in log-linear models. Appl. Stat.33, 38–44.
Article Google Scholar
Choo-Wosoba, H. and Datta, S. (2018). Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway–Maxwell–Poisson distribution. J. Appl. Stat.45, 799–814.
Article MathSciNet Google Scholar
Choo-Wosoba, H., Levy, S.M. and Datta, S. (2016). Marginal regression models for clustered count data based on zero-inflated Conway-Maxwell-Poisson distribution with applications. Biometrics72, 606–618.
Article MathSciNet Google Scholar
Cole, S.Z. and Lanham, J.S. (2011). Failure to thrive: an update. Am. Fam. Physician83, 829–834.
Google Scholar
Cox, C. (1995). Location-scale cumulative odds models for ordinal data: a generalized non-linear model approach. Stat. Med.14, 1191–1203.
Article Google Scholar
Ezzet, F. and Whitehead, J. (1991). A random effects model for ordinal responses from a crossover trial. Stat. Med.10, 901–907.
Article Google Scholar
Fielding, A., Yang, M. and Goldstein, H. (2003). Multilevel ordinal models for examination grades. Stat. Model.3, 127–153.
Article MathSciNet Google Scholar
Gelman, A. and Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New York.
Book Google Scholar
Griswolda, M., Swiharta, B., Caffoa, B. and Zeger, S. (2013). A practical marginalized multilevel models. Stat2, 129–142.
Article Google Scholar
Heagerty, P. and Zeger, S. (2000). Marginalized multilevel models and likelihood inference. Stat. Sci.15, 1–19.
MathSciNet Google Scholar
Hedeker, D. and Gibbons, R.D. (1994). A random-effects ordinal regression model for multilevel analysis. Biometrics50, 933–944.
Article Google Scholar
Hedeker, D. and Gibbons, R.D. (2006). Wiley, New York.
Hedeker, D. and Mermelstein, R.J. (1998). A multilevel thresholds of change model for analysis of stages of change data. Multivar. Behav. Res.33, 427–455.
Article Google Scholar
Hedeker, D., Demirtas, H. and Mermelstein, R.J. (2009). A mixed ordinal location scale model for analysis of Ecological Momentary Assessment (EMA). Stat. Interface2, 391–401.
Article MathSciNet Google Scholar
Hinde, J. and Demetrio, C.G.B. (1998a). Overdispersion: models and estimation. Comput. Stat. Data Anal.27, 151–170.
Article Google Scholar
Iddi, S. and Molenberghs, G. (2013). A marginalized model for zero-inflated, overdispersed and correlated count data. Electron. J. Appl. Stat. Anal.6, 149–165.
MathSciNet Google Scholar
Ivanova, A., Molenberghs, G. and Verbeke, G. (2014). A model for overdispersed hierarchical ordinal data. Stat. Model.14, 399–415.
Article MathSciNet Google Scholar
Kong, M., Xu, S., Levy, S.M. and Datta, S. (2015). GEE type inference for clustered zero-inflated negative binomial regression with application to dental caries. Comput. Stat. Data Anal.85, 54–66.
Article MathSciNet Google Scholar
Kuczmarski, R.J., Ogden, C.L., Guo, S.S., Grummer-Strawn, L.M., Flegal, K.M., Mei, Z. et al. (2002). 2000 CDC growth charts for the United States: methods and development. Vital Health Stat.11, 1–190.
Google Scholar
Laird, N.M. and Ware, J.H. (1982). Random-effects models for longitudinal data. Biometrics38, 963–974.
Article Google Scholar
Lawless, J. (1987). Negative binomial and mixed Poisson regression. Can. J. Stat.15, 209–225.
Article MathSciNet Google Scholar
Lee, K. and Daniels, M. (2008). Marginalized models for longitudinal ordinal data with application to quality of life studies. Stat. Med.27, 4359–4380.
Article MathSciNet Google Scholar
Lesaffre, E. and Spiessens, B. (2001). On the effect of the number of quadrature points in a logistic random effects model: an example. Appl. Stat.50, 325–335.
MathSciNet MATH Google Scholar
McCullagh, P. (1980). Regression models for ordinal data (with discussion). J. R. Stat. Soc. Ser. B42, 109–142.
MATH Google Scholar
Molenberghs, G., Verbeke, G. and Demétrio, C. (2007). An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Anal.13, 513–531.
Article MathSciNet Google Scholar
Molenberghs, G., Verbeke, G., Demétrio, C. and Vieira, A. (2010). A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat. Sci.25, 325–347.
Article MathSciNet Google Scholar
Peterson, B. and Harrell, F.E. (1990). Partial proportional odds models for ordinal response variables. Appl. Stat.39, 205–217.
Article Google Scholar
Rao, T.J. (1999). Mahalanobis’ contributions to sample survey. Resonance4, 27–33.
Article Google Scholar
Raudenbush, S.W., Bryk, A.S., Cheong, Y.F. and Congdon, R. (2004). HLM 6: hierarchical Linear and Nonlinear Modeling. Scientific Software International Inc, Chicago.
Google Scholar
Sellers, K.F. and Shmueli, G. (2010). A flexible regression model for count data. Ann. Appl. Stat.4, 943–961.
Article MathSciNet Google Scholar
Tosteson, A.N. and Begg, C.B. (1988). A general regression methodology for ROC curve estimation. Med. Decis. Mak.8, 204–215.
Article Google Scholar
Tutz, G. and Hennevogl, W. (1996). Random effects in ordinal regression models. Comput. Stat. Data Anal.22, 537–557.
Article Google Scholar
Vahabi, N., Salehi, M., Azarbar, A., Zayeri, F. and Kholdi, N. (2014). Application of multilevel model for assessing the affected factors on failure to thrive in children less than two years old. Razi J. Med. Sci.14, 91–99.
Google Scholar
Vahabi, N., Kazemnejad, A. and Fallah, R. (2016). Length and weight growth trends for children less than two years old in Zanjan, Iran: Longitudinal modeling. Med. J. Islam Repub. Iran30, 374.
Google Scholar
Vahabi, N., Kazemnejad, A. and Datta, S. (2017). A joint overdispersed marginalized random effects model for analyzing two or more longitudinal ordinal responses. Stat. Methods Med. Res. Epub ahead of print. https://doi.org/10.1177/0962280217714616.
Article MathSciNet Google Scholar
WHO Multicenter Growth Reference Study Group (2006). WHO Child Growth Standards based on length/height, weight and age. Acta Paediatr. Suppl.450, 76–85.
Google Scholar
Zeger, S., Liang, K. and Albert, P. (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics44, 1049–1060.
Article MathSciNet Google Scholar

Download references

Acknowledgments

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Department of Pharmacotherapy and Translational Research, College of Pharmacy, University of Florida, Florida, Gainesville, FL, 32610, USA
Nasim Vahabi
Tarbiat Modares University, Tehran, Iran
Anoshirvan Kazemnejad
Department of Biostatistics, College of Public Health & Health Professions College of Medicine, University of Florida, Florida, Gainesville, FL, 32610, USA
Somnath Datta

Authors

Nasim Vahabi
View author publications
You can also search for this author in PubMed Google Scholar
Anoshirvan Kazemnejad
View author publications
You can also search for this author in PubMed Google Scholar
Somnath Datta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Somnath Datta.

Ethics declarations

Conflict of interest

The Authors declare that there is no conflict of interest.

Appendices

Appendix

We provide a proof of Theorem 2 below. Since, Theorem 1 can be obtained by essentially replacing 𝜃 by the constant 1, a separate proof of Theorem 1 is not provided.

The following lemma is need for the proof of Theorem 2.

Lemma 1.

For normally distributed random variable b _i , $\int {\Phi } \left (\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}\right ) f(b_{i})$ $db_{i} =\sigma _{\varepsilon _{ij}} {\Phi } \left (\frac {{\Delta }_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+\sigma _{b}^{2}}} \right )$ .

Proof of Lemma 1.

$\int {{\Phi }{(\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}\mathrm ) } f{(b_{i})} db_{i}} \mathrm {=}\sigma _{\varepsilon _{ij}}{\Phi } (\frac {{{\Delta }}_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+{\sigma _{b}^{2}}}} \mathrm )$. □

By definition,

$${\Phi} (x)=\int\limits_{-\infty}^{x} {\frac{1}{\sqrt{2\pi} } exp {\left( \frac{-t^{2}}{2}\right)}dt.} $$

Hence

$$\begin{array}{@{}rcl@{}} \int {{\Phi} \left( \frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}\right) f(b_{i}) db_{i}} &=&\int\limits_{-\infty}^{+\infty}{\int\limits_{-\infty }^{\frac{{\Delta}_{ijk}+b_{i}}{\sigma _{\varepsilon_{ij}}}} {\frac{1}{\sqrt {2\pi} } {exp}\left( \frac{-t^{2}}{2}\right) . \frac{1}{\sqrt {2\pi}s}{exp}\left( \frac{-{b_{i}^{2}}}{2\sigma_{b}^{2}}\right)dt db_{i},}}\\ &=&\int\limits_{-\infty}^{+\infty} {\int\limits_{-\infty}^{\frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}} {{\left( \frac{1}{\sqrt{2\pi} }\right)}^{2}\frac{1}{\sigma_{b}} {exp}\left\{\frac{-1}{2} [{b_{i}^{2}} \sigma_{b}^{-2}+(t^{2})]\right\}} dt} db_{i}. \end{array} $$

Here we make a change of variable $t=\frac {z+b_{i}}{\sigma _{\varepsilon _{ij}}}$ to conclude that the above expression equals

$$\int\limits_{-\infty}^{+\infty} {\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt{2\pi} } \right)}^{2}\frac{1}{\sigma_{b}} {exp}\left\{\frac{-1}{2} \left[{b_{i}^{2}}\sigma_{b}^{-2}+ {\left( \frac{z+b_{i}}{\sigma_{\varepsilon_{ij}}}\right)}^{2}\right]\right\}} \frac{1}{\sigma_{\varepsilon_{ij}}}dz} db_{i}. $$

Furthermore, letting $\acute {z}=\frac {z}{\sigma _{\varepsilon _{ij}}}$ and $\acute {b}_{i}=\frac {b_{i}}{\sigma _{\varepsilon _{ij}}}$ we see that the above expression further equals

$$\int\limits_{-\infty}^{+\infty}{\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt {2\pi} } \right)}^{2}\frac{1}{\sigma_{b}} {exp}\left\{\frac{-1}{2} \left[\acute{b}_{i}^{2} \sigma _{\varepsilon_{ij}}^{2} \sigma_{b}^{-2}+ {(\acute{z}+\acute{b}_{i})}^{2}\right]\right\}} \frac{1}{\sigma_{\varepsilon_{ij}}}\sigma_{\varepsilon_{ij}}d\acute{z}} \sigma_{\varepsilon _{ij}}d\acute{b}_{i}. $$

Further let us to define:

$$\left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}l} A = 1+\frac{\sigma_{\varepsilon_{ij}}^{2}}{{\sigma_{b}^{2}}} \\ B=-\acute{z} {\sigma_{b}^{2}} \end{array}} \\ {\begin{array}{*{20}l} C=\acute{z}^{2}E \\ E=\frac{1}{\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}} \end{array}} \end{array}} \right. $$

hence the above expression equals

$$\int\limits_{-\infty}^{+\infty}{\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt {2\pi} } \right)}^{2}\frac{\sigma_{\varepsilon_{ij}}}{\sigma_{b}} {exp}\left\{\frac{-1}{2} [\acute{b}_{i}^{2}A + 2\acute{z}\acute{b}_{i}+\acute{z}^{2}]\right\}} d\acute{z}} d\acute{b}_{i}. $$

Consider that $\acute {b}_{i}^{2} \sigma _{\varepsilon _{ij}}^{2} \sigma _{b}^{-2}+ {(\acute {z}+\acute {b}_{i})}^{2}\mathrm {=}\acute {b}_{i}^{2}A + 2\acute {z}\acute {b}_{i}+\acute {z}^{2}{=}{{(\acute {b}_{i}{-}B)}}^{2}A\mathrm {+}C$, the above expression equals

$$\int\limits_{-\infty}^{+\infty} {\int\limits_{-\infty}^{{\Delta}_{ijk}} {{\left( \frac{1}{\sqrt {2\pi} } \right)}^{2}\frac{\sigma_{\varepsilon_{ij}}}{\sigma_{b}} {exp}\left\{\frac{-1}{2} \frac{{(\acute{b}_{i}-B)}^{2}}{\frac{1}{A}}\right\} . {exp}\left\{\frac{-1}{2}\frac{\acute{z}^{2}}{\frac{1}{E}}\right\}} d\acute{z}} d\acute{b}_{i}. $$

By multiplying the above formula by $\frac {1}{A^{\frac {1}{2}}A^{\frac {\mathrm {-1}}{2}}\quad E^{\frac {1}{2}}{E}^{-\frac {1}{2}}}$, it further equals

$$\int\limits_{-\infty}^{+\infty} \int\limits_{-\infty}^{{\Delta}_{ijk}}{\frac{1}{A^{\frac{1}{2}}{A}^{\frac{\mathrm{-1}}{2}}E^{\frac{1}{2}}{E}^{-\frac{1}{2}}}.\left( \frac{1}{\sqrt{2\pi}}\right)}^{2}\frac{\sigma_{\varepsilon_{ij}}}{\sigma_{b}}{exp}\left\{\frac{-1}{2} \frac{{(\acute{b}-B)}^{2}}{\frac{1}{A}}\right\}.{exp}\left\{\frac{-1}{2}\frac{\acute{z}^{2}}{\frac{1}{E}}\right\} d\acute{z} d\acute{b}_{i}. $$

With a slight rearrangement the above equation equals

$$\sigma_{\varepsilon_{ij}}\times \int\limits_{-\infty}^{+\infty} \frac{1}{\sqrt {2\pi } { A}^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} \frac{{(\acute{b}-B)}^{2}}{\frac{1}{A}}\right\}d\acute{b}_{i} \int\limits_{-\infty}^{{\Delta}_{\boldsymbol{ijk}}}{\frac{1}{\sqrt{2\pi} E^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} \frac{\acute{z}^{2}}{\frac{1}{E}}\right\}} d\acute{z}. $$

Since the first integral equals one, then the above formula equals

$$\sigma_{\varepsilon_{ij}}\times 1\times \int\limits_{-\infty}^{{\Delta}_{ijk}} {\frac{1}{\sqrt {2\pi} E^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} \frac{\acute{z}^{2}}{\frac{1}{E}}\right\}} d. $$

Here we make a change of variable $q= \acute {z}\sqrt E $ to conclude that the above expression equals

$$\begin{array}{@{}rcl@{}} \sigma_{\varepsilon_{ij}}\times \int\limits_{-\infty}^{\frac{{\Delta}_{ijk}}{\sqrt{\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}}}} \frac{1}{\sqrt {2\pi} E^{\frac{\mathrm{-1}}{2}}}{exp}\left\{\frac{-1}{2} q^{2}\right\} {E}^{\frac{\mathrm{-1}}{2}} dq,\\ =\sigma_{\varepsilon_{ij}}{\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}} \right). \end{array} $$

Proof of the Theorem 2.

${\Delta }_{ijk} = {\Phi }^{-1}\,\left \{\!\frac {1}{E\left (\theta \right ) \sigma _{e_{ij}}}\,expit\,(\,\!\alpha _{0k}\,+\, \boldsymbol {x}_{\boldsymbol {ij}}^{\boldsymbol {T}} \boldsymbol {\beta }\,)\!\right \} \times $$ \left \{\sqrt {\sigma _{e_{ij}}^{2}+\sigma _{b}^{2}}\right \}.$□

Knowing the relationship between the marginal and conditional probabilities as follows:

$$F_{ijk}=\int F_{ijk} (b_{i}\theta ) f(b_{i}) db_{i}, $$

we can write:

$$\begin{array}{@{}rcl@{}} \beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} =Logit\left[\int {E(\theta ) {\Phi} \left( \frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}\right) f(b_{i}) db_{i}} \right],\\ \beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} =Logit\left[E(\theta )\int {{\Phi} \left( \frac{{\Delta}_{ijk}+b_{i}}{\sigma_{\varepsilon_{ij}}}\right) f(b_{i}) db_{i}} \right], \end{array} $$

(A1)

where $\int {{\Phi } (\frac {{\Delta }_{ijk}+b_{i}}{\sigma _{\varepsilon _{ij}}}) f(b_{i}) db_{i}} $ simplifies to $\sigma _{\varepsilon _{ij}} {\Phi } (\frac {{\Delta }_{ijk}}{\sqrt {\sigma _{\varepsilon _{ij}}^{2}+{\sigma _{b}^{2}}}} )$ by Lemma 1.

Therefore,

$$\beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} =Logit\left[E(\theta ) \sigma_{\varepsilon_{ij}} {\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}}} \right)\right],$$

and hence

$$expit[\beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} ]=E(\theta ) \sigma_{\varepsilon_{ij}} {\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}} \right). $$

A slight rearrangement shows

$$\frac{1}{\sigma_{\varepsilon_{ij}} E(\theta )}expit[\beta_{0k}+\boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta} ]={\Phi} \left( \frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}} \right), $$

implying

$${\Phi}^{-1}\left[\frac{1}{\sigma_{\varepsilon_{ij}} E(\theta )} expit \{\beta_{0k}+ \boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta }\}\right]=\frac{{\Delta}_{ijk}}{\sqrt {\sigma_{\varepsilon_{ij}}^{2}+\sigma_{b}^{2}}}, $$

and finally

$${{\Delta}_{ijk}={\Phi}}^{-1}\left[\frac{1}{\sigma_{\varepsilon_{ij}} E(\theta )} expit \{\beta_{0k}+ \boldsymbol{x}_{\boldsymbol{ij}}^{\boldsymbol{T}} \boldsymbol{\beta}\} \right]\times \sqrt {\sigma_{\varepsilon_{ij}}^{2}+{\sigma_{b}^{2}}}. $$

Proof of the Theorem 3.

For the OMLS model in equation (2.7), if α₀₁ < ⋯ < α_0K− 1 then Δ_ij1 < ⋯ < Δ_ijK− 1. □

The proof follows along the same lines as in Vahabi et al. (2017).

Web Supplements

SAS Code for Generating Simulated Data Based on the Proposed OMLS Model in Section 2.3

SAS Code for the LS Model Reviewed in Section 2.1

SAS Code for the MLS Model Introduced in Section 2.2

SAS Code for the OMLS Model Introduced in Section 2.3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vahabi, N., Kazemnejad, A. & Datta, S. A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data. Sankhya B 80 (Suppl 1), 103–134 (2018). https://doi.org/10.1007/s13571-018-0162-5

Download citation

Received: 24 November 2017
Published: 28 June 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s13571-018-0162-5

Keywords and phrases.

AMS (2000) subject classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Fixed and random effects models: making an informed choice

References

Acknowledgments