Skip to main content

Advertisement

Log in

Robustness of Bayesian D-optimal design for the logistic mixed model against misspecification of autocorrelation

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In medicine and health sciences mixed effects models are often used to study time-structured data. Optimal designs for such studies have been shown useful to improve the precision of the estimators of the parameters. However, optimal designs for such studies are often derived under the assumption of a zero autocorrelation between the errors, especially for binary data. Ignoring or misspecifying the autocorrelation in the design stage can result in loss of efficiency. This paper addresses robustness of Bayesian D-optimal designs for the logistic mixed effects model for longitudinal data with a linear or quadratic time effect against incorrect specification of the autocorrelation. To find the Bayesian D-optimal allocations of time points for different values of the autocorrelation, under different priors for the fixed effects and different covariance structures of the random effects, a scalar function of the approximate variance–covariance matrix of the fixed effects is optimized. Two approximations are compared; one based on a first order penalized quasi likelihood (PQL1) and one based on an extended version of the generalized estimating equations (GEE). The results show that Bayesian D-optimal allocations of time points are robust against misspecification of the autocorrelation and are approximately equally spaced. Moreover, PQL1 and extended GEE give essentially the same Bayesian D-optimal allocation of time points for a given subject-to-measurement cost ratio. Furthermore, Bayesian optimal designs are hardly affected either by the choice of a covariance structure or by the choice of a prior distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abebe HT, Tan FES, van Breukelen GJP, Berger MPF (2013) Bayesian design for dichotomous repeated measurements with autocorrelation. Stat Methods Med Res. doi:10.1177/0962280213508850

  • Abebe HT, Tan FES, van Breukelen GJP, Berger MPF (2014a) Bayesian D-optimal designs for the two parameter logistic mixed effects model. Comput Stat Data Anal 71:1166–1176

  • Abebe HT, Tan FES, van Breukelen GJP, Serroyen J, Berger MPF (2014b) On the choice of a prior for Bayesian D-optimal designs for the logistic regression model with a single predictor. Commun Stat Simul Comput 43(7):1811–1824

    Article  MATH  Google Scholar 

  • Atkinson AC, Haines LM (1996) Designs for nonlinear and generalized linear models. Handb Stat 13:437–475

    Article  MathSciNet  Google Scholar 

  • Atkinson AC, Donev AN, Tobias RD (2007) Optimum Experimental Designs, with SAS. Clarendon, Oxford

    MATH  Google Scholar 

  • Ausems M, Mesters I, Van Breukelen G, De Vries H (2002) Short-term effects of a randomized computer-based out-of-school smoking prevention trial aimed at Dutch elementary schoolchildren. Prev Med 34:581–589

    Article  Google Scholar 

  • Berger MPF, Tan FES (2004) Robust designs for linear mixed effects models. Appl Stat 53:569–581

    MATH  MathSciNet  Google Scholar 

  • Berger MPF, Wong WK (2009) An introduction to optimal designs for social and biomedical research. Wiley, Chichester

    Book  MATH  Google Scholar 

  • Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25

    MATH  Google Scholar 

  • Carriére I, Bouyer J (2002) Choosing marginal or random-effects models for longitudinal binary studies: application to self-reported disability among older persons. BMC Med Res Methodol 2:15

    Article  Google Scholar 

  • Chaloner K, Larntz K (1989) Optimal Bayesian designs applied to logistic regression experiments. J Stat Plan Inference 18:191–208

    Article  MathSciNet  Google Scholar 

  • Chaloner K, Verdinelli I (1995) Bayesian experimental design. A review. Stat Sci 10:273–304

    Article  MATH  MathSciNet  Google Scholar 

  • Chernoff H (1953) Locally optimal designs for estimating parameters. Ann Math Stat 24:586–602

    Article  MATH  MathSciNet  Google Scholar 

  • Chipman H, Welch W (1996) D-optimal design for generalized linear models, unpublished manuscript downloaded from http://math.acadiau.ca/chipmanh/publications.html

  • Dargent-Molina P, Favier F, Grandjean H, Baudoin C, Schott AM, Hausherr E, Meunier PJ, Bréart G (1996) Fall-related factors and risk of hip fracture: the EPIDOS prospective study. Lancet 348:145–149

    Article  Google Scholar 

  • Diggle PJ, Liang K-Y, Zeger S (1994) Analysis of longitudinal data. Clarendon Press, Oxford

    Google Scholar 

  • Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41:337–348

    Article  MATH  Google Scholar 

  • Gilks WR, Best NG, Tan KKC (1995) Adaptive rejection metropolis sampling within Gibbs sampling. Appl Stat 44:455–472

    Article  MATH  Google Scholar 

  • Han C, Chaloner K (2004) Bayesian experimental designs for nonlinear mixed models with application to HIV dynamics. Biometrics 60:25–33

    Article  MATH  MathSciNet  Google Scholar 

  • Jang W, Lim J (2009) A numerical study of PQL estimation biases in generalized linear mixed models under heterogeneity of random effects. Commun Stat Simul Comput 38:692–702

    Article  MATH  MathSciNet  Google Scholar 

  • Khuri AI, Mukherjee B, Sinha BK, Ghosh M (2006) Design issues for generalized linear models: a review. Stat Sci 21:376–399

    Article  MATH  MathSciNet  Google Scholar 

  • Matlab (2010) Handbook Matlab, version 7.10.0.499(R2010a). MathWorks Inc., MA, USA

  • Moerbeek M, Breukelen GJP, Berger MPF (2001) Optimal experimental designs for multilevel logistic models. Statistician 50(1):17–30

    MathSciNet  Google Scholar 

  • Moerbeek M, Breukelen GJP, Berger MPF (2003) A comparison of estimation methods for multilevel logistic models. Comput Stat 18(1):19–37

    MATH  Google Scholar 

  • Moerbeek M, Maas CJM (2005) Optimal experimental designs for multilevel logistic models with two binary predictors. Commun Stat Theory Methods 34:1151–1167

    Article  MATH  MathSciNet  Google Scholar 

  • Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer, New York

    MATH  Google Scholar 

  • Niaparast M (2009) On optimal design for a Poisson regression model with random intercept. Stat Probab Lett 79:741–747

    Article  MATH  MathSciNet  Google Scholar 

  • Niaparast M, Schwabe R (2013) Optimal design for quasi-likelihood estimation in Poisson regression with random coefficients. J Stat Plan Inference 143:296–306

    Article  MATH  MathSciNet  Google Scholar 

  • Ouwens MJNM, Tan FES, Berger MPF (2006) A maximin criterion for the logistic random intercept model with covariates. J Stat Plan Inference 136:962–981

    Article  MATH  MathSciNet  Google Scholar 

  • Silvey SD (1980) Optimal design: an introduction to the theory for parameter estimation. Chapman and Hall, New York

    Book  MATH  Google Scholar 

  • Sinha Sk, Xu X (2011) Sequential D-optimal designs for generalized linear mixed models. J Stat Plan Inference 141:1394–1402

    Article  MATH  MathSciNet  Google Scholar 

  • Sommer A, Katz J, Tarwotjo I (1984) Increased risk of respiratory infection and diarrhea in children with pre-existing mild vitamin A deficiency. Am J Clin Nutr 40:1090–1095

    Google Scholar 

  • Tan FES, Berger MPF (1999) Optimal allocation of time points for random effects models. Commun Stat Simul Comput 28:517–540

    Article  MATH  MathSciNet  Google Scholar 

  • Tekle FB, Tan FES, Berger MPF (2008) Maximin D-optimal designs for binary longitudinal responses. Comput Stat Data Anal 52:5253–5262

    Article  MATH  MathSciNet  Google Scholar 

  • Woods DC, van de Ven P (2006) Blocked designs for experiments with correlated non-normal response. Technometrics 53:173–182

    Article  Google Scholar 

  • Zeger SL, Liang K-Y, Albert PS (1988) Model for longitudinal data: a generalized estimating equation approach. Biometrics 44:1049–1060

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H. T. Abebe.

Electronic supplementary material

Appendix: Derivation for the relative efficiency Eq. (14)

Appendix: Derivation for the relative efficiency Eq. (14)

To compare designs we compute their efficiencies using the concept of equivalent sample size (see Atkinson et al. 2007, page 152 and Berger and Wong 2009, page 37). Let \({\text { Var}}\left( {\hat{\varvec{\beta }}_{\xi _s}} \right) \) and \( {\text {Var}}\left( {\hat{\varvec{\beta }}_{\xi _q } } \right) \) be the variance–covariance matrices of \( \hat{\varvec{\beta }} \) for the design \( \xi _s \) with \( s \) time points and the design \( \xi _q \) with \( q \) time points, respectively and \( N_{s} \) and \( N_{q} \) are the number of subjects for the design \( \xi _s \) and \( \xi _q\), respectively. For the D-criterion and a given model with \( p \) parameters, the RE of design \( \xi _s \) compared to design \( \xi _q \) is given by:

$$\begin{aligned} \hbox {RE}\left( {\xi _s ;\xi _q } \right) =\frac{N_s }{N_q }\left[ {\frac{\det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _s } } \right) } \right] ^{-1}} \right\} } \right. }{\det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _q } } \right) } \right] ^{-1}} \right\} } \right. }} \right] ^{\frac{1}{p}}. \end{aligned}$$
(15)

where the two determinants in (15) are both based on one subject only, and the factor \( N_{s}/N_{q} \) takes into account the sample size per design.

This RE (15) can be rewritten as follows:

$$\begin{aligned} \hbox {RE}\left( {\xi _s ;\xi _q } \right)&= \frac{N_s }{N_q }\exp \left\{ {\log \left\{ {\left[ {\frac{\det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _s } } \right) } \right] ^{-1}} \right\} } \right. }{\det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _q } } \right) } \right] ^{-1}} \right\} } \right. }} \right] ^{\frac{1}{p}}} \right\} } \right\} . \nonumber \\&= \frac{N_s }{N_q }\exp \left\{ {\frac{\log \det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _s } } \right) } \right] ^{-1}} \right\} -\log \det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _q } } \right) } \right] ^{-1}} \right\} } \right. } \right. }{p}} \right\} .\nonumber \\ \end{aligned}$$
(16)

Rewriting \( N_{s} \) and \( N_{q} \) in terms of cost ratio \( k \) and number of time points for the same total cost using the cost function Eq. (13), i.e., \( N_s =\frac{C}{C_2 \left( {k+s} \right) } \) and \( N_q =\frac{C}{C_2 \left( {k+q} \right) }\), we obtain

$$\begin{aligned} \hbox {RE}\left( {\xi _s ;\xi _q } \right) =\frac{k+q}{k+s}\left[ {\exp \left\{ {\frac{\log \det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _s } } \right) } \right] ^{-1}} \right\} -\log \det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _q } } \right) } \right] ^{-1}} \right\} } \right. } \right. }{p}} \right\} } \right] .\nonumber \\ \end{aligned}$$
(17)

This RE (17) is for locally optimal design, i.e., for given parameter values. By generalizing this to Bayesian design, the RE of design \( \xi _s \) compared to design \( \xi _q \) with prior distribution \( \pi \) for \( \varvec{\beta } \) becomes as follows:

$$\begin{aligned}&\!\!\!\!\!\hbox {RE}\left( {\xi _s ;\xi _q (\pi )\left| \pi \right. } \right) \nonumber \\&\!=\!\frac{k+q}{k+s}\left[ {\exp \left\{ {\frac{E_\beta \log \det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _s } } \right) } \right] ^{-1}} \right\} \!-\!E_\beta \log \det \left\{ {\left. {\left[ {\hbox {Var}\left( {\hat{\varvec{\beta }}_{\xi _q } } \right) } \right] ^{-1}} \right\} } \right. } \right. }{p}} \right\} } \right] .\nonumber \\ \end{aligned}$$
(18)

Thus, using the Bayesian D-optimality criterion (12), the RE will be:

$$\begin{aligned} \hbox {RE}\left( {\xi _s ;\xi _q (\pi )\left| \pi \right. } \right) =\frac{k+q}{k+s}\left[ {\exp \left\{ {\frac{\phi _\mathrm{D} \left( {\xi _s \left| \pi \right. } \right) -\phi _\mathrm{D} \left( {\xi _q \left| \pi \right. } \right) }{p}} \right\} } \right] . \end{aligned}$$
(19)

When the ratio \( \frac{k+q}{k+s} \) is one, that is, if either \( q=s \) or the cost ratio \( k \) is very large, this RE (19) becomes the same as the RE given by Chaloner and Larntz (1989).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abebe, H.T., Tan, F.E.S., Van Breukelen, G.J.P. et al. Robustness of Bayesian D-optimal design for the logistic mixed model against misspecification of autocorrelation. Comput Stat 29, 1667–1690 (2014). https://doi.org/10.1007/s00180-014-0512-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-014-0512-3

Keywords

Navigation