# Estimation of random-effects model for longitudinal data with nonignorable missingness using Gibbs sampling

- 525 Downloads

## Abstract

The missing data problem is common in longitudinal or repeated measurements data. When the missingness mechanism is nonignorable, the distribution of the observed response and indicators of missingness should be modelled jointly using either ‘shared random-effects model’ or ‘correlated random-effects model’. However, computational challenges arise in the model fitting due to intractable numerical integration involved in the log-likelihood function. We provide alternative modeling of ‘correlated random-effects model’ using latent variables and propose a simple algorithm based on Gibbs sampling for estimation of associated parameters. The method is illustrated through simulation and the analysis of a real data set arising from an autism study.

## Keywords

Latent variable Legendre polynomial Time-varying coefficients MCMC Non-informative prior## 1 Introduction

In designed longitudinal studies, the aim is to estimate the mean response at a certain time based on fixed or time-varying covariates. For studies with long follow up periods, the proportion of individuals with missing data can be substantial. Inference based on the observed data may lead to biased and unreliable results. Ample amount of research works on the modeling of longitudinal data with ignorable missingness are available in the literature (Little and Rubin 2002). In this paper, we focus on modeling of longitudinal data when the missingness mechanism is nonignorable. In these cases, the distribution of the observed response and indicators of missingness should be modelled jointly. Such joint models can be classified into either pattern mixture models or selection models. Little (1995) provided a detailed overview of pattern mixture models and selection models for longitudinal studies with missing data due to informative dropout. Siddiqui and Ali (1998) considered random-effects pattern-mixture models and provided estimates of the associated parameters by averaging the estimates obtained from different subsets of the data depending on the missing data-patterns. Daniels and Hogan (2000) proposed a reparameterization of the pattern mixture model that allows consideration of a wide range of nonignorable missing-data mechanisms. Diggle and Kenward (1994) proposed the use of a selection model with a logistic regression form to deal with informative dropout. Baker (1995) considered repeated binary data with nonignorable and non-monotone missingness. Troxel et al. (1998) extended those of Diggle and Kenward (1994) for non-monotone and nonignorable missing data. However, its implementation is computationally challenging. Rotnitzky et al. (1998) considered inverse probability weighted estimating equations for the nonignorable missing and provided a simultaneous estimation of the dropout probability and mean response based on the selection model. Minini and Chavance (2004) considered a log-linear model and provided a sensitivity analysis for the longitudinal binary data with nonignorable missingness.

In the context of survival analysis, shared random-effects (SRE) model proposed by Wu and Carrol (1998), is popularly used as an alternative to the selection models. De and Tu (1994) and Schluchter (1994) suggested some extensions of SRE models. Have et al. (1998) and Pulkstenis et al. (1998) adopted the SRE model for binary longitudinal data with the informative dropouts. Tsonaka et al. (2009) considered semi-parametric shared parameter model for the modelling of the response variable with non-monotone and nonignorable missingness. In the aforementioned SRE models, the selection model and the response model have exactly the same random component.

In many situations, the latent factors affecting the missingness could be different from those affecting the response; however, they are correlated due to common risk factors. In order to model such phenomena, Lin et al. (2010) considered an interesting generalization of SRE model by using correlated random effects model. An underlying assumption for the random-effects model is that, conditional on the random effects, the missingness is independent of the response. Note that ignorable missingness is a special case of the correlated random-effects model if the two random effects are independent. A main concern in the random-effects models is computational challenges arise in the model fitting due to intractable numerical integration involved in the log-likelihood function. In order to overcome such difficulties, well-known approximation methods like the Gauss–Hermite quadrature (Pinheiro and Bates 1995) and the Laplace approximation (Breslow and Clayton 1993) are exploited for estimation purposes. Lin et al. (2010) expressed the likelihood as a ratio of two integrals and then approximated the numerator and denominator using the Laplace formula. In order to estimate the associated model parameters, one needs to evaluate first and second derivatives and maximize the two integrands.

We propose alternative modeling of the observed response and indicators of missingness based on correlated latent variables. In particular, we develop regression models with the covariates having a time-varying effect and time-invariant effects on the latent variables involving correlated random effects. A simple Gibbs sampler is developed following Albert and Chib (1993), where in each iteration, we sample the model parameters as well as the latent variables. Our method is simple because it is based on the Gibbs sampler, and it is fast since we estimate the parameters for both the models simultaneously in an automated manner and avoids the computational challenges posed by intractable log-likelihood functions typically encountered in the frequentist method. The rest of the paper is organized as the follows. In Sect. 2, we discuss our proposed model and the Bayesian estimation method in detail. We analyze data from a longitudinal study of the social development of children with autism in Sect. 3. Simulation studies are performed for assessing the effectiveness of the proposed method and the results are finally summarized in Sect. 4. In Sect. 5, we provide outlines of possible future work and some concluding remarks.

## 2 Proposed model

*m*different time points from

*n*subjects. We consider a set of covariates, some of which possibly have a time-varying effect on the response. The response for the

*i*th subject at the

*t*th time point, which we denote by \(Y_i(t)\), can thus be modelled as the following:

*J*, \(J'\) and

*L*denote the number of covariates with time-varying fixed effects, time-invariant fixed effects and random effects on response, respectively. Subject-specific random effects \(\mathbf u _i=(u_{1i},\ldots , u_{Li})\)’s capture the longitudinal dependence and are assumed to be iid \(N_{L}(\mathbf 0 ,\varSigma _{u})\). The residuals \(e_i(t)\)’s are assumed to be iid \(N(0,\sigma ^2_{e}\)). Note that the above model is a special case of generalized varying coefficient model for longitudinal data, introduced by Sentrk et al. (2013).

*N*(0, 1). In order to incorporate the possible correlation between the response variable \(Y_{i}(t)\) and the missing indicator \(U_{i}(t)\), we consider \(\mathbf u _{i}\) and \(\mathbf v _{i}\) are correlated random vectors following a multivariate normal distribution with mean vector \(\mathbf 0 \) and covariance matrix \(\varSigma =\bigl ({\begin{matrix}\varSigma _{u}&{}\quad \varSigma _{uv} \\ \varSigma _{uv} &{}\quad \varSigma _{v}\end{matrix}} \bigr )\). For the aforementioned models, we can use the usual models for multivariate longitudinal data (Diggle et al. 2002, p. 332) but this requires values of the latent variables at each step. We propose a Bayesian estimation method for simultaneous estimation of the parameters associated with the joint model using Gibbs sampling.

### 2.1 Modelling time-varying coefficients

One of the major advantages of the longitudinal studies is to incorporate age effect in the modeling and its capacity to distinguish changes in the response within and across individuals over time (Diggle et al. 2002, p. 1). In order to model the effects of the respective covariates over time, we have considered the time-varying coefficients \(\beta _j(t)\) and \(\theta _k(t)\) in Eqs. (2) and (3), respectively. Since parametric nature of \(\beta _j(t)\) and \(\theta _k(t)\) is not known in advance, we consider semi-parametric approach of modelling the time-varying coefficients using Legendre polynomials (LP) basis functions. These Polynomials have already been proven as powerful tool by several authors for semi-parametric regression (Marie and Sen 1985; Meyer 2000; Cui and Zhu 2006; Bhuyan et al. 2019).

*r*is given by the following sum

*r*th order Legendre polynomial (LP) at time

*t*by \(P_{r}(t)\). We transform the original time points

*t*to the adjusted time points \( t^{'}=-1+2(\frac{t-t_{min}}{t_{max}-t_{min}})\), for fitting the orthogonal LP over the range [− 1, 1], where \(t_{min}\) and \(t_{max}\) are the smallest and the highest time points respectively. Let \(P^{(r)}(t^{'})= [P_{0}(t^{'}), P_{1}(t^{'}),\ldots ,P_{r}(t^{'})]^{T}\) denote the family of the first \(r+1\) basis functions and express the functions \(\beta _j(t^{'})\) and \(\theta _k(t^{'})\) as some linear combinations of these basis functions:

### 2.2 Bayesian estimation using Gibbs sampler

We employ a Bayesian approach of estimating the model parameters for the Eqs. (2) and (3) using the Gibbs sampler. Let \(\varTheta =[\mathbf a ,\mathbf b , \varvec{\gamma },\varvec{\delta }, \sigma ^2_{e}, \varSigma ]\) denote the set of all the model parameters involved in the Eqs. (2) and (3), where the bold symbols denote the vector of the respective coefficients. Note that, one needs to sample from the joint posterior of the model parameters and unknown latent variables.

## 3 Data analysis

The estimated parameters for the autism study

Predictor | \(U^{*}_{i}(t)\) | \(\log ({ VSAE})\) | ||||
---|---|---|---|---|---|---|

Mean | SD | 95% CI | Mean | SD | 95% CI | |

Intercept | 2.279 | 0.309 | (1.712,2.909) | 1.503 | 0.085 | (1.332, 1.669) |

\(\log ({ Age})\) | \(-\) 0.873 | 0.153 | (\(-\) 1.171, \(-\) 0.586) | 0.681 | 0.045 | (0.593, 0.769) |

SIC2 | \(-\) 0.207 | 0.390 | (\(-\) 0.978, 0.568) | 0.074 | 0.113 | (\(-\) 0.152, 0.295) |

SIC3 | \(-\) 0.309 | 0.442 | (\(-\) 1.195, 0.560) | 0.328 | 0.126 | (0.084, 0.580) |

SIC2 \(\times \log ({ Age})\) | 0.119 | 0.196 | (\(-\) 0.287, 0.491) | 0.118 | 0.059 | (\(-\) 0.002, 0.231) |

SIC3 \(\times \log ({ Age})\) | 0.253 | 0.222 | (\(-\) 0.172, 0.685) | 0.312 | 0.067 | (0.183, 0.439) |

*i*th child at time

*t*, where \(t=\log ({ Age})\). Lin et al. (2010) observed that the general trend of the VSAE score is increasing with age, while there is a substantial variation of the VSAE scores among the children, and hence the logarithmic transformation has been considered. In order to incorporate the categorical variable SCIDEGP, we introduce two dummy variables SCI2 and SCI3 representing the second and third level of the SICD group, respectively, and we take SICDEGP \(=\) 1 as the reference group. As discussed in the Sect. 2.1, we consider first order LP for modeling of time-varying coefficients and the models (4) and (5) can be rewritten as

## 4 Simulation study

Results of the simulation study with \(20 \%\) missing observations (\(n=100\) and \(m=5\))

Parameter | Complete data | Proposed | MAR | ||||||
---|---|---|---|---|---|---|---|---|---|

RB | SD | CP | RB | SD | CP | RB | SD | CP | |

\(\beta _{1}\) | \(-\) 0.791 | 0.402 | 0.96 | 0.287 | 0.426 | 0.95 | \(-\) 1.270 | 0.436 | 0.94 |

\(\beta _{20}\) | \(-\) 0.510 | 0.354 | 0.95 | 1.267 | 0.409 | 0.95 | 10.825 | 0.408 | 0.85 |

\(\beta _{21}\) | 0.393 | 0.084 | 0.96 | \(-\) 0.383 | 0.102 | 0.94 | \(-\) 0.802 | 0.102 | 0.93 |

\(\gamma _{1}\) | 0.067 | 0.403 | 0.93 | \(-\) 0.296 | 0.392 | 0.93 | \(-\) 0.546 | 0.411 | 0.94 |

\(\sigma ^{2}_{u}\) | 1.997 | 0.589 | 0.94 | 5.538 | 0.660 | 0.94 | \(-\) 13.042 | 0.620 | 0.80 |

\(\sigma ^{2}_{e}\) | 1.328 | 0.657 | 0.95 | 2.669 | 0.719 | 0.95 | 4.825 | 0.774 | 0.90 |

Results of the simulation study with \(40 \%\) missing observations (\(n=200\) and \(m=10\))

Parameter | Complete data | Proposed | MAR | ||||||
---|---|---|---|---|---|---|---|---|---|

RB | SD | CP | RB | SD | CP | RB | SD | CP | |

\(\beta _{1}\) | 0.239 | 0.226 | 0.94 | 0.589 | 0.279 | 0.95 | \(-\) 2.173 | 0.297 | 0.93 |

\(\beta _{20}\) | \(-\) 0.511 | 0.165 | 0.94 | \(-\) 2.059 | 0.280 | 0.94 | 12.71 | 0.273 | 0.85 |

\(\beta _{21}\) | 0.019 | 0.020 | 0.96 | \(-\) 0.004 | 0.071 | 0.95 | \(-\) 0.913 | 0.070 | 0.88 |

\(\gamma _{1}\) | 0.001 | 0.246 | 0.94 | \(-\) 0.295 | 0.254 | 0.93 | \(-\) 0.670 | 0.294 | 0.95 |

\(\sigma ^{2}_{u}\) | \(-\) 1.079 | 0.295 | 0.96 | 2.731 | 0.441 | 0.95 | \(-\) 8.465 | 0.445 | 0.88 |

\(\sigma ^{2}_{e}\) | 0.238 | 0.301 | 0.94 | \(-\) 0.135 | 0.491 | 0.94 | 0.975 | 0.513 | 0.90 |

Results of the simulation study under mis-specified selection model (\(n=100\) and \(m=5\))

Parameter | Proposed | MAR | ||||
---|---|---|---|---|---|---|

RB | SD | CP | RB | SD | CP | |

\(\beta _{1}\) | 14.556 | 0.646 | 0.40 | 17.219 | 0.672 | 0.31 |

\(\beta _{20}\) | \(-\) 38.817 | 0.469 | 0.65 | \(-\) 49.398 | 0.482 | 0.42 |

\(\beta _{21}\) | \(-\) 0.202 | 0.085 | 0.96 | 0.207 | 0.087 | 0.94 |

\(\gamma _{1}\) | \(-\) 1.648 | 0.415 | 0.85 | \(-\) 2.627 | 0.447 | 0.83 |

\(\sigma ^{2}_{u}\) | \(-\) 6.261 | 0.511 | 0.83 | \(-\) 18.670 | 0.669 | 0.83 |

\(\sigma ^{2}_{e}\) | 1.143 | 0.753 | 0.92 | 1.633 | 0.840 | 0.92 |

BIC | \(-\) 191,582 | \(-\) 167,753 |

Results of the simulation study under mis-specified selection model ( \(n=200\) and \(m=10\))

Parameter | Proposed | MAR | ||||
---|---|---|---|---|---|---|

RB | SD | CP | RB | SD | CP | |

\(\beta _{1}\) | 17.473 | 0.340 | 0.01 | 18.943 | 0.363 | 0 |

\(\beta _{20}\) | \(-\) 51.309 | 0.232 | 0.03 | \(-\) 56.239 | 0.236 | 0 |

\(\beta _{21}\) | 0.245 | 0.021 | 0.92 | 0.326 | 0.021 | 0.85 |

\(\gamma _{1}\) | \(-\) 2.378 | 0.222 | 0.65 | \(-\) 2.687 | 0.271 | 0.65 |

\(\sigma ^{2}_{u}\) | \(-\) 5.605 | 0.263 | 0.80 | \(-\) 6.058 | 0.334 | 0.86 |

\(\sigma ^{2}_{e}\) | \(-\) 1.873 | 0.355 | 0.87 | \(-\) 2.460 | 0.369 | 0.89 |

BIC | \(-\) 2,207,775 | \(-\) 2,039,544 |

## 5 Discussion

In this paper, a Bayesian methodology has been developed for estimation of the correlated random-effects model for longitudinal data with nonignorable missingness. Unlike the existing frequentist methods that require approximation of the intractable log-likelihood function, we provide a simple estimation methodology using Gibbs sampler. Our method is easy to implement, and as a special case, it is applicable to various other models. For example, traditional SRE models are special cases of the correlated random-effects models, which are popularly used for modeling nonignorable missing data. Moreover, the proposed method is also readily applicable to the data with ignorable missingness. The simulation results indicate that the estimates from our proposed method with missing information in response, are as good as compared to the estimates from complete data analysis. Moreover, the performance of the proposed method is superior compared to MAR even under the mis-specified selection mechanism.

In this paper, we have considered longitudinal data with a continuous response variable. Since the underlying latent response is assumed to be continuous, one can also consider our approach for the purpose of modeling a binary or count response variable with minor modification. For non-normal error and/or random effects models, the full conditionals may not be from standard distributions. One can generalize the proposed method for such cases and employ the Metropolis–Hasting algorithm for estimation purpose. Sometimes, the data sets from longitudinal studies may possess outliers along with missing information not only in the response but also in the covariates. It may be an interesting problem to deal with such data and develop a Bayesian methodology for detection of outliers in the presence of missing values.

## Notes

### Acknowledgements

The author is thankful to Prof. Murari Mitra and Mr. Jayabrata Biswas for many helpful comments and suggestions. This project is partially funded by Science and Engineering Research Board, Government of India (Fellowship Reference No. PDF/2017/000180).

## References

- Albert J, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88(422):669–679MathSciNetCrossRefGoogle Scholar
- Baker SG (1995) Marginal regression for repeated binary data with outcome subject to non-ignorable non-response. Biometrics 51:1042–1052CrossRefGoogle Scholar
- Bhuyan P, Biswas J, Ghosh P, Das K (2019) A Bayesian two-stage regression approach of analysing longitudinal outcomes with endogeneity and incompleteness. Stat Model 29(2):157–173MathSciNetCrossRefGoogle Scholar
- Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25zbMATHGoogle Scholar
- Cui Y, Zhu J (2006) Functional mapping for genetic control of programmed cell death. Physiol Genom 25(3):458–69CrossRefGoogle Scholar
- Daniels MJ, Hogan JW (2000) Reparameterizing the pattern mixture model for sensitivity analyses under informative dropout authors. Biometrics 56(4):1241–1248MathSciNetCrossRefGoogle Scholar
- De VG, Tu XM (1994) Modeling progression of CD4+ lymphocyte count and its relationship to survival time. Biometrics 50:1003–1014CrossRefGoogle Scholar
- Diggle PJ, Kenward MG (1994) Informative dropout in longitudinal data analysis. Appl Stat 43:49–93
**(with discussion)**CrossRefGoogle Scholar - Diggle PJ, Heagerty PJ, Liang K, Zeger SL (2002) Analysis of longitudinal data. Oxford University Press, Inc., OxfordzbMATHGoogle Scholar
- Have TRT, Kunselman AR, Pulkstenis EP, Landis JR (1998) Mixed effects logistic regression models for longitudinal binary response data with informative dropout. Biometrics 54:367–383CrossRefGoogle Scholar
- Lin H, Liub D, Zhou X (2010) A correlated random-effects model for normal longitudinal data with nonignorable missingness. Stat Med 29:236–247MathSciNetGoogle Scholar
- Little RJA (1995) Modeling the drop-out mechanism in repeated-measures studies. J Am Stat Assoc 90:1112–1121MathSciNetCrossRefGoogle Scholar
- Little RJA, Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New YorkGoogle Scholar
- Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley, New YorkCrossRefGoogle Scholar
- Marie H, Sen PK (1985) On sequentially adaptive asymptotically efficient rank statistics. Seq Anal 4(3):125–151MathSciNetCrossRefGoogle Scholar
- Meyer K (2000) Random regressions to model phenotypic variation in monthly weights of Australian beef cows. Livest Prod Sci 65:19–38CrossRefGoogle Scholar
- Minini P, Chavance M (2004) Sensitivity analysis of longitudinal binary data with non-monotone missing values. Biometrics 5:531–544zbMATHGoogle Scholar
- Pinheiro JC, Bates DM (1995) Approximations to the log-likelihood function in the nonlinear mixed-effects model. J Comput Graph Stat 4:12–35Google Scholar
- Pulkstenis EP, Have TRT, Landis JR (1998) Model for the analysis of binary longitudinal pain data subject to informative dropout through remedication. J Am Stat Assoc 93:438–450CrossRefGoogle Scholar
- Rotnitzky A, Robins JM, Scharfstein DO (1998) Semiparametric regression for repeated outcomes with nonignorable nonresponse. J Am Stat Asso 93:321–1339MathSciNetCrossRefGoogle Scholar
- Schluchter MD (1994) Methods for the analysis of informatively censored longitudinal data. Stat Med 11:1861–1870CrossRefGoogle Scholar
- Sentrk D, Dalrymple LS, Mohammed SM, Kaysen GA, Nguyen DV (2013) Modeling time-varying effects with generalized and unsynchronized longitudinal data. Stat Med 32:2971–2987MathSciNetCrossRefGoogle Scholar
- Siddiqui O, Ali MW (1998) A comparison of the random-effects pattern mixture model with last-observation-carried-forward (LOCF) analysis in longitudinal clinical trials with dropouts. J Biopharm Stat 8(4):61–106CrossRefGoogle Scholar
- Troxel AB, Harrington DP, Lipsitz SR (1998) Analysis of longitudinal data with non-ignorable non-monotone missing values. Appl Stat 74:425–438zbMATHGoogle Scholar
- Tsonaka R, Verbeke G, Lesaffre E (2009) A semi-parametric shared parameter model to handle nonmonotone nonignorable missingness. Biometrics 65(1):81–87MathSciNetCrossRefGoogle Scholar
- West B, Welch K, Galecki A (2007) Linear mixed models: a practical guide using statistical software. Chapman & Hall, CRC Press, Boca RatonzbMATHGoogle Scholar
- Wu MC, Carrol RJ (1998) Estimation and comparison of changes in the presence of informative censoring by modeling the censoring process. Biometrics 44:175–188MathSciNetGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.