Abstract
Longitudinal data are frequently analyzed using normal mixed effects models. Moreover, the traditional estimation methods are based on mean regression, which leads to non-robust parameter estimation under non-normal error distribution. However, at least in principle, quantile regression (QR) is more robust in the presence of outliers/influential observations and misspecification of the error distributions when compared to the conventional mean regression approach. In this context, this paper develops a likelihood-based approach for estimating QR models with correlated continuous longitudinal data using the asymmetric Laplace distribution. Our approach relies on the stochastic approximation of the EM algorithm (SAEM algorithm), obtaining maximum likelihood estimates of the fixed effects and variance components in the case of nonlinear mixed effects (NLME) models. We evaluate the finite sample performance of the SAEM algorithm and asymptotic properties of the ML estimates through simulation experiments. Moreover, two real life datasets are used to illustrate our proposed method using the \(\texttt {qrNLMM}\) package from \(\texttt {R}\).
Similar content being viewed by others
References
Aghamohammadi A, Mohammadi S (2017) Bayesian analysis of penalized quantile regression for longitudinal data. Stat Pap 58(4):1035–1053
Allassonnière S, Kuhn E, Trouvé A (2010) Construction of Bayesian deformable models via a stochastic approximation algorithm: a convergence study. Bernoulli 16(3):641–678
Andriyana Y, Gijbels I, Verhasselt A (2016) Quantile regression in varying-coefficients models: non-crossing quantile curves and heteroscedasticity. Stat Pap. https://doi.org/10.1007/s00362-016-0847-7
Barndorff-Nielsen OE, Shephard N (2001) Non-gaussian ornstein-uhlenbeck-based models and some of their uses in financial economics. J R Stat Soc Ser B 63(2):167–241
Booth JG, Hobert JP (1999) Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. J R Stat Soc Ser B 61(1):265–285
Davidian M, Giltinan D (1995) Nonlinear models for repeated measurement data. CRC Press, Boca Raton
Davidian M, Giltinan D (2003) Nonlinear models for repeated measurement data: an overview and update. J Agric Biol Environ Stat 8(4):387–419
Delyon B, Lavielle M, Moulines E (1999) Convergence of a stochastic approximation version of the EM algorithm. Ann Stat 8:94–128
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38
Fu L, Wang Y (2012) Quantile regression for longitudinal data with a working correlation model. Comput Stat Data Anal 56(8):2526–2538
Galarza C, Lachos VH, Cabral C, Castro L (2017) Robust quantile regression using a generalized classs of skewed distributions. Statistics 6:113–130
Galvao A (2011) Quantile regression for dynamic panel data with fixed effects. J Econ 164(1):142–157
Galvao A, Montes-Rojas GV (2010) Penalized quantile regression for dynamic panel data. J Stat Plan Inference 140(11):3476–3497
Geraci M, Bottai M (2007) Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics 8(1):140–154
Geraci M, Bottai M (2014) Linear quantile mixed models. Stat Comput 24(3):461–479
Grossman Z, Polis M, Feinberg M, Grossman Z, Levi I, Jankelevich S, Yarchoan R, Boon J, de Wolf F, Lange J, Goudsmit J, Dimitrov D, Paul W (1999) Ongoing HIV dissemination during HAART. Nat Med 5(10):1099–1104
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1):97–109
Huang Y, Dagne G (2011) A Bayesian approach to joint mixed-effects models with a skew-normal distribution and measurement errors in covariates. Biometrics 67(1):260–269
Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91(1):74–89
Koenker R (2005) Quantile regression. Cambridge University Press, New York
Kozubowski T, Nadarajah S (2010) Multitude of Laplace distributions. Stat Pap 51:127–148
Kuhn E, Lavielle M (2004) Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM Probab Stat 8:115–131
Kuhn E, Lavielle M (2005) Maximum likelihood estimation in nonlinear mixed effects models. Comput Stat Data Anal 49(4):1020–1038
Lachos VH, Ghosh P, Arellano-Valle RB (2010) Likelihood based inference for skew-normal independent linear mixed models. Stat Sin 20(1):303–322
Lachos VH, Castro LM, Dey DK (2013) Bayesian inference in nonlinear mixed-effects models using normal independent distributions. Comput Stat Data Anal 64:237–252
Lavielle M (2014) Mixed effects models for the population approach. Chapman and Hall/CRC, Boca Raton
Lipsitz SR, Fitzmaurice GM, Molenberghs G, Zhao LP (1997) Quantile regression methods for longitudinal data with drop-outs: application to CD4 cell counts of patients infected with the human immunodeficiency virus. J R Stat Soc Ser C 46(4):463–476
Liu Y, Bottai M (2009) Mixed-effects models for conditional quantiles with longitudinal data. Int J Biostat. https://doi.org/10.2202/1557-4679.1186
Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B 44(2):226–233
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092
Meza C, Osorio F, De la Cruz R (2012) Estimation in nonlinear mixed-effects models using heavy-tailed distributions. Stat Comput 22:121–139
Mu Y, He X (2007) Power transformation toward a linear regression quantile. J Am Stat Assoc 102:269–279
Perelson AS, Essunger P, Cao Y, Vesanen M, Hurley A, Saksela K, Markowitz M, Ho DD (1997) Decay characteristics of HIV-1-infected compartments during combination therapy. Nature 387(6629):188–191
Pinheiro J, Bates D (1995) Approximations to the log-likelihood function in the nonlinear mixed effects model. J Comput Gr Stat 4:12–35
Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-PLUS. Springer, New York
R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Searle SR, Casella G, McCulloch C (1992) Variance components. Wiley, New York
Sriram K, Ramamoorthi R, Ghosh P (2013) Posterior consitency of Bayesian quantile regression based on the misspecified asymmetric Laplace distribution. Bayesian Anal 8(2):479–504
Wang J (2012) Bayesian quantile regression for parametric nonlinear mixed effects models. Stat Methods Appl 21:279–295
Wu L (2002) A joint model for nonlinear mixed-effects models with censoring and covariates measured with error, with application to AIDS studies. J Am Stat Assoc 97(460):955–964
Wu L (2010) Mixed effects models for complex data. Chapman & Hall/CRC, Boca Raton
Yu K, Moyeed R (2001) Bayesian quantile regression. Stat Probab Lett 54(4):437–447
Yu K, Zhang J (2005) A three-parameter asymmetric Laplace distribution and its extension. Commun Stat Theory Methods 34(9–10):1867–1879
Yuan Y, Yin G (2010) Bayesian quantile regression for longitudinal studies with nonignorable missing data. Biometrics 66(1):105–114
Acknowledgements
We thank the Editor and two anonymous referees whose constructive comments and suggestions led to an improved presentation of the paper. The research of C. Galarza was supported by Grant 2015/17110-9 from Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP-Brazil). L. M. Castro acknowledges Grant Fondecyt 1170258 from the Chilean government. The research of F. Louzada was supported by Grant 305351/2013-3 from from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq-Brazil) and by Grant 2013/07375-0 from Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP-Brazil).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 A1: Specification of initial values
It is well known that a smart choice of the initial values for the ML estimates can assure a fast convergence of the algorithm to the global maxima solution. Without considering the random effect term, i.e., \(\mathbf {b}_{i}=\mathbf {0}\), let \(\mathbf{y}_{i}\sim \text {AL}({\varvec{\eta }}({\varvec{\beta }}_{ p},\mathbf {0}),\sigma ,p)\). Next, considering the ML estimates for \({\varvec{\beta }}_{ p}\) and \(\sigma \) as defined in Yu and Zhang (2005) for this model, we follow the steps below for the QR-LME model implementation:
- 1.
Compute an initial value \(\widehat{{\varvec{\beta }}}_{ p}^{{\scriptscriptstyle (0)}}\) as
$$\begin{aligned} \widehat{{\varvec{\beta }}}_{ p}^{{\scriptscriptstyle (0)}}=\textit{argmin}{\scriptscriptstyle \beta _{ p}\in \mathbb {R}^{{k}}}\sum ^n_{i=1}\rho _{p}({\mathbf{y}_{i}-{\varvec{\eta }}({\varvec{\beta }}_{ p},\mathbf {0})}). \end{aligned}$$ - 2.
Using the initial value of \(\widehat{{\varvec{\beta }}}_{ p}^{{\scriptscriptstyle (0)}}\) obtained above, compute \(\widehat{\sigma }^{(0)}\) as
$$\begin{aligned} \widehat{\sigma }^{(0)}=\frac{1}{n}\sum ^n_{i=1}\rho _{p}({\mathbf{y}_{i}-{\varvec{\eta }}({\varvec{\beta }}_{ p},\mathbf {0})}). \end{aligned}$$ - 3.
Use a \(q\times q\) identity matrix \(\mathbf I _{{\scriptscriptstyle q\times q}}\) for the the initial value \({\varvec{\varPsi }}^{(0)}\).
1.2 A2: Computing the conditional expectations
Due to the independence between \(u_{ij}\!\mid \! y_{ij},\mathbf {b}_{i}\) and \(u_{ik}\!\mid \! y_{ik},\mathbf {b}_{i}\), for all \(j,k\!=\!1,2,\ldots ,n_{i}\) and \(j\ne k\), we can write \(\mathbf {u}_{i}\mid \mathbf {y}_{i},\mathbf {b}_{i}=[\begin{array}{cccc} u_{i1}\mid y_{i1},\mathbf {b}_{i}&u_{i2}\mid y_{i2},\mathbf {b}_{i}&\cdots&u_{in_{i}}\mid y_{in_{i}},\mathbf {b}_{i}\end{array}]^{{\scriptscriptstyle {\top }}}\). Using this fact, we are able to compute the conditional expectations \(\mathcal {E}(\mathbf {u}_{i})\) and \(\mathcal {E}({\varvec{D}}_{i}^{{\scriptscriptstyle -1}})\) in the following way:
We have \(u_{ij}\mid y_{ij},\mathbf {b}_{i}\sim \text {GIG}(\frac{1}{2},\chi _{ij},\psi )\), where \(\chi _{ij}\) and \(\psi \) are defined in Sect. 3.2. Then, using (2), we compute the moments involved in the equations above as \(\mathcal {E}({u}_{ij})=\frac{\chi _{ij}}{\psi }(1+\frac{1}{\chi _{ij}\psi })\) and \(\mathcal {E}({u}_{ij}^{-1})=\frac{\psi }{\chi _{ij}}\). Thus, for the k-th iteration of the algorithm and for the \(\ell \)-th Monte Carlo realization, we can compute \(\mathcal {E}(\mathbf {u}_{i})^{{\scriptscriptstyle (\ell ,k)}}\) and \(\mathcal {E}[{\varvec{D}}_{i}^{{\scriptscriptstyle -1}}]^{{\scriptscriptstyle (\ell ,k)}}\) using equations (12)-(13) where
1.3 A3: The empirical information matrix
Using (5), the complete log-likelihood function can be rewritten as
where \(\zeta _{i}=\mathbf {y}_{i}-{\varvec{\eta }}({\varvec{\beta }}_{ p},\mathbf {b}_{i})-\vartheta _{p}\mathbf {u}_{i}\) and \({\varvec{\theta }}=({\varvec{\beta }}_{{\scriptscriptstyle p}}^{{\scriptscriptstyle \top }},\sigma ,\mathbf {{\varvec{\alpha }}^{{\scriptscriptstyle \top }}})^{{\scriptscriptstyle \top }}\). Differentiating with respect to \({\varvec{\theta }}\), we have the following score functions:
with \(\mathbf {J_{i}}\) as defined in Sect. 3.2. and
Let \({\varvec{\alpha }}\) be the vector of reduced parameters from \({\varvec{\varPsi }}\), the dispersion matrix for \(\mathbf {b}_i\). Using the trace properties and differentiating the complete log-likelihood function, we have that
Next, taking derivatives with respect to a specific \(\alpha _{j}\) from \({\varvec{\alpha }}\) based on the chain rule, we have
where, using the fact that \(\text {tr}\{\mathbf{ABCD }\}=(\mathrm {vec}(\mathbf{A }^{\top }))^{\top }\)\(({\varvec{D}}^{\top }\otimes \mathbf{B })(\mathrm {vec}(\mathbf{C }))\), (14) can be rewritten as
Let \(\mathcal {D}_{q}\) be the elimination matrix (Lavielle 2014), which transforms the vectorized \({\varvec{\varPsi }}\) (written as \(\text{ vec }({\varvec{\varPsi }})\)) into its half-vectorized form vech(\({\varvec{\varPsi }}\)), so that \(\mathcal {D}_{q}\mathrm {vec}({\varvec{\varPsi }})\)\(=\mathrm {vech}({\varvec{\varPsi }}).\) Using the fact that for all \(j=1,\ldots ,\frac{1}{2}q(q+1)\), the vector \((\mathrm {vec}(\frac{\partial {\varvec{\varPsi }}}{\partial \alpha _{j}})^{\top })^{\top }\) corresponds to the j-th row of the elimination matrix \(\mathcal {D}_{q}\), we can generalize the derivative in (14) for the vector of parameters \({\varvec{\alpha }}\) as
Finally, at each iteration, we can compute the empirical information matrix by approximating the score for the observed log-likelihood by the stochastic approximation given in (9).
1.4 A4: Figures
1.5 A5: Sample output from R package qrNLMM()
Rights and permissions
About this article
Cite this article
Galarza, C.E., Castro, L.M., Louzada, F. et al. Quantile regression for nonlinear mixed effects models: a likelihood based perspective. Stat Papers 61, 1281–1307 (2020). https://doi.org/10.1007/s00362-018-0988-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-018-0988-y