Abstract
It is well known that specifying a covariance matrix is difficult in the quantile regression with longitudinal data. This paper develops a two step estimation procedure to improve estimation efficiency based on the modified Cholesky decomposition. Specifically, in the first step, we obtain the initial estimators of regression coefficients by ignoring the possible correlations between repeated measures. Then, we apply the modified Cholesky decomposition to construct the covariance models and obtain the estimator of within-subject covariance matrix. In the second step, we construct unbiased estimating functions to obtain more efficient estimators of regression coefficients. However, the proposed estimating functions are discrete and non-convex. We utilize the induced smoothing method to achieve the fast and accurate estimates of parameters and their asymptotic covariance. Under some regularity conditions, we establish the asymptotically normal distributions for the resulting estimators. Simulation studies and the longitudinal progesterone data analysis show that the proposed approach yields highly efficient estimators.
Similar content being viewed by others
References
Brown BM, Wang YG (2005) Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92:149–158
Fan J, Yao Q (1998) Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85:645–660
Fu L, Wang YG (2012) Quantile regression for longitudinal data with a working correlation model. Comput Stat Data Anal 56:2526–2538
Fu L, Wang YG (2016) Efficient parameter estimation via Gaussian copulas for quantile regression with longitudinal data. J Multivar Anal 143:492–502
Fu L, Wang YG, Zhu M (2015) A Gaussian pseudolikelihood approach for quantile regression with repeated measurements. Comput Stat Data Anal 84:41–53
He X, Fu B, Fung WK (2003) Median regression of longitudinal data. Stat Med 22:3655–3669
Jung SH (1996) Quasi-likelihood for median regression models. J Am Stat Assoc 91:251–257
Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91:74–89
Koenker R (2005) Quantile regression. Number 38 in econometric society monographs. Combridge University Press, New York
Leng C, Zhang W (2014) Smoothing combined estimating equations in quantile regression for longitudinal data. Stat Comput 24:123–136
Leng C, Zhang W, Pan J (2010) Semiparametric mean-covariance regression analysis for longitudinal data. J Am Stat Assoc 105:181–193
Leung D, Wang YG, Zhu M (2009) Efficient parameter estimation in longitudinal data analysis using a hybrid GEE method. Biostatistics 10:436–445
Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22
Liang H, Liang H, Wang L (2014) Generalized additive partial linear models for clustered data with diverging number of covariates using GEE. Stat Sin 24:173–196
Liu S, Li G (2015) Varying-coefficient mean-covariance regression analysis for longitudinal data. J Stat Plan Inference 160:89–106
Liu X, Zhang W (2013) A moving average Cholesky factor model in joint mean-covariance modeling for longitudinal data. Sci China Math 56:2367–2379
Lu X, Fan Z (2015) Weighted quantile regression for longitudinal data. Comput Stat 30:569–592
Mao J, Zhu Z, Fung WK (2011) Joint estimation of mean-covariance model for longitudinal data with basis function approximations. Comput Stat Data Anal 55:983–992
Mu Y, Wei Y (2009) A dynamic quantile regression transformation model for longitudinal data. Stat Sin 19:1137–1153
Pourahmadi M (1999) Joint mean-covariance models with applications to longitudinal data: unconstrained parameterisation. Biometrika 86:677–690
Qin G, Mao J, Zhu Z (2016) Joint mean-covariance model in generalized partially linear varying coefficient models for longitudinal data. J Stat Comput Simul 86:1166–1182
Tang CY, Leng C (2011) Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika 98:1001–1006
Tang Y, Wang Y, Li J, Qian W (2015) Improving estimation efficiency in quantile regression with longitudinal data. J Stat Plan Inference 165:38–55
Wang L (2011) GEE analysis of clustered binary with diverging number of covariates. Ann Stat 39:389–417
Wang YG, Lin X, Zhu M (2005) Robust estimating functions and bias correction for longitudinal data analysis. Biometrics 61:684–691
Xu P, Zhu L (2012) Estimation for a marginal generalized single-index longitudinal model. J Multivar Anal 105:285–299
Yao W, Li R (2013) New local estimation procedure for a non-parametric regression function for longitudinal data. J R Stat Soc Ser B 75:123–138
Ye H, Pan J (2006) Modelling of covariance structures in generalised estimating equations for longitudinal data. Biometrika 93:927–941
Zhang W, Leng C (2012) A moving average Cholesky factor model in covariance modeling for longitudinal data. Biometrika 99:141–150
Zhang D, Lin X, Raz J, Sowers M (1998) Semiparametric stochastic mixed models for longitudinal data. J Am Stat Assoc 93:710–719
Zhao P, Li G (2013) Modified SEE variable selection for varying coefficient instrumental variable models. Stat Med 12:60–70
Zheng X, Fung W, Zhu Z (2013) Robust estimation in joint mean-covariance regression model for longitudinal data. Ann Inst Stat Math 65:617–638
Zheng X, Fung W, Zhu Z (2014) Variable selection in robust joint mean and covariance model for longitudinal data analysis. Stat Sin 24:515–531
Acknowledgements
The authors are very grateful to the editor and two anonymous referees for their detailed comments on the earlier version of the manuscript, which led to a much improved paper. This work is supported by the Doctoral Grant of Southwest University (Grant No. SWU116015), the Scientific and Technological Research Program of Chongqing Municipal Education Commission (Grant Nos. KJ1703054, KJ130658, KJ1400521), the Fund of Chongqing Normal University (Grant No. 16XLB019) and the Basic and Frontier Research Program of Chongqing (Grant No. cstc2016jcyjA0510).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
To establish the asymptotic properties of the proposed estimators, the following regularity conditions are needed in this paper.
(C1) The distribution function \(F_{ij}(t)=p\left( {{Y_{ij}} - \varvec{X}_{ij}^T{\varvec{\beta } _\tau } \le t\left| {{\varvec{X}_{ij}}} \right. } \right) \) is absolutely continuous, with continuous densities \(f_{ij}\left( \cdot \right) \) uniformly bounded, and its first derivative \({{\dot{f}}_{ij}}\left( \cdot \right) \) uniformly bounded away from 0 and \(\infty \) at the points \(0, i=1,..,n, j=1,\ldots ,m_i\).
(C2) The true value \({\varvec{\beta }}_\tau \) is in the interior of a bounded convex region \( {\mathcal {B}}\).
(C3) When \(n\rightarrow \infty \), the number of repeated measurements \( m_i\) is bounded for each i.
(C4) For any positive definite matrix \({\varvec{W}}_i\), \({n^{ - 1}}\sum \nolimits _{i = 1}^n {\varvec{X}_i^T{\varvec{\Lambda } _i}{{\varvec{W}}_i}{\varvec{\Lambda } _i}{\varvec{X}_i}} \) converges to a positive definite matrix, where \({\varvec{\Lambda } _i}\) is an \(m_i \times m_i\) diagonal matrix with the jth diagonal element \({f_{ij}}\left( 0 \right) \). In addition, \({\sup _i}\left\| {{\varvec{X}_i}} \right\| < \infty \), where \(\left\| \cdot \right\| \) denotes the Euclidean norm.
(C5) Matrix \(\varvec{\Omega }\) is positive definite and \(\varvec{\Omega }=O\left( {{1 / n}} \right) \).
(C6) The differentiation of \({{\tilde{\varvec{U}}}_{w\tau } }({\varvec{\beta }}_\tau )\), \(-\frac{{\partial {{\tilde{\varvec{U}}}_{w\tau } }({\varvec{\beta }}_\tau )}}{{\partial \varvec{\beta }_\tau }}\) is positive definite with probability 1.
(C7) For \({{\varvec{ \upsilon } }_{ij}} = {\left( {\sum \limits _{l = 1}^{j - 1} {{\psi _\tau }\left( {{{ \varepsilon }_{il}}} \right) W_{j,l,1}^{(i)}} ,\ldots ,\sum \limits _{l = 1}^{j - 1} {{\psi _\tau }\left( {{{ \varepsilon }_{il}}} \right) W_{j,l,s}^{(i)}} } \right) ^T}\), we have
where \(\mathop \rightarrow \limits ^p \) denotes convergence in probability.
Proof of Theorem 1
By the definition of \( {{\hat{\varvec{\upsilon }} }_{ij}} \), we have
where \({{\hat{\varepsilon }} _{ij}}=Y_{ij}-\varvec{X}_{ij}^T{\hat{\varvec{\beta }}}_I\).
Similar to Fu and Wang (2012), we have
where
Thus, by conditions (C1) and (C4), we have
Similarly, we have
and
Thus, we obtain
By the fact that
Together with the condition (C7), we obtain
as \(n\rightarrow \infty \). In addition
Similar to the proof of (19), we have \(I_1=o_p\left( 1\right) \), \(I_2=o_p\left( 1\right) \) and \(I_3=o_p\left( 1\right) \). Furthermore,
It remains to show that
Then, combine (20) and (21) and use the Slutsky’s theorem, it follows that
Next, we prove (21). Note that for any vector \(s\times 1\) constant vector \(\varvec{\kappa }= {\left( {{\kappa _1},\ldots ,{\kappa _s}} \right) ^T}\) whose components are not all zero. Let \(\varvec{\Psi } = {\left( {N - n} \right) ^{ - 1}}\sum \nolimits _{i = 1}^n {\sum \nolimits _{j = 2}^{{m_i}} {\left( {{\varvec{\kappa }^T}{\varvec{\upsilon } _{ij}}} \right) {e_{ij,\tau }}} } \). It is easy to show that \(E\left( \varvec{\Psi }\right) = 0\) and
Denote \({\xi _i} = \sum \nolimits _{j = 2}^{{m_i}} {\left( {{\varvec{\kappa }^T}{\varvec{\upsilon } _{ij}}} \right) {e_{ij,\tau }}} \). Then
It follows easily by checking Lyapunov condition that if
Thus, we can conclude that (21) holds. Now, we only need to show (22) holds. By the condition (C3), we have
Thus we complete the proof of Theorem 1. \(\square \)
Proof of Theorem 2
Note that
It follows from (A.3) of Fan and Yao (1998) that
where
It is easy to see that the Theorem 2 follows directly from statements (a)–(d) below
It is easy to see that (a) follows from a Taylor expansion. \(I_2\) is asymptotically normal with mean 0 and variance
It follows from the definition of \(I_3\) that
By (18), we know that \({\hat{\varvec{\beta }}_{\varvec{I}} }\) is a root-n consistent estimator of \(\varvec{\beta }_\tau \), together with conditions (C1) and (C4), we have
and
Furthermore, by Theorem 1 and (24), (25), we have
On the other hand, according to \(E\left( {{\varsigma _{ij}}|{t_{ij}}} \right) = 0\), \({{Var}}\left( {{\varsigma _{ij}}|{t_{ij}}} \right) = 1\), together with (24) and (26), we have
Then \({I_{3}} = {o_p}\left( {{1 / {\sqrt{Nh} }}} \right) \). By the same arguments as proving \(\varvec{I}_3\), we have \({I_{4}} = {o_p}\left( {{1 / {\sqrt{Nh} }}} \right) \). Under the conditions \(h\rightarrow 0\), \(Nh \rightarrow \infty \) as \(n\rightarrow \infty \) and \(\lim {\sup _{n \rightarrow \infty }}N{h^5} < \infty \), then the proof of Theorem 2 is completed. \(\square \)
Proof of Theorem 3
By the similar arguments of Theorem 3.1 in Lu and Fan (2015), we can show that \({{\hat{\varvec{\beta }}_{{\varvec{w}}{\varvec{\tau }}} }}\) is a root-n consistent estimator of \(\varvec{\beta }_\tau \) and has the asymptotically normal distribution, and thus omitted. \(\square \)
Proof of Theorem 4
By the similar arguments of Lemma 3.1 in Lu and Fan (2015), we can complete the proof of Theorem 4, and thus omitted. \(\square \)
Proof of Corollary 1
By the similar arguments of Theorem 3.2 in Lu and Fan (2015), we can complete the proof of Corollary 1, and thus omitted. \(\square \)
Rights and permissions
About this article
Cite this article
Lv, J., Guo, C. Efficient parameter estimation via modified Cholesky decomposition for quantile regression with longitudinal data. Comput Stat 32, 947–975 (2017). https://doi.org/10.1007/s00180-017-0714-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-017-0714-6