Abstract
In this paper, we focus on partially linear varying coefficient quantile regression with observations missing at random, which allows the responses or responses and covariates simultaneously missing. By means of empirical likelihood method, we construct posterior distributions of the parameter in the model, and investigate their large sample properties under fixed prior. Meanwhile, we use a Bayesian hierarchical model based on empirical likelihood, spike and slab Gaussian priors to discuss variable selection. By using MCMC algorithm, finite sample performance of the proposed methods is investigated via simulations, and real data analysis is discussed too.
Similar content being viewed by others
References
Aghamohammadi A, Mohammadi S (2017) Bayesian analysis of penalized quantile regression for longitudinal data. Stat Pap 58(4):1035–1053
Cai Z, Xiao Z (2012) Semiparametric quantile regression estimation in dynamic models with partially varying coefficients. J Econom 167(2):413–425
Chernozhukov V, Hong H (2003) An MCMC approach to classical estimation. J Econom 115:293–346
Fan J, Gijbels I (1996) Local polynomial modeling and its applications. Chapman and Hall, London
Huang Y (2016) Quantile regression-based Bayesian semiparametric mixed-effects models for longitudinal data with non-normal, missing and mismeasured covariate. J Stat Comput Simul 86(6):1183–1202
Kai B, Li R, Zou H (2011) New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann Stat 39(1):305–332
Kim S, Cho HR (2018) Efficient estimation in the partially linear quantile regression model for longitudinal data. Electron J Stat 12(1):824–850
Kim MO, Yang Y (2011) Semiparametric approach to a random effects quantile regression model. J Am Stat Assoc 106(496):1405–1417
Lancaster T, Jun SJ (2010) Bayesian quantile regression methods. J Appl Econom 25(2):287–307
Lazer NA (2003) Bayesian empirical likelihood. Biometria 90(2):319–326
Lian H (2015) Quantile regression for dynamic partially linear varying coefficient time series models. J Multivar Anal 141:49–66
Lv X, Li R (2013) Smoothed empirical likelihood analysis of partially linear quantile regression models with missing response variables. Asta Adv Stat Anal 97(4):317–347
Narisetty NN, He X (2014) Bayesian variable selection with shrinking and diffusing priors. Ann Stat 42(2):789–817
Owen AB (1990) Empirical likelihood ratio confidence regions. Ann Stat 18:90–120
Owen AB (2001) Empirical likelihood. Chapman and Hall/CRC
Sherwood B (2016) Variable selection for additive partial linear quantile regression with missing covariates. J Multivar Anal 152:206–223
Vexler A, Yu J, Lazar N (2017) Bayesian empirical likelihood methods for quantile comparisons. J Korean Stat Soc 46(4):518–538
Vexler A, Zou L, Hutson AD (2019) The empirical likelihood prior applied to bias reduction of general estimating equations. Comput Stat Data Anal 138:96–106
Wang Q, Dinse GE (2011) Linear regression analysis of survival data with missing censoring indicators. Lifetime Data Anal 17(2):256–279
Wang B-H, Liang H-Y (2022) Empirical likelihood in varying-coefficient quantile regression with missing observations. Commun Stat Theory Methods 51(1):267–283
Wang Z-Q, Tang N-S (2020) Bayesian quantile regression with mixed discrete and nonignorable missing covariates. Bayesian Anal 15(2):579–604
Xu D, Tang N (2019) Bayesian adaptive Lasso for quantile regression models with nonignorably missing response data. Commun Stat Simul Comput 48(9):2727–2742
Yang Y, He X (2012) Bayesian empirical likelihood for quantile regression. Ann Stat 40(2):1102–1131
Yang J, Lu F, Yang H (2017) Quantile regression for robust estimation and variable selection in partially linear varying-coefficient models. Statistics 51(6):1179–1199
Yu K, Moyeed RA (2001) Bayesian quantile regression. Statist Probab Lett 54(4):437–447
Yuan Y, Yin G (2010) Bayesian quantile regression for longitudinal studies with nonignorable missing data. Biometrics 66(1):105–114
Yue YR, Rue H (2011) Bayesian inference for additive mixed quantile regression models. Comput Stat Data Anal 55(1):84–96
Zhao P, Tang X (2016) Imputation based statistical inference for partially linear quantile regression models with missing responses. Metrika 79(8):991–1009
Acknowledgements
The authors are grateful to the editor, associate editor, and anonymous referees for their helpful comments. This research was supported by the National Natural Science Foundation of China (12071348).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Lemma 7.2
(1) Let \(R_i=I(Y_i-X_i^T\beta -Z_i^T\widetilde{\alpha }(U_i)\le 0)-I(\epsilon _i\le 0)\) and write
Clearly, \(B_{1n}{\mathop {\rightarrow }\limits ^{P}}E(\frac{\delta XX^T}{\pi ^2(\gamma _0)}[I(\epsilon \le 0)-\tau ]^2)=\tau (1-\tau )E[\pi ^{-1}(\gamma _0)XX^T]=V\). Then, it suffices to show that \(B_{kn}\overset{p}{\rightarrow }0\) for \(k=2,3,4\).
By (A5) and Lemma 7.1(1), \(B_{4n}=O_p(n^{-1/2})\). From \(E(\Vert B_{2n}\Vert )\le CE(|R_i|)\) and \(E(\Vert B_{3n}\Vert )\le CE(|R_i|)\), we only need to prove \(E(|R_i|)\rightarrow 0\). In fact,
From Lemma (7.1)(2) and \(E\{\frac{1}{nh_n}\sum _{j=1}^n \frac{\delta _jZ_j}{\pi _{j}(\gamma _0)}K(\frac{U_j-U_i}{h_n})[\tau -I(\epsilon _j\le 0)]\}^2=O(\frac{1}{nh_n})\), we have \(\widetilde{\alpha }(U_i)-\alpha _\tau (U_i)=O_p((nh_n)^{-1/2})\). Thus
(2) Let \(g^*=\max _{1\le i\le n}\sup _{\Vert \beta -\beta _{\tau }\Vert \le c\rho _n}\Vert \widetilde{g}_i(\beta )\Vert \) and \(\alpha \in \mathbb {S}^p\) such that \(\lambda (\beta )=\Vert \lambda (\beta )\Vert \alpha \). By (2.2) and the fact \(\frac{1}{1+x}=1-\frac{x}{1+x}\), \(\left\{ \frac{1}{n}\sum _{i=1}^n\frac{\widetilde{g}_i(\beta )\widetilde{g}_i^T(\beta )}{1+\lambda ^T(\beta ) \widetilde{g}_i(\beta )}\right\} \lambda (\beta ) =\frac{1}{n}\sum _{i=1}^n\widetilde{g}_i(\beta )\). Since \(1+\lambda ^T(\beta )\widetilde{g}_i(\beta )>0\), we have
Thus, \(\Vert \lambda (\beta )\Vert \left\{ \alpha ^T\frac{1}{n} \sum _{i=1}^n\widetilde{g}_i(\beta )\widetilde{g}_i^T(\beta )\alpha -\alpha ^T\frac{g^*}{n} \sum _{i=1}^n\widetilde{g}_i(\beta )\right\} \le \alpha ^T\left\{ \frac{1}{n}\sum _{i=1}^n\widetilde{g}_i(\beta )\right\} .\) From (7.1), Lemma 7.1(3), \(\frac{1}{n}\sum _{i=1}^n\widetilde{g}_i(\beta )\widetilde{g}_i^T(\beta )=O_p(1)\) and \(\frac{1}{n}\sum _{i=1}^n\widetilde{g}_i(\beta )=O_p(\rho _n+n^{-1/2})\). Clearly, \(g^*=O_p(1)\). Hence, \(\Vert \lambda (\beta )\Vert =O_p(\rho _n+n^{-1/2})\). In particular, \(\Vert \lambda (\beta _\tau )\Vert =O_p(\rho _n+n^{-1/2})\) by taking \(\rho _n=0\). Next, write
Observe that \(\Vert \lambda ^T(\beta )\{\frac{1}{n}\sum _{j=1}^n\frac{[\widetilde{g}_i(\beta ) \widetilde{g}^T_i(\beta )]\widetilde{g}_i(\beta )}{1+\lambda ^T(\beta )\widetilde{g}_i(\beta )}\} \lambda (\beta )\Vert =O_p(\rho _n^2+n^{-1})\), from (8.1) we can get \(\lambda (\beta )=\{\frac{1}{n}\sum _{i=1}^n\widetilde{g}_i(\beta ) \widetilde{g}^T_i(\beta )\}^{-1} \frac{1}{n}\sum _{i=1}^n\widetilde{g}_i(\beta )+O_p(\rho _n^2+n^{-1})\). \(\square \)
Proof of Lemma 7.3
Set \(h_n(\beta )=\frac{1}{n}\sum _{j=1}^n\widetilde{g}_j(\beta )\) and \(L_n(\beta )=-\sum _{i=1}^n\log (1+n^{\gamma -1} h_n(\beta )^T\widetilde{g}_i(\beta ))\). (2.2) implies \(\ell _n(\beta )\le L_n(\beta )\). Lemmas 7.1–7.2 implies \(\sum _{i=1}^n\lambda ^T(\beta _{\tau })\widetilde{g}_i(\beta _{\tau })=O_p(1)\). From the fact \(\log (1+x)\le x\) for \(x>-1\), we have \(\ell _n(\beta _{\tau })\ge -\sum _{i=1}^n\lambda ^T(\beta _{\tau })\widetilde{g}_i(\beta _{\tau })=O_p(1)\). Hence,
holds with probability one. Thus, it suffices to show that \(P\left\{ \sup _{\beta \in \mathcal {B}_\delta ^c}L_n(\beta )\le -2n^{\gamma }C_{\delta }\right\} \rightarrow 1\).
Note that \(-\log (1+n^{\gamma -1} h_n(\beta )^T\widetilde{g}_i(\beta ))\!=\!-n^{\gamma -1}h_n(\beta )^T \widetilde{g}_i(\beta )\!+\! \frac{n^{2(\gamma -1)}[h_n(\beta )^T\widetilde{g}_i(\beta )]^2}{2[1+\alpha n^{\gamma -1}h_n(\beta )^T\widetilde{g}_i(\beta )]^2}\) for some \(\alpha \in (0,1)\). From the boundedness of \(\widetilde{g}_i(\beta )\) and \(h_n(\beta )\) in probability, it follows that
Clearly, it only needs to show that \(P\{\inf _{\beta \in \mathcal {B}_\delta ^c}\Vert h_n(\beta )\Vert \ge 2\sqrt{C_{\delta }}\}\rightarrow 1\). In fact, observe that
It suffices to verify that \(P\{\inf _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{1n}\Vert \ge 3\sqrt{C_{\delta }}\}\rightarrow 1\) and \(\sup _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{kn}\Vert \overset{p}{\rightarrow }0\) for \(k=2,3\). From Lemma 7.1(1), \(\sup _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{3n}\Vert =O_p(n^{-1/2})\).
Using the Bernstein’s inequality, from (A0), (A3) and (A7), it follows that
Since S(u) is positive define, from Lemma 7.1(2), we have \(\max _{1\le i\le n}\Vert \widetilde{\alpha }(U_i)-\alpha _{\tau }(U_i)\Vert =o_p(a_n)\).
Next, we prove \(\sup _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{2n}\Vert \overset{p}{\rightarrow }0\). In fact, for each \(\eta >0\), we have
It suffices to show \(J_n:=\sup _{\beta \in \mathcal {B}_\delta ^c}\frac{1}{n}\sum _{i=1}^nf_i(\beta )=o_p(1)\). Compact set \(\mathcal {B}_\delta ^c\) can be covered by cubes \(E_1,\ldots ,E_\kappa \) with sides of length at most \(\eta \) and centers \(\beta ^{(1)},\ldots ,\beta ^{(\kappa )}\), respectively, such that \(\kappa \le C(1/\eta )^p\). Hence
Since \(\kappa \) is finite, law of large number implies \(J_{1n}=O_p(1/\sqrt{n})\). From (A1) we have \(J_{2n}\le C(a_n+\eta )\). Therefore \(J_n=o_p(1)\) and \(\sup _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{2n}\Vert \overset{p}{\rightarrow }0\) from (8.2).
Finally, we verify \(P\{\inf _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{1n}\Vert \ge 3\sqrt{C_{\delta }}\}\rightarrow 1\). Write \(A_{1n}=A_{1n}'+A_{1n}''\), where
Following similar proof line as for \(J_n\overset{p}{\rightarrow }0\), one can verify \(\sup _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{1n}'\Vert \overset{p}{\rightarrow }0\). Next, we prove \(P\{\inf _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{1n}''\Vert \ge 4\sqrt{C_{\delta }}\}\rightarrow 1\). Note that \(A_{1n}''=[\frac{1}{n}\sum _{i=1}^nf_{\epsilon }(\xi ^*|X_i,Z_i,U_i)X_iX_i^T] (\beta -\beta _{\tau })\), where \(\xi ^*\) is between 0 and \(X_i^T(\beta -\beta _{\tau })\). By (A1) and (A4), we know \(f_{\epsilon }(\xi ^*|X_1,Z_1,U_1)\ge c\) and \(E(XX^T)\) is positive definite. Observe that \( \Vert A_{1n}''\Vert ^2=(\beta -\beta _{\tau })^T[\frac{1}{n}\sum _{i=1}^nf_{\epsilon } (\xi ^*|X_i,Z_i,U_i)X_iX_i^T]^2(\beta -\beta _{\tau }). \) Hence \(\inf _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{1n}''\Vert \ge c_0\delta \) in probability. Thus \(P\{\inf _{\beta \in \mathcal {B}_\delta ^c}\Vert A_{1n}''\Vert >4\sqrt{C_{\delta }}\}\rightarrow 1\) with \(C_{\delta }=\frac{c_0^2\delta ^2}{16}\). \(\square \)
Rights and permissions
About this article
Cite this article
Liu, CS., Liang, HY. Bayesian empirical likelihood of quantile regression with missing observations. Metrika 86, 285–313 (2023). https://doi.org/10.1007/s00184-022-00869-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-022-00869-y
Keywords
- Bayesian empirical likelihood
- Missing at random
- Posterior distribution
- Quantile regression
- Variable selection