Skip to main content

Advertisement

Log in

Penalized weighted composite quantile regression for partially linear varying coefficient models with missing covariates

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this paper we study partially linear varying coefficient models with missing covariates. Based on inverse probability-weighting and B-spline approximations, we propose a weighted B-spline composite quantile regression method to estimate the non-parametric function and the regression coefficients. Under some mild conditions, we establish the asymptotic normality and Horvitz–Thompson property of the proposed estimators. We further investigate a variable selection procedure by combining the proposed estimation method with adaptive LASSO. The oracle property of the proposed variable selection method is studied. Under a missing covariate scenario, two simulations with various non-normal error distributions and a real data application are conducted to assess and showcase the finite sample performance of the proposed estimation and variable selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

References

  • Du J, Zhan Z, Sun Z (2013) Variable selection for partially linear varying coefficient quantile regression model. Int J Biomath 6:135–149

    Article  MathSciNet  Google Scholar 

  • Fan J, Huang T (2005) Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11:1031–1057

    Article  MathSciNet  Google Scholar 

  • Fan J, Li R (2004) New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. J Am Stat Assoc 99:710–723

    Article  MathSciNet  Google Scholar 

  • Fan Y, Härdle WK, Wang W, Zhu L (2018) Single-index-based CoVaR with very high-dimensional covariates. J Bus Econ Stat 36:212–226

    Article  MathSciNet  Google Scholar 

  • Guo X, Xu WL (2012) Goodness-of-fit tests for general linear models with covariates missed at random. J. Stat. Plan. Inference. 142:2047–2058

    Article  MathSciNet  Google Scholar 

  • He XM, Shi P (1994) Convergence rate of b-spline estimators of nonparametric conditional quantile functions. J Nonparametr Stat 3:299–308

    Article  MathSciNet  Google Scholar 

  • He XM, Shi P (1996) Bivariate tensor-product B-splines in a partly linear model. J Multivar Anal 58:162–181

    Article  MathSciNet  Google Scholar 

  • Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685

    Article  MathSciNet  Google Scholar 

  • Huang JZ, Wu CO, Zhou L (2002) Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika 89:111–128

    Article  MathSciNet  Google Scholar 

  • Jiang R, Qian WM, Zhou ZR (2017) Weighted composite quantile regression for partially linear varying coefficient models. Commun Stat Simul C 3:1532–1543

    Google Scholar 

  • Jin J, Hao CY, Ma TF (2018) B-spline estimation for partially linear varying coefficient composite quantile regression models. Commun Stat Theory Methods 48(21):5322–5335. https://doi.org/10.1080/03610926.2018.1510006

    Article  MathSciNet  Google Scholar 

  • Kai B, Li R, Zou H (2011) New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann Stat 39:305–332

    Article  MathSciNet  Google Scholar 

  • Knight K (1998) Limiting distributions for L1 regression estimators under general conditions. Ann Stat 26:755–770

    Article  Google Scholar 

  • Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Koenker R, Basset GS (1978) Regression quantiles. Econometrica 46:33–50

    Article  MathSciNet  Google Scholar 

  • Liang H (2008) Generalized partially linear models with missing covariates. J Multivar Anal 99:880–895

    Article  MathSciNet  Google Scholar 

  • Liu HL, Yang H, Peng CG (2019) Weighted composite quantile regression for single index model with missing covariates at random. Comput Stat 34:1711–1740

    Article  MathSciNet  Google Scholar 

  • Robins JM, Rotnitsky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866

    Article  MathSciNet  Google Scholar 

  • Schumaker LL (1981) Spline functions. Wiley, New York

    MATH  Google Scholar 

  • Sherwood B (2015) Variable selection for additive partial linear quantile regression with missing covariates. J Multivar Anal 152:206–223

    Article  MathSciNet  Google Scholar 

  • Sherwood B, Wang L (2016) Additive partially linear quantile regression in ultra-high dimension. Ann Stat 44:288–317

    Article  Google Scholar 

  • Stone C (1985) Additive regression and other nonparametric models. Ann Stat 13:689–706

    Article  MathSciNet  Google Scholar 

  • Sun J, Gai Y, Lin L (2013) Weighted local linear composite quantile estimation for the case of general error distributions. J Stat Plan Inference 143:1049–1063

    Article  MathSciNet  Google Scholar 

  • Tang L, Zhou ZZ (2015) Weighted local linear CQR for varying-coefficient models with missing covariates. TEST 24:583–604

    Article  MathSciNet  Google Scholar 

  • Tsiatis AA (2006) Semiparametric theory and missing data. Springer, New York

    MATH  Google Scholar 

  • Wang CY, Wang S, Zhao LP, Ou ST (1997) Weighted semiparametric estimation in regression analysis with missing covariate data. J Am Stat Assoc 92:512–525

    Article  MathSciNet  Google Scholar 

  • Wang H, Li G, Jiang G (2007a) Robust regression shrinkage and consistent variable selection via the LAD-LASSO. J Bus Econ Stat 20:347–355

    Article  Google Scholar 

  • Wang H, Li R, Tsai CL (2007b) Tuning parameter selectors for smoothly clipped absolute deviation method. Biometrika 94:553–568

    Article  MathSciNet  Google Scholar 

  • Wang L, Li H, Huang JZ (2008) Variable selection in nonparametric varying-coeddicient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569

    Article  Google Scholar 

  • Wang JZ, Zhu Z, Zhou J (2009) Quantile regression in partially linear varying coefficient models. Ann Stat 37:3841–3866

    Article  MathSciNet  Google Scholar 

  • Wong H, Guo SJ, Chen M et al (2009) On locally weighted estimation and hypothesis testing on varying coefficient models. J Stat Plan Inference 139:2933–2951

    Article  MathSciNet  Google Scholar 

  • Xue LG, Yang L (2006) Additive coefficient modeling via polynomial spline. Stat Sinica 16:1423–1446

    MathSciNet  MATH  Google Scholar 

  • Xue LG, Zhu LX (2007a) Empirical likelihood for a varying coefficient model with longitudinal data. J Am Stat Assoc 102:642–652

    Article  MathSciNet  Google Scholar 

  • Xue LG, Zhu LX (2007b) Empirical likelihood semiparametric regression analysis for longitudinal data. Biometrika 94:921–937

    Article  MathSciNet  Google Scholar 

  • Yang H, Liu HL (2016) Penalized weighted composite quantile estimators with missing covariates. Stat Pap 57:69–88

    Article  MathSciNet  Google Scholar 

  • Yang YP, Xue LG, Cheng WH (2009) Empirical likelihood for a partially linear model with covariate data missing at random. J Stat Plan Inference 139:4143–4153

    Article  MathSciNet  Google Scholar 

  • Zhang W, Lee SY, Song X (2002) Local polynomial fitting in semivarying coefficient model. J Multivar Anal 82:166–188

    Article  MathSciNet  Google Scholar 

  • Zhang R, Lv Y, Zhao W (2016) Composite quantile regression and variable selection in single-index coefficient model. J Stat Plan Inference 176:1–21

    Article  MathSciNet  Google Scholar 

  • Zhao PX, Xue LG (2009) Variable selection for semiparametric varying coefficient partially linear models. Stat Prob Lett 79:2148–2157

    Article  MathSciNet  Google Scholar 

  • Zou H (2006) The adaptive LASSO and its oracle properties. J Am Stat Assoc 101:1418–1429

    Article  MathSciNet  Google Scholar 

  • Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36:1108–1126

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to thank the Editor and referees very much for their constructive comments which led to an improved manuscript. We are very grateful to Drs J.Z. Huang, C.O. Wu and L. Zhou for sharing with us the dataset “MACS Public Use Data Set Release PO4 (1984\(\sim \)1991)”. This research was supported by the National Natural Science Foundation of China (#11471264 and #11361015).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Jin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Appendix

Appendix

Throughout the appendix, C is used to represent a generic positive constant that can change from line to line. For a vector x, ||x|| denotes its Euclidean norm and for a matrix \({\mathbf {A}}\), \(\left\| {\mathbf {A}} \right\| = \sqrt{{\lambda _{\max }}({\mathbf {A}}'{\mathbf {A}})}\) denotes its spectral norm.

1.1 Lemmas

For a given covariate \(u_i\), we let \(\varvec{{\tilde{\iota }}(u_i)}=({{\iota }_0(u_i)},\ldots ,{{\iota }_N(u_i)})^T\) denote the corresponding vector of B-spline basis functions of order \(h+1\). A property of B-spline is that \(\sum \nolimits _{s = 1}^N {{{\iota }_s}({u_i})} = 1\), thus to avoid collinearity when fitting models only \({{\varvec{\iota } }(u_i)}=({{\iota }_1(u_i)},\ldots ,\) \({{\iota }_N(u_i))}^T\) is used. For ease of proof the B-spline basis functions are theoretically centered similar to the one defined in Xue and Yang (2006). Specifically, \({{\iota }_s}\) in the above is transformed by defining \({w_s}({u_i}) = \sqrt{{k_n}} \left[ {{{\iota }_s}({u_i}) - \frac{{E({{\iota }_s}({u_i}))}}{{E({{\iota }_0}({u_i}))}}{{\iota }_0}({u_i})} \right] .\) For a given covariate \(u_i\), let \({\mathbf {w}}(u_i) = (w_1(u_i),\ldots , w_N(u_i))^{T}\) be the vector of basis functions, and \({\mathbf {W}}(u_i)\) denote the \(J_n \times 1\) vector \(({\mathbf {w}}(u_i)^{T}X_{i1}, . . . , {\mathbf {w}}(u_i)^{T}X_{ip})^{T}\), where \(J_n = p\times N\) and \({\mathbf {W}} = ({\mathbf {W}}(u_1),\ldots , {\mathbf {W}}(u_n))^{T}\in R^{n\times J_n}\). Let

$$\begin{aligned} (\hat{\beta } ,{\hat{a}}) = \mathop {\arg \min }\limits _{(\beta ,a, b_k)} \sum \limits _{k = 1}^K {\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{\hat{\pi } }}} } {\rho _{{\tau _k}}}({Y_i} - {\varvec{\beta ^T}}{{\mathbf {Z}}_i} - {\mathbf {W}}{({u_i})^T}{\textit{\textbf{a}}} - {b_k}), \end{aligned}$$

then \(\varvec{{\hat{\beta }}}=\varvec{{\hat{\beta }}}^{WBCQR_{{\hat{\pi }}}}\) and \(\varvec{{\hat{\alpha }}}^{WBCQR_{{\hat{\pi }}}}(u)={\mathbf {W}}{({u_i})^T}\varvec{\hat{a}}^{WBCQR_{{\hat{\pi }}}}\). To help analyze the asymptotic behavior of \(\varvec{{\hat{\beta }}}\) , while accounting for the estimation of \({\textit{\textbf{a}}}\), following the techniques of He and Shi (1996), we define:

$$\begin{aligned} {{\mathbf {D}}_n}= & {} \mathrm {diag}\left\{ {\sum \nolimits _{k = 1}^K {{f_i}({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}){\delta _i}\pi _{i0}^{ - 1}} } \right\} _{i=1}^n \in {R^{n \times n}},\\ {{\varvec{{\hat{D}}}}_n}= & {} \mathrm {diag}\left\{ {\sum \nolimits _{k = 1}^K {{f_i}({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}){\delta _i}\hat{\pi } _{i}^{ - 1}} } \right\} _{i=1}^n \in {R^{n \times n}},\\ {{\mathbf {Z}}^*}= & {} {(Z_1^*, \ldots ,Z_n^*)^T} = ({{\mathbf {I}}_n} - {\mathbf {P}}){\mathbf {Z}} \in {R^{n \times q}}, {\mathbf {W}}_D^2 = {{\mathbf {W}}^T}{{{\hat{\mathbf {D}}}}_n}{\mathbf {W}} \in {R^{J_n \times J_n}},\\ {\varvec{\eta } _1}= & {} \sqrt{n} (\varvec{\beta } - {{\varvec{\beta }_0}}) \in {R^q}, {\varvec{\eta } _2} = {{\mathbf {W}}_D}({\textit{\textbf{a}}} - {{\textit{\textbf{a}}}_0}) + {{\textit{\textbf{W}}}_D}^{ - 1}{W^T}{{\varvec{D_n}}}{\mathbf {Z}}({\varvec{\beta }} - {{\varvec{\beta }_0}}) \in {R^{{J_n}}},\\ {{{\tilde{\mathbf {Z}}}}_i}= & {} {n^{{{ - 1} / 2}}}{\mathbf {Z}}_i^* \in {R^{{q}}}, \tilde{ {\mathbf {W}}}({u_i}) = {{\textit{\textbf{W}}}_D}^{ - 1}{\mathbf {W}}({u_i}) \in {R^{{J_n}}}, {{{\tilde{\mathbf {s}}}}_i} = {({\tilde{\mathbf {Z}}}_i^T,\tilde{{\mathbf {W}}}({u_i}))^T} \in {R^{q + {J_n}}}\\ {\phi _{ni}}= & {} ({\textit{\textbf{W}}}{({u_i})^T}{{\textit{\textbf{a}}}_0} - \varvec{\alpha }(u_i))^T {\mathbf {X}}_i,\\ {Q_i}({a_n})= & {} \sum \nolimits _{k = 1}^K {{\rho _{{\tau _k}}}} \left\{ {{\varepsilon _i} - {a_n}{{{\tilde{\mathbf {Z}}}}_i}^T{{\varvec{\eta } _1}} - {a_n}\tilde{{\mathbf {W}}}{{({u_i})}^T}{\varvec{\eta } _2} - {\phi _{ni}}} \right\} , {E_s}({Q_i}) = E({Q_i}|{{\mathbf {Z}}_i},{{U}_i},{{\mathbf {X}}_i}). \end{aligned}$$

We observe that

$$\begin{aligned}&\sum \limits _{k = 1}^K {\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{\hat{\pi }_i }}} } {\rho _{{\tau _k}}}({Y_i} - {\varvec{\beta ^T}}{Z_i} - {\mathbf {W}}{({u_i})^T}{\textit{\textbf{a}}} - {b_k}) \\&\quad = \sum \limits _{k = 1}^K {\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{\hat{\pi }_i }}} } {\rho _{{\tau _k}}}({\varepsilon _i} - {\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}} - \tilde{{\mathbf {W}}}{({u_i})^T}{\varvec{\eta } _2} - {b_k} - {\phi _{ni}}) \end{aligned}$$

Defining

$$\begin{aligned} ({{\varvec{\hat{\eta }} }_1},{\varvec{\hat{\eta }}_2}) = \mathop {\arg \min }\limits _{({{\varvec{\eta } _1}},{\varvec{\eta } _2})} \sum \limits _{k = 1}^K {\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{\hat{\pi } }}} } {\rho _{{\tau _k}}}({\varepsilon _i} - {\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}} - {\tilde{W}}{({u_i})^T}{\varvec{\eta } _2} - {b_k} - {\phi _{ni}}), \end{aligned}$$

we then have \({{\varvec{\hat{\eta }} }_1} = \sqrt{n} ({{\varvec{\hat{\beta } }}^{WBCQ{R_{\hat{\pi } }}}} - {{\varvec{\beta }_0}})\) and

$$\begin{aligned} {{{\varvec{\hat{\eta }}}}_2} = {{{\textit{\textbf{W}}}_D}}({\varvec{{\hat{a}}}^{WBCQ{R_{\hat{\pi } }}}} - {{\textit{\textbf{a}}}_0}) + {{\textit{\textbf{W}}}}_D^{ - 1}{{{\textit{\textbf{W}}}^T}}{{\varvec{{\hat{D}}}}_n}{\mathbf {Z}}({{\varvec{\hat{\beta }} }^{WBCQ{R_{\hat{\pi } }}}} - {{\varvec{\beta }_0}}). \end{aligned}$$

Lemma 1

Under Condition 3, there exists a constant \(C>0\) such that

$$\begin{aligned} \mathop {\sup }\limits _{u \in U} |{{\alpha }_l}(u) - {\varvec{\iota }^{T} }(u){{\textit{\textbf{a}}} _{0l}}| \le C{k_{n}^{ - r}}, \end{aligned}$$
(A.1)

for \(l=1,\ldots ,p\), where \(k_n\) is the number of knots. r is defined in Condition 3.

The idea to prove this lemma is similar to Schumaker (1981). Therefore, we omit the proof.

Lemma 2

If Conditions 15 are satisfied, then \({n^{{{ - 1} / 2}}}{{\mathbf {Z}}^*} = {n^{{{ - 1} / 2}}}{\varvec{\Delta } _n} + {o_p}(1)\). Furthermore, \({n^{ - 1}}{{\mathbf {Z}}^{*T}}{{{\hat{\mathbf {D}}}}_n}{Z^*} = {{\varvec{\Sigma }} _1} + {o_p}(1)\), where \({{\varvec{\Sigma }} _1}\) is defined in Sect. 2.

Proof

By the definition of \(Z^{*}\),

$$\begin{aligned} {n^{{{ - 1} / 2}}}{{\mathbf {Z}}^*} = {n^{{{ - 1} / 2}}}({\mathbf {Z}} - {\mathbf {P}}{\mathbf {Z}}) = {n^{{{ - 1} / 2}}}{\varvec{\Delta } _n} + {n^{{{ - 1} /2}}}({\mathbf {H}} - {\mathbf {P}}{\mathbf {Z}}). \end{aligned}$$

Consider the following weighted least squares problem. Let \(\varvec{\gamma }^{*}_{j}\in R^{J_n}\) be defined as \(\varvec{\gamma } _j^* = \mathop {\arg \min }\limits _{\gamma \in {R^{{J_n}}}} \sum \nolimits _{k = 1}^K {\sum \nolimits _{i = 1}^n {({{{\delta _i}} / {{{\hat{\pi }}_i}}})} } {f_i}({b_k}|{{\mathbf {Z}}_i},{U_i},{{\mathbf {X}}_i}){\left\{ {{Z_{ij}} - {\mathbf {W}}{{({U_i})}^T}\varvec{\gamma } } \right\} ^2}\). Let \({{{\hat{h}}}_j}({U_i}) = {\mathbf {W}}{({U_i})^T}{\varvec{\gamma } ^*}\) and notice that \({({\mathbf {P}}{\mathbf {X}})_{ij}} = {{{\hat{h}}}_j}({U_i})\). Adapting the results from Stone (1985), for the basis function, and from Wang et al. (1997), we obtain that

$$\begin{aligned} {n^{ - 1}}{\left\| {{\mathbf {H}}-\mathbf {PA}} \right\| ^2}= & {} {n^{ - 1}}{\lambda _{\max }}\left\{ {{{(\mathbf {H} - \mathbf {PZ})}^T}(\mathbf {H} - \mathbf{PZ})} \right\} \\\le & {} {n^{ - 1}}\mathrm{trace}\left\{ {{{(\mathbf {H} - \mathbf {PZ})}^T}(\mathbf {H}- \mathbf {PZ})} \right\} \\= & {} {n^{ - 1}}\sum \limits _{i = 1}^n {\sum \limits _{j = 1}^p {{{\left\{ {h_j^*({U_i}) - {{{\hat{h}}}_j}({U_i})} \right\} }^2}} } \\= & {} {o_p}(1) \end{aligned}$$

\(\square \)

Lemma 3

If the Conditions 15 hold then for any \(\omega >0\)

$$\begin{aligned} \Pr \left[ {\mathop {\inf }\limits _{\left\| \eta \right\| = L} \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}J_n^{ - 1}\left\{ {{Q_i}(\sqrt{{J_n}} ) - {Q_i}(0)} \right\} > 0} } \right] \ge 1 - \omega \end{aligned}$$

The idea to prove this lemma is similar to Lemma 4 of Sherwood (2015), where L is arbitrary finite positive constant. Therefore, we omit the proof.

1.2 Proof of asymptotic normality from Theorem 1 and Theorem 2

Throughout this section we will use Knight’s identity [presented in Koenker (2005) as a generalization of an identity presented in Knight (1998)] that

$$\begin{aligned} {\rho _\tau }(u - v)-\rho _\tau (u) = - v{\psi _\tau }(u) + \int _o^v {I(u \le s)} - I(u \le 0){d_s}. \end{aligned}$$

Another useful equality is \({\rho _\tau }({\varepsilon _i} - b - c) - {\rho _\tau }({\varepsilon _i} - c) = \int _{ - c}^{ - b - c} {{\psi _\tau }} ({\varepsilon _i} + s){d_s}\).

Define

$$\begin{aligned} {{\mathbf {B}}_n}= & {} \mathrm {diag}(\sum \limits _{k = 1}^K {{f_1}} ({b_k}|{{\mathbf {Z}}_1},{{\mathbf {X}}_1},{U_1}), \ldots ,\sum \limits _{k = 1}^K {{f_n}} ({b_k}|{{\mathbf {Z}}_n},{{\mathbf {X}}_n},{U_n})),\\ \varvec{\hat{\delta }}= & {} \mathrm {diag}({\delta _1}\hat{\pi } _1^{ - 1}, \ldots ,{\delta _n}\hat{\pi } _n^{ - 1}), \varvec{\psi }^{*} (\varepsilon ) = {(\psi ({\varepsilon _1}), \ldots ,\psi ({\varepsilon _n}))^T},\\ {{\varvec{{{\tilde{\eta }}}} }_1}= & {} \sqrt{n} {({{\varvec{Z^{*T}}}}{{\mathbf {B}}_n}{{\mathbf {Z}}^*})^{ - 1}}{{\varvec{Z^{*T}}}}\varvec{\hat{\delta }} \varvec{\psi }^{*} (\varepsilon ), \Delta {(B)_n} = {n^{ - 1}}\Delta _n^T{{\mathbf {B}}_n}{\Delta _n}. \end{aligned}$$

The idea is to show that \(\varvec{{\hat{\eta }}}_{1}\) is asymptotically equivalent to \(\varvec{{\tilde{\eta }}}_{1}\). The following definition is used for ease of notation

$$\begin{aligned} Q_i^*\left( {{{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}} \right)= & {} \sum \limits _{k = 1}^K \left\{ {\rho _{{\tau _k}}}\left\{ {{\varepsilon _i} - {\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}} - \tilde{{\mathbf {W}}}{{\left( {{U_i}} \right) }^T}{\varvec{\eta } _2} - {\phi _{ni}}-b_k} \right\} \right. \nonumber \\&\left. - {\rho _{{\tau _k}}}\left\{ {{\varepsilon _i} - {\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1} - \tilde{{\mathbf {W}}}{{\left( {{U_i}} \right) }^T}{\varvec{\eta } _2} - {\phi _{ni}}-b_{k}} \right\} \right\} . \end{aligned}$$
(A.2)

For positive constants M, L and C

$$\begin{aligned} \Pr \left[ {\mathop {\inf }\limits _{\begin{array}{c} \left\| {{{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}} \right\| \ge M\\ \scriptstyle \left\| {{\varvec{\eta } _2}} \right\| \le C\sqrt{{k_n}} \end{array}} \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}Q_i^*({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}) > 0} } \right] \rightarrow 1, \end{aligned}$$
(A.3)

By Lemma B.4 from supplementary material of Sherwood and Wang (2016) and Condition 2, we have

$$\begin{aligned} \mathop {\sup }\limits _{\begin{array}{c} \scriptstyle \left\| {{{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}} \right\| \le M\\ \scriptstyle \left\| {{\varvec{\eta } _2}} \right\| \le C\sqrt{{k_n}} \end{array}} \left| {\displaystyle \sum \limits _{i = 1}^n {\displaystyle \frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}\left[ {Q_i^*\big ({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}\big ) - {E_s}\left\{ {Q_i^*\big ({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}\big )} \right\} +{{\tilde{\mathbf {Z}}}_i^T\big ({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}\big )\psi ({\varepsilon _i})} } \right] } } \right| = {o_p}(1), \end{aligned}$$

where

$$\begin{aligned}&\displaystyle \sum \limits _{i = 1}^n {\displaystyle \frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}{E_s}} \left\{ {Q_i^*\big ({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}\big )} \right\} \\&\quad = \displaystyle \sum \limits _{i = 1}^n {{E_s}} \left[ {\int _{ - \left\{ {{\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1} + {\tilde{\mathbf {W}}}{{({Z_i})}^T}{\varvec{\eta } _2} + {\phi _{ni}}} \right\} }^{ - \left\{ {{\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}} + {\tilde{\mathbf {W}}}{{({Z_i})}^T}{\varvec{\eta } _2} + {\phi _{ni}}} \right\} } {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}}\sum \limits _{k = 1}^K{\psi _{\tau _k} ({\varepsilon _i} + s){d_s}} } \right] \\&\quad = -\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}} \int _{ - \left\{ {{\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1} +\tilde{ {\mathbf {W}}}{{({Z_i})}^T}{\varvec{\eta } _2} + {\phi _{ni}}} \right\} }^{ - \left\{ {{\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}} +{\tilde{\mathbf {W}}}{{({Z_i})}^T}{\varvec{\eta } _2} + {\phi _{ni}}} \right\} }\sum \limits _{k = 1}^K {\left\{ {{F_i}\big ( - s|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}\big ) - \displaystyle {{F_i}({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i})} } \right\} {d_s}} \\&\quad = \displaystyle \frac{1}{2}\displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}{f_i}\big ({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}\big )\big (1 + {o_p}(1)\big )} } \\&\qquad { \times \left[ {{{\left\{ {{{\tilde{\mathbf{Z}}}}_i^T{{\varvec{\eta } _1}} +{{\tilde{\mathbf {W}}}}{{({Z_i})}^T}{\varvec{\eta } _2} + {\phi _{ni}}} \right\} }^2} - {{\left\{ {{\tilde{\mathbf {Z}}}_i^T{{{{\tilde{\varvec{\eta }}}} }_1} + {\tilde{\mathbf {W}}}{{\big ({{\mathbf {Z}}_i}\big )}^T}{\varvec{\eta } _2} + {\phi _{ni}}} \right\} }^2}} \right] }\\&\quad = \displaystyle \frac{1}{2}\displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}{f_i}\big ({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}\big )} } (1 + {o_p}(1))\\&\qquad { \times \left[ {{{\big ({\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}}\big )}^2} - {{\big ({\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1}\big )}^2} + 2\left\{ {{\tilde{\mathbf {W}}}{{({{{\mathbf {Z}}_i}})}^T}{\varvec{\eta } _2} + {\phi _{ni}}} \right\} \big ({\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}} - {\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1}\big )} \right] }\\&\quad = \displaystyle \frac{1}{2}\displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {\left( \frac{{{\delta _i}}}{{{\pi _{i0}}}} + \frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}} - \frac{{{\delta _i}}}{{{\pi _{i0}}}}\right) {f_i}\big ({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}\big )} } \\&\qquad \times {\left[ {{{({\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}})}^2} - {{\big ({\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1}\big )}^2} + 2\left\{ {{\tilde{\mathbf {W}}}{{({{{\mathbf {Z}}_i}})}^T}{\varvec{\eta } _2} + {\phi _{ni}}}\right\} \big ({\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}} - {\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1}\big )} \right] \times (1 + {o_p}(1))}\\&\quad =\displaystyle \frac{1}{2}\left\{ {{\varvec{\eta } _1}^T\varvec{\Delta } {{(B)}_n}{{\varvec{\eta } _1}} - \tilde{\varvec{\eta } _1}^T\varvec{\Delta } {{(B)}_n}{{\varvec{\eta } _1}}} \right\} \left\{ {1 + {o_p}(1)} \right\} \\&\qquad { + {n^{{{ - 1} / 2}}}{{\big ({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}\big )}^T}\displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{\pi _{i0}}}}{f_i}\big ({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}\big ){{\mathbf {R}}_i}{\phi _{ni}}\left\{ {1 + {o_p}(1)} \right\} } } }\\&\qquad + \displaystyle \frac{1}{2}\displaystyle \sum \limits _{i = 1}^n {\left( \frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}} - \frac{{{\delta _i}}}{{{\pi _{i0}}}}\right) \left\{ {{{\big ({\tilde{\mathbf {Z}}}_i^T{{\varvec{\eta } _1}}\big )}^2} - {{\big ({\tilde{\mathbf {Z}}}_i^T{\varvec{{{\tilde{\eta }}} }_1}\big )}^2}} \right\} } \\&\qquad { + {n^{{{ - 1} / 2}}}} {\big ({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}\big )^T}\displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {\left( \frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}} - \frac{{{\delta _i}}}{{{\pi _{i0}}}}\right) } } {f_i}\big ({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}\big ){{\mathbf {R}}_i}{\phi _{ni}}\left\{ {1 + {o_p}(1)} \right\} \\&\quad =\displaystyle \frac{1}{2}\left\{ {{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{{\varvec{\eta } _1}} - \tilde{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{\varvec{{{\tilde{\eta }}} }_1}} \right\} + {o_p}(1). \end{aligned}$$

Therefore, we get

$$\begin{aligned} \mathop {\sup }\limits _{\begin{array}{c} \scriptstyle \left\| {{{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}} \right\| \le M\\ \scriptstyle \left\| {{\varvec{\eta } _2}} \right\| \le C\sqrt{{k_n}} \end{array}} \left| {\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}{E_s}\left\{ {Q_i^*({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2})} \right\} - \frac{1}{2}\left\{ {{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{{\varvec{\eta } _1}} - \tilde{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{\varvec{{{\tilde{\eta }}} }_1}} \right\} } } \right| = {o_p}(1).\nonumber \\ \end{aligned}$$
(A.4)

Then by (A.4) and Condition 5, we have

$$\begin{aligned}&\mathop {\sup }\limits _{\begin{array}{c} \scriptstyle \left\| {{{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}} \right\| \le M\\ \scriptstyle \left\| {{\varvec{\eta } _2}} \right\| \le C\sqrt{{k_n}} \end{array}} \left| \sum \limits _{i = 1}^n \frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}\left[ \left\{ {Q_i^*({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}) + {\tilde{\mathbf {Z}}}_i^T({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1})\psi ({\varepsilon _i})} \right\} \right. \right. \nonumber \\&\quad \left. \left. - \frac{1}{2}\left\{ {{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{{\varvec{\eta } _1}} - \tilde{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{\varvec{{{\tilde{\eta }}} }_1}} \right\} \right] \right| = {o_p}(1). \end{aligned}$$
(A.5)

Note that

$$\begin{aligned} {\big ({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}\big )^T}\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}{{{\tilde{\mathbf {Z}}}}_i}} \psi ({\varepsilon _i}) = {\big ({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}\big )^T}{\varvec{\Delta }} {(B)_n}{\varvec{{{\tilde{\eta }}} }_1} + {o_p}(1), \end{aligned}$$
(A.6)

By combining (A.5) and (A.6) we have

$$\begin{aligned}&\mathop {\sup }\limits _{\begin{array}{c} \scriptstyle \left\| {{{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}} \right\| \le M\\ \scriptstyle \left\| {{\varvec{\eta } _2}} \right\| \le C\sqrt{{k_n}} \end{array}} \left| \displaystyle \sum \limits _{i = 1}^n \displaystyle \frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}\left[ Q_i^*({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}) + {{({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1})}^T}{\varvec{\Delta }} {{(B)}_n}{\varvec{{{\tilde{\eta }}} }_1}\right. \right. \\&\qquad \left. \left. - \displaystyle \frac{1}{2}\left\{ {{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{{\varvec{\eta } _1}} - \tilde{\varvec{\eta } _1}^T{\varvec{\Delta }} {{(B)}_n}{\varvec{{{\tilde{\eta }}} }_1}} \right\} \right] \right| = {o_p}(1)\\&\quad \Rightarrow \mathop {\sup }\limits _{\begin{array}{c} \scriptstyle \left\| {{{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}} \right\| \le M\\ \scriptstyle \left\| {{\varvec{\eta } _2}} \right\| \le C\sqrt{{k_n}} \end{array}} \left| {\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}\left[ {Q_i^*({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2}) - \displaystyle \frac{1}{2}{{({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1})}^T}{\varvec{\Delta }} {{(B)}_n}({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1})} \right] } } \right| = {o_p}(1). \end{aligned}$$

By Conditions 1 and 2 for any \(M>0\) and \(C>0\)

$$\begin{aligned} \frac{1}{2}{\big ({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}\big )^T}{\varvec{\Delta }} {(B)_n}\big ({{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}\big )> 0. \end{aligned}$$

So, we have

$$\begin{aligned} \mathop {\lim }\limits _{n \rightarrow \infty } \mathop {\inf }\limits _{\begin{array}{c} \scriptstyle \left\| {{{\varvec{\eta } _1}} - {\varvec{{{\tilde{\eta }}} }_1}} \right\| \ge M\\ \scriptstyle \left\| {{\varvec{\eta } _2}} \right\| \le C\sqrt{{k_n}} \end{array}} {\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{{\hat{\pi } }_i}}}Q_i^*({{\varvec{\eta } _1}},{\varvec{{{\tilde{\eta }}} }_1},{\varvec{\eta } _2})} }> 0. \end{aligned}$$

Then by the convexity of \(Q^{*}_{i}\) the proof is complete.

Hence, we have

$$\begin{aligned}&\sqrt{n} ({{\varvec{\hat{\beta } }^{WBCQ{R_\pi }}}} - {{\varvec{\beta }_0}}) \\&\quad = {\left\{ {{n^{ - 1}}\sum \limits _{k = 1}^K {\sum \limits _{i = 1}^n {{f_i}({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}){\mathbf {Z}}_i^*{{\mathbf {Z}}_i}^{*T}} } } \right\} ^{ -1/\sqrt{n}}}\sum \limits _{i = 1}^n {({\delta _i}/{\pi _{i0}}){\mathbf {Z}}_i^*\psi ({\varepsilon _i})} + {o_p}(1), \end{aligned}$$

Using Lemma 2 for the first equality, we get

$$\begin{aligned} \sqrt{n} ({{\varvec{\hat{\beta } }^{WBCQ{R_\pi }}}} - {{\varvec{\beta }_0}}) = {\left\{ {{{\varvec{\Sigma }} _1} + {o_p}(1)} \right\} ^{ - 1}}{n^{{{ - 1} / 2}}}\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{\pi _{i0}}}}{{\mathbf {R}}_i}\psi ({\varepsilon _i})} \left\{ {1 + {o_p}(1)} \right\} \nonumber \\ \end{aligned}$$
(A.7)

The expected value of (A.7) is zero. Therefore, it suffices to compute the variance of the sums.

$$\begin{aligned} Var\left\{ {\frac{{{\delta _i}}}{{{\pi _{i0}}}}{{\mathbf {R}}_i}\psi ({\varepsilon _i})} \right\} = E\left\{ {\frac{{{\delta _i}}}{{\pi _{i0}}}{{\mathbf {R}}_i}{\mathbf {R}}_i^T\psi {{({\varepsilon _i})}^2}} \right\} = E\left\{ {\frac{1}{{\pi _{i0}}}{{\mathbf {R}}_i}{\mathbf {R}}_i^T\psi {{({\varepsilon _i})}^2}} \right\} = {{\varvec{\Sigma }} _2}. \end{aligned}$$

Therefore, we have

$$\begin{aligned} \sqrt{n} ({{\varvec{\hat{\beta } }^{WBCQ{R_\pi }}}} - {{\varvec{\beta }_0}}) \rightarrow N(0,{\varvec{\Sigma }} _1^{ - 1}{{\varvec{\Sigma }} _2}{\varvec{\Sigma }} _1^{ - 1}). \end{aligned}$$

This completes the proof of Part (i) for Theorem 1.

$$\begin{aligned}&\sqrt{n} ({\varvec{\hat{\beta } }^{WBCQ{R_{\hat{\pi } }}}} - {{\varvec{\beta }_0}}) \\&\quad = {\left\{ {{n^{ - 1}}\sum \limits _{k = 1}^K {\sum \limits _{i = 1}^n {{f_i}({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}){\mathbf {Z}}_i^*{{\mathbf {Z}}_i}^{*T}} } } \right\} ^{ -1/\sqrt{n}}}\sum \limits _{i = 1}^n {({\delta _i}/{{\hat{\pi } }_i}){\mathbf {Z}}_i^*\psi ({\varepsilon _i})} + {o_p}(1). \end{aligned}$$

Using Lemma 2 for the first equality and Lemma 3, we get

$$\begin{aligned} \begin{aligned}&\sqrt{n}\left( \hat{\varvec{\beta }}^{W B C Q R_{\hat{\pi }}}-\varvec{\beta }_{0}\right) =\left\{ \varvec{\Sigma }_{1}+o_{p}(1)\right\} ^{-1} n^{-1 / 2} \sum _{i=1}^{n} \frac{\delta _{i}}{{\hat{\pi }}_{i}} {\mathbf {R}}_{i} \psi \left( \varepsilon _{i}\right) \left\{ 1+o_{p}(1)\right\} \end{aligned} \end{aligned}$$

Again

$$\begin{aligned} \displaystyle \sum _{i=1}^{n} \displaystyle \frac{\delta _{i}}{{\hat{\pi }}_{i}} {\mathbf {R}}_{i} \psi ={\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{\pi _{i0}}}}{{\mathbf {R}}_i}\psi ({\varepsilon _i}) - \displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i} - {\pi _{i0}}}}{{{\pi _{i0}}}}E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } } } . \end{aligned}$$

So

$$\begin{aligned} \sqrt{n} ({\varvec{\hat{\beta } }^{WBCQ{R_{\hat{\pi } }}}} - {{\varvec{\beta }_0}})= & {} {\left\{ {{{\varvec{\Sigma }} _1} + {o_p}(1)} \right\} ^{ - 1}}\left\{ {n^{{{ - 1} / 2}}}\displaystyle \sum \limits _{i = 1}^n \frac{{{\delta _i}}}{{{\pi _{i0}}}}{{\mathbf {R}}_i}\psi ({\varepsilon _i}) \right. \\&\left. - {n^{{{ - 1} / 2}}}\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i} - {\pi _{i0}}}}{{{\pi _{i0}}}}E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } \right\} \left\{ {1 + {o_p}(1)} \right\} . \end{aligned}$$

The expected value of each of these two sums is zero. Therefore, it suffices to compute the variance of the two sums and their covariance. The variance of the first sum is \({{{\varvec{\Sigma }} _2}}\). The variance of the second sum is

$$\begin{aligned}&{\mathrm{Var}} \left\{ {\displaystyle \frac{{{\delta _i} - {\pi _{i0}}}}{{{\pi _{i0}}}}E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } \right\} \\&\quad = E\left\{ {\displaystyle \frac{{{{({\delta _i} - {\pi _{i0}})}^2}}}{{\pi _{i0}^2}}E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} E\left\{ {R_i^T\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } \right\} \\&\quad = E\left\{ {\displaystyle \frac{{1 - {\pi _{i0}}}}{{\pi _{i0}}}E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} E\left\{ {R_i^T\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } \right\} ={\varvec{\Sigma }}_3. \end{aligned}$$

For the covariance of the sums, we can use the assumption that \(\pi _{i0}\) is known given \( {\mathbf {V}}_{i}\) and the law of iterated expectations to get

$$\begin{aligned}&Cov\left\{ {\displaystyle \frac{{{\delta _i}}}{{{\pi _{i0}}}}{{\mathbf {R}}_i}\psi ({\varepsilon _i}),\frac{{{\delta _i} - {\pi _{i0}}}}{{{\pi _{i0}}}}E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } \right\} \\&\quad = E\left\{ {\displaystyle \frac{{{\delta _i}({\delta _i} - {\pi _{i0}})}}{{{\pi _{i0}}}}{{\mathbf {R}}_i}\psi ({\varepsilon _i})E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } \right\} \\&\quad = E\left\{ {\displaystyle \frac{{1 - {\pi _{i0}}}}{{{\pi _{i0}}}}E\left\{ {{{\mathbf {R}}_i}\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} E\left\{ {{\mathbf {R}}_i^T\psi ({\varepsilon _i})|{{{{\mathbf {V}}}_i}}} \right\} } \right\} ={\varvec{\Sigma }}_3 \end{aligned}$$

Therefore, we have

$$\begin{aligned} \sqrt{n} ({{\varvec{\hat{\beta }} }^{WBCQ{R_{{\hat{\pi }}} }}} - {{\varvec{\beta }_0}}) \rightarrow N(0,{\varvec{\Sigma }} _1^{ - 1}{{\varvec{\Sigma }} _m}{\varvec{\Sigma }} _1^{ - 1}). \end{aligned}$$

This completes the proof of Part (i) for Theorem 2.

1.3 Proof of coefficient functions convergence rate for Theorem 1 and Theorem 2

It follows Lemma 3 that

$$\begin{aligned} \left\| {{{{\textit{\textbf{W}}}_D}}({{\varvec{{\hat{a}}}}^{WBCQ{R_{\hat{\pi } }}}} - {\varvec{a_0}})} \right\| = {O_p}\left( {\sqrt{{k_n}} } \right) . \end{aligned}$$

By Schumaker (1981) it follows that

$$\begin{aligned} \max \left| {{\phi _{ni}}} \right| = {O_p}(k_n^{ - r}). \end{aligned}$$

Combining this with Condition 3, we have

$$\begin{aligned}&\displaystyle \frac{1}{n}\displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {{f_i}} } ({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}){\{ {{\hat{\alpha } }^{WBCQ{R_{\hat{\pi } }}}_{l}}({U_i}) - {\alpha _{0l}}({U_i})\} ^2}\\&\quad =\displaystyle \frac{1}{n}\displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {{f_i}} } ({b_k}|{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i}){\left\{ {{\mathbf {W}}{{({U_i})}^T}({\varvec{{\hat{a}}_l}^{WBCQ{R_{\hat{\pi } }}}} - \varvec{a_{0l}}) - {\phi _{ni}}} \right\} ^2}\\&\quad \le \displaystyle \frac{1}{n}({{\varvec{{\hat{a}}_l}}^{WBCQ{R_{\hat{\pi } }}}} - {{\textit{\textbf{a}}}_{0l}}){{\textit{\textbf{W}}}_D}^2({\varvec{{\hat{a}}_l}^{WBCQ{R_{\hat{\pi } }}}} - \varvec{a_{0l}}) + {O_p}({n^{{{ - 2r} / {\left( {2r + 1} \right) }}}})\\&\quad = {O_p}({n^{{{ - 2r} / {\left( {2r + 1} \right) }}}}) \end{aligned}$$

The proof is complete by Condition 1, which provides a uniform lower and upper bound for \({f_i}( \cdot |{{\mathbf {Z}}_i},{{\mathbf {X}}_i},{U_i})\). This completes the proof of Part (ii) for Theorem 2. The proof of Part (i) of Theorem 2 is similar to the proof of Part (ii) of Theorem 2, so it is omitted here.

1.4 Proof of Theorem 3 and Theorem 4

Let \(\sqrt{n} ({{\varvec{\hat{\beta }} }^{PWBCQ{R_{ \pi }}}} - {{\varvec{\beta }_0}}) = {{\textit{\textbf{u}}}^*}\), \(\sqrt{n} ({\hat{b}}_k^{PWBCQ{R_\pi }} - {b_k}) = v_k^*\), and \({{\varvec{\theta }} ^*} = ({{\textit{\textbf{u}}}^*},v_k^*)\). \({{\varvec{\theta }} ^*}\) is the minimizer of the following criterion:

$$\begin{aligned} {Q_n}(\pi ,{{\varvec{\theta }} ^*})= & {} \displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {\frac{{{{{{\mathbf {V}}}_i}}}}{{{\pi _{i0}}}}} } \left[ {{\rho _{{\tau _k}}}\left( {{\varepsilon _i} - {b_k} - {\phi _{ni}} - \frac{{v_k^* + {{\mathbf {Z}}_i}^T{{\textit{\textbf{u}}}^*}}}{{\sqrt{n} }}} \right) - {\rho _{{\tau _k}}}({\varepsilon _i} - {b_k})} \right] \\&+\displaystyle \sum \limits _{j = 1}^q {{\lambda _\pi }} \frac{{\left( {\left| {{\beta _j} +\displaystyle \frac{{u_j^*}}{{\sqrt{n} }}} \right| - \left| {{\beta _j}} \right| } \right) }}{{{{\left| {{{\hat{\beta } }_{j}^{WBCQ{R_\pi }}}} \right| }^2}}} \end{aligned}$$

Similar to the proof of Theorem 4.1 in Zou and Yuan (2008), the second term above can be expressed as

$$\begin{aligned} \frac{{{\lambda _\pi }}}{{\sqrt{n} {{\left| {{{\hat{\beta } }_{j}^{WBCQ{R_\pi }}}} \right| }^2}}}\sqrt{n} \left( {\left| {{\beta _j} + \frac{{u_j^*}}{{\sqrt{n} }}} \right| - \left| {{\beta _j}} \right| } \right) {\mathop {\longrightarrow }\limits ^{P}}\left\{ \begin{array}{lllll} 0,&{}\quad if&{} \beta _j \ne 0,\\ 0,&{}\quad if&{}{\beta _j} = 0 &{} and &{} u_j^* = 0,\\ \infty ,&{}\quad if&{}{\beta _j} = 0&{} and &{} u_j^* \ne 0. \end{array} \right. \end{aligned}$$

Let \({{\textit{\textbf{u}}}^*} = {({\textit{\textbf{u}}}_1^{T*},{\textit{\textbf{u}}}_2^{T*})^T}\) where \({\textit{\textbf{u}}}^{*}_{1}\) contains nonzero element of \({\textit{\textbf{u}}}^{*}\). Using the same arguments in Knight (1998) and Koenker (2005), we have \({\textit{\textbf{u}}}^{*}_{2}{\mathop {\rightarrow }\limits ^{P}} 0\) and \({\textit{\textbf{u}}}^{*}_{1}{\mathop {\rightarrow }\limits ^{d}}N(0,{[{{\varvec{\Sigma }} _1^{ - 1}{{\varvec{\Sigma }} _m}{\varvec{\Sigma }} _1^{ - 1}]}_{\Lambda \Lambda }})\). Thus, asymptotic normality is proven.

Next, we prove the consistency part. Let \({{\hat{\Lambda }}_n} = \left\{ {j:\hat{\beta } _j^{PWBCQ{R_\pi }} \ne 0} \right\} \) and \(\Lambda = \left\{ {j:{\beta _j} \ne 0} \right\} \), \(\forall {}j \in \Lambda \), the asymptotic normality indicates \(P(j \in {{\hat{\Lambda }}_n}){\rightarrow } 1\). It suffices to show \(\forall {}j \notin \Lambda \), \(P(j \in {{\hat{\Lambda }}_n}){\rightarrow } 0\). Note that,

$$\begin{aligned}&(b_1^{PWBCQ{R_\pi }}, \ldots ,b_K^{PWBCQ{R_\pi }},{{\varvec{\hat{\beta }} }^{PWBCQ{R_\pi }}},{{\varvec{{\hat{a}}}}^{PWBCQ{R_\pi }}})\\&\quad = \mathop {\arg \min }\limits _{{b_1}, \ldots ,{b_K},\varvec{\beta } ,{\textit{\textbf{a}}}} \displaystyle \sum \limits _{k = 1}^K {\displaystyle \sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{\pi _{i0}}}}{\rho _{{\tau _k}}}({Y_i} - {\varvec{\beta ^T}}{{{\mathbf {Z}}_i}} - \varvec{\Pi } _i^T{\textit{\textbf{a}}} - {b_k})} } + {\lambda _\pi }\sum \limits _{j = 1}^q {\frac{{|{\beta _j}|}}{{|\hat{\beta } _j^{WBCQ{R_\pi }}{|^2}}}} \end{aligned}$$

Using the fact that

$$\begin{aligned} \left| {\frac{{{\rho _\tau }({x_1}) - {\rho _\tau }({x_2})}}{{{x_1} - {x_2}}}} \right| \le \max \left( {\tau ,1 - \tau } \right) < 1, \end{aligned}$$

therefore, we have

$$\begin{aligned} \frac{{{\lambda _\pi }}}{{|\hat{\beta } _j^{WBCQ{R_\pi }}{|^2}}} < \sum \limits _{k = 1}^K {\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{\pi _{i0}}}}\left| {{Z_{ij}}} \right| } } = K\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{\pi _{i0}}}}\left| {{Z_{ij}}} \right| }. \end{aligned}$$

So

$$\begin{aligned} P(j \in {{\hat{\Lambda }}_n}) \le P\left( {\frac{{{\lambda _\pi }}}{{|\hat{\beta } _j^{WBCQ{R_\pi }}{|^2}}} < K\sum \limits _{i = 1}^n {\frac{{{\delta _i}}}{{{\pi _{i0}}}}\left| {{Z_{ij}}} \right| } } \right) {\rightarrow }0. \end{aligned}$$

This completes the proof of Theorem 3. The proof of Theorem 4 is similar to the proof of Theorem 3, so it is omitted here.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, J., Ma, T., Dai, J. et al. Penalized weighted composite quantile regression for partially linear varying coefficient models with missing covariates. Comput Stat 36, 541–575 (2021). https://doi.org/10.1007/s00180-020-01012-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-020-01012-z

Keywords

Navigation