Abstract
This paper proposes an empirical likelihood-based weighted (ELW) rank regression approach for estimating linear regression models when some covariates are missing at random. The proposed ELW estimator of regression parameters is computationally simple and achieves better efficiency than the inverse probability weighted (IPW) estimator if the probability of missingness is correctly specified. The covariances of the IPW and ELW estimators are estimated by using a variant of the induced smoothing method, which can bypass density estimation of the errors. Simulation results show that the ELW method works well in finite samples. A real data example is used to illustrate the proposed ELW method.
Similar content being viewed by others
References
Adichie JN (1967) Estimates of regression parameters based on rank tests. Ann Math Stat 38:894–904
Brown BM, Wang YG (2005) Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92:149–158
Fu L, Wang YG (2012) Efficient estimation for rank regression with clustered data. Biometrics 68:1074–1082
Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685
Hettmansperger TP, McKean JW (1998) Robust nonparametric statistical methods. Kendall’s library of statistics 5. Arnold/Wiley, New York
Hettmansperger TP, McKean JW (1983) A geometric interpretation of inferences based on ranks in the linear model. J Am Stat Assoc 78:885–893
Jaeckel J (1972) Estimating regression coefficients by minimizing the dispersion of the residuals. Ann Math Stat 43:1449–1458
Jung SH, Ying Z (2003) Rank-based regression with repeated measurements data. Biometrika 90:732–740
Jureckova J (1969) Asymptotic linearity of a rank statistic in regression parameter. Ann Math Stat 40:1889–1900
Lai TL, Ying Z (1988) Stochastic integrals of empirical type processes with applications to censored regression. J Multivar Anal 27:334–358
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
Liu T, Yuan X (2012) Combining quasi and empirical likelihoods in generalized linear models with missing responses. J Multivar Anal 111:39–58
Liu T, Yuan X, Li Z, Li Y (2013) Empirical and weighted conditional likelihoods for matched case-control studies with missing covariates. J Multivar Anal 119:185–199
Luo S, Zhang C (2016) Nonparametric MM-type regression estimation under missing response data. Stat Pap 57:641–664
McKean JW, Hettmansperger TP (1976) Tests of hypotheses based on ranks in the general linear model. Commun Stat Theory Methods 5:693–709
Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75:237–249
Owen AB (1990) Empirical likelihood ratio confidence regions. Ann Stat 18:90–120
Owen AB (1991) Empirical likelihood for linear models. Ann Stat 19:1725–1747
Owen AB (2001) Empirical likelihood. Chapman and Hall-CRC, New York
Pierce D (1982) The asymptotic effect of substituting estimators for parameters in a certain type of statistics. Ann Stat 10:475–478
Pollard D (1990) Empirical processes: theories and applications. Institute of Mathematical Statistics, Hayward
Purkayastha S (1998) Simple proofs of two results on convolutions of unimodal distributions. Stat Probab Lett 39:97–100
Qin J, Shao J, Zhang B (2008) Efficient and doubly robust imputation for covariate-dependent missing responses. J Am Stat Assoc 103:797–810
Qin J, Zhang B (2007) Empirical likelihood-based inference in missing response problems and its application in observational studies. J R Stat Soc B 69:101–122
Qin J, Zhang B, Leung DH (2009) Empirical likelihood in missing data problems. J Am Stat Assoc 104:1492–1503
Robins JM, Rotnizky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–886
Wang Q, Rao JNK (2002) Empirical likelihood-based inference under imputation for missing response data. Ann Stat 30:896–924
Wang YG, Zhu M (2006) Rank-based regression for analysis of repeated measures. Biometrika 93:459–464
Yang H, Liu H (2016) Penalized weighted composite quantile estimators with missing covariates. Stat Pap 57:69–88
Yuan X, Liu T, Lin N, Zhang B (2010) Combining conditional and unconditional moment restrictions with missing responses. J Multivar Anal 101:2420–2433
Acknowledgements
Tianqing Liu was partly supported by the NSFC (No. 11201174) and the Natural Science Foundation for Young Scientists of Jilin Province, China (No. 20150520054JH); Xiaohui Yuan was partly supported by the NSFC (Nos. 11401048, 11671054) and the Natural Science Foundation for Young Scientists of Jilin Province, China (No. 20150520055JH).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Unless mentioned otherwise, all limits are taken as \(n\rightarrow \infty \) and \(\Vert \cdot \Vert \) denotes the Euclidean norm. For notational convenience, for \(i=1\ldots ,n\), let \(U_{Bi}=U_{Bi}(\gamma ^*)\), \(h_i=h_i(\alpha ^*,\beta ^*,\gamma ^*)\) and \(g_i=g_i(\alpha ^*,\beta ^*,\gamma ^*)\). Write
To establish all the large sample properties in this paper, we require the following conditions:
Regularity conditions
- C1:
\(\{(y_i,x_i,z_i,\delta _i)\}_{i=1}^n\) are independent.
- C2:
The parameter \(\beta ^*\) is an interior point of a compact parameter space \(\varTheta \subset \mathcal {R}^p\).
- C3:
\(w_i\) has a bounded support.
- C4:
For \(i,j=1,\ldots ,n\), let \(f_{i}\) and \(f_{ij}\) denote respectively the condition density functions of \(e_i\) and \(e_i-e_j\) given \((w_i,w_j)\). Assume that \(f_{i}'(\cdot )\) exists and is uniformly bounded and \(f_{ij}\) satisfies the following assumptions:
- (1)
\(f_{ij}(-u)=f_{ij}(u)\), \(u\in \mathcal {R}\).
- (2)
If \(0\le u<v\), then \(f_{ij}(u)\ge f_{ij}(v)\).
- (3)
There exists a \(\varDelta >0\), when \(0\le u<v<\varDelta \), \(f_{ij}(u)>f_{ij}(v)\).
- (4)
For fixed \(t \in \mathcal {R}\), \(\int _{-\infty }^{\infty }|u+t|f_{ij}(u)du<\infty \).
- (1)
- C5:
A, \(S_\varphi \), \(S_B\), \(S_{g}\), \(S_\varphi -F_\gamma S_B^{-1}F_\gamma ^\textsf {T}\) and \(S_\varphi -F_gS_g^{-1}F_g^\textsf {T}\) are positive definite.
- C6:
(a) For all \((y_{i},z_{i})\), \(\pi (y_{i},z_{i},\gamma )\) admits all third partial derivatives \(\frac{\partial ^3\pi (y_{i},z_{i},\gamma )}{\partial \gamma _k\partial \gamma _l\partial \gamma _m}\) for all \(\gamma \) in a neighborhood of the true value \(\gamma ^*\), \(\max _{1\le i \le n }\biggr \Vert \frac{\partial ^3\pi (y_{i},z_{i},\gamma )}{\partial \gamma _k\partial \gamma _l\partial \gamma _m}\biggr \Vert \) is bounded by an integrable function for all \(\gamma \) in this neighborhood, and \(\max _{1\le i \le n }\Vert \partial \pi (y_i,z_i,\gamma )/\partial \gamma \Vert ^2\) is bounded by an integrable function for all \(\gamma \) in this neighborhood.
(b) The probability \(\pi (y,z,\gamma ^*)\) is bounded away from zero, i.e. \(\inf _{(y,z)}\pi (y,z,\gamma ^*) \ge c_0\) for some \(c_0>0.\)
- C7:
\(\max _{1\le i \le n }\Vert \xi _{i}(\mathcal {Y}_n,\mathcal {Z}_n,\eta )\Vert ^2\) is bounded by an integrable function for all \(\eta \) in a neighborhood of \(\eta ^*\) and \(\xi _{i}(\mathcal {Y}_n,\mathcal {Z}_n,\eta )\) is continuous at each \(\eta \) with probability one in this neighborhood, where \(\eta =(\alpha ^\textsf {T},\beta ^\textsf {T})^\textsf {T}\) and \(\eta ^*=(\alpha ^{*\textsf {T}},\beta ^{*\textsf {T}})^\textsf {T}\). Moreover,
$$\begin{aligned} \sup _{\Vert \eta -\eta ^*\Vert \le cn^{-1/2}}\left\| n^{-1/2}\sum _{i=1}^n\frac{\delta _i-\pi (y_i,z_i,\gamma ^*)}{\pi (y_i,z_i,\gamma ^*)}\{\xi _{i}(\mathcal {Y}_n,\mathcal {Z}_n,\eta )-\xi _{i}(\mathcal {Y}_n,\mathcal {Z}_n,\eta ^*)\}\right\| =o_p(1). \end{aligned}$$
Most of the above conditions were assumed for a standard rank regression model. Additional conditions are on the missing data mechanism and the unconditional moment restrictions. C1 defines the structure which generates the observations. C2 is a standard assumption for a parameter space. C3–C4 impose some conditions on the conditional error distributions and the covariates, both of which will hold in most practical situations. C5 guarantees that the asymptotic covariance matrices of the IPW and ELW estimators are both positive definite. C6(a) contains the conditions, which are needed to establish the consistency and asymptotic normality of the binomial likelihood estimator \(\hat{\gamma }\). C6(b) implies that the covariates x cannot be missing with probability 1 anywhere in the domain of the (y, z). C7 collects the conditions on \(h(t,\alpha ,\beta ,\gamma )\) under which the influence function of the multiplier \(\hat{\lambda }\) can be established.
In the following, we show that the regularity condition C4 is easily satisfied.
Lemma A.1
Consider the conditions:
- (a)
\(f_1=\ldots =f_n=f\), f is unimodal;
- (b)
\(f_i\) is symmetrical unimodal and has a unique mode at \(\delta \), \(i=1,\ldots ,n\);
- (c)
For fixed \(t \in \mathcal {R}\), \(\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }|u-v+t|f_{i}(u)f_{j}(v)dudv<\infty \).
Then either (a) or (b) together with (c) imply the condition C4.
Proof of Lemma A.1
Without loss of generality, we let \(\delta =0\) for (b).
By Theorems 2.1 and 2.2 in Purkayastha (1998), we get that either (a) or (b) yields the points (1) and (2) of condition C4.
Let \(J(u)=f_{ij}(u)-f_{ij}(0).\) It follows that J(u) is symmetrical about 0 and \(J(u)\le 0\) for all u. Thus we only show that \(J(u)<0\) for \(u>0\).
If (a) is satisfied, by the Cauchy–Schwarz inequality,
and equality holds if and only if \(f(u+v)=cf(u)\). Since \(\int _{-\infty }^{\infty }f(u+v)dv=c \int _{-\infty }^{\infty }f(v)dv=1.\) We have \(c=1.\) Thus, \(J(u)<0\) for \(u>0\).
If (b) is satisfied, we decompose \(J(u)=J_1(u)+J_2(u),\) where
and
By the substitution \(t=-(u+v)\) in the integral, we see that
Then
By using the substitution \(t=v+\frac{u}{2}\), we obtain
For the first integral above, substitute \(t=-v\),
Thus,
Under (b), it is straightforward to show that there exists a \(\delta _1>0\), when \(0\le u<v<\delta _1\), \(f_{i}(u)>f_{i}(v)\), \(i=1,\ldots ,n\). Then (A.1) and (A.2) imply that \(J(u)<0\) for \(u>0\).
Then we have if either (a) or (b) is satisfied, \(f_{ij}(u)\) has a unique maximum at 0, which yields the point (3) of condition C4. By the variable substitution, we have
Then, the point (4) of condition C4 follows from (c). \(\square \)
Lemma A.2
Suppose that the regularity conditions C1–C4 are satisfied, then
has the unique minimum at \(\beta ^*\).
Proof of Lemma A.2
Observe that
where \(t_{ij}=(w_i-w_j)^\textsf {T}(\beta ^*-\beta )\). Let
Note that for \(t>0\),
Similarly, for \(t<0\),
Therefore, we have \(J^*(t)\ge 0\) for all t and equality holds if and only if \(t=0.\) Lemma A.2 is then proved. \(\square \)
Proof of Theorem 1
Let \(\hat{F}(\beta ,t)=\frac{1}{n} \sum _{i=1}^n\frac{\delta _i}{\pi (y_i,z_i,\gamma ^*)} I(e_i(\beta )\le t)\), \(F_n(\beta ,t)=\frac{1}{n}\sum _{i=1}^{n}F_i(t+w_i^\textsf {T}(\beta -\beta ^*))\). Then we have
The uniform strong law of large numbers (Pollard 1990, p. 41) can be easily adapted to give
Thus \(L_{1n}(\beta )\) converges uniformly to
By Lemma A.2, it can be showed that \(L(\beta )\) has a unique minimizer \(\beta =\beta ^*\). Hence, \(\hat{\beta }_{IPW}\), as minimizer of \(L_{1n}(\beta )\), converges to \(\beta ^*\) almost surely. \(\square \)
Proof of Theorem 2
It follows from directional differentiation that minimizing \(L_{1n}(\beta )\) is equivalent to solving \(U_{1n}(\beta )=0\), where
Define
The proof is divided into two steps. In the first step, we show that \(n^{1/2}U_n^*(\beta ^*){\mathop {\longrightarrow }\limits ^{d}}N(0,V)\). Let \(\hat{G}(\beta ,t)=\frac{1}{n} \sum _{i=1}^n\frac{\delta _i w_i}{\pi (y_i,z_i,\gamma ^*)} I(e_i(\beta )\le t)\), \(G_n(\beta ,t)=\frac{1}{n}\sum _{i=1}^{n}w_iF_i(t+w_i^\textsf {T}(\beta -\beta ^*))\). Then we have
Write
It can easily be shown that \(n^{1/2}\{\hat{F}(\beta ,\cdot )-F_n(\beta ,\cdot )\} \) converges weakly to a Gaussian process. Thus
where the last equality follows from
Combining (A.4) with (A.5) we obtain \(n^{1/2}U_n^*(\beta ^*)=n^{-1/2}\sum _{i=1}^n\psi _i^*+o_p(1)\), where
and \(\bar{w}=n^{-1}\sum _{j=1}^n w_j\). Thus, \(n^{1/2}U_n^*(\beta ^*)\) converges in distribution to N(0, V) by the multivariate central limit theorem, where
In the second step, we show that
where
with
Regarding \(\hat{F}\) and \(\hat{G}\) as sums, over i, of independent random variables, we can apply the approximations given in Lai and Ying (1988) for weighted empirical processes to obtain
where \(d_n\rightarrow 0.\) (A.3), (A.7) and (A.8) imply that
The asymptotic linearity of \(U_n^*\) in (A.9) now gives
Moreover,
where
(A.10) and (A.11) lead to the following asymptotic expansion
Then, we have \(n^{1/2}(\hat{\beta }_{IPW}-\beta ^*)\) converges to \(N(0,\varSigma _{IPW})\), where \(\varSigma _{IPW}=A^{-1}VA^{-1}\), \(A=\lim _{n\rightarrow \infty }A_n\) and \(V=\lim _{n\rightarrow \infty }\text{ var }(n^{1/2}U_n^*(\beta ^*)+n^{1/2}F_\gamma S_B^{-1}U_B)=S_\varphi -F_\gamma S_B^{-1}F_\gamma ^\textsf {T}\). Then the proof is completed. \(\square \)
Lemma A.3
If \(\lambda =\lambda (\theta )\) solves
then we have \(\Vert \lambda (\theta )\Vert =O_p(n^{-1/2})\) and
uniformly about \(\theta \in B_0=\{\theta :\Vert \theta -\theta ^*\Vert \le c n^{-1/2}\}\) for some \(0< c<\infty \).
Proof of Lemma A.3
The basic idea behind this proof is outlined in Owen (1990). We first show that
From the definition of g, we only need to show \(\sup _{\Vert \theta -\theta ^*\Vert \le cn^{-1/2}}\left\| \frac{1}{n}\sum _{i=1}^n h_i(\theta )\right\| =O_p(n^{-1/2})\) and \(\sup _{\Vert \gamma -\gamma ^*\Vert \le cn^{-1/2}}\left\| \frac{1}{n}\sum _{i=1}^nU_{Bi}(\gamma )\right\| =O_p(n^{-1/2})\). On one hand, by C7 and
where \(\bar{\gamma }\) is a point on the segment connecting \(\gamma \) and \(\gamma ^*\), \(\sup _{\Vert \gamma -\gamma ^*\Vert \le cn^{-1/2}}\left\| \frac{1}{n}\sum _{i=1}^n\right. \left. U_{Bi}(\gamma )\right\| =O_p(n^{-1/2})\) is proved. On the other hand,
Let \(U_i=\lambda ^\textsf {T} g_i(\theta )\) and \(g^*=\max _{1\le i \le n}\sup _{\theta \in B_0}\Vert g_i(\theta )\Vert \). Let \(\lambda (\theta )=\Vert \lambda (\theta )\Vert v\), \(\Vert v\Vert =1\). Substituting \(1/(1+U_i)=1-U_i/(1+U_i) \) into (A.12) and simplifying, we find that
Since every \(p_i>0\), we have \(1+U_i>0\) and therefore
Consequently,
By C6 and C7, we have \(\sup _{\Vert \theta -\theta ^*\Vert \le cn^{-1/2}}\Vert \frac{1}{n}\sum _{i=1}^ng_i(\theta )g_i^\textsf {T}(\theta )\Vert =O_p(1)\). Moreover, by using the Markov’s inequality, we have \(g^*=o_p(n^{1/2})\). From these facts and (A.13), we obtain \(\lambda (\theta )=O_p(n^{-1/2})\), uniformly for any \(\theta \in B_0\). Now, write
Because
we have, uniformly for \(\theta \in B_0\),
\(\square \)
Proof of Theorem 3
It follows from directional differentiation that minimizing
is equivalent to solving \(U_{2n}(\beta )=0\), where
From the asymptotic linearity of \(U_n^*\) in (A.9), we have
For fixed estimators \(\hat{\alpha }\), \(\hat{\beta }_{IPW}\) and \(\hat{\gamma }\), the Lagrange multiplier \(\hat{\lambda }\) satisfies the constraint equations \(G(\hat{\lambda },\hat{\alpha },\hat{\beta }_{IPW},\hat{\gamma })=0\). By Lemma A.3, it follows that \(\hat{\lambda }\) is \(n^{1/2}\)-consistent for 0 as \(n\rightarrow \infty \). Using Lemma A.3, C7 and the fact that \((\hat{\lambda },\hat{\alpha },\hat{\beta }_{IPW},\hat{\gamma })\) is \(n^{1/2}\)-consistent for \((0,\alpha ^*,\beta ^*,\gamma ^*)\), we have
By (A.15), we have
where
It is easy to verify that \(F_\gamma =F_g S_g^{-1}G_\gamma \). Thus, \(U_n^*(\hat{\beta }_{ELW})=F_g S_g^{-1}U_g+o_p(n^{-1/2})\). Combining this fact and (A.14), we have
A little algebra reveals that
Using this equation, we get that
Then, we have \(n^{1/2}(\hat{\beta }_{ELW}-\beta ^*)\) converges to \(N(0,\varSigma _{ELW})\). \(\square \)
Lemma A.4
Let \( A(\beta ,\varSigma )=2n^{-2}\sum _{i\ne j}\frac{\delta _i\delta _j{\varPsi }_{ij}(\beta ,\varSigma )}{\pi (y_i,z_i,\hat{\gamma })\pi (y_j,z_j,\hat{\gamma })}(w_i-w_j)(w_i-w_j)^\textsf {T}\), where \( {\varPsi }_{ij}(\beta ,\varSigma )=\frac{1}{\sigma _{ij}(\varSigma )}\phi \left( \frac{e_i(\beta )-e_j(\beta )}{\sigma _{ij}(\varSigma )}\right) \), \(e_i(\beta )=y_i-w_{i}^\textsf {T}\beta \), \(\sigma _{ij}^2(\varSigma )=(w_i-w_j)^\textsf {T}\varSigma (w_i-w_j)/n\), \(\varSigma \) is some symmetric, positive definite matrix and \(\phi \) denotes the standard normal density function. Then
where \(A=\lim _{n\rightarrow \infty }2n^{-2}\sum _{i=1}^n\sum _{j=1}^n(w_i-w_j)(w_i-w_j)^\textsf {T}\int _{-\infty }^{\infty }f_i(u)d F_j(u)\).
Proof of Lemma A.4
Let \(A_n=2n^{-2}\sum _{i=1}^n\sum _{j=1}^n(w_i-w_j)(w_i-w_j)^\textsf {T}\int _{-\infty }^{\infty }f_i(u)d F_j(u) \). By the triangle inequality, we have
It is easy to see that \( \Vert A_n-A\Vert {\mathop {\longrightarrow }\limits ^{\textstyle p}}0\) and \(\lim _{n\rightarrow \infty }\Vert {A}(\beta ,\varSigma ^*)-E\{{A}(\beta ,\varSigma ^*)\}\Vert {\mathop {\longrightarrow }\limits ^{\textstyle p}}0\). Observe that
where \(\phi '(u)\) is the derivative of u and \(\bar{\beta }\) is a point on the segment connecting \(\beta \) and \(\beta ^*\). Note that \(\sigma _{ij}(\varSigma )=O_p(n^{-1/2})\) and \(\lim _{u\rightarrow \infty }|u\phi '(u)|=0\), then we have \(\sup _{\Vert \beta -\beta ^*\Vert <cn^{-1/2}}\Vert {A}(\beta ,\varSigma )- {A}(\beta ,\varSigma ^*)\Vert {\mathop {\longrightarrow }\limits ^{\textstyle p}}0\). Next, we will show that \(\Vert E\{{A}(\beta ,\varSigma ^*)\}-A_n\Vert {\mathop {\longrightarrow }\limits ^{\textstyle p}}0\). Notice that
Since
we have
where \(\xi _u\) lies between 0 and \(u\sigma _{ij}(\varSigma )\). By condition C4, there exists \(M>0\), such that \(\sup _{1\le i \le n}|f_{i}'(u)|<M\). It follows that
The desired result follows. \(\square \)
Rights and permissions
About this article
Cite this article
Liu, T., Yuan, X. Empirical likelihood-based weighted rank regression with missing covariates. Stat Papers 61, 697–725 (2020). https://doi.org/10.1007/s00362-017-0957-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-017-0957-x