Abstract
This paper addresses the problem of hypothesis test on response mean with various inequality constraints in the presence of covariates when response data are missing at random. The various hypotheses include to test single point, two points, set of inequalities as well as two-sided set of inequalities of the response mean. The test statistics is constructed by the weighted-corrected empirical likelihood function of the response mean based on the approach of weighted-corrected imputation for the response variable. We investigate limiting distributions and asymptotic powers of the proposed empirical likelihood ratio test statistics with auxiliary information. The results show that the test statistics with auxiliary information is more efficient than that without auxiliary information. A simulation study is undertaken to investigate the finite sample performance of the proposed method.
Similar content being viewed by others
References
Chen L, Shi J (2011) Empirical likelihood hypothesis test on mean with inequality constraints. Sci China Math 54:1847–1857
El Barmi H (1996) Empirical likelihood ratio test for or against a set of inequality constraints. J Stat Plan Inference 55:191–204
Fan GL, Liang HY, Wang JF (2013) Empirical likelihood for heteroscedastic partially linear errors-in-variables model with \(\alpha \)-mixing errors. Stat Pap 54:85–112
Fan GL, Xu HX, Liang HY (2012) Empirical likelihood inference for partially time-varying coefficient errors-in-variables models. Electron J Stat 6:1040–1058
Hall P (1992) The bootstrap and edgeworth expansion. Springer, New York
Hall P, La Scala B (1990) Methodology and algorithms of empirical likelihood. Int Stat Rev 58:109–127
Liang H, Wang S, Robins JM, Carroll RJ (2004) Estimation in partially linear models with missing covariates. J Am Stat Assoc 99:357–367
Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75:237–249
Owen AB (1990) Empirical likelihood confidence regions. Ann Stat 18:90–120
Owen AB (1991) Empirical likelihood for linear models. Ann Stat 19:1725–1747
Qin YS, Rao JNK, Ren QS (2008) Confidence intervals for marginal parameters under fractional linear regression imputation for missing data. J Multivar Anal 99:1232–1259
Robins JM, Rotnizky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866
Tang CY, Qin YS (2012) An efficient empirical likelihood approach for estimating equations with missing data. Biometrika 99:1001–1007
Wang QH, Rao JNK (2002) Empirical likelihood-based inference under imputation for missing response data. Ann Stat 30:896–924
Xue LG (2009) Empirical likelihood confidence intervals for response mean with data missing at random. Scand J Stat 36:671–685
Zhao H, Zhao PY, Tang NS (2013) Empirical likelihood inference for mean functionals with nonignorably missing response data. Comput Stat Data Anal 66:101–116
Acknowledgments
The authors would like to thank anonymous referees for their valuable comments and suggestions which lead to the improvement of the paper. This research was supported by the National Natural Science Foundation of China (11271286, 11226218, 11401006) and the Specialized Research Fund for the Doctor Program of Higher Education of China (20120072110007).
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs of main results
Appendix: Proofs of main results
For the convenience and simplicity, let \(Z\) be the standard normal random variable and \(C\) be positive constant whose value may vary at each occurrence.
Lemma 5.1
The empirical log-likelihood function \(\hat{l}_{AI}(\theta )\) is upper convex with respect to \(\theta \).
Proof
Note that \(\psi '_i(\theta )=(\mathbf{0}^T_q,-1)^T\) where \(\mathbf{0}_q\) is the \(q\times 1\) null vector and “ \('\) ” represents the derivative with respect to \(\theta \). Then
Furthermore, from (2.5) to (2.6), we have
From Eq. (2.6), we deduce by taking the derivative with respect to \(\theta \), that
Then, it follows that
Thus, we have \(\eta '_{q+1}(\theta )<0\), and hence \(\hat{l}''_{AI}(\theta )=n\eta '_{q+1}(\theta )<0\), which completes the proof of Lemma 5.1. \(\square \)
Set \( \Gamma _{AI}=\bigg (\begin{array}{cc} \Gamma _{1}~~\Gamma _{2}\\ \Gamma _{2}~~\Gamma _A \end{array}\bigg ) \) with \(\Gamma _1=E\{A(\mathbf {X})A^T(\mathbf {X})\},\) \(\Gamma _2=E\{A(\mathbf {X})(m(\mathbf {X})-\theta )\}\), \(\Gamma _A=E\{\sigma ^2(\mathbf {X})/p(\mathbf {X})\}+\text{ Var }(m(\mathbf {X}))\) and \(\sigma ^2(\mathbf {x})=\text{ Var }(Y|\mathbf {X}=\mathbf {x}).\)
Lemma 5.2
Suppose that conditions (C1)–(C8) hold. If \(\theta \) is the true mean of response \(Y\), then \( \frac{1}{\sqrt{n}}\sum _{i=1}^n\psi _i(\theta )\mathop {\rightarrow }\limits ^{d}N(0,\Gamma _{AI}). \)
Proof
The proof of Lemma 5.2 can be found in the proof of Theorem 2 of Xue (2009). \(\square \)
Lemma 5.3
Suppose that conditions (C1)–(C8) hold. If \(\theta \) is the true mean of response \(Y\), then
where \(\hat{\theta }_{ME}=\arg \max _{\theta }\hat{l}_{AI}(\theta )\) called the maximum EL estimator, \(\hat{\theta }_n=\frac{1}{n}\sum _{i=1}^n\hat{Y}_i\) and \(\Gamma \) is defined in Theorem 3.2.
Proof
Firstly, we prove \(\sqrt{n}(\hat{\theta }_{ME}-\theta )\mathop {\rightarrow }\limits ^{d}N(0,\Gamma ) \) holds. Let \(\tilde{\theta }=\hat{\theta }_{ME}\) and \(\tilde{\eta }=\tilde{\eta }(\hat{\theta }_{ME})=(\tilde{\eta }^T_1,\tilde{\eta }_2)^T\), where \(\tilde{\eta }_1\) is a \(q\)-dimensional column vector. Note that \(\tilde{\theta }\) and \(\tilde{\eta }\) satisfy the following three equations: \(Q_{kn}(\theta ,\eta )=0\) for \(k=1,2\) and \(3\), where \( Q_{1n}(\theta ,\eta )=n^{-1}\sum _{i=1}^n(\hat{Y}_i-\theta )/\{1+\eta ^T\psi _i(\theta )\}, \) \( Q_{2n}(\theta ,\eta )=n^{-1}\sum _{i=1}^nA(\mathbf {X}_i)/\{1+\eta ^T\psi _i(\theta )\} \) and \( Q_{3n}(\theta ,\eta )=n^{-1}\sum _{i=1}^n\eta _2/\{1+\eta ^T\psi _i(\theta )\}. \) Then by expanding \(Q_{kn}(\tilde{\theta },\tilde{\eta })=0\) at \((\theta ,0)\) for \(k=1,2\) and \(3\), we derive that
where \(\epsilon _n=|\tilde{\theta }-\theta |+\Vert \tilde{\eta }_1\Vert +|\tilde{\eta }_2|\). Further we find
where
Lemma 4 in Xue (2009) gives \(\frac{1}{n}\sum _{i=1}^n (\hat{Y}_i-\theta )^2\mathop {\rightarrow }\limits ^{p}\Gamma _A\). Applying the Law of Large Numbers we have \(\frac{1}{n}\sum _{i=1}^n A(\mathbf {X}_i)A^T(\mathbf {X}_i)\mathop {\rightarrow }\limits ^{p}\Gamma _1\), and from Lemmas 1 and 2 in Xue (2009), one can derive
Thus \( H_n\mathop {\rightarrow }\limits ^{p}\left( \begin{array}{ccc} -\Gamma ^T_2 &{} -\Gamma _A &{} -1\\ -\Gamma _1 &{} -\Gamma _2 &{} 0\\ 0 &{} -1 &{} 0 \end{array}\right) . \) Therefore, we obtain that
Note that Lemma 5.2 implies that
which together with (5.1) yields \(\sqrt{n}(\hat{\theta }_{ME}-\theta )\mathop {\rightarrow }\limits ^{d}N(0,\Gamma ).\)
Lemma 3 in Xue (2009) implies \(\sqrt{n}(\hat{\theta }_n-\theta )\mathop {\rightarrow }\limits ^{d}N(0,\Gamma _A)\), which, together with \(\sqrt{n}(\hat{\theta }_{ME}-\theta )\mathop {\rightarrow }\limits ^{d}N(0,\Gamma )\), leads to \(\hat{\theta }_{ME}=\hat{\theta }_n+O_p(n^{-1/2}).\) \(\square \)
Lemma 5.4
Assume that \(\theta ^*\) is the true mean of response \(Y\). Under the null hypothesis \(H_3\) and conditions (C1)–(C8), if \(E\Vert A(\mathbf {X})\Vert ^3<\infty \), \(\sup _{\mathbf {x}}E(|Y|^3|\mathbf {X}=\mathbf {x})<\infty \) and \(\Gamma _A>0\), then \( \frac{\max \{\widehat{\mathcal {L}}_{AI}(\theta _1),\widehat{\mathcal {L}}_{AI}(\theta _2)\}}{\widehat{\mathcal {L}}_{AI}(\theta ^*)} \mathop {\rightarrow }\limits ^{p}1. \)
Proof
We only prove the case when the true mean \(\theta ^*\) is \(\theta _1\), since the proof for the case when the true mean \(\theta ^*\) is \(\theta _2\) is similar. For any \(\theta \), denote \(\hat{l}_E(\theta )=-\log [n^n\widehat{\mathcal {L}}_{AI}(\theta )]=\sum _{i=1}^n\log \{1+\eta ^T(\theta )\psi _i(\theta )\}\) and \(\bar{\theta }=\theta _1+n^{-1/3}\).
Firstly, we establish \(\eta (\bar{\theta })=O_p(n^{-1/3}).\) Let \(\eta (\bar{\theta })=\rho u,\) where \(\rho \ge 0,u\in R^{q+1}\) and \(\Vert u\Vert =1\). Set
Denote by \(\mathrm{mineig}(S)\) the smallest eigenvalues of \(S\) and \(0_q\) the the \(q\times 1\) null vector. From Lemmas 5.2 and (2.6), we have
which gives
Lemma 3 in Xue (2009) implies that
Note that \(\{\psi _i^*(\bar{\theta }),1\le i\le n\}\) is i.i.d., and that \(E\Vert A(\mathbf {X})\Vert ^3<\infty \) and \(\sup _{\mathbf {x}}E(|Y|^3|\mathbf {X}=\mathbf {x})<\infty \) imply \(E\Vert \psi ^*_i(\bar{\theta })\Vert ^3<\infty \). Then from the proof of Lemma 3 in Owen (1990), one can derive
Then from \(|u^T\bar{\psi }(\theta _1)|=O_p(n^{-1/2})\) and Lemmas 5.2, we have
Since \(\Gamma _{AI}\) is a positive definite matrix and \(S\mathop {\rightarrow }\limits ^{p}\Gamma _{AI}\), \(C+o_p(1)\le \mathrm{mineig}(S)\le C+o_p(1).\) Therefore \( \rho =O_p(n^{-1/3}) \) and \(\eta (\bar{\theta })=O_p(n^{-1/3}).\)
From (2.6), it follows that
In view of \(\eta (\bar{\theta })=O_p(n^{-1/3})\) and \(\psi _m(\bar{\theta })=o_p(n^{1/3})\), one can conclude that
This together with (5.2), yields
By Taylor expansion, using (5.3) and law of the iterated logarithm for \(\{\psi _i^*(\theta _1),1\le i\le n\}\), we obtain
Similarly to the proof as for \(\eta (\bar{\theta })=O_p(n^{-1/3})\), it can be shown, for the true mean \(\theta _1\), that \(\eta (\theta _1)=O_p(n^{-1/2})\). Then
Then, it follows from the lower concavity of \(\hat{l}_E(\theta )\) that
Therefore,
Hence, Lemma 5.4 holds for \(\theta ^*=\theta _1\), which completes the proof of this Lemma. \(\square \)
Proof of Theorem 3.1
Note that \(\widehat{\mathcal {L}}_{AI}(\theta )\) has unique maxima \(\hat{\theta }_{ME}\) on \(\mathbb {R}\) by Lemma 5.1. This together with Lemma 5.3 implies that
Hence, we derive that
From Lemma 5.3 and \(\sum _{i=1}^n\tilde{p}_i(\hat{\theta }_{ME})(\hat{Y}_i-\hat{\theta }_{ME})=0\), we have \( \sum _{i=1}^n\tilde{p}_i(\hat{\theta }_{ME})\left( \hat{Y_i}-\hat{\theta }_n+O_p(n^{-1/2})\right) =0. \) Then \(\tilde{p}_i(\hat{\theta }_{ME})=n^{-1}+O_p(n^{-3/2})\), which implies
Using similar arguments as for \(\hat{l}_E(\bar{\theta })\) in the proof of Lemma 5.4 (or the proof of Theorem 2 in Owen (1991) or the proof of Theorem 2 in Xue (2009)), under the null hypothesis \(H_0\), we have
where \( \Gamma _{n,AI}=\left( \begin{array}{cc} \Gamma _{n1}&{}\Gamma _{n2}\\ \Gamma _{n2}&{}\Gamma _{n} \end{array}\right) \) with \(\Gamma _n=\frac{1}{n}\sum _{i=1}^n(\hat{Y}_i-\theta _0)^2,\) \(\Gamma _{n1}=\frac{1}{n}\sum _{i=1}^nA(\mathbf {X}_i)A^T(\mathbf {X}_i)\) and \(\Gamma _{n2}=\frac{1}{n}\sum _{i=1}^nA(\mathbf {X}_i)(\hat{Y}_i-\theta _0)\).
Note that \(\Gamma _{n,AI}\mathop {\rightarrow }\limits ^{p}\Gamma _{AI}\). Then, using Lemma 5.2, we find, for any \(t>0\),
Hence \(P\{T_{01}\le t\}=1-\frac{1}{2}P\{\chi ^2_{q+1}>t\}=\frac{1}{2}+\frac{1}{2}P\{\chi ^2_{q+1}\le t\}\), which leads to \(T_{01}\mathop {\rightarrow }\limits ^{d}\frac{1}{2}\chi _0^2+\frac{1}{2}\chi _{q+1}^2\), and the proof of Theorem 3.1 is completed. \(\square \)
Proof of Theorem 3.2
It is obvious that \( \frac{1}{\sqrt{n}}\sum _{i=1}^n\psi _i(\theta ^*)\mathop {\rightarrow }\limits ^{d}N(0,\Gamma _{AI})\) still holds for the true mean \(\theta ^*=\theta _0+n^{-1/2}\Gamma ^{1/2}\tau \). Then by Lemma 5.2, it can be shown that
which completes the proof of Theorem 3.2. \(\square \)
Proof of Theorem 3.3
From (5.4), we have
By using the same method as in the proof of Theorem 3.1, one can derive that
Hence \(\lim _{n\rightarrow \infty }P\{T_{12}>c_\alpha |\theta ^*=\theta _0\}=\alpha \). Moreover, for any fixed true mean \(\theta ^*>\theta _0\), by \(\sqrt{n}(\hat{\theta }_{ME}-\theta ^*)\Gamma ^{-1/2}\mathop {\rightarrow }\limits ^{d}N(0,1),\) we have
Thus the proof of Theorem 3.3 is completed. \(\square \)
Proof of Theorem 3.4
Assume that the true mean \(\theta ^*\) is \(\theta _1\). From Lemma 5.1, we have
Denote \(\mathcal {L}^*=\max \{\widehat{\mathcal {L}}(\theta _1),\widehat{\mathcal {L}}(\theta _2)\}\). Then, by Lemma 5.4 and (5.5), we derive
By Lemma 5.3, we have
Hence, from the proof for Theorem 3.1, it follows that
which gives the conclusion for the case of \(\theta ^*=\theta _1\). The result for \(\theta ^*=\theta _2\) can be easily obtained by using the similar approach. \(\square \)
Proof of Theorem 3.5
From (5.5), we write
For \(\theta _1<\theta ^*<\theta _2\), we have \(\sqrt{n}(\theta _1-\theta ^*)\Gamma ^{-1}\rightarrow -\infty \), \(\sqrt{n}(\theta _2-\theta ^*)\Gamma ^{-1}\rightarrow +\infty \), and hence from Lemma 5.3, it follows that
Note that if \(\theta ^*=\theta _1\), \(P(\hat{\theta }_{ME}>\theta _2)\rightarrow 0\), and \(P(\hat{\theta }_{ME}<\theta _1)\rightarrow 0\) if \(\theta ^*=\theta _2\). Following the proof for Theorem 3.1, one can conclude that
Thus \( \lim _{n\rightarrow \infty }P\{T_{24}>c_\alpha |\theta ^*=\theta _1 ~or ~\theta _2\}=\alpha . \)
Hence, the proof of Theorem 3.5 is completed. \(\square \)
Rights and permissions
About this article
Cite this article
Xu, HX., Fan, GL. & Liang, HY. Hypothesis test on response mean with inequality constraints under data missing when covariables are present. Stat Papers 58, 53–75 (2017). https://doi.org/10.1007/s00362-015-0687-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-015-0687-x
Keywords
- Auxiliary information
- Empirical likelihood
- Hypothesis test
- Inequality constraint
- Missing data
- Response mean