Skip to main content

Empirical Likelihood Test for High Dimensional Generalized Linear Models

  • Chapter
  • First Online:
Big and Complex Data Analysis

Part of the book series: Contributions to Statistics ((CONTRIB.STAT.))

  • 3652 Accesses

Abstract

Technological advances allow scientists to collect high dimensional data sets in which the number of variables is much larger than the sample size. A representative example is genomics. Consequently, due to their loss of accuracy or power, many classic statistical methods are being challenged when analyzing such data. In this chapter, we propose an empirical likelihood (EL) method to test regression coefficients in high dimensional generalized linear models. The EL test has an asymptotic chi-squared distribution with two degrees of freedom under the null hypothesis, and this result is independent of the number of covariates. Moreover, we extend the proposed method to test a part of the regression coefficients in the presence of nuisance parameters. Simulation studies show that the EL tests have a good control of the type-I error rate under moderate sample sizes and are more powerful than the direct competitor under the alternative hypothesis under most scenarios. The proposed tests are employed to analyze the association between rheumatoid arthritis (RA) and single nucleotide polymorphisms (SNPs) on chromosome 6. The resulted p-value is 0.019, indicating that chromosome 6 has an influence on RA. With the partial test and logistic modeling, we also find that the SNPs eliminated by the sure independence screening and Lasso methods have no significant influence on RA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bai, Z.D., Saranadasa, H.: Effect of high dimension: by an example of a two sample problem. Stat. Sin. 6, 311–329 (1996)

    MathSciNet  MATH  Google Scholar 

  2. Bühlmann, P., et al.: Statistical significance in high-dimensional linear models. Bernoulli 19 (4), 1212–1242 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  3. Chapman, J., Whittaker, J.: Analysis of multiple snps in a candidate gene or region. Genet. Epidemiol. 32, 560–566 (2008)

    Article  Google Scholar 

  4. Chapman, J.M., Cooper, J.D., Todd, J.A., Clayton, D.G.: Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum. Hered. 56, 18–31 (2003)

    Article  Google Scholar 

  5. Chen, S.X., Guo, B.: Tests for high dimensional generalized linear models. arXiv preprint. arXiv:1402.4882 (2014)

    Google Scholar 

  6. Chen, S.X., Hall, P.: Smoothed empirical likelihood confidence intervals for quantiles. Ann. Stat. 21, 1166–1181 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen, S.X., Van Keilegom, I.: A review on empirical likelihood methods for regression. Test 18 (3), 415–447 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chen, S.X., Peng, L., Qin, Y.L.: Effects of data dimension on empirical likelihood. Biometrika 96, 711–722 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Chen, S.X., Zhang, L.X., Zhong, P.S.: Tests for high-dimensional covariance matrices. J. Am. Stat. Assoc. 106, 260–274 (2010)

    MathSciNet  Google Scholar 

  10. Donoho, D.L., et al.: High-dimensional data analysis: the curses and blessings of dimensionality. In: AMS Math Challenges Lecture, pp. 1–32 (2000)

    Google Scholar 

  11. Ellinghaus, E., Stuart, P.E., Ellinghaus, D., Nair, R.P., Debrus, S., Raelson, J.V., Belouchi, M., Tejasvi, T., Li, Y., Tsoi, L.C., et al.: Genome-wide meta-analysis of psoriatic arthritis identifies susceptibility locus at REL. J. Invest. Dermatol. 132, 1133–1140 (2012)

    Article  Google Scholar 

  12. Fan, J., Song, R., et al.: Sure independence screening in generalized linear models with NP-dimensionality. The Annals of Statistics 38, 3567–3604 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  13. Goeman, J.J., Van De Geer, S.A., Van Houwelingen, H.C.: Testing against a high dimensional alternative. J. R. Stat. Soc. Ser. B (Stat Methodol.) 68, 477–493 (2006)

    Google Scholar 

  14. Huang, J., Ma, S., Zhang, C.H.: The iterated lasso for high-dimensional logistic regression. The University of Iowa Department of Statistical and Actuarial Science Technical Report (392) (2008)

    Google Scholar 

  15. Kolaczyk, E.D.: Empirical likelihood for generalized linear models. Stat. Sin. 4, 199–218 (1994)

    MathSciNet  MATH  Google Scholar 

  16. Li, Q., Hu, J., Ding, J., Zheng, G.: Fisher’s method of combining dependent statistics using generalizations of the gamma distribution with applications to genetic pleiotropic associations. Biostatistics 15, 284–295 (2013)

    Article  Google Scholar 

  17. Meinshausen, N., Meier, L., Bühlmann, P.: P-values for high-dimensional regression. J. Am. Stat. Assoc. 104 (488), 1671–1681 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  18. Newey, W.K., Smith, R.J.: Higher order properties of gmm and generalized empirical likelihood estimators. Econometrica 72, 219–255 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  19. Owen, A.B.: Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237–249 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  20. Owen, A.B.: Empirical likelihood for linear models. Ann. Stat. 11, 1725–1747 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  21. Owen, A.: Empirical Likelihood. Chapman and Hall/CRC, Boca Raton (2001)

    Book  MATH  Google Scholar 

  22. Peng, L., Qi, Y., Wang, R.: Empirical likelihood test for high dimensional linear models. Stat. Probab. Lett. 86, 74–79 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  23. Plenge, R.M., Seielstad, M., Padyukov, L., Lee, A.T., Remmers, E.F., Ding, B., Liew, A., Khalili, H., Chandrasekaran, A., Davies, L.R., et al.: Traf1-c5 as a risk locus for rheumatoid arthritis–a genomewide study. N. Engl. J. Med. 357 (12), 1199–1209 (2007)

    Article  Google Scholar 

  24. Qin, J., Lawless, J.: Empirical likelihood and general estimating equations. Ann. Stat. 22, 300–325 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  25. Wang, T., Elston, R.C.: Improved power by use of a weighted score test for linkage disequilibrium mapping. Am. J. Hum. Genet. 80, 353–360 (2007)

    Article  Google Scholar 

  26. Wang, R., Peng, L., Qi, Y.: Jackknife empirical likelihood test for equality of two high dimensional means. Stat. Sin. 23, 667–690 (2013)

    MathSciNet  MATH  Google Scholar 

  27. Zhang, R., Peng, L., Wang, R., et al.: Tests for covariance matrix with fixed or divergent dimension. Ann. Stat. 41, 2075–2096 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  28. Zhong, P.S., Chen, S.X.: Tests for high-dimensional regression coefficients with factorial designs. J. Am. Stat. Assoc. 106, 260–274 (2011)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank the organizers and participants of “The Fourth International Workshop on the Perspectives on High-dimensional Data Analysis.” Q. Zhang was partly supported by the China Postdoctoral Science Foundation (Grant No. 2014M550799) and the National Science Foundation of China (11401561). Q. Li was supported in part by the National Science Foundation of China (11371353, 61134013) and the Strategic Priority Research Program of the Chinese Academy of Sciences. S. Ma was supported by the National Social Science Foundation of China (13CTJ001, 13&ZD148).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuangge Ma .

Editor information

Editors and Affiliations

Appendix

Appendix

Recall that μ 0i  = g(X i β 0), ψ(X i , β 0) = g (X i β 0)∕V {g(X i β 0)}, ɛ i  = Y i g(X i β 0),

$$\displaystyle{T_{i} = (Y _{i} -\mu _{0i})(Y _{i+m} -\mu _{0i+m})\psi (X_{i},\beta _{0})\psi (X_{i+m},\beta _{0})X_{i}^{\top }X_{i+m}}$$

and

$$\displaystyle{S_{i} = (Y _{i} -\mu _{0i})\psi (X_{i},\beta _{0})X_{i}^{\top }\alpha + (Y _{i+m} -\mu _{0i+m})\psi (X_{i+m},\beta _{0})X_{i+m}^{\top }\alpha.}$$

Without loss of generality, we assume μ 0i  = 0. Now we prove Theorem 1.

Proof

According to Theorem 3.2 in [21], it suffices to prove that under the assumptions of Theorem 1, Conditions 1–2, and the null hypothesis, we have that as n → ,

$$\displaystyle{ \frac{1} {\sqrt{m}}\left (\sum \limits _{i=1}^{m}T_{ i}/\sigma _{1},\ \sum \limits _{i=1}^{m}S_{ i}/\sigma _{2}\right )^{\top }\stackrel{d}{\rightarrow }N(0,I_{ 2}), }$$
(12)

and

$$\displaystyle{ \frac{\sum _{i=1}^{m}T_{i}^{2}} {m\sigma _{1}^{2}} \stackrel{p}{\rightarrow }1, \frac{\sum _{i=1}^{m}S_{i}^{2}} {m\sigma _{2}^{2}} \stackrel{p}{\rightarrow }1, \frac{\sum _{i=1}^{m}T_{i}S_{i}} {m\sigma _{1}\sigma _{2}} \stackrel{p}{\rightarrow }0, }$$
(13)

where

$$\displaystyle{\sigma _{1}^{2} = \text{tr}\left \{[E(V (g(X_{ 1}^{\top }\beta _{ 0}))\psi ^{2}(X_{ 1},\beta _{0})X_{1}X_{1}^{\top })]^{2}\right \}}$$

and

$$\displaystyle{\sigma _{2}^{2} = 2E\left (V (g(X_{ 1}^{\top }\beta _{ 0}))\psi ^{2}(X_{ 1},\beta _{0})\alpha ^{\top }X_{ 1}X_{1}^{\top }\alpha \right ).}$$

Notice that

$$\displaystyle\begin{array}{rcl} & & \frac{E\left \vert X_{i}^{\top }X_{i+m}\psi (X_{i},\beta _{0})\psi (X_{i+m},\beta _{0})\varepsilon _{i}\varepsilon _{i+m}\right \vert ^{2+\delta }} {\sigma _{1}^{(2+\delta )}} {}\\ & & \quad = \frac{E\left \{E\left \vert X_{i}^{\top }X_{i+m}\psi (X_{i},\beta _{0})\psi (X_{i+m},\beta _{0})\varepsilon _{i}\varepsilon _{i+m}\right \vert ^{2+\delta }\mid X_{i} = x_{i},X_{i+m} = x_{i+m}\right \}} {\sigma _{1}^{(2+\delta )}} {}\\ & & \quad = \frac{E\left \{\left \vert x_{i}^{\top }x_{i+m}\psi (x_{i},\beta _{0})\psi (x_{i+m},\beta _{0})\right \vert ^{2+\delta }E(\left \vert \varepsilon _{i}\right \vert ^{2+\delta }\vert X_{i} = x_{i})E(\left \vert \varepsilon _{i+m}\right \vert ^{2+\delta }\vert X_{i+m} = x_{i+m})\right \}} {\sigma _{1}^{(2+\delta )}}.{}\\ \end{array}$$

According to Conditions 1–2 and (5), we have

$$\displaystyle\begin{array}{rcl} \frac{E\left \vert X_{i}^{\top }X_{i+m}\psi (X_{i},\beta _{0})\psi (X_{i+m},\beta _{0})\varepsilon _{i}\varepsilon _{i+m}\right \vert ^{2+\delta }} {\sigma _{1}^{(2+\delta )}} = o(m^{ \frac{\delta }{ 2} }).& & {}\\ \end{array}$$

Based on the Lyapunov central limit theorem, we can immediately get \(\sum _{i=1}^{m}T_{i}/\sqrt{m}\sigma _{1}\stackrel{\mathrm{d}}{\rightarrow }N(0,1)\). Similarly we can obtain \(\sum _{i=1}^{m}S_{i}/\sqrt{m}\sigma _{2}\stackrel{\mathrm{d}}{\rightarrow }N(0,1)\). To show (12), we still need to prove that for any constants a and b,

$$\displaystyle{ a\frac{\sum _{i=1}^{m}T_{i}} {\sqrt{m}\sigma _{1}} + b\frac{\sum _{i=1}^{m}S_{i}} {\sqrt{m}\sigma _{2}} \stackrel{\mathrm{d}}{\rightarrow }N\left (0,a^{2} + b^{2}\right ). }$$
(14)

Notice that under the null hypothesis,

$$\displaystyle\begin{array}{rcl} & & a\frac{\sum _{i=1}^{m}T_{i}} {\sqrt{m}\sigma _{1}} + b\frac{\sum _{i=1}^{m}S_{i}} {\sqrt{m}\sigma _{2}} {}\\ & & \quad = \frac{a} {\sqrt{m}\sigma _{1}}\sum _{i=1}^{m}\psi (X_{ i},\beta _{0})\psi (X_{i+m},\beta _{0})X_{i}^{\top }X_{ i+m}\varepsilon _{i}\varepsilon _{i+m} {}\\ & & \qquad + \frac{b} {\sqrt{m}\sigma _{2}}\sum _{i=1}^{m}[\psi (X_{ i},\beta _{0})X_{i}^{\top }\varepsilon _{ i} +\psi (X_{i+m},\beta _{0})X_{i+m}^{\top }\varepsilon _{ i+m}]. {}\\ \end{array}$$

Then it is easy to obtain that

$$\displaystyle\begin{array}{rcl} E\left \{a\frac{\sum _{i=1}^{m}T_{i}} {\sqrt{m}\sigma _{1}} + b\frac{\sum _{i=1}^{m}S_{i}} {\sqrt{m}\sigma _{2}} \right \} = 0,\ \text{var}\left \{a\frac{\sum _{i=1}^{m}T_{i}} {\sqrt{m}\sigma _{1}} + b\frac{\sum _{i=1}^{m}S_{i}} {\sqrt{m}\sigma _{2}} \right \} = a^{2} + b^{2}.& & {}\\ \end{array}$$

By the Lyapunov central limit theorem, we conclude that (14) holds. That is, we prove (12).

To show the first result in (13), it is obviously that

$$\displaystyle\begin{array}{rcl} \frac{\sum _{i=1}^{m}T_{i}^{2}} {m} = \frac{1} {m}\sum _{i=1}^{m}[\psi (X_{ i},\beta _{0})\psi (X_{i+m},\beta _{0})X_{i}^{\top }X_{ i+m}\varepsilon _{i}\varepsilon _{i+m}]^{2}\stackrel{\mathrm{p}}{\rightarrow }\sigma _{ 1}^{2}.& & {}\\ \end{array}$$

Therefore the first result in (13) holds. Similarly, we can obtain the rest two results in (13). ⊓ ⊔

To prove Theorem 2, we first establish Lemma 1.

Lemma 1

For any δ > 0,

$$\displaystyle{ E\vert X_{1}^{\top }X_{ 1+m}\vert ^{2+\delta } \leq p^{\delta }\left (\sum _{ j=1}^{p}E\vert X_{ 1j}\vert ^{2+\delta }\right )^{2} }$$
(15)

and

$$\displaystyle{ E\vert \alpha ^{\top }(X_{ 1} + X_{1+m})\vert ^{2+\delta } \leq 2^{4+\delta }\vert \vert \alpha \vert \vert ^{2+\delta }p^{\delta /2}\sum _{ j=1}^{p}E\vert X_{ 1j}\vert ^{2+\delta }. }$$
(16)

Proof

The proof of Lemma 1 is similar to that of Lemma  6 in [26]. ⊓⊔

Proof

[Proof of Theorem 2] It suffices to verify that (5) and (6) hold in Theorem 1. Consider Example 1. Assume that Q 1 = O Σ −1∕2 X 1, and Q 1+m  = O Σ −1∕2 X 1+m , where O is an orthogonal matrix satisfying that O Σ O is diagonal. Then X 1 X 1+m  = Q 1 O Σ O Q 1+m  =  j = 1 p ϕ j Q 1j Q 1+m, j , where ϕ j ’s are the eigenvalues of Σ. Therefore

$$\displaystyle\begin{array}{rcl} E\left [(X_{1}^{\top }X_{ 1+m})^{4}\right ] = E\left [\left (\sum _{ j=1}^{p}\phi _{ j}Q_{1j}Q_{1+m,j}\right )^{4}\right ] \leq 9\left (\sum _{ j=1}^{p}\phi _{ j}^{2}\right )^{2} = 9[\text{tr}\{\varSigma ^{2}\}]^{2}.& & {}\\ \end{array}$$

Thus we obtain that E[(X 1 X 1+m )]4∕[tr{Σ 2}]2 = O(1) is bounded uniformly for any p, i.e., (5) holds. Equation (6) can be verified in the same way.

As for Example 2, we define Σ  = Γ Γ = (σ i, j )1 ≤ i, j ≤ m and α Γ = (a 1, , a m ). Since X i  = Γ F i ,

$$\displaystyle\begin{array}{rcl} X_{1}^{\top }X_{ 1+m} =\sum _{ j,j^{{\prime}}=1}^{s}\sigma _{ j,j^{{\prime}}}^{{\prime}}F_{ 1j}F_{(1+m)j^{{\prime}}},& & {}\\ \end{array}$$

where F (1+m)j denotes the jth element of F 1+m , and

$$\displaystyle\begin{array}{rcl} \alpha ^{\top }(X_{ 1} + X_{1+m}) =\sum _{ j=1}^{m}a_{ j}(F_{1j} + F_{(1+m)j}).& & {}\\ \end{array}$$

Denote \(\delta _{j_{1},\ldots,j_{8}} = E\left (\prod _{k=1}^{8}F_{1j_{k}}\right )\). The other cases of v = 1 d l v  ≤ 8 can be proved in the same way. Notice that

$$\displaystyle\begin{array}{rcl} E(X_{1}^{\top }X_{ 1+m})^{8} =\sum _{ j_{1},\ldots,j_{8}=1}^{s}\sum _{ j_{1}^{{\prime}},\ldots,j_{8}^{{\prime}}=1}^{s}\prod _{ k=1}^{8}\sigma _{ j_{k},j_{k}^{{\prime}}}^{{\prime}}\delta _{ j_{1},\ldots,j_{8}}\delta _{j_{1}^{{\prime}},\ldots,j_{8}^{{\prime}}}.& & {}\\ & & {}\\ \end{array}$$

\(\delta _{j_{1},\ldots,j_{8}}\neq 0\) only when {j 1, , j 8} form pairs of integers. Denote as the summation of the situations that \(\delta _{j_{1},\ldots,j_{8}}\delta _{j_{1}^{{\prime}},\ldots,j_{8}^{{\prime}}}\neq 0\). By Lemma 1 we have

$$\displaystyle\begin{array}{rcl} E(X_{1}^{\top }X_{ 1+m})^{8}& =& O\left (\sum ^{{\ast}}\prod _{ k=1}^{8}\sigma _{ j_{k},j_{k}^{{\prime}}}^{{\prime}}\right ) {}\\ & =& O\left (\text{tr}\{\varSigma ^{{\prime}8}\}\right ) = O\left ([\text{tr}\{\varSigma ^{{\prime}2}\}]^{4}\right ). {}\\ \end{array}$$

Similarly we have

$$\displaystyle\begin{array}{rcl} & & E(\alpha ^{T}(X_{ 1} + X_{1+m}))^{8} \leq 2^{8}E\left (\sum _{ j=1}^{s}a_{ j}F_{1,j}\right )^{8} {}\\ & & \quad = O\left (\sum _{j}a_{j}^{8}\right ) + O\left (\sum _{ j,j^{{\prime}}}a_{j}^{6}a_{ j^{{\prime}}}^{2}\right ) + O\left (\sum _{ j,j^{{\prime}}}a_{j}^{4}a_{ j^{{\prime}}}^{4}\right ) + O\left (\sum _{ j,j^{{\prime}},j^{{\prime\prime}}}a_{j}^{4}a_{ j^{{\prime}}}^{2}a_{ j^{{\prime\prime}}}^{2}\right ) {}\\ & & \qquad + O\left (\sum _{j,j^{{\prime}},j^{{\prime\prime}},j^{{\prime\prime\prime}}}a_{j}^{2}a_{ j^{{\prime}}}^{2}a_{ j^{{\prime\prime}}}^{2}a_{ j^{{\prime\prime\prime}}}^{2}\right ) {}\\ & & \quad = O\left (\left (\sum _{j}a_{j}^{2}\right )^{4}\right ) = O\left (\left (\alpha ^{T}\varGamma \varGamma ^{T}\alpha \right )^{4}\right ). {}\\ \end{array}$$

Then according to Theorem 1, we can prove Theorem 2. ⊓ ⊔

Proof

[Proof of Theorem 3] Similar to the proof of Theorem 1, we only need to show that under Conditions 1, 3–5, and the null hypothesis, as n → ,

$$\displaystyle{ \frac{1} {\sqrt{m}}\left (\sum \limits _{i=1}^{m}\tilde{T}_{ i}/\tilde{\sigma }_{1},\ \sum \limits _{i=1}^{m}\tilde{S}_{ i}/\tilde{\sigma }_{2}\right )^{\top }\stackrel{d}{\rightarrow }N(0,I_{ 2}) }$$
(17)
$$\displaystyle{ \frac{\sum _{i=1}^{m}\tilde{T}_{i}^{2}} {m\tilde{\sigma }_{1}^{2}} \stackrel{p}{\rightarrow }1, \frac{\sum _{i=1}^{m}\tilde{S}_{i}^{2}} {m\tilde{\sigma }_{2}^{2}} \stackrel{p}{\rightarrow }1, \frac{\sum _{i=1}^{m}\tilde{T}_{i}\tilde{S}_{i}} {m\tilde{\sigma }_{1}\tilde{\sigma }_{2}} \stackrel{p}{\rightarrow }0, }$$
(18)

where

$$\displaystyle\begin{array}{rcl} \tilde{\sigma }_{1}^{2} = \text{tr}\left \{[E(V (g(X_{ 1}^{\top }\beta _{ 0}))\psi ^{2}(X_{ 1},\beta _{0})X_{1}^{(2)}X_{ 1}^{(2)\top })]^{2}\right \},& & {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \tilde{\sigma }_{2}^{2} = 2E\left (V (g(X_{ 1}^{\top }\beta _{ 0}))\psi ^{2}(X_{ 1},\beta _{0})\alpha ^{\top }X_{ 1}^{(2)}X_{ 1}^{(2)\top }\alpha \right ).& & {}\\ \end{array}$$

To prove (17), it suffices to prove the following three asymptotic results:

$$\displaystyle\begin{array}{rcl} & & \frac{\sum _{i=1}^{m}\tilde{T}_{i}} {\sqrt{m}\tilde{\sigma }_{1}} \stackrel{\mathrm{d}}{\rightarrow }N(0,1), {}\\ & & \frac{\sum _{i=1}^{m}\tilde{S}_{i}} {\sqrt{m}\tilde{\sigma }_{2}} \stackrel{\mathrm{d}}{\rightarrow }N(0,1),\ \text{and}\ a\frac{\sum _{i=1}^{m}\tilde{T}_{i}} {\sqrt{m}\tilde{\sigma }_{1}} + b\frac{\sum _{i=1}^{m}\tilde{S}_{i}} {\sqrt{m}\tilde{\sigma }_{2}} \stackrel{\mathrm{d}}{\rightarrow }N(0,a^{2} + b^{2}). {}\\ \end{array}$$

Notice that under the null hypothesis \(\tilde{H}_{0}\), we have

$$\displaystyle\begin{array}{rcl} \frac{\sum _{i=1}^{m}\tilde{T}_{i}} {\sqrt{m}\tilde{\sigma }_{1}} & =& \frac{1} {\sqrt{m}\tilde{\sigma }_{1}}\sum _{i=1}^{m}h_{ 1i}(\hat{\beta }_{0}) {}\\ & =& \frac{1} {\sqrt{m}\tilde{\sigma }_{1}}\sum _{i=1}^{m}h_{ 1i}(\beta _{0}) + \frac{1} {\sqrt{m}\tilde{\sigma }_{1}}\sum _{i=1}^{m}(h_{ 1i}(\hat{\beta }_{0}) - h_{1i}(\beta _{0})), {}\\ \end{array}$$

where

$$\displaystyle\begin{array}{rcl} h_{1i}(\beta ) =\psi (X_{i},\beta _{0})\psi (X_{i+m},\beta _{0})X_{i}^{(2)\top }X_{ i+m}^{(2)}\left (y_{ i} - g(X_{i}^{\top }\beta )\right )\left (y_{ i+m} - g(X_{i+m}^{\top }\beta )\right ).& & {}\\ \end{array}$$

Through proper calculation and according to Conditions 3–5, we have

$$\displaystyle\begin{array}{rcl} E\left ( \frac{1} {\sqrt{m}\tilde{\sigma }_{1}}\sum _{i=1}^{m}(h_{ 1i}(\hat{\beta }_{0}) - h_{1i}(\beta _{0}))\right )^{2} = o(1).& & {}\\ \end{array}$$

Then by applying the Markov equality, we have

$$\displaystyle\begin{array}{rcl} \frac{1} {\sqrt{m}\tilde{\sigma }_{1}}\sum _{i=1}^{m}(h_{ 1i}(\hat{\beta }_{0}) - h_{1i}(\beta _{0})) = o_{p}(1).& & {}\\ \end{array}$$

Therefore \(\frac{\sum _{i=1}^{m}\tilde{T}_{ i}} {\sqrt{m}\tilde{\sigma }_{1}}\) can be written as the summation of independent statistics and o p (1), namely

$$\displaystyle\begin{array}{rcl} \frac{\sum _{i=1}^{m}\tilde{T}_{i}} {\sqrt{m}\tilde{\sigma }_{1}} = \frac{1} {\sqrt{m}\tilde{\sigma }_{1}}\sum _{i=1}^{m}h_{ 1i}(\beta _{0}) + o_{p}(1).& & {}\\ \end{array}$$

Therefore similar to the proof of (12) in Theorem 1, we can prove (17).

To show the first result in (18), it is obvious that

$$\displaystyle\begin{array}{rcl} \frac{1} {m\tilde{\sigma }_{1}^{2}}\sum _{i=1}^{m}\tilde{T}_{ i}^{2} = \frac{1} {m\tilde{\sigma }_{1}^{2}}\sum _{i=1}^{m}h_{ 1i}^{2}(\hat{\beta }_{ 0}) = \frac{1} {m\tilde{\sigma }_{1}^{2}}\sum _{i=1}^{m}h_{ 1i}^{2}(\beta _{ 0}) + \frac{1} {m\tilde{\sigma }_{1}^{2}}\sum _{i=1}^{m}(h_{ 1i}^{2}(\hat{\beta }_{ 0}) - h_{1i}^{2}(\beta _{ 0})).& & {}\\ \end{array}$$

By applying Conditions 3–5 and with proper computation, we can obtain

$$\displaystyle\begin{array}{rcl} E\left ( \frac{1} {m\tilde{\sigma }_{1}^{2}}\sum _{i=1}^{m}(h_{ 1i}^{2}(\hat{\beta }_{ 0}) - h_{1i}^{2}(\beta _{ 0}))\right )^{2} = o(1).& & {}\\ \end{array}$$

According to the Markov equality, we obtain \(\frac{1} {m\tilde{\sigma }_{1}^{2}} \sum _{i=1}^{m}(h_{1i}^{2}(\hat{\beta }_{0}) - h_{1i}^{2}(\beta _{0})) = o_{p}(1)\). Therefore we have

$$\displaystyle\begin{array}{rcl} \frac{\sum _{i=1}^{m}\tilde{T}_{i}^{2}} {m\tilde{\sigma }_{1}^{2}} = \frac{1} {m\tilde{\sigma }_{1}^{2}}\sum _{i=1}^{m}h_{ 1i}^{2}(\beta _{ 0}) + o_{p}(1).& &{}\end{array}$$
(19)

By adopting the method similar to the proof of (13) in Theorem 1, we can obtain the first result in (18). Similarly, we can prove the other two results in (18). ⊓ ⊔

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Zang, Y., Zhang, Q., Zhang, S., Li, Q., Ma, S. (2017). Empirical Likelihood Test for High Dimensional Generalized Linear Models. In: Ahmed, S. (eds) Big and Complex Data Analysis. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41573-4_2

Download citation

Publish with us

Policies and ethics