Calibration of the empirical likelihood for high-dimensional data

Article

Abstract

This article is concerned with the calibration of the empirical likelihood (EL) for high-dimensional data where the data dimension may increase as the sample size increases. We analyze the asymptotic behavior of the EL under a general multivariate model and provide weak conditions under which the best rate for the asymptotic normality of the empirical likelihood ratio (ELR) is achieved. In addition, there is usually substantial lack-of-fit when the ELR is calibrated by the usual normal in high dimensions, producing tests with type I errors much larger than nominal levels. We find that this is mainly due to the underestimation of the centralized and normalized quantities of the ELR. By examining the connection between the ELR and the classical Hotelling’s \(T\)-square statistic, we propose an effective calibration method which works much better in most situations.

Keywords

Asymptotic normality Coverage accuracy High-dimensional data Hotelling’s \(T\)-square statistic 

Notes

Acknowledgments

The authors would like to thank the Editor, Associate Editor and two anonymous referees for their many helpful comments that have resulted in significant improvements in the article. In particular, we are grateful to the Associate Editor for pointing out the necessity of Lemma 8. This research was supported by the NNSF of China Grants 11001138, 11071128, 11131002, 11101306, 11001083, the RFDP of China Grant 20110031110002 and the Fundamental Research Funds for the Central Universities. Zou thanks the support of the PAPD of Jiangsu Higher Education Institutions, and the National Center for Theoretical Sciences, Math Division.

References

  1. Bai, Z., Saranadasa, H. (1996). Effect of high dimension: by an example of a two sample problem. Statistics Sinica, 6, 311–329.Google Scholar
  2. Brown, B. M., Chen, S. X. (1998). Combined and least squares empirical likelihood. Annals of the Institue of Statistical Mathematics, 50, 697–714.Google Scholar
  3. Chen, J. H., Variyath, A. M., Abraham, B. (2008). Adjusted empirical likelihood and its properties. Journal of Computational and Graphical Statistics, 17, 426–443.Google Scholar
  4. Chen, S. X., Qin, Y.-L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. The Annals of Statistics, 38, 808–835.Google Scholar
  5. Chen, S. X., Peng, L., Qin, Y.-L. (2009). Effects of data dimension on empirical likelihood. Biometrika, 96, 1–12.Google Scholar
  6. Chen, S. X., Zhang, L.-X., Zhong, P.-S. (2010). Tests for high-dimensional covariance matrices. Journal of American Statistian Assocciation, 105, 810–815.Google Scholar
  7. DasGupta, A. (2008). Asymptotic theory of statistics and probability. New York: Springer.Google Scholar
  8. DiCiccio, T. J., Hall, P., Romano, J. P. (1991). Empirical likelihood is Bartlett correctable. The Annals of Statistics, 19, 1053–1061.Google Scholar
  9. Emerson, S. C., Owen, A. B. (2009). Calibration of the empirical likelihood method for a vector mean. Econometrical Journal of Statisitcs, 3, 1161–1192.Google Scholar
  10. Hall, P. (1992). The bootstrap and Edgeworth expansion. New York: Springer.Google Scholar
  11. Hall, P., Hyde, C. C. (1992). Martingale central limit theory and its applications. New York: Academic Press.Google Scholar
  12. Hjort, H. L., Mckeague, I. W., Van Keilegom, I. (2009). Extending the scope of empirical likelihood. The Annals of Statistics, 37, 1079–1115.Google Scholar
  13. Liu, Y., Chen, J. (2010). Adjusted Empirical Likelihood with High-Order Precision. The Annals of Statistics, 38, 1341–1362.Google Scholar
  14. Liu, Y., Yu, C. W. (2010). Bartlett Correctable Two-Sample Adjusted Empirical Likelihood. Journal of Multivariate Analysis, 101, 1701–1711.Google Scholar
  15. Mardia, K. V., Kent, J. T., Bibby, J. M. (1979). Multivariate analysis. London: Academic Press.Google Scholar
  16. Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237–249.Google Scholar
  17. Owen, A. B. (1990). Empirical likelihood ratio confidence regions. The Annals of Statistics, 18, 90–120.Google Scholar
  18. Owen, A. B. (2001). Empirical likelihood. New York: Chapman & Hall/CRC.Google Scholar
  19. Portnoy, S. (1985). Asymptotic behavior of M-estimations of \(p\) regression parameters with \(p^2/n\) is large. II. normal approximation. The Annals of Statistics, 13, 1403–1417.Google Scholar
  20. Qin, J., Lawless, J. (1994). Empirical likelihood and general equations. The Annals of Statistics, 22, 300–325.Google Scholar
  21. Schott, J. R. (2005). Testing for complete independence in high dimensions. Biometrika, 92, 951–956.Google Scholar
  22. Tang, C. Y., Leng, C. (2005). Penalized high-dimensional empirical likelihood. Biometrika, 97, 905–920.Google Scholar
  23. Tsao, M. (2004). Bounds on coverage probabilities of the empirical likelihood ratio confidence regions. The Annals of Statistics, 32, 1215–1221.Google Scholar
  24. Vapnik, V., Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16, 264–280.Google Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2012

Authors and Affiliations

  1. 1.School of Statistics and FinanceEast China Normal UniversityShanghaiChina
  2. 2.School of Mathematical SciencesNankai UniversityTianjinChina

Personalised recommendations