Skip to main content
Log in

A combined p-value test for the mean difference of high-dimensional data

  • Articles
  • Published:
Science China Mathematics Aims and scope Submit manuscript

Abstract

This paper proposes a novel method for testing the equality of high-dimensional means using a multiple hypothesis test. The proposed method is based on the maximum of standardized partial sums of logarithmic p-values statistic. Numerical studies show that the method performs well for both normal and non-normal data and has a good power performance under both dense and sparse alternative hypotheses. For illustration, a real data analysis is implemented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bai Z, Saranadasa H. Effect of high dimension: By an example of a two-sample problem. Statist Sinica, 1996, 6: 311–329

    MathSciNet  MATH  Google Scholar 

  2. Bennett B M. Note on a solution of the generalized Behrens-Fisher problem. Ann Inst Statist Math, 1951, 2: 87–90

    Article  MathSciNet  MATH  Google Scholar 

  3. Cai T, Liu W D. Adaptive thresholding for sparse covariance matrix estimation. J Amer Statist Assoc, 2011, 106: 672–684

    Article  MathSciNet  MATH  Google Scholar 

  4. Cai T, Liu W D, Luo X. A constrained l1 minimization approach to sparse precision matrix estimation. J Amer Statist Assoc, 2011, 106: 594–607

    Article  MathSciNet  MATH  Google Scholar 

  5. Cai T, Liu W D, Xia Y. Two-sample test of high dimensional means under dependency. J R Stat Soc Ser B Stat Methodol, 2014, 76: 349–372

    Article  MathSciNet  Google Scholar 

  6. Chakraborty A K, Chatterjee M. On multivariate folded normal distribution. Sankhyā, 2013, 75: 1–15

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen S X, Qin Y L. A two-sample test for high-dimensional data with applications to gene-set testing. Ann Statist, 2010, 38: 808–835

    Article  MathSciNet  MATH  Google Scholar 

  8. David H A, Nagaraja H N. Order Statistics, 3rd ed. Hoboken: John Wiley & Sons, 2003

    Book  Google Scholar 

  9. Dong Z C, Yu W, Xu W L. A modified combined p-value multiple test. J Stat Comput Simul, 2015, 85: 2479–2490

    Article  MathSciNet  Google Scholar 

  10. Dudbridge F, Koeleman B P C. Rank truncated product of P values, with application to genomewide association scans. Genet Epidemiol, 2003, 25: 360–366

    Article  Google Scholar 

  11. Feng L, Zou C L, Wang Z J, et al. Two-sample Behrens-Fisher problem for high-dimensional data. Statist Sinica, 2015, 25: 1297–1312

    MathSciNet  MATH  Google Scholar 

  12. Fisher R A. Statistical Methods for Research Workers. London: Oliver and Boyd, 1932

    MATH  Google Scholar 

  13. Gregory K B, Carroll R J, Baladandayuthapani V, et al. A two-sample test for equality of means in high dimension. J Amer Statist Assoc, 2015, 110: 837–849

    Article  MathSciNet  MATH  Google Scholar 

  14. Gupta A K, Nadarajah S. Handbook of Beta Distribution and Its Applications. New York: Marcel Dekker, 2004

    Book  MATH  Google Scholar 

  15. Hall P, Jing B-Y, Lahiri S N. On the sampling window method for long-range dependent data. Statist Sinica, 1998, 8: 1189–1204

    MathSciNet  MATH  Google Scholar 

  16. Hu X J, Gadbury G L, Xiang Q F, et al. Illustrations on using the distribution of a P-value in high-dimensional data analyses. Adv Appl Stat Sci, 2010, 1: 191–213

    MathSciNet  MATH  Google Scholar 

  17. Sheng X, Yang J. An adaptive truncated product method for combining dependent p-values. Economics Letters, 2013, 119: 180–182

    Article  MathSciNet  MATH  Google Scholar 

  18. Srivastava M. Multivariate theory for analyzing high dimensional data. J Japan Statist Soc, 2007, 37: 53–86

    Article  MathSciNet  MATH  Google Scholar 

  19. Tsanas A, Little M A, Fox C, et al. Objective automatic assessment of rehabilitative speech treatment in Parkinson's disease. IEEE Trans Neural Syst Rehabil Eng, 2014, 22: 181–190

    Article  Google Scholar 

  20. Yu K, Li Q, Bergen W A, et al. Pathway analysis by adaptive combination of P-values. Genet Epidemiol, 2009, 22: 170–185

    Google Scholar 

  21. Zaykin D V, Zhivotovsky L A, Westfall P H, et al. Truncated product method for combining P-values. Genet Epidemiol, 2002, 22: 170–185

    Article  Google Scholar 

  22. Zhang S, Chen H, Pfeiffer R M. A combined p-value test for multiple hypothesis testing. J Statist Plann Inference, 2013, 143: 764–770

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by a grant from the University Grants Council of Hong Kong, National Natural Science Foundation of China (Grant No. 11471335), the Ministry of Education project of Key Research Institute of Humanities and Social Sciences at Universities (Grant No. 16JJD910002), and Fund for Building World-Class Universities (Disciplines) of Renmin University of China. The authors thank two referees for their constructive comments that led to an improvement of an early version of the article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lixing Zhu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, W., Xu, W. & Zhu, L. A combined p-value test for the mean difference of high-dimensional data. Sci. China Math. 62, 961–978 (2019). https://doi.org/10.1007/s11425-017-9190-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11425-017-9190-6

Keywords

MSC(2010)

Navigation