High-Dimensional Quadratic Classifiers in Non-sparse Settings

Abstract

In this paper, we consider high-dimensional quadratic classifiers in non-sparse settings. The quadratic classifiers proposed in this paper draw information about heterogeneity effectively through both the differences of growing mean vectors and covariance matrices. We show that they hold a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under non-sparse settings. We also propose a quadratic classifier after feature selection by using both the differences of mean vectors and covariance matrices. We discuss the performance of the classifiers in numerical simulations and actual data analyzes. Finally, we give concluding remarks about the choice of the classifiers for high-dimensional, non-sparse data.

This is a preview of subscription content, access via your institution.

References

  1. Aoshima M, Yata K (2011) Two-stage procedures for high-dimensional data. Seq Anal (Editor’s special invited paper) 30:356–399

    MathSciNet  Article  MATH  Google Scholar 

  2. Aoshima M, Yata K (2014) A distance-based, misclassification rate adjusted classifier for multiclass, high-dimensional data. Ann I Stat Math 66:983–1010

    MathSciNet  Article  MATH  Google Scholar 

  3. Aoshima M, Yata K (2015a) Asymptotic normality for inference on multisample, high-dimensional mean vectors under mild conditions. Methodol Comput Appl 17:419–439

  4. Aoshima M, Yata K (2015b) Geometric classifier for multiclass, high-Dimensional data. Seq Anal 34:279–294

  5. Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2002) MLL Translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41–47

    Article  Google Scholar 

  6. Bai Z, Saranadasa H (1996) Effect of high dimension: by an example of a two sample problem. Stat Sinica 6:311–329

    MathSciNet  MATH  Google Scholar 

  7. Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10:989–1010

    MathSciNet  Article  MATH  Google Scholar 

  8. Bickel PJ, Levina E (2008) Covariance regularization by thresholding. Ann Stat 36:2577–2604

    MathSciNet  Article  MATH  Google Scholar 

  9. Cai TT, Liu W (2011) A direct estimation approach to sparse linear discriminant analysis. J Am Stat Assoc 106:1566–1577

    MathSciNet  Article  MATH  Google Scholar 

  10. Cai TT, Liu W, Luo X (2011) A constrained 1 minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106:594–607

    MathSciNet  Article  MATH  Google Scholar 

  11. Chan YB, Hall P (2009) Scale adjustments for classifiers in high-dimensional, low sample size settings. Biometrika 96:469–478

    MathSciNet  Article  MATH  Google Scholar 

  12. Donoho D, Jin J (2015) Higher criticism for large-scale inference, especially for rare and weak effects. Stat Sci 30:1–25

    MathSciNet  Article  MATH  Google Scholar 

  13. Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97:77–87

    MathSciNet  Article  MATH  Google Scholar 

  14. Fan J, Fan Y (2008) High-dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637

    MathSciNet  Article  MATH  Google Scholar 

  15. Fan J, Feng Y, Tong X (2012) A road to classification in high dimensional space: the regularized optimal affine discriminant. J Roy Stat Soc B 74:745–771

    MathSciNet  Article  MATH  Google Scholar 

  16. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537

    Article  Google Scholar 

  17. Hall P, Marron JS, Neeman A (2005) Geometric representation of high dimension, low sample size data. J Roy Stat Soc B 67:427–444

    MathSciNet  Article  MATH  Google Scholar 

  18. Huang S, Tong T, Zhao H (2010) Bias-corrected diagonal discriminant rules for high-dimensional classification. Biometrics 66:1096–1106

    MathSciNet  Article  MATH  Google Scholar 

  19. Li Q, Shao J (2015) Sparse quadratic dicriminant analysis for high dimensional data. Stat Sinica 25:457–473

    MATH  Google Scholar 

  20. Marron JS, Todd MJ, Ahn J (2007) Distance-weighted discrimination. J Am Stat Assoc 102:1267–1271

    MathSciNet  Article  MATH  Google Scholar 

  21. Shao J, Wang Y, Deng X, Wang S (2011) Sparse linear discriminant analysis by thresholding for high dimensional data. Ann Stat 39:1241–1265

    MathSciNet  Article  MATH  Google Scholar 

  22. Yata K, Aoshima M (2013) Correlation tests for high-dimensional data using extended cross-data-matrix methodology. J Multivariate Anal 117:313–331

    MathSciNet  Article  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank the reviewers for their constructive comments. The research of the first author was partially supported by Grants-in-Aid for Scientific Research (A) and Challenging Exploratory Research, Japan Society for the Promotion of Science (JSPS), under Contract Numbers 15H01678 and 26540010. The research of the second author was partially supported by Grant-in-Aid for Young Scientists (B), JSPS, under Contract Number 26800078.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Makoto Aoshima.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 140 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Aoshima, M., Yata, K. High-Dimensional Quadratic Classifiers in Non-sparse Settings. Methodol Comput Appl Probab 21, 663–682 (2019). https://doi.org/10.1007/s11009-018-9646-z

Download citation

Keywords

  • Asymptotic normality
  • Bayes error rate
  • Feature selection
  • Heterogeneity
  • Large p small n

Mathematics Subject Classification (2010)

  • 62H30
  • 62H10