In this paper, we consider high-dimensional quadratic classifiers in non-sparse settings. The quadratic classifiers proposed in this paper draw information about heterogeneity effectively through both the differences of growing mean vectors and covariance matrices. We show that they hold a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under non-sparse settings. We also propose a quadratic classifier after feature selection by using both the differences of mean vectors and covariance matrices. We discuss the performance of the classifiers in numerical simulations and actual data analyzes. Finally, we give concluding remarks about the choice of the classifiers for high-dimensional, non-sparse data.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Aoshima M, Yata K (2011) Two-stage procedures for high-dimensional data. Seq Anal (Editor’s special invited paper) 30:356–399
Aoshima M, Yata K (2014) A distance-based, misclassification rate adjusted classifier for multiclass, high-dimensional data. Ann I Stat Math 66:983–1010
Aoshima M, Yata K (2015a) Asymptotic normality for inference on multisample, high-dimensional mean vectors under mild conditions. Methodol Comput Appl 17:419–439
Aoshima M, Yata K (2015b) Geometric classifier for multiclass, high-Dimensional data. Seq Anal 34:279–294
Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2002) MLL Translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41–47
Bai Z, Saranadasa H (1996) Effect of high dimension: by an example of a two sample problem. Stat Sinica 6:311–329
Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10:989–1010
Bickel PJ, Levina E (2008) Covariance regularization by thresholding. Ann Stat 36:2577–2604
Cai TT, Liu W (2011) A direct estimation approach to sparse linear discriminant analysis. J Am Stat Assoc 106:1566–1577
Cai TT, Liu W, Luo X (2011) A constrained ℓ 1 minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106:594–607
Chan YB, Hall P (2009) Scale adjustments for classifiers in high-dimensional, low sample size settings. Biometrika 96:469–478
Donoho D, Jin J (2015) Higher criticism for large-scale inference, especially for rare and weak effects. Stat Sci 30:1–25
Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97:77–87
Fan J, Fan Y (2008) High-dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637
Fan J, Feng Y, Tong X (2012) A road to classification in high dimensional space: the regularized optimal affine discriminant. J Roy Stat Soc B 74:745–771
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Hall P, Marron JS, Neeman A (2005) Geometric representation of high dimension, low sample size data. J Roy Stat Soc B 67:427–444
Huang S, Tong T, Zhao H (2010) Bias-corrected diagonal discriminant rules for high-dimensional classification. Biometrics 66:1096–1106
Li Q, Shao J (2015) Sparse quadratic dicriminant analysis for high dimensional data. Stat Sinica 25:457–473
Marron JS, Todd MJ, Ahn J (2007) Distance-weighted discrimination. J Am Stat Assoc 102:1267–1271
Shao J, Wang Y, Deng X, Wang S (2011) Sparse linear discriminant analysis by thresholding for high dimensional data. Ann Stat 39:1241–1265
Yata K, Aoshima M (2013) Correlation tests for high-dimensional data using extended cross-data-matrix methodology. J Multivariate Anal 117:313–331
We would like to thank the reviewers for their constructive comments. The research of the first author was partially supported by Grants-in-Aid for Scientific Research (A) and Challenging Exploratory Research, Japan Society for the Promotion of Science (JSPS), under Contract Numbers 15H01678 and 26540010. The research of the second author was partially supported by Grant-in-Aid for Young Scientists (B), JSPS, under Contract Number 26800078.
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Aoshima, M., Yata, K. High-Dimensional Quadratic Classifiers in Non-sparse Settings. Methodol Comput Appl Probab 21, 663–682 (2019). https://doi.org/10.1007/s11009-018-9646-z
- Asymptotic normality
- Bayes error rate
- Feature selection
- Large p small n
Mathematics Subject Classification (2010)