A projection pursuit index for large p small n data
- 266 Downloads
In high-dimensional data, one often seeks a few interesting low-dimensional projections which reveal important aspects of the data. Projection pursuit for classification finds projections that reveal differences between classes. Even though projection pursuit is used to bypass the curse of dimensionality, most indexes will not work well when there are a small number of observations relative to the number of variables, known as a large p (dimension) small n (sample size) problem. This paper discusses the relationship between the sample size and dimensionality on classification and proposes a new projection pursuit index that overcomes the problem of small sample size for exploratory classification.
KeywordsThe curse of dimensionality Gene expression data analysis Multivariate data Penalized discriminant analysis
Unable to display preview. Download preview PDF.
- Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 268, 531–537 (1999) CrossRefGoogle Scholar
- Huber, P.J.: Data Analysis and Projection Pursuit. Technical Report PJH-90-1, MIT (1990) Google Scholar
- Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, 4th edn. Prentice-Hall, New Jersey (1998) Google Scholar
- Ligges, U.: tuneR: Analysis of music. http://www.r-project.org
- Marron, J.S., Todd, M.: Distance weighted discrimination. Optimization Online Digest, July (2002) Google Scholar