Statistics and Computing

, Volume 20, Issue 3, pp 381–392 | Cite as

A projection pursuit index for large p small n data

Article

Abstract

In high-dimensional data, one often seeks a few interesting low-dimensional projections which reveal important aspects of the data. Projection pursuit for classification finds projections that reveal differences between classes. Even though projection pursuit is used to bypass the curse of dimensionality, most indexes will not work well when there are a small number of observations relative to the number of variables, known as a large p (dimension) small n (sample size) problem. This paper discusses the relationship between the sample size and dimensionality on classification and proposes a new projection pursuit index that overcomes the problem of small sample size for exploratory classification.

Keywords

The curse of dimensionality Gene expression data analysis Multivariate data Penalized discriminant analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97(1), 77–87 (2002) MATHCrossRefMathSciNetGoogle Scholar
  2. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 268, 531–537 (1999) CrossRefGoogle Scholar
  3. Good, P.: Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer, Berlin (2000) MATHGoogle Scholar
  4. Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. Ann. Stat. 23(1), 73–102 (1995) MATHCrossRefMathSciNetGoogle Scholar
  5. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Berlin (2001) MATHGoogle Scholar
  6. Huber, P.J.: Projection pursuit (with discussion). Ann. Stat. 13, 435–525 (1985) MATHCrossRefMathSciNetGoogle Scholar
  7. Huber, P.J.: Data Analysis and Projection Pursuit. Technical Report PJH-90-1, MIT (1990) Google Scholar
  8. Ihaka, R., Gentleman, R.: A language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996) CrossRefGoogle Scholar
  9. Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, 4th edn. Prentice-Hall, New Jersey (1998) Google Scholar
  10. Lee, E., Cook, D., Klinke, S., Lumley, T.: Projection pursuit for exploratory supervised classification. J. Comput. Graph. Stat. 14(4), 831–846 (2005) CrossRefMathSciNetGoogle Scholar
  11. Ligges, U.: tuneR: Analysis of music. http://www.r-project.org
  12. Marron, J.S., Todd, M.: Distance weighted discrimination. Optimization Online Digest, July (2002) Google Scholar
  13. Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996) MATHGoogle Scholar
  14. Swayne, D.F., Lang, D.T., Buja, A., Cook, D.: GGobi: evolving from XGobi into an extensible framework for interactive data visualization. Comput. Stat. Data Anal. 43(4), 423–444 (2003) MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of StatisticsEWHA Womans UniversitySeoulKorea
  2. 2.Department of StatisticsIowa State UniversityAmesUSA

Personalised recommendations