Projection pursuit via white noise matrices
 Guodong Hui,
 Bruce G. Lindsay
 … show all 2 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
Projection pursuit is a technique for locating projections from high to lowdimensional space that reveal interesting nonlinear features of a data set, such as clustering and outliers. The two key components of projection pursuit are the chosen measure of interesting features (the projection index) and its algorithm. In this paper, a white noise matrix based on the Fisher information matrix is proposed for use as the projection index. This matrix index is easily estimated by the kernel method. The eigenanalysis of the estimated matrix index provides a set of solution projections that are most similar to white noise. Application to simulated data and real data sets shows that our algorithm successfully reveals interesting features in fairly high dimensions with a practical sample size and low computational effort.
 Ahn, JJ, Marron, K, Muller, K, Chi, Y (2007) The highdimension, lowsamplesize geometric representation holds under mild conditions. Biometrika 94: pp. 760 CrossRef
 Azzalini, A, Capitanio, A (1999) Statistical applications of the multivariate skew normal distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61: pp. 579602 CrossRef
 Azzalini, A, Capitanio, A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew tdistribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65: pp. 367389 CrossRef
 Azzalini, A, Valle, A (1996) The multivariate skewnormal distribution. Biometrika 83: pp. 715 CrossRef
 Ballam, J, Chadwick, GB, Guiragossian, ZCG, Johnson, WB, Leith, DWGS, Morigasu, J (1971) Van Hove analysis of the reactions π − p → π − π − π + p and π + p → π + π + π − at 16 GeV/C. Physical Review 4: pp. 19461947
 Bowman, AW, Foster, PJ (1993) Adaptive smoothing and densitybased teste of multivariate normality. Journal of American Statistical Association 88: pp. 529539 CrossRef
 Calo, DG (2007) Gaussian mixture model classification: a projection pursuit approach. Computational Statistics & Data Analysis 52: pp. 471482 CrossRef
 Davison, A.C., and D.V. Hinkley. 1997. Bootstrap methods and their application. Cambridge Series in Statistical and Probabilistic Mathematics, No 1. ISBN10: 0521574714.
 Diaconis, P, Freedman, D (1984) Asymptotics of graphical projection pursuit. Annals of Statistics 12: pp. 793815 CrossRef
 Fraley, C, Raftery, A (2002) Modelbased clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97: pp. 611631 CrossRef
 Friedman, JH (1987) Exploratory projection pursuit. Journal of the American Statistical Association 82: pp. 249266 CrossRef
 Friedman, JH, Tukey, JW (1974) A projection pursuit algorithm for exploatory data analysis. IEEE Transactions on Computers C23: pp. 881889 CrossRef
 FrühwirthSchnatter, S, Pyne, S (2010) Bayesian inference for finite mixtures of univariate and multivariate skewnormal and skewt distributions. Biostatistics 11: pp. 317 CrossRef
 Genton, M. 2004. Skewelliptical distributions and their applications: a journey beyond normality.
 Godambe, VP (1960) An opertimal property of regular maximal likelihood estimation. Annals of Mathematical Statistics 31: pp. 12081211 CrossRef
 Golub, TD, Slonim, P, Tamayo, C, Huard, M, Gaasenbeek, J, Mesirov, H, Coller, M, Loh, J, Downing, M, Caligiuri, M (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: pp. 531 CrossRef
 Hall, P, Marron, JS, Neeman, A (2005) Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67: pp. 427444 CrossRef
 Huber, PJ (1985) Projection pursuit. Annals of Statistics 13: pp. 435475 CrossRef
 Hui, G.D. 2008. Matrix distances with their application to finding directional deviations from normality in highdimensinal data. PhD Thesis, Pennsylvania State University.
 Jee, R.J. 1985. A study of projection pursuit methods. PhD Thesis, Rice University.
 Kagan, A (2001) Aother look at Cramer–Rao inequality. The American Statistician 55: pp. 211212 CrossRef
 Kagan, A., and Yu.V. Linnik, and C.R. Rao. 1973. Characterization problems in mathematical statistics. Wiley Series in Probability and Mathematical Statistics, No 1. ISBN10: 0471454214
 Kazuyoshi, Y, Makoto, A (2001) Effective PCA for highdimension, lowsamplesize data with singular value decomposition of cross data matrix. Journal of Multivariate Analysis 101: pp. 20602077
 Kazuyoshi, Y, Makoto, A (2009) PCA consistency for nonGaussian data in high dimension, low sample size context. Communications in Statistics  Theory and Methods 38: pp. 26342652 CrossRef
 Li, J, Ray, S, Lindsay, B (2007) A nonparametric statistical approach to clustering via mode identification. Journal of Machine Learning Research 8: pp. 16871723
 Lin, T, Lee, J, Yen, S (2007) Finite mixture modelling using the skew normal distribution. Statistica Sinica 17: pp. 909
 Lindsay, BG (1982) Conditional score functions: some optimality results. Biometrika 69: pp. 503512 CrossRef
 Lindsay, BG, Markatou, M, Ray, SR, Yang, K, Chen, SC (2008) Quadratic distances on probabilities: a unified foundation. Annals of Statistics 36: pp. 9831006 CrossRef
 Melnykov, VR, Maitra, (2010) Finite mixture models and modelbased clustering. Statistics Surveys 4: pp. 80116 CrossRef
 Muller, K.E., Y.Y. Chi, J. Ahn, and J.S. Marron. 2011. Limitations of high dimension, low sample size principal components for Gaussian data (under revision for resubmission).
 Papaioannou, T, Ferentinos, K (2005) On two forms of Fisher’s measure of information. Communications in Statistics  Theory and Methods 34: pp. 14611470 CrossRef
 Posse, C (1995) Projection pursuit exploratory data analysis. Computational Statistics and Data Analysis 20: pp. 669687 CrossRef
 Ray, S, Lindsay, BG (2008) Model selection in high dimensions: a quadraticriskbased approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70: pp. 95118
 Sungkyu, J, Marron, JS (1995) PCA consistency in high dimension, low sample size context. Annals of Statistics 37: pp. 41044130
 Terrell, G.R. 1995. A Fisher information test for Pearsonfamily membership. In Proceedings of the statistical computing section, joint statistical meetings, Orlando, Florida, 230–234.
 Title
 Projection pursuit via white noise matrices
 Journal

Sankhya B
Volume 72, Issue 2 , pp 123153
 Cover Date
 20101101
 DOI
 10.1007/s135710110008x
 Print ISSN
 09768386
 Online ISSN
 09768394
 Publisher
 SpringerVerlag
 Additional Links
 Topics
 Keywords

 Projection pursuit
 Fisher information matrix
 Eigenanalysis
 Authors

 Guodong Hui ^{(1)}
 Bruce G. Lindsay ^{(2)}
 Author Affiliations

 1. Genzyme Corporation, Framingham, MA, 01702, USA
 2. Department of Statistics, Pennsylvania State University, University Park, PA, 16802, USA