- 5.7k Downloads
In PCA, the most outlying data points determine the direction of the PCs – these are the ones contributing most to the variance. This often results in score plots showing a large group of points close to the centre. As a result, any local structure is hard to recognize, even when zooming in: such points are not important in the determination of the PCs. One approach is to select the rows of the data matrix corresponding to these points, and to perform a separate PCA on them. Apart from the obvious dificulties in deciding which points to leave out and which to include, this leads to a cumbersome and hard to interpret two-step approach. It would be better if a projection can be found that does show structure, even within very similar groups of points.
KeywordsOutlying Data Point Batch Algorithm Codebook Vector Training Progress Sammon Mapping
Unable to display preview. Download preview PDF.