Complementary View on Multivariate Data Structure Based on Kohonen’s SOM, Parallel Coordinates and t-SNE Methods
Nowadays, it is often required in modern condition monitoring applications, to describe acquired signal by set of parameters. It directly leads to mD diagnostic data. Before starting the proper analysis of the recorded data, it is advisable to look at the data globally to get an idea what really they are representing. Visualization of mD data is a challenging problem and probably it is not possible to find an ideal method that could take into account all aspects in case of high dimensional, nonlinear, redundant, etc., data. We propose to use for that goal jointly the triplet multivariate visualization methods: Self-organizing maps, Parallel coordinate plots and t-distributed Stochastic neighbor embedding. The methods use concepts of Machine Learning, simple Geometry and Probabilistic Modeling for finding indices of distances or similarities between the data vectors represented in the multivariate data space as data points. The methods permit to visualize the data points in a plane with possibly preserving their mutual between-point distances in the multidimensional data space. The three proposed methods are complementary, and they are supplementing each other. The considerations are illustrated using a data matrix X of size (\(1000 \times 15\)) containing gearbox diagnostic data structured into 4 (sub)groups. Indeed, the three applied (unsupervised) methods permit to get an insight into the 15-dimensional data space and to state that data points belonging to different subgroups of X have different geometrical location. However, the employed methods do not yield indications for reducing the dimensionality (number of variables) of the considered data.
KeywordsVibration signal Gearbox diagnostics Visualization of multivariate data
- 2.Bartkowiak, A. M., & Zimroz, R. (2015). NMF and PCA as applied to gearbox data. In K. Jackowski, et al. (Eds.), Intelligent data engineering and automated learning– IDEAL 2015 (pp. 199–206). LNCS 9375, Springer.Google Scholar
- 4.Hinton, G. E., & Roweiss, S. T. (2002). Stochastic neighbor embedding. In Advances in neural information processing systems (pp. 833–840), Vol. 15. USA.Google Scholar
- 5.Inselberg, A. (2009). Parallel coordinates: Visual multidimensional geometry and its applications (Textbook 554 pages). New York: Springer.Google Scholar
- 8.Ultsch, A., & Siemion, H. P. (1990). Kohonen’s self organizing feature maps for Explorative Data Analysis. In Proceedings of International Neural Network Conferences (INNC’90) (pp. 305–308). Dordrecht, NL: Kluwer.Google Scholar
- 10.Vesanto, J., et al. (2000). SOM toolbox for Matlab 5 (pp. 1–54). HUT, Libella Oy, Espoo, Finland: Som Toolbox Team. http://www.cis.hut.fi/projects/somtoolbox/.