# Introducing the Concept of Interaction Model for Interactive Dimensionality Reduction and Data Visualization

- 73 Downloads

## Abstract

This letter formally introduces the concept of interaction model (IM), which has been used either directly or tangentially in previous works but never defined. Broadly speaking, an IM consists of the use of a mixture of dimensionality reduction (DR) techniques within an interactive data visualization framework. The rationale of creating an IM is the need for simultaneously harnessing the benefit of several DR approaches to reach a data representation being intelligible and/or fitted to any user’s criterion. As a remarkable advantage, an IM naturally provides a generalized framework for designing both interactive DR approaches as well as readily-to-use data visualization interfaces. In addition to a comprehensive overview on basics of data representation and dimensionality reduction, the main contribution of this manuscript is the elegant definition of the concept of IM in mathematical terms.

## Keywords

Dimensionality reduction Interaction model Kernel functions Data visualization## 1 Introduction

Very often, dimensionality reduction (DR) is an essential building block to design both machine learning systems, and information visualization interfaces [1, 2]. In simple terms, DR consists of finding a low-dimensional representation of the original data (said to be high-dimensional) by keeping a criterion of either data structure preservation, or class-separability ensuring. Recent analysis has shown that DR should attempt to reach two goals: First, to ensure that data points that are neighbors in the original space should remain neighbors in the embedded space. Second, to guarantee that two data points should be shown as neighbors in the embedded space only if they are neighbors in the original space. In the context of information retrieval, these two goals can be seen as precision and recall measures, respectively. In spite of being clearly conflicting, the compromise between precision and recall denes the DR method performance. Furthermore, since DR methods are often developed under determined design parameters and pre-established optimization criterion, they still lack of properties such as user interaction and controllability. These properties are characteristic of information visualization procedures. The eld of data visualization (DataVis) is aimed at developing graphical ways of representing data so that information can be more usable and intelligible for the user [3]. Then, one can intuit that DR can be improved by importing some properties of the DataVis methods. This is in fact the premise on which this research is based.

This emergent research area can be referred as interactive dimensionality reduction for visualization. Its main goal is to link the field of DR with that of DataVis, in order to harness the special properties of the latter within DR frameworks. In particular, the properties of controllability and interactivity are of great interest, which should make the DR outcomes significantly more understandable and tractable for the (no-necessarily-expert) user. These two properties allow the user to have freedom to select the best way for representing data. Then, in other words, it can be said that the goal of this project is to develop a DR framework that facilitates an interactive and quick visualization of data representation to make more intelligible the DR outcomes, as well as to allow users modifying the views of data according to their needs in an affordable fashion.

In this connection, this letter formally introduces the concept of interaction model (IM) as a key tool for both interactive DR and DataVis. Even though the term interaction model has been referred directly or tangentially in previous works [4, 5, 6, 7, 8], it has not been formally defined. This paper aims to fill that void. In general terms, the concept of IM refers to a mixture of dimensionality reduction (DR) techniques within an interactive data visualization framework. The rationale of creating an IM is the need for simultaneously harnessing the benefit of several DR approaches to reach a data representation being intelligible and/or fitted to any user’s criterion. As a remarkable advantage, an IM naturally provides a generalized framework for designing both interactive DR approaches as well as readily-to-use data visualization interfaces. That said, the main contribution of this manuscript is the elegant definition of the concept of IM in mathematical terms. Also, it overviews some basics of data representation and dimensionality reduction from matrix algebra point of view. A special interest is given to spectral and kernel-based DR methods, which can be generalized by kernel principal component analysis (KPCA) and readily incorporated into a linear IM.

The remaining of this manuscript is organized as follows: Sect. 2 states the mathematical notation for main variables and operators. Section 3 presents a short overview on basic concepts and introductory formulations for DR, with a special interest in KPCA. In Sect. 4, we formally define the concept of IM as well as its particular linear version. Also, the use of DR-based DataVis is outlined. Finally, some concluding remarks are gathered in Sect. 5.

## 2 Mathematical Notation

*N*data points or samples described by a

*D*-dimensional feature set, such that:

*i*-th data point, \(i \in \{1, \ldots , N\}\) and \(l \in \{1, \ldots , D\}\). Likewise, let us consider a lower-dimensional matrix, so-named embedded space, as

*i*-th column vector.

## 3 Overview on Dimensionality Reduction and Data Visualization

### 3.1 Data Representation

Data representation is a wide-meaning term coined by some authors to refer by-and-large to either data transformation or feature extraction. The former consists of transforming data into a new version intended to fulfill a specific goal [9]. The latter, meanwhile, is somewhat as a data remaking in such a manner that input data undergo a morphological deformation, or projection (also called rotation) by following a certain transformation criterion [10]. Also, data representation may be referred to yielding a new data representation -just as a new data matrix being an alternative to the original given one. For instance, a dissimilarity-based representation of input data [11]. An exhaustive review on data representation is presented in [12].

### 3.2 Dimensionality Reduction

**DR Approaches:** For instance, pioneer approaches such as principal component analysis (PCA) or classical multidimensional scaling (CMDS) optimize the reduction in terms of variance and distance preservation criterion, respectively
[14]. More sophisticated methods attempt to capture the data topology through a non-directed and weighted data-driven graph, which is formed by nodes located at the geometrical coordinates pointed out by the data points represent the nodes, and a non-negative similarity (also called affinity, or Gram) matrix holding the pairwise edge weights. Such data-topology-based criteria has been addressed by both spectral
[15] and divergence-based methods
[16]. Similarity matrix can represent either the weighting factor for pairwise distances as happens in Laplacian eigenmaps and locally linear embedding
[17, 18], or a probability distribution as is the case of methods based on divergences such as stochastic neighbour embedding
[19]. In this letter, we give especial interest to the spectral approaches -more specifically, to the so-named kernel PCA.

**Kernel PCA (KPCA):**Now, let us recall the high-dimensional space \({\varvec{\varPhi }}\) defined in Sect. 2, let us also consider a lower-rank reconstructed matrix \(\widehat{{\varvec{\varPhi }}}\), and a dissimilarity function \(\delta (\cdot , \cdot )\). Matrix \(\widehat{{\varvec{\varPhi }}}\) is obtained from a lower-rank base whose full-rank version spans \({\varvec{\varPhi }}\), and minimizes \(\delta ({\varvec{\varPhi }},\widehat{{\varvec{\varPhi }}})\). To keep all the KPCA conditions, following developments are done under the assumption that \({\varvec{\varPhi }}\) is centered. Let us define a

*d*-dimensional base \({\textit{\textbf{W}}} \in \mathbb {R}^{D_h \times d}\), such that \({\textit{\textbf{W}}} = ({\textit{\textbf{w}}}^{(1)}, \ldots , {\textit{\textbf{w}}}^{(d)})\) and \({\textit{\textbf{W}}}^\top {\textit{\textbf{W}}} = {\textit{\textbf{I}}}_d\), where \({\textit{\textbf{w}}}^{(\ell )} \in \mathbb {R}^{D_h}\), \(\ell \in \{1, \ldots , d\}\), and \({\textit{\textbf{I}}}_d\) is a

*d*-dimensional identity matrix. Since \(d < D\), we can say that the base \({\textit{\textbf{W}}}\) is lower-rank. Given this, the low-dimensional space can be calculated by means of a linear projection, as follows:

*d*largest eigenvalues of \({\varvec{\varPhi }}{\varvec{\varPhi }}^\top \) and \({\varvec{\varPhi }}^\top {\varvec{\varPhi }}\), respectively. This theorem is known as optimal low-rank representation widely discussed and demonstrated in [15].

**Kernel Trick:**Furthermore, by following the Mercer’s condition or the so-called kernel trick, we can introduce a kernel function \(k(\cdot , \cdot )\), which estimates the inner product \(\phi ({\textit{\textbf{x}}}_i)^\top \phi ({\textit{\textbf{x}}}_j) = k({\textit{\textbf{x}}}_i, {\textit{\textbf{x}}}_i)\). By gathering all the pairwise kernel values, we can write a kernel matrix \({\textit{\textbf{K}}} = [k_{ij}]\) as:

### 3.3 DataVis via Interactive DR

Quite intuitively, one can infer that the premise underlying the use of DR for DataVis purposes is to making directly intelligible the information of a high-dimensional dataset by displaying it into a representation in 3 or less dimensions.

Besides, the incorporation of interactivity into the DR technique itself or DR-based DataVis interfaces enables the users (even non-expert ones) to select a method or tune parameters thereof in an intuitive fashion.

## 4 Concept of Interaction Model (IM) for DR

### 4.1 Definition of IM

Herein, the interaction is considered as the ability to incorporate in a readily manner the user’s criterion into the stages of the data exploration process. In this case, the DR is the stage of interest. Particularly, we refer to interactivity to the possibility of tuning parameters or selecting methods within an interactive interface. As traditionally done in previous works [8], the interactivity consists of a mixture of functions or elements representing DR techniques. In the following, we formally define the concept of IM:

### Definition 1

**(Interaction model (IM)).**Under a certain topological space \(\mathcal {V}\) and given a set of

*M*functions or elements (\(f_m \in \mathcal {V}\)) representing

*M*different dimensionality reduction techniques (\({\textit{\textbf{f}}} = \{f_1, \ldots , f_M\}\)) and a set of weighting factors \({\varvec{\alpha }} = \{\alpha _1, \ldots , \alpha _M\}\), an IM is defined as any possible mixture \(\tilde{f} \in \mathcal {V}\) of such a set, in the form:

### 4.2 Linear IM

As a particular case of the Definition 1, we can define the linear IM as follows:

### Definition 2

**(Linear IM (LIM)).**Under a certain vector space \(\mathbb {E}\) and given a set of

*M*functions or elements (\(f_m \in \mathcal {V}\)) representing

*M*different dimensionality reduction techniques (\({\textit{\textbf{f}}} = \{f_1, \ldots , f_M\}\)) and a set of weighting factors \({\varvec{\alpha }} = \{\alpha _1, \ldots , \alpha _M\}\), a LIM is defined as the weighted sum \(\tilde{f} \in \mathcal {V}\) in the form:

### 4.3 DR-techniques Representation

As explained [20], spectral DR methods are susceptible to be represented as kernel matrices. Also, in [15] is demonstrated that, when incorporated into a KCPA algorithm, such kernel matrices reach the same low-dimensional spaces as those obtained by the original DR methods. Let us consider the following kernel representations for three well-known spectral DR approaches:

**Classical Multi-dimensional Scaling (CMDS):**CMDS kernel can be expressed as the double centered, squared Euclidean distance matrix \({\textit{\textbf{D}}} \in \mathbb {R}^{N \times N}\) so

*ij*entry of \({\textit{\textbf{D}}}\) is given by \(d_{ij} = ||{\textit{\textbf{x}}}_i - {\textit{\textbf{x}}}_j||_2^2\).

**Laplacian Eigenmaps (LE).**Since KPCA is a maximization of the high-dimensional covariance represented by a kernel, LE can be represented as the pseudo-inverse of the graph Laplacian \({\textit{\textbf{L}}}\):

**Locally Linear Embedding (LLE):**A kernel for LLE can be approximated from a quadratic form in terms of the matrix \({\varvec{\mathcal {W}}}\) holding linear coefficients that sum to 1 and optimally reconstruct the original data matrix \({\textit{\textbf{X}}}\). Define a matrix \({\textit{\textbf{M}}} \in \mathbb {R}^{N \times N}\) as \({\textit{\textbf{M}}} = ({\textit{\textbf{I}}}_N - {\varvec{\mathcal {W}}})({\textit{\textbf{I}}}_N - {\varvec{\mathcal {W}}}^\top )\) and \(\lambda _{max}\) as the largest eigenvalue of \({\textit{\textbf{M}}}\). Kernel matrix for LLE is in the form

### 4.4 Use of IM in Interactive DR

^{1}, as those reviewed in [8]. Finally, the low-dimensional space \({\textit{\textbf{Y}}}\) is calculated by applying KPCA on \(\widetilde{{\textit{\textbf{K}}}}\), so:

## 5 Final Remarks

In this work, we have elegantly defined the concept of the so-named interaction model (IM). Such a definition open the possibility of developing more formally new interactive data visualization based on a mixture of dimensionality reduction techniques.

In future works, we will explore and/or develop novel kernel representations arising from other dimensionality reduction methods as well as IM approaches enabling users to readily incorporate their knowledge and expertise into data exploration and visualization.

## Footnotes

- 1.
Some IM-based interfaces are available at https://sdas-group.com/gallery/.

## Notes

### Acknowledgment

The authors acknowledge to the research project “Desarrollo de una metodología de visualización interactiva y eficaz de información en Big Data” supported by Agreement No. 180 November 1st, 2016 by VIPRI from Universidad de Nariño.

As well, authors thank the valuable support given by the SDAS Research Group (www.sdas-group.com).

## References

- 1.Gou, J., Yang, Y., Yi, Z., Lv, J., Mao, Q., Zhan, Y.: Discriminative globality and locality preserving graph embedding for dimensionality reduction. Expert Syst. Appl.
**144**, 113079 (2020)CrossRefGoogle Scholar - 2.Lee, J.A., Peluffo-Ordóñez, D.H., Verleysen, M.: Multi-scale similarities in stochastic neighbour embedding: reducing dimensionality while preserving both local and global structure. Neurocomputing
**169**, 246–261 (2015)CrossRefGoogle Scholar - 3.Ward, M.O., Grinstein, G., Keim, D.: Interactive Data Visualization: Foundations, Techniques, and Applications. CRC Press (2010)Google Scholar
- 4.Peluffo-Ordónez, D.H., Alvarado-Pérez, J.C., Lee, J.A., Verleysen, M., et al.: Geometrical homotopy for data visualization. In: European Symposium on Artificial Neural Networks (ESANN 2015). Computational Intelligence and Machine Learning (2015)Google Scholar
- 5.Salazar-Castro, J., Rosas-Narváez, Y., Pantoja, A., Alvarado-Pérez, J.C., Peluffo-Ordóñez, D.H.: Interactive interface for efficient data visualization via a geometric approach. In: 2015 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA), pp. 1–6. IEEE (2015)Google Scholar
- 6.Rosero-Montalvo, P., et al.: Interactive data visualization using dimensionality reduction and similarity-based representations. In: Beltrán-Castañón, C., Nyström, I., Famili, F. (eds.) CIARP 2016. LNCS, vol. 10125, pp. 334–342. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52277-7_41CrossRefGoogle Scholar
- 7.Rosero-Montalvo, P.D., Peña-Unigarro, D.F., Peluffo, D.H., Castro-Silva, J.A., Umaquinga, A., Rosero-Rosero, E.A.: Data visualization using interactive dimensionality reduction and improved color-based interaction model. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds.) IWINAC 2017. LNCS, vol. 10338, pp. 289–298. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59773-7_30. (Cited by 8)CrossRefGoogle Scholar
- 8.Umaquinga-Criollo, A.C., Peluffo-Ordóñez, D.H., Rosero-Montalvo, P.D., Godoy-Trujillo, P.E., Benítez-Pereira, H.: Interactive visualization interfaces for big data analysis using combination of dimensionality reduction methods: a brief review. In: Basantes-Andrade, A., Naranjo-Toro, M., Zambrano Vizuete, M., Botto-Tobar, M. (eds.) TSIE 2019. AISC, vol. 1110, pp. 193–203. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37221-7_17CrossRefGoogle Scholar
- 9.Amin, A., et al.: Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods. Int. J. Inf. Manag.
**46**, 304–319 (2019)CrossRefGoogle Scholar - 10.Peluffo, D., Lee, J., Verleysen, M., Rodríguez-Sotelo, J., Castellanos-Domínguez, G.: Unsupervised relevance analysis for feature extraction and selection: a distance-based approach for feature relevance. In: International Conference on Pattern Recognition, Applications and Methods-ICPRAM (2014)Google Scholar
- 11.Cao, H., Bernard, S., Heutte, L., Sabourin, R.: Dissimilarity-based representation for radiomics applications. CoRR abs/1803.04460 (2018)Google Scholar
- 12.Zhong, G., Wang, L.N., Ling, X., Dong, J.: An overview on data representation learning: from traditional feature learning to recent deep learning. J. Finance Data Sci.
**2**(4), 265–278 (2016)CrossRefGoogle Scholar - 13.Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, Heidelberg (2007). https://doi.org/10.1007/978-0-387-39351-3CrossRefzbMATHGoogle Scholar
- 14.Borg, I., Groenen, P.J.: Modern Multidimensional Scaling: Theory and Applications. Springer, Heidelberg (2005). https://doi.org/10.1007/0-387-28981-XCrossRefzbMATHGoogle Scholar
- 15.Peluffo-Ordóñez, D.H., Lee, J.A., Verleysen, M.: Generalized kernel framework for unsupervised spectral methods of dimensionality reduction. In: 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 171–177. IEEE (2014)Google Scholar
- 16.Peluffo-Ordóñez, D.H., Lee, J.A., Verleysen, M.: Short review of dimensionality reduction methods based on stochastic neighbour embedding. In: Villmann, T., Schleif, F.-M., Kaden, M., Lange, M. (eds.) Advances in Self-Organizing Maps and Learning Vector Quantization. AISC, vol. 295, pp. 65–74. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07695-9_6CrossRefGoogle Scholar
- 17.Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput.
**15**(6), 1373–1396 (2003)CrossRefGoogle Scholar - 18.Zhang, Z., Wang, J.: MLLE: modified locally linear embedding using multiple weights. In: Advances in Neural Information Processing Systems, pp. 1593–1600 (2007)Google Scholar
- 19.Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems, pp. 857–864 (2003)Google Scholar
- 20.Ham, J., Lee, D.D., Mika, S., Schölkopf, B.: A kernel view of the dimensionality reduction of manifolds. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 47. ACM (2004)Google Scholar