Abstract
This paper proposes a novel variable learning rate to address two main challenges of the conventional SelfOrganizing Maps (SOM) termed VLRSOM: high accuracy with fast convergence and low topological error. We empirically showed that the proposed method exhibits faster convergence behavior. It is also more robust in topology preservation as it maintains an optimal topology until the end of the maximum iterations. Since the learning rate adaption and the misadjustment parameter depends on the calculated error, the VLRSOM will avoid the undesired results by exploiting the error response during the weight updation. Then the learning rate is updated adaptively after the random initialization at the beginning of the training process. Experimental results show that it eliminates the tradeoff between the rate of convergence and accuracy and maintains the data's topological relationship. Extensive experiments were conducted on different types of datasets to evaluate the performance of the proposed method. First, we experimented with synthetic data and handwritten digits. For each data set, two experiments with a different number of iterations (200 and 500) were performed to test the stability of the network. The proposed method was further evaluated using four benchmark data sets. These datasets include Balance, Wisconsin Breast, Dermatology, and Ionosphere. In addition, a comprehensive comparative analysis was performed between the proposed method and three other SOM techniques: conventional SOM, parameterless selforganizing map (PLSOM2), and RASOM in terms of accuracy, quantization error (QE), and topology error (TE). The results indicated the proposed approach produced superior results to the other three methods.
Introduction
Kohonen's SelfOrganizing Map (SOM) [1, 2] is an artificial neural network that maps highdimensional inputs to a lowerdimensional lattice of artificial neurons [3]. The learning is usually done in an unsupervised fashion as it does not require input labels to train the model. SOMs have proven to be effective for solving various problems, including but not limited to clustering [4, 5], dimensionality reduction [6, 7], anomaly detection [8, 9], feature selection [10], speaker recognition [11], nonstationary realworld agents [12], remote sensing [13], speaker recognition [11], etc. In addition, SOM has the potential to preserve the topological relationship of the input space, which is essential for producing consistent results.
The SOM architecture consists of artificial neurons arranged in layers to map the input to the desired output. The neurons are connected via weight vectors. These weight vectors keep updating during the training process to learn the patterns in the input data. Obtaining the optimal values for the weight vectors that result in high accuracy is the main objective of the learner.
Generally, the SOM algorithm has two main stages: competition and adaption. The algorithm proceeds by selecting a winner neuron from a set of neurons in the competition stage. The weights are updated for the winner neuron and other nodes in its vicinity in the adaption stage. The conventional SOM algorithm has two main challenges that hinder its performance: weight initialization and topology preservation [14]. The conventional SOM uses fixed weights for initialization, and a predefined topology is selected. Using a fixed weight may affect both the algorithm's accuracy and robustness. In addition, its performance is adversely affected in the case of nonstationary datasets [7]. Topology preservation is crucial for producing consistent results. The literature review showed that preserving topology is tricky with conventional SOM as high topology error is reported on various datasets [7, 15,16,17].
Over the last few decades, researchers proposed various versions of the conventional SOM algorithm to exploit its potential fully. For instance, Growing SOM (GSOM) [18] is a hierarchical clustering approach that improves the conventional SOM's accuracy. In GSOM, the number of neurons is gradually increased to produce the final maps. It introduces a spreading factor as a tool to control the size of the maps being generated. Similarly, the authors in [19] propose an asymmetric neighborhood function adopted into the GSOM algorithm to reduce the topology error further and improve the accuracy. GrowwhenRequired (GWR) is another variant of SOM [20] that learns the prototypical representation of the interaction between human and object in an unsupervised manner. The GWRSOM showed superior performance for human motion patterns clustering. A common limiting factor for achieving faster convergence in conventional SOM is its sequential execution of tasks. To achieve the highspeed processing capability of the SOM algorithm, a fully parallel architecture of SOM is proposed in [21]. The experimental results showed that it achieved 8.91 times faster processing speed than the sequential SOM algorithm. The authors in [14] proposed a semisupervised learning technique called a semisupervised Growing SelfOrganizing Map (SSGSOM). Like other variants, SSGSOM also resulted in higher accuracy. In addition, it is faster than conventional SOM, which is exploited for the visualization of higherdimensional data on a 2D feature map quickly. An improved version of SOM for clustering of time series data is proposed by Jayanth et al. [22]. In particular, the prototype vectors are initialized using farthest neighbors in contrast to the random initialization in SOM. Moreover, dynamic time warping is employed as a metric for measuring the similarity between signals. These two combinations produced highquality clusters for time series data. The results indicated that the proposed SOM not only performed better than Agglomerative Clustering but also it is more scalable in processing time to compute clusters. In some research studies, a combination of various machine learning technique is proposed which lead to better clustering and interpretability. For instance, Mateusz et al. [23] employed feature selection using PCA first then applied Selforganizing maps (SOM) neural network and Kmeans clustering combination model for clustering analysis that resulted in higher classification accuracy. A novel unsupervised crossmodal retrieval framework based on associative learning has been proposed in [24] where two traditional SOMs are trained separately for images and collateral text and then they are associated together using the Hebbian learning network to facilitate the crossmodal retrieval process. In [25], the authors proposed a novel SOM approach termed unsupervised borderline SOM to solve the two main challenges, class imbalance and high dimensionality. Generally, industrial processes produce highdimensional data set and that is an imbalance as well. UBSOM used a small number of nodes to represent the normal samples and also help highlight the borderline areas. Highly accurate results were obtained on datasets for fault detection in an industrial process using the proposed method. In another study [26], an ensembles method based on SOM and a support vector machine is proposed for survival risk prediction of cancer patients.
In [27], the authors proposed the Growing Hierarchical SelfOrganizing Map (GHSOM) algorithm, a dynamic variant of the original SOM. GHSOM has two main issues: slow learning rate and unable to process categorical data. To solve these two issues, SparkGHSOM is proposed in [28]. SparkGHSOM integrates the Spark platform to enable the processing of large amounts of data. It also introduced a new cost function that can handle both numerical and categorical data. In [29], least squares support vector machines (LSSMSs) with multiple kernels are used to solve data redundancy and improve the conventional SOM's performance. In [30], the half quadratic (HQ) approach is adopted for parameter selection in a semisupervised growing organizing map (SSGSOM). Another work, [31], uses an adaptable variable learning rate to obtain optimal weight vectors for the winner neurons. The authors in [20] proposed selforganizing mapbased oversampling (SOMO) to deal with class imbalance issues. SOM is first used to transform the input data to a lowerdimensional space, and then it derives the within and between synthetic cluster data. The proposed method was tested in synthetic data which showed promising results. Zhang et al. [32] proposed a Biomimetic SLAM Algorithm Based on a Growing SelfOrganizing Map (GSOMBSLAM) to overcome the uncertainty issues in location identification which uses selfmotionaware information to obtain activation response. In [33], to add automatic interpretations for better decision making, a method based on a combination of casebased reasoning, semiotic concepts and selforganizing maps is used. Moreover, a novel datadriven sign deconstruction mechanism is introduced to the problem domain.
Preserving the topological order is vital for obtaining consistent results for both the feature maps and the clusters. Since the optimization algorithms required performing the experiment several times over a number of iterations, the algorithm must assign the data to the most closet neuron [6]. Moreover, the presence of outliers in the data may result in the suboptimal performance of any machinelearning algorithm. Some SOM variants have also focused on this issue. For instance, a smoothed SOM (SSOM) is proposed in [16] that can deal with outliers without affecting the model's performance. Specifically, a new learning rule is introduced, which helps smooth out the representation of outlying vectors. Similarly, in [34], the authors exploit the Neighbor Entropy Local Outlier Factor (NELOF) to identify and remove clusters. The initial clustering is done with SOM followed by a refinement step using the entropy of the Krelative neighborhood to redefine the local outlier factor (LOF). The LOF is then used to identify the outliers and remove them for the subsequent iterations.
Recently, deep learningbased methods have been widely used for pattern recognition problems. Although deeper networks have achieved promising performance on large datasets, they need a large amount of labeled data. Training such networks is difficult due to their highly complex network structures. Literature review shows that the SOM techniques have also been used with deep learningbased frameworks for various problems. For instance, a new unsupervised technique for visual feature learning is proposed to learn invariant image representation from unlabeled data [35], known as deep convolutional selforganizing maps (DCSOM). It consists of a cascade of convolutional SOM layers that extract features at multiple levels. Similarly, an extended version of deep SOM (DSOM) is proposed in [36]. It modified the learning algorithm and introduced unsupervised learning in the model, and the architecture is modified to learn features of different resolutions. The results indicated that the performance was significantly improved for classification. In [37] authors proposed a denoising autoencoder selforganizing map (DASOM) that integrates denoising autoencoders into a hierarchically organized hybrid model. This arrangement will help learn the model parameters in an unsupervised fashion and also maintain the clustering properties.
The deep learning techniques have demonstrated their ability to solve various supervised learning problems. In recent times, deep neural networks are also combined with representation learning for data clustering tasks. Specific regularization techniques are introduced to learn data representation to improve clustering. Florent et al. [38] proposed a deep embedded selforganizing map (DESOM) that combines both representation learning and clustering as a joint task. The model consists of an autoencoder and a SOM layer which are trained jointly to learn SOMfriendly representations. Experimental results on various benchmark datasets showed that DESOM improved the quality of quantization and topology in latent space.
This paper presents a novel variation of the conventional SOM with a variable learning rate parameter called VLRSOM. This method can obtain optimal weights to produce higher accuracy and reduce the topological error. Experiments are performed to evaluate the effect of the variable learning rate on the accuracy and topological error of the network. The results indicate that the proposed VLRSOM can produce high accuracy, low QE, and topology error (TE) compared to conventional SOM and some of its popular variants.
The rest of the paper is organized as follows. The related work and an overview of the relevant SOM variants are presented in “Related work”. “Proposed method” summarizes the motivation and contributions of this paper. “Experimental results” describes the proposed method. The experimental results are presented in “Conclusion”. Finally, the paper is completed with concluding remarks.
Motivation and contributions
As mentioned earlier, SOM is an unsupervised learning approach that can maintain the topological relationship among the input data. Since SOM can extract the latent representation of the input space, it is highly useful for applications such as data clustering and visualization. However, SOM suffers from some basic limitations and may produce undesirable results. In this section, we highlight some issues and motivate the need to improve the accuracy of the SOM algorithm further. Some main issues and desirable properties of the SOM are presented as follows:
The choice of learning rate is crucial for training the SOM model. One issue with conventional SOM is that it uses a constant learning rate. However, this may lead to convergence issues. When a small learning rate is selected, it will produce a low error, but the convergence rate will be slow. In contrast, a large learning rate value provides a faster convergence, but it may result in a high error. Several versions of SOM were introduced to handle this issue [16, 25, 30, 31, 39,40,41,42]. However, its dependence on weight initialization and the presence of outliers in the input space can adversely affect its performance.
The SOM algorithm also expects a balanced class distribution. However, in some situations obtaining balanced data can be difficult. Applying conventional SOM on such classimbalanced data may produce undesirable results. Although various methods have been proposed to tackle this issue by generating artificial data to achieve a balanced distribution, these techniques may contain noise and make the precise prediction a challenging problem [40]. The constant learning rate may not adapt well to the imbalanced data and may fail to converge due to the high error. Generally, a lower learning rate is expected against a higher error value and visversa.
SOM suffers from many issues like oversensitivity to outliers and high dependence on the weight initialization. In addition, the computation cost of proposed SOM variations is also high. Therefore, new methods are motivated to overcome the issues in SOM with less computational overhead.
We proposed an algorithm that can deal with the abovementioned problems. The main features of the proposed work are summaries as below:

Introduced a novel variable learning rate to improve both accuracy and the convergency behavior of the algorithm.

The proposed algorithm is more robust in terms of TE as it produced the optimal topology and maintained it until the end of the iterations. It can reach the steady error state faster than other variants of SOM in the training process.

The presence of outliers or class imbalance does not affect its performance.

It can deal with multiclass clustering problems with high accuracy.

We have performed detailed experiments on various benchmark and synthetic datasets.
Related work
This section summarizes some wellknown and most relevant algorithms, followed by a detailed description of the proposed VLRSOM algorithm.
Improved parameterless selforganizing map (PLSOM2) algorithm
The original PLSOM algorithm [42] proposed an effective solution to the main issues encountered in conventional SOM when dealing with some specific types of mapping tasks. However, the PLSOM is oversensitivity to the outliers and highly dependent on weight initialization. An improvement of PLSOM was introduced as PLSOM2 [43] to address these problems. PLSOM2 is robust against the outliers, thus resulting in improved accuracy compared to the PLSOM algorithm. Moreover, PLSOM2 is not computationally expensive and does not require prior knowledge of the input data.
The PLSOM2 has overcome the problem in PLSOM [42] by using the range of the inputs to scale the weight update method during training. While in PLSOM, the weight update was performed using the size of the error relative to the maximum error [43]. In PLSOM2, the scaling variable of PLSOM is calculated as follows [42]:
where \(\epsilon \left( t \right)\) is the scaling variable which is considered as a normalized Euclidean distance between input vector \(x\left( t \right)\) at time \(t\) and the closest weight vector \(w_{{\text{c}}} \left( t \right)\) given as
and \(S\left( t \right)\) is calculated as
A large value of \(\epsilon\) indicates that the output map has fit the input space poorly, while a small value of \(\epsilon\) indicates that the map fitting is acceptable. A large readjustment is required for large \(\epsilon\), which may require more iterations, while no adjustment is needed for a small value of \(\epsilon\) at time t. The PLSOM algorithm updates the weight vectors of the winner neurons as follows:
where \(h_{{{\text{c}},i}}\) is the Gaussian neighborhood function given as
\(\Theta \left( {\epsilon \left( t \right)} \right)\) is used as a scaling factor.
Moreover, another way to calculate \(\Theta \left( {\epsilon \left( t \right)} \right)\) is provided in (10) and (11), where \(\beta = {\text{constant}}\,\forall t\), and \(\theta_{\min }\) is constant.
where \({\text{ln}}\) (.) is the natural logarithm and \(e\) is the Euler number.
RASOM algorithm
It has been proven that selecting a higher learning rate may result in faster convergence, leading to higher topology error. In contrast, a small learning rate value may produce more accurate results, yet it requires a higher number of iterations to obtain lower QE. It may be impractical in situations with a higher amount of data. The problem is alleviated by introducing an adaptive technique termed robust adaptive SOM [44]. The algorithm initially starts with a relatively larger value of learning rate and then gradually reduces it over several iterations. This technique resulted in less QE and faster convergence. The adaptive learning rate can be updated according to the following equation [44]:
where \({w}_{i}(t+1)\) is the new weight updated in the next iteration while \(\alpha (t)\) represents the adaptive learning rate which is defined as
Equation (9) can be updated using Eq. (10) as
It adopts a similar approach for weight initialization as in the conventional SOM, where weights are first randomly initialized. However, unlike conventional SOM, RASOM is able to control the weight updation via \({\beta }^{t}\). Initially, a larger value of \(\beta \) is selected while t will be small that will make the term \(1{\beta }^{t}\) small while making \(\alpha \left(t\right)\) large. In the subsequent iterations, this term will gradually decrease as the value of t will increase.
Proposed method
This section provides the mathematical detail of the SOM algorithm followed by the proposed version VLRSOM. The original SOM consists of a group of neurons that gradually adjust to input data points. It then generates a set of ordered neurons that maintains the topology of the mapped data. A similarity measure such as Euclidean distance is defined to help adaption of these neurons. The weights of winner neurons are then updated in each iteration during training.
In Kohonen's SOM algorithm [1], features from a high ndimensional input space \(x=\{ {x}_{1},{x}_{2},\dots ,{x}_{n}\}\) are mapped to a lower dimensional output space using the connection weights \({w}_{i}=\{{w}_{n1}, {w}_{n2}, \dots , {w}_{nm}\}\). It simply uses the Euclidean distance with the rule known as winnertakesall, which is given in the following equation [1, 2]:
where \(c\) is called the bestmatchunit (BMU) neuron on the output map, \(i=1, 2,\dots , k\le n\times m\), it indicates that a high dimensional input corresponds to the most suitable unit \(i\) at position \(c\). For all inputs and randomly initialized weights, a competitive learning rule is applied in such a way that the input data having similar features will retain a similar topological output map [1, 2]:
where
where \({w}_{i}\left(t+1\right)\) represents the weight of \(i\)th neuron at iteration \((t+1)\), \(x\left(t\right)\) is the training input taken at time \(t\), \(\mu \) is learning rate and \({h}_{c,i}\) is the Gaussian neighborhood function, \(\left\ {r_{c}  r_{i} } \right\\) is the Euclidean distance between the winning neuron c and ith neuron in the grid, and \(\sigma (t)\) is the neighborhood size which is set to a constant value. During training, the value of \({h}_{c,i}(t)\) decreases according to the annealing scheme used in the algorithm.
The weights are then updated during the training as given by the following equation
where \(\alpha \left(t\right)\) is the learning rate. Instead of the learning rate, a Gaussian function is adopted in GFSOM. Equation (15) can then be updated using Gaussian function as
where \({h}_{c,i}(t)\) represents Gaussian function defined as follows
In the above equation, \(\left\ {r_{c}  r_{i} } \right\\) represents the Euclidean distance between ith neuron and the selected winner neuron. Moreover, in the conventional SOM, the error function \(J\left(t\right)\) is defined as
where \({w}_{c}\) is the BMU of \({x}_{i}, i = 1, 2, \dots , n\).
The weight update rule of the SOM corresponds to a gradient descent step in minimizing the above error function.
where \(\mu \) is the fixed learning rate, which controls the convergence (where \(\mu >0\) is a preset small learning rate parameter) and steadystate behavior of the SOM algorithm. We introduced a new variable learning rate in the SOM Similar to [45], to further enhance the accuracy and decrease both TE and QE. The learning rate is adopted according to the following equation in VLRSOM:
In this equation, the learning rate is updated with two conditions, \(0<\alpha <1\) and \(\gamma >0\). \({\mu }^{\mathrm{{\prime}}}\left(t+1\right)\) is bounded by \([\mu_{\max } \)\(\mu_{\min } ]\). Then, we introduce the following condition for \(\mu \left( k \right)\).
As suggested in [45] that a good choice could be \({\mu }_{\mathrm{max}}\). To provide maximum convergence speed, \({\mu }_{\mathrm{max}}\) is normally selected near the point of instability in the conventional SOM algorithm. The value of \({\mu }_{\mathrm{min}}\) is selected as a tradeoff between the desired steadystate misadjustment and the algorithm's required tracking capabilities (convergence behaviors). From Eq. (20), it is obvious that the learning rate is always positive and controlled by \(\alpha \), \(\gamma \) and the prediction error \(J\left(t\right)\). \(\gamma \) controls both the convergence rate and the level of steadystate misadjustment of the algorithm. This technique has shown improved performance compared to the fixed learningrate SOM. When the model starts training, the weights are randomly initialized and produce a high prediction error, and a large learning rate is selected. As the training continues, the prediction error gradually decreases. The learning rate is also gradually reduced, which then yields smaller misadjustment near the optimum. Hence, the steadystate misadjustment is reduced. The value of \({\mu }_{\mathrm{max}}\) should be chosen to guarantee bounded error [45] which is given below
where, \(R\) is the expected value of the autocorrelation matrix of the input vector, and \(\mathrm{tr}\) is the trace of \(R\) matrix.
Experimental results
This section describes in detail the results obtained for the proposed method. Moreover, a comparative analysis is also performed with the conventional SOM and its two wellknown variants: PLSOM2 and RASOM. We evaluated the performance of the proposed method in terms of QE, TE, accuracy, and convergence time.
Two different datasets were used to test the efficiency and robustness of the proposed method: synthetically generated data and the MNIST handwritten characters dataset [46]. For each dataset, two separate experiments were performed with a different number of iterations (200 and 500) to test the ability of the algorithms to reach the steady error state. Technical details are presented in the following subsections.
Results for synthetic data
The synthetic data was generated randomly in the twodimensional (2D) feature space in the range \([\mathrm{0,1}]\). Such data is widely used in various experiments to validate the proposed approach, [19, 42, 47,48,49,50]. 2D data is easier to visualize and analyze; therefore, selecting such data for the current study was more imperative.
For this study, 1000 2D samples were generated to validate the proposed VLRSOM architecture. Let the 2D data be represented \({x}_{j,2}\forall j,1\le j\le 1000\). The map established by \(j=100\) neurons in a \(10\times 10\) lattice, and \(t=200,\mathrm{ and }500\) iterations. The 2D data arranged in a grid as coordinates can also confirm the effectiveness of the asymmetric neighborhood function. The similar units on the map are connected with each other. The weights associated with the input vector are randomly initialized which makes sure that the positions of units in neighborhood on the map did not match the input data. The weights of the map are gradually learned as the training proceeds. Finally, the algorithm is able to obtain the optimal maps.
Experiment A
This experiment was carried out for a maximum of 200 iterations. We performed a detailed analysis to evaluate the performance of each model in terms of QE and TE.
Like any other machine learning algorithm, SOM requires optimal values for its parameters. Naturally, these parameters should be obtained first before training or testing the model. Therefore, a comprehensive search for finding the optimal parameters was performed to get the optimal set of parameters for generating the optimal maps. Table 1 summarizes the optimal parameters obtained on the 2D synthetic data. These optimal parameters were then used for further training and evaluation of the models.
Table 2 summarizes the quantitative results obtained for the proposed method and its comparison with the other three SOMs (conventional SOM, PLSOM2, and RASOM). VLRSOM proved to be superior compared to the other three models in terms of both QE (4.6 × 10^{–4}) and TE (1.0 × 10^{–4}). The other models also produced relatively similar performance in terms of QE; conventional SOM, PLSOM2, and RASOM resulted in 8.1 × 10^{–4}, 5.6 × 10^{–4}, and 4.9 × 10^{–4} QE, respectively. In terms of TE, VLRSOM produced the lowest error (1.0 × 10^{–4}). The performance of RASOM was relatively better (TE = 1.1 × 10^{–3}) while PLSOM2 and conventional SOM resulted in higher TE with scores of 3.57 × 10^{–2} and 2.760 × 10^{–1}, respectively. Lower TE is an indication of a consistent result in maintaining the topology of the network. The lower TE for both VLRSOM and RASOM compared to conventional SOM and PLSOM2 indicates that these models were able to exploit the relationship between data which plays an important role in producing consistent results and maintaining the topology.
To better understand the results, we visualized the input data and the corresponding output of the models to see the consistency in the maps generated (Fig. 1). All maps were randomly initialized before training each model. Figure 1a shows the generated synthetic 2D data plot. (b) shows the topology adaption results for the conventionalSOM. Maps are highly random and suffer from a higher number of variations, indicating that the model could not reach a steadyerror state within the given number of iterations. It was also confirmed from the quantitative results, which showed a higher TE that indicates the algorithm suffers from low stability even after running over all 200 iterations. The topological maps generated for PLSOM 2 (Fig. 1c) were more consistent and better than conventional SOM, indicating that it is better in terms of TE as it reached a steady error state within the given maximum number of iterations.
Figure 2 shows each model's visual results to further understand their behavior in terms of QE and TE over the maximum number of iterations. In (a), we can see that the proposed method is always faster than the other model for the whole number of iterations. In addition, its convergence is smoother and stable every time it iterates. The variations are higher for SOM and PLSOM2, and their convergence rate is way too slower than the RASOM and VLRSOM. RASOM is more stable, and results are more comparable with the proposed VLRSOM. However, still proposed method is able to reach the steadystate faster than the RASOM. The better stability of the proposed method is shown in Fig. 2a, b in the zoomed region for iterations 48–56. It clearly shows that the VLRSOM has a slight edge over the RASOM in speed and reaching a low error state. Conventional SOM produced the highest QE for the whole number of iterations. However, QE was initially high for other models but gradually reached a lower value as the model continued training over the number of iterations and ultimately reached a stable state, showing their effectiveness in achieving the steady error state.
The behavior of the models in terms of TE was similar to that of QE (Fig. 2b). Conventional SOM was highly unstable even though it showed a slight improvement in TE in the end, but still, it was way behind the other models. Interestingly, the TE of PLSOM2 was also low, and it was also faster in reaching a lower TE in early iterations, but it was still unstable with a higher number of variations. The performance of both RASOM and VLRSOM was very similar and produced overall optimal results. When we closely look at the fine details (zoomed region for iterations 10–20 in Fig. 2b), VLRSOM was slightly better than RASOM. These results indicate that PLSOM2, RASOM, and VLRSOM are highly suitable for processing 2D data with high accuracy.
Experiment B
In this experiment, the models were evaluated for 500 iterations to test their behavior over a large number of iterations. Like experiment A, we randomly generated the 2D data and then calculated the optimal parameter values for each model.
Table 3 shows the results obtained for the proposed method and its comparison with the other three models. The overall behavior of the models was similar to experiment A. Yet, there were some interesting points to note. Increasing the number of iterations also resulted in lower QE for all models. Interestingly, the TE was very low for all the models after training over 500 iterations. It shows that, given many iterations, the models can reach a steady error state. However, a model with a faster convergence rate is better since fewer resources will be utilized to reach the steady error state.
We can see that among all models, VLRSOM resulted in the lowest QE (1.5 × 10^{–3}), followed by RASOM (1.6 × 10^{–3}), PLSOM2 (2.0 × 10^{–3}), and conventional SOM (2.2 × 10^{–3}). These results indicate that all models were able to converge when trained over 500 iterations. The overall behavior of each model in terms of TE was also similar as only marginal differences in TE were noted; VLRSOM produced TE as 1.37 × 10^{–6}, RASOM 1.75 × 10^{–6}, PLSOM2 7.4 × 10^{–5}, and conventional SOM as 4.3 × 10^{–5}.
Figure 3a–e shows the visual results obtained for 2D synthetic data for each model run over 500 iterations. (a) shows the original randomly generated data, and the connections between the data points indicate the weights of the topology. The resulting topology for conventional SOM (b) is distorted (shapeless) even after 200 iterations. It indicates higher instability in the topology resulting in a higher topology error as validated by the quantitative results. However, the stability of the topology for both PLSOM2 (c) and RASOM (d) was much better than conventional SOM, which can be evidenced from figures that there is less twisting and deformation in the topology produced by reaching the 500 iterations. The grid shape obtained for VLRSOM (e) is more consistent, and it can adapt the asymmetric neighborhood function better than other algorithms.
Figure 4a, b compare the performance of each model in terms of QE and TE, respectively. The results complement the quantitative results. The proposed VLRSOM algorithm consistently showed lower QE over all iterations except in very few cases. The RASOM's performance was very similar to the VLRSOM at every iteration, indicating that it can reach lower QE early in training. However, as shown in the zoomed area from 460 to 500th iteration, the proposed VLRSOM has a slight edge over RASOM in obtaining a low error state. On the other side, both PLSOM2 and conventional SOM required a significant number of iterations to reach a steady error state.
In the case of TE, the proposed method and RASOM are comparatively close as both achieved less TE and more stability early in the training process (Fig. 4b). However, when we look closely at the zoomed region for iterations 1–100, it is clear that VLRSOM reached the lower TE early in the training process compared to all other models. SOM was highly unstable till it reached the 460th iteration, which is clear that it requires more time to reach a steady error state. PLSOM2 was better than conventional SOM as it consistently produced lesser TE. It also needed many iterations to reach a steady error state (350). These results show that both RASOM and VLRSOM can reach a stable error state quite early in training, making them suitable for processing 2D data with high accuracy.
In terms of CPU time, the comparisons between the proposed method and other variants of SOM for two different iterations of the synthetic data are summarized in Table 4. Time is measured in seconds in all algorithms. The CPU time for PLSOM2 was relatively longer than the other algorithms. In contrast, conventional SOM took less execution time than other models.
Results for handwritten characters
The second experiment was conducted for handwritten character recognition using the MNIST dataset. It consists of a total of 70,000 handwritten characters. First, we divided the data into training (60,000) and testing (10,000) sets. Like the previous experiments with the synthetic dataset, we carried out two separate experiments, with 200 and 500 iterations. The following sections provide details of each experiment performed.
Experiment A
This experiment was carried out for 200 iterations. For the handwritten dataset, first, we proceed by obtaining the optimal set of parameters for each model. Table 5 summarizes the parameters obtained for each algorithm. The quantitative results obtained for this experiment are shown in Table 6. The performance of the proposed VLRSOM was superior to the other three models, which produced an accuracy of 83.00%, QE 5.8450, and TE 0.0024. RASOM also produced highly satisfactory results; accuracy = 81.1%, QE = 6.6566 and TE = 0.0711. The performance of the Conventional SOM algorithm was also in line with stateoftheart models, which produced an accuracy of 80.00%, QE as 6.8660, and TE as 0.3377. Surprisingly, PLSOM2 showed the least accuracy (73.33%), highest QE (7.3240), and TE (0.644).
The visual results obtained for some sample character recognition using 200 iterations for each algorithm are shown in Fig. 5. In (a), the visual results obtained for conventional SOM are shown. The results are visually consistent as the constructed characters are recognizable except in fewer cases. In the case of PLSOM2 (b), the visual results indicate that its performance is suboptimal for character recognition. Its output is even difficult to comprehend in some situations. It indicates its low applicability for character recognition tasks. The visual results obtained for RASOM (c) and VLRSOM (d) were highly accurate as they were able to get a correct estimation of the shape of the original characters.
Figure 5e, f shows QE and TE for each model over the whole number of iterations. The overall behavior of all models for QE was the same; initially, the QE was high due to random initialization, and then it started to reduce as the models began to learn with the increasing number of iterations. However, QE's convergence rate and value were consistently lower for the proposed VLRSOM algorithm for each iteration until it reached the maximum value (200). This was also validated in quantitative results. In the case of TE (f), however, the behavior of each model was different as high variations were noted during the initial stage of map generation for each model. Conventional SOM, PLSOM2, and RASOM have high variations till they reach the highest number of iterations. On The other hand, although initially, the behavior of VLRSOM was similar to other models, the variations gradually reduced as it progressed over the number of iterations. This indicates that the VLRSOM reaches a more stable state in constructing the maps, which is crucial for producing consistent results.
Experiment B
We again run the experiment for hand character recognition using 500 iterations to test the behavior of the models over a higher number of iterations. The quantitative results obtained using 500 iterations are summarized in Table 7. The proposed VLRSOM produced the highest accuracy (88.89%). Interestingly, conventional SOM and RASOM produced higher accuracy of 87.66% and 87.77%, respectively, compared to PLSOM2, resulting in 64.44% accuracy. QE, VLRSOM, RASOM, and conventional SOM were highly effective as they produced 6.421, 6.461, and 6.887, respectively. However, the performance of PLSOM2 was suboptimal as it resulted in QE = 7.279. The TE for VLRSOM was lowest (8.10 × 10^{−3}) as expected as it produced consistent results. Similarly, RA SOM and conventional SOM were also efficient in preserving topology for handwritten character recognition as they produced 8.9 × 10^{−3} and 3.02 × 10^{−2} TE, respectively. The performance of PLSOM2 was again lowest in terms of TE (6.40 × 10^{−1}) for this dataset.
The visual results obtained for each model for the handwritten character recognition dataset are shown in Fig. 6. The models showed similar performance as we saw in experiment A. However, as we can see from Fig. 6e, f, which are QE and TE, respectively, the models perform better over a more significant number of iterations. Naturally, the models can gain more insights into the data as they go over many iterations. The output characters produced by each model are more consistent than the previous experiment's output. It is also worth mentioning that both QE and TE for VLRSOM were consistently lower and reached steady error compared to the other two methods.
Table 8 shows the execution time taken by each algorithm for the handwritten character dataset. Compared to the synthetic data, the time taken by longer for this dataset. Similar to the synthetic data, the CPU time for PLSOM2 was relatively longer than the other algorithms. In contrast, conventional SOM took less execution time than other models.
Experiments with UCI benchmark datasets
Additional experiments were performed on four benchmark datasets to test the applicability of the proposed method. These datasets were obtained from the University of California, Irvine (https://archive.ics.uci.edu). Four datasets were considered: Balance, Wisconsin Breast, Dermatology, and Ionosphere. Table 9 summarizes the dataset used. As we can see, there are varying samples, features, and classes present in each data set. Generally, models face challenges when the number of features and classes are more than 2.
Before executing the experiments, we divided the data into training (80%) and testing (20%) subsets. The number of iterations was empirically set to 50 as the models converge before reaching these maximum iterations. In addition, weights were generated in the same fashion for all algorithms to achieve fairness in the evaluation.
The algorithms have many parameters which need finetuning before training the models as the performance of those algorithms depends on the optimal values of these parameters. Therefore, the optimal values for those parameters were first obtained by applying the grid search method. Since the numerical values for each dataset are significantly varying, optimal values for the parameters were obtained separately for each dataset. Table 10 summarizes the finetuned parameter values obtained for each algorithm. The values were then used in subsequent training of the models.
The quantitative results in terms of accuracy and QE obtained for all the algorithms applied on the four datasets are summarized in Table 11. We can see that the proposed VLRSOM produced the highest accuracy and lowest QE for all data sets. Generally, all models produce good classification accuracy on the Wisconsin Breast dataset. Both PLSOM2 and RASOM proved better than conventional SOM but were suboptimal compared to the proposed VLRSOM.
In the case of the Balance dataset, the highest accuracy was obtained for the VLRSOM (76.47%) and lowest QE (0.206). The performance of RASOM was comparable with VLRSOM as it resulted in 75.94% and 0.208 accuracy and QE, respectively. In contrast, the performance of both conventional SOM and PLSOM2 was not satisfactory. Conventional SOM resulted in 63.10% accuracy and 0.242 QE. Similarly, PLSOM2 produced accuracy and QE of 63.10% and 0.222, respectively.
All models produced high accuracies for the Wisconsin Breast dataset. Both RASOM and VLRSOM produced 100% accuracy, while conventional SOM and PLSOM2 resulted in 99.02% accuracy. Similarly, the QE for conventional SOM, PLSOM2, RASOM and VLRSOM was 0.148, 0.152, 0.147 and 0.142, respectively. The main reason behind the high accuracy can be ascribed to a relatively lower number of distinct features and the number of classes.
VLRSOM outperformed the other three models for the Dermatology dataset as it produced the highest accuracy (80.91%). The accuracy obtained for the other three models was similar; conventional SOM produced 68.18%, while both PLSOM2 and RASOM produced 70.91% accuracy. However, in terms of QE, all classifier’s response was similar. The accuracies obtained for Ionosphere data sets for the proposed method was 82.41% with QE 0.099. For this dataset, conventional SOM resulted in 68.52% accuracy and 0.100 QE, PLSOM2 produced 68.52% accuracy and 0.110 as QE, and RASOM resulted in 80.56% accuracy and 0.100 QE.
These results indicated that the proposed VLRSOM is superior in terms of accuracy compared to the other three algorithms. In addition, in terms of QE, it also produced optimal results on all datasets. This indicates that the proposed VLRSOM is more robust against noise and outliers.
We performed further analysis to investigate the learning behavior of the proposed method and compared it with conventional SOM, PLSOM2 and RASOM algorithms. Figure 7 shows the visual results obtained for all algorithms obtained on the four datasets. Figure 7a shows the learning behavior of the four models on the Balance dataset. We observe that the proposed algorithm has a much faster adaptation as it reached an error steadystate quite early in the training process (8th iteration). The learning behavior of RASOM was similar to the proposed method. However, it started with a higher error, and also at the end of the maximum iterations, its QE was higher than the proposed method. On the other hand, the learning behavior of conventional SOM and PLSOM2 was different than RASOM and VLRSOM. SOM does not seem to reach a lower QE even though it reached a stable error state, but the QE increased after reaching a lower error early in training. PLSOM2 also resulted in higher QE and did not converge to a lower QE even after reaching the maximum iterations. The proposed method showed low variations throughout the iterations compared to the other methods in terms of TE.
Figure 7b shows the learning behavior for the Wisconsin dataset. The models behaved similarly to each other. All models had high variations in the beginning dues to random initialization of weights and then gradually reached a lower QE. After the 10th iteration, the models reached a lower QE, thus attaining a steadyerror state. All models also showed similar behavior for both Dermatology (Fig. 7c) and Ionosphere datasets (Fig. 7d). It can also be observed that the models produced higher TE at the beginning of the iterations and then gradually reached lower TE as the training continued for a higher number of iterations.
Table 12 shows the execution time taken by each algorithm for the UCI benchmark dataset. For all datasets, the CPU time taken by all algorithms is very similar except for the conventional SOM, which took less execution time for the Ionosphere dataset. This indicates that the proposed variable learning rate does cost much processing time compared to the other methods and obtained higher accuracy.
Time complexity of the VLRSOM algorithm
This section provides an insight into the time complexity of the proposed VLRSOM algorithm. In general, the SOM algorithm suffers more from memory complexity than time complexity as that kind of situation is less likely to occur. Vesanto et al. [51] showed that the time complexity of the SOM is \(\mathcal{O}\left({N}^{2}\right)\), where \(N\) is the total number of neurons/prototypes in the SOM lattice which shows the time complexity is quadratic in nature with respect to the given input N. This indicates that the algorithm is less efficient in terms of memory complexity as the maps size increases. In contrast, the processing time for the SOM algorithm seems to be much less than the amount of memory it consumes during processing [39]. The processing time complexity for input to the output layer is in order of \(\mathcal{O}(NM)\) [51]. This shows that the time complexity is linear to the number of nodes in the output (\(M\)) and the input \((N)\) layers of the SOM model.
Each sample is passed to the SOM model during training, which calculates the distance between the input and the weight vector (Eq. 12). The time required for calculating this distance for each sample can be estimated to be \(O(NM)\) as the initial clustering is performed only once. Similarly, according to Eq. 12, the time complexity required for calculating the winning neuron in the output layer is also \(O(N)\). According to Eq. 13, the total time required for weight updating for each pattern is \(O\left(NM\right)\). Therefore, the total time complexity for SOM can be calculated as [52]:
where \({\mathrm{TC}}_{\mathrm{SOM}}\) is the time complexity of SOM, \(t\) represents the number of iterations, and \(s\) is the number of patterns. Therefore, the asymptotic time complexity is \(O(\mathrm{ts}NM)\).
The proposed approach does not affect the architecture or the layers of the conventional SOM algorithm. This indicates that the underlying principles of the SOM algorithm remain intact except for the introduction of an adaptive learning rate instead of a fixed one, as shown in Eqs. (20) and (21). Therefore, time complexity remains \(\mathcal{O}(NM)\) which is the same as the conventional SOM. The “big \(\mathcal{O}\) notation" does not consider the constant and minor terms, so when such terms are dropped, it leads to the same time complexity as \(\mathcal{O}(NM)\) for the proposed approach. However, for the memory consumption, the proposed method has a little more memory complexity than the conventional SOM due to the addition of more complex calculations. Yet, after simplification of the "big \(\mathcal{O}\) notation", the resulting expression for memory usage will remain \(\mathcal{O}\left({N}^{2}\right)\), which is same as the conventional SOM algorithm.
According to [53], the time complexity of the conventional SOM algorithm can be calculated as \(O(\mathrm{ts}(3N+3))\). In the case of VLRSOM, the proposed steps are also executed in each iteration. From Eqs. (20) and (21), we can deduce that each iteration of VLRSOM approximately takes \(O(\mathrm{ts}(5N+3))\) units of time. Therefore, the total time for \(n\) iterations can be calculated as:
The correctness of these results is confirmed by comparing the results of the CPU time shown in Tables 4, 8, and 12 for the conventional SOM, PLSOM2, RASOM, and VLRSOM algorithms. It is interesting to note that each iteration of the VLRSOM algorithm takes slightly longer than the conventional SOM. Yet, the convergence speed of the VLRSOM compensates for it in such a way that VLRSOM reaches an acceptable level of quantization error much faster than the conventional SOM.
Conclusion
The main objective of this paper was to improve the accuracy and topology preservation capability of the SOM algorithm. The improvement in the accuracy is achieved by introducing a new variable learning rate parameter. The adaptive learning rate help improve the accuracy of the SOM technique by reducing the steadystate misadjustment. The VLR adaptively adjusts itself to the error (increase or decrease) allowing the SOM model to track changes in the training data that resulted in a small steadystate error. The VLR adjustment is controlled by the estimated error. Moreover, VLR leads to faster convergence and robustness in the steadystate behavior. The goal is to make a large adjustment to VLR for a large estimation error for faster tracking while a small adjustment to VLR for a small estimation error. Hence, the VLR will control the amount of misadjustments needed to produce optimal maps.
Detailed experiments were performed to evaluate the accuracy and robustness of the proposed VLRSOM algorithm. Two different datasets were used, and for each dataset, two independent experiments were performed with different iterations to test the speed of convergence with high accuracy and its ability to preserve the topology. The results confirmed the capability of the proposed method as it produced highly satisfactory results. Moreover, VLRSOM was also compared with conventional SOM, parameterless selforganizing map (PLSOM2), and RASOM in terms of accuracy, quantization error (QE), and topology preservation (TE). The proposed method proved superior to all other three techniques in all experiments.
We want to focus on Markov Blanket to make the SOM algorithms more efficient in future work. Moreover, we would like to integrate the proposed algorithm with a deep neural network to find the optimal set of parameters for classification tasks. The greedy search algorithm will help improve the efficiency of deep neural networks by selecting the optimal set of parameters needed to perform the classification task. The method can further be improved by adopting a parallel implementation of the proposed algorithm. In addition, the theoretical aspects of the algorithm will be explored to prove the working of the new algorithm.
References
Kohonen T (1990) The selforganizing map. Proc IEEE 78:1464–1480. https://doi.org/10.1109/5.58325
Kohonen T (1982) Selforganized formation of topologically correct feature maps. Biol Cybern 43:59–69. https://doi.org/10.1007/BF00337288
Huang DW, Gentili RJ, Reggia JA (2015) Selforganizing maps based on limit cycle attractors. Neural Netw 63:208–222. https://doi.org/10.1016/j.neunet.2014.12.003
Chaudhary V, Bhatia RS, Ahlawat AK (2014) A novel SelfOrganizing Map (SOM) learning algorithm with nearest and farthest neurons. Alex Eng J 53:827–831. https://doi.org/10.1016/j.aej.2014.09.007
Ghaseminezhad MH, Karami A (2011) A novel selforganizing map (SOM) neural network for discrete groups of data clustering. Appl Soft Comput J 11:3771–3778. https://doi.org/10.1016/j.asoc.2011.02.009
Chaudhary V, Bhatia RS, Ahlawat AK (2015) A constant learning rate selforganizing map (CLRSOM) learning algorithm. J Inf Sci Eng 31:387–397. https://doi.org/10.6688/JISE.2015.31.2.2
Vasighi M, Amini H (2017) A directed batch growing approach to enhance the topology preservation of selforganizing map. Appl Soft Comput 55:424–435. https://doi.org/10.1016/j.asoc.2017.02.015
Licen S, Di Gilio A, Palmisani J et al (2020) Pattern recognition and anomaly detection by selforganizing maps in a multi month Enose survey at an industrial site. Sensors 20:1887. https://doi.org/10.3390/s20071887
Ijaz A, Choi J (2018) Anomaly detection of electromyographic signals. IEEE Trans Neural Syst Rehabil Eng 26:770–779. https://doi.org/10.1109/TNSRE.2018.2813421
Shan P, Li Z, Wang Q et al (2021) Selforganizing mapsbased generalized feature set selection for model adaption without reference data for batch process. Anal Chim Acta 1188:339205. https://doi.org/10.1016/j.aca.2021.339205
Jia Y, Chen X, Yu J et al (2021) Speaker recognition based on characteristic spectrograms and an improved selforganizing feature map neural network. Complex Intell Syst 7:1749–1757. https://doi.org/10.1007/s40747020001721
Liang W, Wang J, Bao W et al (2021) Continuous selfadaptive optimization to learn multitask multiagent. Complex Intell Syst. https://doi.org/10.1007/s40747021005918
Li H, Qu K, Zhou J (2021) Reconstructing sound speed profile from remote sensing data: nonlinear inversion based on selforganizing map. IEEE Access 9:109754–109762. https://doi.org/10.1109/ACCESS.2021.3102608
Uriarte EA, Martín FD (2005) Topology preservation in SOM. Int J Appl Math Comput Sci 1:19–22
Chen Y, Ashizawa N, Yeo CK et al (2021) Multiscale selforganizing map assisted deep autoencoding gaussian mixture model for unsupervised intrusion detection. Knowl Based Syst 224:107086. https://doi.org/10.1016/j.knosys.2021.107086
D’Urso P, De Giovanni L, Massari R (2020) Smoothed selforganizing map for robust clustering. Inf Sci (NY) 512:381–401. https://doi.org/10.1016/j.ins.2019.06.038
Kirk JS, Zurada JM (2000) A twostage algorithm for improved topography preservation in selforganizing maps. In: SMC 2000 conference proceedings. 2000 IEEE international conference on systems, man and cybernetics. “Cybernetics evolving to systems, humans, organizations, and their complex interactions” (Cat. No.00CH37166). IEEE, pp 2527–2532
Alahakoon D, Halgamuge SK, Srinivasan B (2000) Dynamic selforganizing maps with controlled growth for knowledge discovery. IEEE Trans Neural Netw 11:601–614. https://doi.org/10.1109/72.846732
Kuremoto T, Otani T, Obayashi M et al (2016) A hand shape instruction recognition and learning system using growing SOM with asymmetric neighborhood function. Neurocomputing 188:31–41. https://doi.org/10.1016/j.neucom.2014.10.108
Mici L, Parisi GI, Wermter S (2018) A selforganizing neural network architecture for learning humanobject interactions. Neurocomputing 307:14–24. https://doi.org/10.1016/j.neucom.2018.04.015
Dias LA, Damasceno AMP, Gaura E, Fernandes MAC (2021) A fullparallel implementation of SelfOrganizing Maps on hardware. Neural Netw 143:818–827. https://doi.org/10.1016/j.neunet.2021.05.021
Jayanth Krishnan K, Mitra K (2022) A modified Kohonen map algorithm for clustering time series data. Expert Syst Appl 201:117249. https://doi.org/10.1016/j.eswa.2022.117249
Troka M, Wojnicz W, Szepietowska K et al (2022) Towards classification of patients based on surface EMG data of temporomandibular joint muscles using selforganising maps. Biomed Signal Process Control 72:103322. https://doi.org/10.1016/j.bspc.2021.103322
Kaur P, Malhi AK, Pannu HS (2022) Hybrid SOM based crossmodal retrieval exploiting Hebbian learning. Knowl Based Syst 239:108014. https://doi.org/10.1016/j.knosys.2021.108014
Jang J, Kim CO (2022) Unstructured borderline selforganizing map: learning highly imbalanced, highdimensional datasets for fault detection. Expert Syst Appl 188:116028. https://doi.org/10.1016/j.eswa.2021.116028
Sun J, Yang Y, Wang Y et al (2020) Survival risk prediction of esophageal cancer based on selforganizing maps clustering and support vector machine ensembles. IEEE Access 8:131449–131460. https://doi.org/10.1109/ACCESS.2020.3007785
Dittenbach M, Merkl D, Rauber A (2000) The growing hierarchical selforganizing map. In: Proceedings of the IEEEINNSENNS international joint conference on neural networks. IJCNN 2000. Neural computing: new challenges and perspectives for the New Millennium, vol 6. IEEE, pp 15–19
Malondkar A, Corizzo R, Kiringa I et al (2019) SparkGHSOM: growing hierarchical selforganizing map for large scale mixed attribute datasets. Inf Sci (NY) 496:572–591. https://doi.org/10.1016/j.ins.2018.12.007
Liu C, Tang L, Liu J (2019) Least squares support vector machine with selforganizing multiple kernel learning and sparsity. Neurocomputing 331:493–504. https://doi.org/10.1016/j.neucom.2018.11.067
Mehrizi A, SadoghiYazdi H, Taherinia AH (2018) Robust semisupervised growing selforganizing map. Expert Syst Appl 105:23–33. https://doi.org/10.1016/j.eswa.2018.03.046
Ali Hameed A, Karlik B, Salman MS, Eleyan G (2019) Robust adaptive learning approach to selforganizing maps. Knowl Based Syst 171:25–36. https://doi.org/10.1016/j.knosys.2019.01.011
Zhang Y, Chen M, Tian D, Ding L (2021) Biomimetic slam algorithm based on growing selforganizing map. IEEE Access 9:134660–134671. https://doi.org/10.1109/ACCESS.2021.3113311
Martins DML, De Lima Neto FB (2020) Hybrid intelligent decision support using a semiotic casebased reasoning and selforganizing maps. IEEE Trans Syst Man Cybern Syst 50:863–870. https://doi.org/10.1109/TSMC.2017.2749281
Yang P, Wang D, Wei Z et al (2019) An outlier detection approach based on improved selforganizing feature map clustering algorithm. IEEE Access 7:115914–115925. https://doi.org/10.1109/access.2019.2922004
Aly S, Almotairi S (2020) Deep convolutional selforganizing map network for robust handwritten digit recognition. IEEE Access 8:107035–107045. https://doi.org/10.1109/ACCESS.2020.3000829
Wickramasinghe CS, Amarasinghe K, Manic M (2019) Deep selforganizing maps for unsupervised image classification. IEEE Trans Ind Inform 15:5837–5845. https://doi.org/10.1109/TII.2019.2906083
Ferles C, Papanikolaou Y, Naidoo KJ (2018) Denoising autoencoder selforganizing map (DASOM). Neural Netw 105:112–131. https://doi.org/10.1016/j.neunet.2018.04.016
Forest F, Lebbah M, Azzag H, Lacaille J (2021) Deep embedded selforganizing maps for joint representation learning and topologypreserving clustering. Neural Comput Appl 33:17439–17469. https://doi.org/10.1007/s0052102106331w
Olszewski D (2021) A datascatteringpreserving adaptive selforganizing map. Eng Appl Artif Intell 105:104420. https://doi.org/10.1016/j.engappai.2021.104420
Douzas G, Bacao F (2017) Selforganizing map oversampling (SOMO) for imbalanced data set learning. Expert Syst Appl 82:40–52. https://doi.org/10.1016/j.eswa.2017.03.073
Dozono H, Niina G, Araki S (2016) Convolutional self organizing map. In: 2016 international conference on computational science and computational intelligence (CSCI). IEEE, pp 767–771
Berglund E (2010) Improved PLSOM algorithm. Appl Intell 32:122–130. https://doi.org/10.1007/s1048900801387
ChushigMuzo D, SogueroRuiz C, Engelbrecht AP et al (2020) Datadriven visual characterization of patient healthstatus using electronic health records and selforganizing maps. IEEE Access 8:137019–137031. https://doi.org/10.1109/ACCESS.2020.3012082
Hameed AA, Ajlouni N, Karlik B (2020) Robust adaptive SOMs challenges in a varied datasets analytics. In: Vellido A, Gibert K, Angulo C, Martín Guerrero JD (eds) Advances in selforganizing maps, learning vector quantization, clustering and data visualization. WSOM 2019. Advances in intelligent systems and computing. Springer International Publishing, Cham, pp 110–119
Kwong R, Johnston E (1992) A variable step size LMS adaptive algorithm. IEEE Trans Signal Process 40
LeCun Y, Cortes C, Burges C MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/. Accessed 5 Aug 2021
Berglund E, Sitte J (2006) The parameterless selforganizing map algorithm. IEEE Trans Neural Netw 17:305–316. https://doi.org/10.1109/TNN.2006.871720
Dias LA, Damasceno AMP, Gaura E, Fernandes MAC (2021) A fullparallel implementation of SelfOrganizing Maps on hardware. Neural Netw. https://doi.org/10.1016/j.neunet.2021.05.021
Regadío A, García Tejedor JI, Ayuso S et al (2020) Trajectory determination of muons using scintillators and a novel selforganizative map. Nucl Instrum Methods Phys Res Sect A Accel Spectrom Detect Assoc Equip 973:164166. https://doi.org/10.1016/j.nima.2020.164166
Girau B, TorresHuitzil C (2020) Fault tolerance of selforganizing maps. Neural Comput Appl 32:17977–17993. https://doi.org/10.1007/s0052101837696
Vesanto Juha, Himberg Johan, Alhoniemi Esa PJ (1999) Selforganizing map in matlab: SOM toolbox. In: Proceedings of the Matlab DSP conference. pp 16–17
Ganivada A, Ray SS, Pal SK (2012) Fuzzy rough granular selforganizing map and fuzzy rough entropy. Theor Comput Sci 466:37–63. https://doi.org/10.1016/j.tcs.2012.08.021
ShahHosseini H, Safabakhsh R (2003) TASOM: a new time adaptive selforganizing map. IEEE Trans Syst Man Cybern Part B Cybern 33:271–282. https://doi.org/10.1109/TSMCB.2003.810442
Funding
The authors received no specific funding for this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to report regarding the present study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jamil, A., Hameed, A.A. & Orman, Z. A faster dynamic convergency approach for selforganizing maps. Complex Intell. Syst. 9, 677–696 (2023). https://doi.org/10.1007/s40747022008262
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40747022008262
Keywords
 Selforganizing maps
 Variable learning rate SOM
 Quantization error
 Clustering
 Dimensionality reduction