1 Introduction

In [1], Axelrod used a directed graph to describe the connections between the political elites. This modelling technique was extended by Kosko [23, 24], who introduced Fuzzy Cognitive Maps (FCMs), by representing the strength of the causal connections using values from the \([-1,1]\) interval. The nodes of the graph represent the main subsystems, or system variables; while the weighted, directed edges express the causal knowledge [39]. During the years, this modelling method proved to be very efficient in the representation of complex multicomponent systems, especially when the exact mathematical description was unknown, extremely complicated, thus difficult to deal with, or influenced by uncertain information. Successful applications of FCMs show a very diverse, colourful picture, including, but not limited to social sciences [5], economic problems [9, 26], educational applications [18], various decision-making problems and risk analysis [10, 34, 37], waste management [4, 12], medical problems [33], time series modelling and analysis [16]. The diversity of the fields where FCMs were applied with success, clearly demonstrates the flexibility and performance of this modelling paradigm. In the FCM terminology, the nodes of the weighted, directed graph are usually called ‘concepts’. These represent special characteristics or subsystems of the modelled system. Activation values are numbers from the unit interval (but sometimes the \([-1,1]\) interval is applied) assigned to the concepts to describe the state of the concepts. The initial activation vector usually changes rapidly in the simulation. A simulation always ends with one of the three possible outcomes: [41] (i) the value of the activation vector stabilizes, the iteration arrives to a fixed point (FP); (ii) limit cycle means that activation vectors appear repeatedly in a specific order, and (iii) the system may show no stable or regular behaviour, which is usually called chaotic in the FCM literature.

In most decision-making applications ‘what-if’ questions are answered with the help of FCMs and simulations [20]. The simulation is started with a specific scenario (expressing the assumed, studied circumstances), and in the best case the simulation leads to a FP. In these cases the effect and usefulness of a decision can be easily analysed. Limit cycles express a continuously changing state of the system, but at least these states are known and can be examined. Chaotic behaviour, however, should be avoided in most application areas.

FCM models may have one single, or even multiple FPs, thus fixed points are not always stable. In some systems, it is important to know all possible stable states of the system, e.g. in the case of a safety-critical system, where a significant amount of investments may be damaged or even, people can be injured. The FPs of a FCM are usually explored empirically by a series of simulations [15], during which a lot of various scenarios are used to perform simulations with. Unfortunately, no one knows exactly the required minimum number of such scenarios. As Dickerson and Kosko pointed out in [8], the state space of an FCM contains attractor regions, and all simulations started from any point of the same region lead to the same FP. Thus, theoretically one scenario per FP region would be enough, but initially, the boundaries of those regions are unknown for the decision makers. In order to explore all FPs with high confidence, a tremendous number of scenarios have to be evaluated. Here, a practical problem arises: although the computational power required to perform a single simulation is not significant (it contains only a matrix multiplication and a relatively simple threshold function evaluation; moreover, FCMs usually converge to FPs quickly); a high number of repeated simulations may need very long execution times. The situation is absolutely not better, if someone wants to find scenarios leading to a specific stable state (e.g. to be aware of which initial system configurations should be avoided). If this goal-oriented decision support problem is solved with a population-based evolutionary algorithm like in [19], it also requires lots of executions of repeated simulation and long running times. Despite the tremendous efforts, the dynamic behaviour of FCMs can never be mapped with 100% reliability by using these empirical approaches, and therefore, there is an obvious need for a faster and more reliable analytical method.

Unfortunately, it is not easy to provide an analytical method to determine the FPs themselves, or at least their number, or to analyse the stability of FCMs in general, even if they are similar to neural networks [6]. Conditions expressed by the weights of connections between FCM concepts that guarantee the existence and uniqueness of the FPs, were first introduced by Boutalis et al. [2, 3] for a special case when the steepness parameter value of the so-called sigmoid threshold function is \(\lambda = 1\). In [13], the authors generalized the results of [3] for arbitrary steepness parameters.

Some aspects of the problem of fixed points were also discussed by Knight et al. [21]. Stability issues of FCMs were also investigated by Lee and Kwon [28], where the authors presented an analytical condition for global exponential stability of FCMs based on the Lyapunov method. Moreover, they applied the theoretical results in clinical decision making in [27]. Luo et al. [29] studied the algebraic dynamics of k-valued fuzzy cognitive maps.

This paper also addresses the same problem, the issue of global stability of fuzzy cognitive maps is discussed. Naturally, the question of the significance and applicability of global asymptotic stability arises. If the FCM is globally asymptotically stable then, any arbitrary initial stimulus leads to the same, unique fixed point. There are some applications, where this property is useful. For example, in Section 4 of [21], the authors investigated a large and diverse industrial area called the Humber region (UK). Sixteen key concepts (bio-based energy production, by-products, competitiveness, etc.) and 27 weighted, directed connections were considered in the model, based on the stakeholders’ opinion. The ranking between the importance of the factors was based on the activation values at the fixed point of the corresponding FCM. For the ranking to be unique, the authors required the FCM has a unique, stable fixed point. To ensure the global stability of the fixed point, they used the mathematical results presented in the same article (in Sect. 5, we compare their mathematical findings and the results presented in this paper).

Nevertheless, in the majority of applications, the unique fixed point is not the desired scenario. Although global stability is not the required, preferred feature of the FCM in these cases, it is important to know what parameter sets lead to this disadvantageous property. It is somewhat similar to diabetes or high blood pressure: we want to avoid them, so we have to know everything (or at least a lot) about their causes. To summarize it, we definitely do not state that global stability is always an advantageous feature. Although it may be rather a curse than a blessing in certain cases, this feature potentially comes with FCM models, so we have to explore its causes, even if to be able to avoid it.

In this paper, we present some novel results on the global stability of FCMs. Moreover, we show that these results are better than the previous stability bounds known from the literature. Additionally, we also show that the weight-independent condition for global stability (reported earlier in the literature can be improved using our results. This new weight-independent condition is not only better than the previous one, but is also extremely simple. Finally, as a side result, we point out that a recent result regarding the existence and uniqueness of fixed points of FCMs is not valid.

The paper is organized as follows. In Sect. 2, we review the basic notions of fuzzy cognitive maps. Section 3 summarizes the basic mathematical concepts applied for the investigation of the problem of fixed points. Section 4 presents new theoretical achievements for globally asymptotically stable fixed points of FCMs, providing a better upper bound for the parameter of the threshold function. Moreover, conditions related to the structure of the FCM are also presented. In Sect. 5, the comparison of the result of the current paper and other authors’ findings is presented, pointing out that the approach of Sect. 4 gives a better upper bound for the parameter of the threshold function. The results of the paper are summarized in Sect. 6. In Appendix 1, we present the proof of Theorem 6, in Appendix 2 we discuss the validity of a recent result.

2 Basic notions of fuzzy cognitive maps

From the mathematical point of view, an FCM contains the following components: a weighted, directed graph expressing the causal relations between the concepts; and the updating rule, including the transformation function, which squashes the weighted sum of activation values into the allowed range (it is usually [0, 1], but sometimes \([-1,1]\)) [41]. In graph theory, the adjacency matrix contains the whole information about the connections in the graph. If we deal with a weighted, directed graph, then matrix W containing the weights (\(w_{ij} \in W\)) of the connections (and zeros, if there are no causal connections) stores the causalities of the model. The nonnegative number \(|w_{ij}|\) describes the strength of influence of concept \(C_j\) on concept \(C_i\); moreover, if \(w_{ij}>0\), then a positive change in the activation value of \(C_j\) causes a positive change in the activation value of \(C_i\); if \(w_{ij}<0\), then positive change causes negative change. The weighted sum of the incoming activation values is transformed into the required range. The transformation is computed by a threshold function. Well-known discrete threshold functions are the bivalent and trivalent functions, while in continuous case we find various sigmoid-like functions. In most of the cases, FCM users choose the sigmoid function, see Eq. (1).

$$\begin{aligned} f(x) = \frac{1}{1+e^{-\lambda x}} \end{aligned}$$
(1)

The steepness parameter \(\lambda >0\) controls the speed of transition from low values (close to zero) to high values (close to one). If \(\lambda\) is small, then the function is close to a linear function; if \(\lambda\) is large, then the function is similar to the Heaviside function.

FCM simulation starts with a vector of initial activation values \(A(0)=[A_1(0),\ldots , A_n(0)]^{\mathrm{T}}\). In each simulation step, the activation vector is re-calculated according to the updating rule. The simulation ends when (i) the activation vector is stabilized; (ii) the number of iteration steps reaches the prescribed maximum. In some applications, the updating rule contains self-feedback, but in some other cases self-feedback is not preferred. The general form of the updating rule is

$$\begin{aligned} A_i(k) = f_i\left( \sum _{j=1,j\ne i}^n w_{ij}A_j(k-1) + d_i A_i(k-1)\right) . \end{aligned}$$
(2)

Here, \(A_i(k)\) is the activation value of concept \(C_i\) at simulation step k, \(f_i\) is the threshold function applied at concept \(C_i\), \(w_{ij}\) is the weight of causal edge from \(C_j\) to \(C_i\), \(d_i\) is the strength of the self-feedback. If \(d_i=0\), then there is no self-feedback, as it was used in the first FCM models. Although self-feedbacks were not allowed in Kosko’s original FCMs and are also avoided in some applications, they may be useful in specific cases. Without self-feedbacks, the activation value of a concept is defined by other concepts only. It is not realistic in some cases, however. It is easy to imagine a car, where the speed of the car not only depends on, e.g. the current position of the gas pedal, but on the speed of the car at the previous moment as well. In this example, the current velocity is not independent of the speed measured at the previous time step and the driver can also influence the speed by pushing the gas pedal. Many other, similar examples can be given, where a concept has some kind of ‘memory’. The intensity of that memory can be expressed by the weight of the self-feedback (\(d_i\)). The theoretical background of self-feedbacks is already laid [11, 40], and several real-life examples can be found for their application [7, 22, 38] as well.

Self-feedback can be built into the weight matrix (\(d_i\)s into the diagonal, i.e. \(w_{ii}=d_i\)), then the updating rule turns into the following:

$$\begin{aligned} A_i(k) = f_i \left( \sum _{j=1}^n w_{ij}A_j(k-1) \right) . \end{aligned}$$
(3)

In the present paper, we use this type of W, so in our terminology the weight matrix already contains the possible self-feedback, i.e. if self-feedback is applied then the diagonal of W contains the weights of the feedback (\(d_i\)s), if not, then the diagonal of W contains zeros.

3 Mathematical tools

In this section, we shortly summarize the mathematical notions and tools applied in Sects. 4 and 5. For more detailed and precise information about fixed points and fixed point theorems we refer to [35], while for linear algebra and matrix analysis, see [17]. A fixed point of a function G is a point of the state space such that G maps this point to itself: \(G(x^*)=x^*\). Fixed point \(x^*\) is locally asymptotically stable if starting the iteration at an arbitrary point close enough to \(x^*\), the iteration converges to \(x^*\). If the iteration converges to \(x^*\) for every initial value, then \(x^*\) is a globally asymptotically stable fixed point.

According to Brouwer’s theorem (see [35], pp. 296–299), every continuous function, which maps a convex, bounded and closed set \(K \subset \mathbb {R}^n\) to itself has (at least one) fixed point. Consequently, this theorem ensures the existence of at least one fixed point for any fuzzy cognitive map with continuous threshold function. Since the FCM reasoning is based on the limit of the iteration, we may wonder under what conditions this limit does exist. If by the application of the iteration rule the activation vectors get closer and closer to each other and their difference goes to the zero vector, then the iteration will converge to a certain point. In this case, points of the state space get closer to each other by applying a certain function. This property can be formalized as follows (see [36], page 220):

‘Let (Xd) be a metric space, with metric d. If \(\varphi\) maps X into X and if there is a number \(c<1\) such that

$$\begin{aligned} d\left( \varphi (x),\varphi (y)\right) \le c d(x,y) \end{aligned}$$
(4)

for all \(x,y \in X\), then \(\varphi\) is said to be a contraction of X into X.’

The famous contraction mapping theorem (a.k.a. Banach’s fixed point theorem, see [36], pp. 220–221 or [35], pp. 236–237) states that if a mapping is a contraction over a nonempty complete metric space, then it has exactly one fixed point. The proof of this statement tells more than the theorem: this fixed point can be found as a limit of the iteration \(x_{n+1} = G( x_{n})\), starting from an arbitrary point in the state space. Since the iteration converges to this unique equilibrium point from any initial values, this fixed point is asymptotically stable in the global sense.

In the results presented in Sect. 4, we prove the globally asymptotic stability of the unique fixed point using the contraction mapping theorem. The theorems and proofs require the basic knowledge of some notions from linear algebra, such as matrix norms, spectral radius and relations and inequalities between them. For these facts, we refer to [17]. However, there are two theorems which will be applied in Sect. 4, and which should be mentioned here. (Below, \(\rho (M)\) denotes the spectral radius of matrix M):

Theorem 1

(see [17], page 349) Let \(M \in \mathbb {R}^{n\times n}\) and \(\varepsilon >0\) be given. There is matrix norm \(\Vert \cdot \Vert\), such that \(\rho (M) \le \Vert M \Vert \le \rho (M)+\varepsilon\).

Theorem 2

(see [17], page 373) If \(\Vert *\Vert _m\) is a matrix norm, then there is a vector norm \(\Vert * \Vert _v\) that is compatible with it (i.e. \(\Vert Mx \Vert _v \le \Vert M \Vert _m \cdot \Vert x \Vert _v\)).

4 Conditions for the global stability of fuzzy cognitive maps

In this section, we prove two theorems for the global asymptotical stability of fixed points of FCMs. Consider again the updating rule of an FCM:

$$\begin{aligned} A_i(k) = f_i \left( \sum _{j=1,j\ne i}^n w_{ij}A_j(k-1) + d_i A_i(k-1)\right) \end{aligned}$$
(5)

Let us introduce mapping \(G :\mathbb {R}^n \rightarrow \mathbb {R}^n\) generating the next concept vector from the preceding one. Then the mapping with coordinates:

$$\begin{aligned} A(k+1)= \left[ \begin{array}{c} A_1(k+1) \\ \vdots \\ A_n(k+1) \end{array} \right] = \left[ \begin{array}{c} f_1(w_1A(k)) \\ \vdots \\ f_n(w_n A(k)) \end{array} \right] = G(A(k)), \end{aligned}$$
(6)

where \(f_i\) is the transformation function assigned to the ith concept and \(w_i=(w_{i1},\ldots ,w_{in})\), \(w_{ij} \in W\). We know from Banach’s theorem that a contraction has a unique fixed point and it can be determined by an iteration starting from any point of the space. Since the FCM reasoning is based on an iteration, the application of this theorem is straightforward. Although we cannot compute the unique fixed point analytically from the given parameters (W and \(f_i\)s), we are able to state some conditions, for which the examined mapping G is a contraction. As we have seen from its definition, the notion of contraction requires a distance metric. This distance metric can be generated by a vector norm \(\Vert \cdot \Vert _v\) (we do not specify this norm, it can be an arbitrary vector norm), the distance of two concept vectors is defined as the norm of their difference. Using this vector norm, we can define a matrix norm (a.k.a. induced matrix norm or natural matrix norm) as \({ \Vert M \Vert _* = \sup \left\{ \frac{ \Vert M x \Vert _v}{ \Vert x \Vert _v} :x \in \mathbb {R}^n, \, x \ne \underline{0} \right\} }\). Besides matrix and vector norms, we are going to use the maximum value of the derivative of the threshold function. In general, a sigmoid-like threshold function is a monotone increasing, differentiable function, with finite limits at negative and positive infinity. Let f(x) be a threshold function of this type, then the maximal value of its derivative is finite, let us denote it by K. Then this K can be considered as a Lipschitz constant: \(\left| f(x)-f(y) \right| \le K \cdot \left| x-y \right|\) (Fig. 1).

Fig. 1
figure 1

The sigmoid threshold function with parameter \(\lambda =5\) and \(\lambda =3\) (top) and their bell-shaped derivatives (bottom). The maximum value of the derivative is \(\lambda /4\), i.e. 1.25 and 0.75, respectively

Theorem 3

Consider a fuzzy cognitive map (FCM) with weight matrix W. Moreover, let \(K_i\) be the maximum of the derivative of the threshold function \(f_i\) applied at the ith concept. If the inequality

$$\begin{aligned} \left\| {\mathrm {diag}}(K_i)\cdot W \right\| _* <1 \end{aligned}$$
(7)

holds with an induced matrix norm \(\Vert \cdot \Vert _*\), then the fuzzy cognitive map has the same fixed point for every initial activation vector.

Proof

Let

$$\begin{aligned} G(A)=\left[ f_1(w_1A),f_2(w_2A),\ldots ,f_n(w_nA)\right] ^{\mathrm{T}} \end{aligned}$$
(8)

where \(w_i =(w_{i1},\ldots ,w_{in})\). Moreover, let \(\Vert \cdot \Vert _v\) be the vector norm generating matrix norm \(\Vert \cdot \Vert _*\). In the following, we give an upper estimation of the value of \(\left\| G(A)-G(A^{\prime })\right\| _v\):

$$\begin{aligned}&\left\| G(A)-G(A^{\prime })\right\| _v \\&\quad =\left\| f_1(w_1A) - f_1(w_1A^{\prime }), \ldots , f_n(w_nA) - f_n(w_nA^{\prime })\right\| _v \end{aligned}$$
(9)

From the Mean Value Theorem, we have

$$\begin{aligned} \left| f_i(w_iA)-f_i(w_iA^{\prime }) \right| \le K_i\left| w_iA-w_iA^{\prime }\right| . \end{aligned}$$
(10)

So we have

$$\begin{aligned}&\left\| G(A)-G(A^{\prime })\right\| _v \end{aligned}$$
(11)
$$\begin{aligned}&\quad \le \left\| \left[ K_1 (w_1A-w_1A^{\prime }), \ldots , K_n( w_nA-w_nA^{\prime }) \right] ^T \right\| _v \end{aligned}$$
(12)
$$\begin{aligned}&\quad = \left\| {\mathrm {diag}}(K_i)W(A-A^{\prime }) \right\| _v \end{aligned}$$
(13)
$$\begin{aligned}&\quad = \frac{ \left\| {\mathrm {diag}}(K_i) W(A-A^{\prime }) \right\| _v}{ \left\| A-A^{\prime } \right\| _v} \cdot \left\| A-A^{\prime } \right\| _v \end{aligned}$$
(14)
$$\begin{aligned}&\quad \le \left\| {\mathrm {diag}}(K_i) W \right\| _* \cdot \left\| A-A^{\prime } \right\| _v, \end{aligned}$$
(15)

where the last row is the consequence of the definition of matrix norm \(\Vert \cdot \Vert _*\). If \(\left\| {\mathrm {diag}}(K_i) W \right\| _* <1\), then G is a contraction mapping. It means that starting from any arbitrary initial values, the repetitive application of the FCM updating rule leads to the same equilibrium point. \(\square\)

In Theorem 3, we did not specify the matrix norm (it can be, for example \(\Vert \cdot \Vert _2\)), neither the threshold function \(f_i\) or the maximum value \(K_i\) of its derivative. If we choose them appropriately, we get some additional interesting results:

  • If the threshold function is the most widely used one (i.e. \(f_i(x)=\frac{1}{1+e^{-\lambda _i x}}\)), then \(K_i = \frac{\lambda _i}{4}\) and the condition turns to \(\left\| {\mathrm {diag}}(\lambda _i) W \right\| _* <4\).

  • If the threshold function is the same sigmoid function for all the concepts, i.e. \(f_i(x)=f(x)=\frac{1}{1+e^{-\lambda x}}\), then \(K_i = \frac{\lambda }{4}\), so the condition reduces to \(\Vert W \Vert _* < \frac{4}{\lambda }\). In other words, if parameter \(\lambda <4/\Vert W \Vert _*\), then every initial stimulus leads to the unique fixed point. It is clear that if \(\Vert W \Vert _*\) is smaller, then global stability is ensured for a larger set of possible values of \(\lambda\).

  • Weighted in-degree and weighted out-degree are widely used descriptive measures of networks. If the matrix norm is the 1-norm or the \(\infty\)-norm, then condition for global stability can be expressed by the weighted in-degree and weighted out-degree, similarly as it was done for input-output FCMs in [14]. Nevertheless, although weighted in-degree and out-degree are very useful for descriptive analysis of networks, the convergence conditions expressed by them can be easily overperformed by other matrix norms.

Based on the relation between spectral radius and induced matrix norms, we show a better condition for global asymptotical stability of FCMs (\(\rho (\cdot )\) denotes the spectral radius of the matrix).

Theorem 4

Consider a fuzzy cognitive map (FCM) with weight matrix W. Moreover, let \(K_i\) be the maximum of the derivative of the threshold function \(f_i\) applied at the ith concept. If

$$\begin{aligned} \rho \left( {\mathrm {diag}}(K_i)\cdot W \right) <1, \end{aligned}$$
(16)

then the fuzzy cognitive map has the same fixed point for every initial activation vector.

Proof

We have seen already that if \(\left\| {\mathrm {diag}}(K_i)\cdot W \right\| _* <1\) with a matrix norm, then the mapping generating the iteration is a contraction. Moreover, if

$$\begin{aligned} \rho ({\mathrm {diag}}(K_i)\cdot W) <1 \end{aligned}$$
(17)

then Theorem 1 ensures the existence of a matrix norm (let us denote it by \(\Vert \cdot \Vert _M\)) such that

$$\begin{aligned} \left\| {\mathrm {diag}}(K_i)W \right\| _M <1 \end{aligned}$$
(18)

Given this matrix norm, Theorem 2 ensures the existence of a compatible vector norm to this matrix norm (let us denote it by \(\Vert \cdot \Vert _v\)). If we measure the distance of the concept vectors with this norm, then we have

$$\begin{aligned} \left\| G(A)-G(A^{\prime })\right\| _v&\le \left\| {\mathrm {diag}}(K_i)W (A-A^{\prime }) \right\| _v \end{aligned}$$
(19)
$$\begin{aligned}&\quad\quad\quad\le \left\| {\mathrm {diag}}(K_i)W \right\| _M \cdot \left\| A-A^{\prime } \right\| _v \end{aligned}$$
(20)

According to Eq. (18), the coefficient of \(\left\| A-A^{\prime } \right\| _v\) is less than one. It means that using distance metric \(d(x,y)=\left\| x-y \right\| _v\), mapping G is a contraction. Similar to the previous theorem, it means that starting the iteration from anywhere in the state space, the iterative FCM updating leads to the same equilibrium point. \(\square\)

Since the spectral radius is the infimum of the induced matrix norms, the condition using \(\rho (W)\) is better than the conditions provided by matrix norms.

In a special case, when the threshold function is the sigmoid function and the slope parameter is the same for every concept (i.e. \(\lambda _1=\lambda _2=\ldots =\lambda _n=\lambda\)), the condition simplifies to

$$\begin{aligned} \rho \left( W \right) <\frac{4}{\lambda } \end{aligned}$$
(21)

Remark 1

In Theorems 3 and 4, we assumed the differentiability of the continuous threshold function. Differentiability is an advantageous property in learning the weights of the FCM; nevertheless, it is not a necessity. Theoretically, one may choose other continuous threshold function, for example a continuous piecewise linear function. On the other hand, one may recognize that in the proof of Theorem 3, we used the maximal value of the derivative as a Lipschitz constant. Consequently, Theorems 3 and 4 are valid for every Lipschitz continuous threshold function, with the modification that \(K_i\) is the Lipschitz constant belonging to threshold function \(f_i\).

5 Comparison of the results with previous results

In this section, we shortly summarize previous theoretical research carried out on the problem of fixed points of FCMs, and compare these results to the results presented in previous section.

5.1 Comparison with the results of Boutalis et al.

According to our best knowledge, the first theoretical study discussing the existence and uniqueness of fixed points of FCMs was given by to Boutalis et al. [3] and [2]. They investigated the case, when the transformation function is \(f(x)=1/(1+e^{-x})\), i.e. parameter \(\lambda\) equals one. They arrived to the conclusion that if the following inequality

$$\begin{aligned} \left( \sum _{i=1}^n \Vert w_i\Vert ^2 \right) ^{1/2} <4 \end{aligned}$$
(22)

holds, then the FCM has a unique fixed point and the iteration starting from an arbitrary initial activation vector eventually converges to this point (on the left, \(\Vert w_i\Vert = \sqrt{ w_{i1}^2 + w_{i2}^2 +\ldots + w_{in}^2}\)). Note that the expression in Eq. (22) is just the Frobenius norm of weight matrix W. Their findings were generalized for sigmoid FCMs equipped with arbitrary positive parameter \(\lambda\) in [13]. Namely, it was proved that if \(\Vert W \Vert _F < 4/\lambda\), then the iteration process of the FCM leads to the unique equilibrium point, independently from the initial activation values. The well-known inequalities between different matrix norms (and spectral radius) provide an easy way for comparison of results above and the results presented in Sect. 4. We can find a matrix norm \(\Vert \cdot \Vert _*\), such that \(\Vert W \Vert _* \le \Vert W \Vert _F\), thus the global convergence to a unique fixed point is proved for a larger set of possible values of parameter \(\lambda\). For example, we may choose the operator norm (\(\Vert \cdot \Vert _2\)), or in some cases the infinity norm (\(\Vert W \Vert _\infty\)) or the taxicab norm (\(\Vert W \Vert _1\)). Furthermore, we may take the spectral radius, since \(\rho (W) \le \Vert W \Vert _F\), thus \(4/\rho (W) \ge 4/\Vert W \Vert _F\). In Kottas et al. [25], the authors attempted to extend the results of [3]. We discuss this issue in Appendix 2.

5.2 Comparison with the findings of Knight et al.

In [21] Knight, Lloyd and Penn stated two theorems (Theorem 3.1 and 3.2) regarding the possible number of fixed points of fuzzy cognitive maps. Theorem 3.1 of [21] states that if parameter \(\lambda \ge 0\) of the sigmoid function is small enough then there is a unique fixed point, that is linearly stable. Conversely, the theorem also states that if \(\lambda \ge 0\) is large enough there can be multiple fixed points. In Theorem 3.2 of [21], they clarify the notion of small enough. We cite this theorem literally:

Theorem 3.2 of [21]: ‘For \(W \in \mathbb {R}^{n\times n}\) given, the sigmoid FCM has a unique fixed point for all \(\lambda\) such that \(0 \le \lambda \le \overline{\lambda }(n)\), this fix point is stable. \(\overline{\lambda }(n)\) satisfies

$$\begin{aligned} \left( 1- \frac{\overline{\lambda }(n)}{4} \right) ^n-\sum _{i=1}^n b_i C_i^n \left( \frac{\overline{\lambda }(n)}{4} \right) ^i=0 \end{aligned}$$
(23)

where \(C_i^n\) are the binomial coefficients, and \(b_i\) is given by the recursion relation \(b_i=ib_{i-1}+(-1)^i, \quad b_0=1\)’.

Both of the theorems of [21] deal with FCMs equipped with the same parameter of \(\lambda\) for all of the concepts. Moreover, the weight matrix W was not taken into consideration in the theorems (it is not a fault, but a possible loss of information). From our results it follows that Theorem 3.2 of [21] can be improved. First we prove that for \(\lambda < 4/n\) (n is the number of concepts of the FCM), the FCM has exactly one fixed point. Then we show that this extremely simple bound (4/n) is better than the bound provided by Eq. (23).

Theorem 5

Consider a sigmoid FCM with weight matrix \(W\in \mathbb {R}^{n \times n}\) and sigmoid parameter \(\lambda\). If \(0 \le \lambda < \frac{4}{n}\) then the FCM has a unique, globally asymptotically stable fixed point.

Proof

The statement below is an immediate consequence of Theorem 3. If

$$\begin{aligned} \left\| {\mathrm {diag}}(\lambda _i)\cdot W \right\| _2 <4 \end{aligned}$$
(24)

then there is exactly one fixed point. Since in the current case \(\lambda _1=\lambda _2=\ldots =\lambda _n=\lambda\), this becomes

$$\begin{aligned} \lambda \cdot \left\| W \right\| _2 <4 \end{aligned}$$
(25)

which implies

$$\begin{aligned} \lambda < \frac{4}{\left\| W \right\| _2} \end{aligned}$$
(26)

Moreover, we do know that \(\Vert \cdot \Vert _2 \le \Vert \cdot \Vert _F\). Thus, if \(\lambda < \frac{4}{\left\| W \right\| _F}\) holds, then \(\lambda < \frac{4}{\left\| W \right\| _2}\) holds, too. In weight matrix W, all of the entries are between \(-1\) and 1. Furthermore, by its definition, the Frobenius norm is \(\Vert W \Vert _F= \sqrt{ \sum _{i,j} w_{ij}^2 }\), so an upper estimation on the Frobenius norm of the weight matrix W is:

$$\begin{aligned} \left\| W \right\| _F \le n \end{aligned}$$
(27)

Since \(\frac{4}{n} \le \frac{4}{\left\| W \right\| _F}\), this completes the proof. \(\square\)

If self-feedbacks are not allowed, then we have \({ \left\| W \right\| _F \le \sqrt{n(n-1)} }\), resulting in the bound \(\lambda < \dfrac{4}{\sqrt{n(n-1)}}\), which is slightly higher than \(\dfrac{4}{n}\).

One may think that better upper estimation can be given if we use other norm or the spectral radius of W, but this is not the case. Since the entries of W are between \(-1\) and 1 (and in this case we have no further information about the weights), the inequality \(\rho (W) \le n\) holds (equality holds in the extreme case, when \(w_{ij}=1\) for every ij), so the spectral radius leads to the same upper bound, consequently, we cannot get better bound with any matrix norm.

If we have more information about the weights, then this upper bound \(\frac{4}{n}\) can be further increased. For example, if \(W \in \mathbb {R}^{n\times n}\) has exactly k nonzero elements (i.e. the FCM has exactly k connections with nonzero weights), then \(\Vert W\Vert _F \le \sqrt{k} \le n\), and \(\frac{4}{\sqrt{k}} \ge \frac{4}{n}\).

Theorem 5 improves Theorem 3.2 of [21], since it ensures the uniqueness of fixed points for a larger set of values of \(\lambda\). We can observe it in Fig. 2 and the next theorem shows that the inequality \(\overline{\lambda }(n) \le \frac{4}{n}\) holds for every FCM (equality holds for \(n=1,2\)).

Fig. 2
figure 2

Proven upper bounds on parameter \(\lambda\) for global stability, \(\overline{\lambda }(n)\) (blue bullet) and 4/n (orange cross) versus number of concepts (n). The values of 4/n are slightly higher than the values of \(\overline{\lambda }(n)\) (color figure online)

Theorem 6

Let \(\overline{\lambda }(n)\) be defined as in Eq. 23. Then the inequality

$$\begin{aligned} \overline{\lambda }(n) \le \frac{4}{n} \end{aligned}$$
(28)

holds for every \(n\ge 1\).

Proof

See Appendix 1. \(\square\)

5.3 Comparison with the result of Lee and Kwon

Lee and Kwon examined the problem of equilibrium points via Lyapunov stability analysis [28]. In their approach, the inference rule is described by the following equation (we changed the notations for convenience):

$$\begin{aligned} A_i(k)=f\left( r_1 A_i(k-1) + r_2\sum _{j=1,j \ne i}^n w_{ij}A_j(k-1) \right) \end{aligned}$$
(29)

Here, f is the sigmoid function and its parameter \(\lambda\) is the same for every concept. They proved a criteria for global exponential stability (see [28]):

‘If the inequality

$$\begin{aligned} 0< \lambda < \frac{4}{r_1 + r_2 \Vert W^* \Vert _2} \end{aligned}$$
(30)

holds, then the equilibrium point of the corresponding FCM is globally exponentially stable’.

We note here that in their approach the weight matrix does not contain the self-feedback, but the self-feedback is expressed by the term \(r_1\). Because of this reason this weight matrix is denoted here by \(W^*\).

To compare their result with the findings of Sect. 4, we start with a lower bound of the denominator of Eq. (30):

$$\begin{aligned} r_1 + r_2 \left\| W^* \right\| _2&= \left\| {\mathrm {diag}}(r_1) \right\| _2 + r_2 \Vert W^* \Vert _2 \end{aligned}$$
(31)
$$\begin{aligned}&\quad\quad\quad=r_2\left\| {\mathrm {diag}}\left( r_1/r_2\right) \right\| _2 + r_2 \Vert W^* \Vert _2 \end{aligned}$$
(32)
$$\begin{aligned}&\quad\quad\quad\ge r_2\left\| {\mathrm {diag}}\left( r_1/r_2\right) + W^* \right\| _2 \end{aligned}$$
(33)

In our approach, \(r_2=1\) and \(r_1=d_i\), so we get

$$\begin{aligned} r_2\Vert W^* + {\mathrm {diag}}(r_1/r_2) \Vert _2&=\Vert W^* + {\mathrm {diag}}(d_i) \Vert _2 = \Vert W \Vert _2 \\&\le \Vert W^* \Vert _2 + \Vert {\mathrm {diag}}(d_i) \Vert _2 \end{aligned}$$
(34)

where W stands for the weight matrix including self-feedback (\(d_i\)s in the diagonal). Based on the inequality above, we get that

$$\begin{aligned} \frac{4}{ r_1 + r_2 \left\| W^* \right\| _2 } \le \frac{4 }{ \Vert W \Vert _2} \end{aligned}$$
(35)

The last inequality ensures that the bound for parameter \(\lambda\) provided in Sect. 4 is better than the bound given in [28].

5.4 Example and ordering of the bounds

As we have seen, various approaches provide different bounds with the property that if the steepness parameter (\(\lambda\)) of the sigmoid function is less than a number computed from some parameters of the model, then the FCM iteration rule produces the same equilibrium point for every initial activation vector. Based on the properties of matrix norms and the spectral radius, a simple ordering of the proven bounds can be given (\(\overline{\lambda }\) refers to the bound by Knight et al.):

$$\begin{aligned} \overline{\lambda } \le \frac{4}{n} \le \frac{4}{ \Vert W \Vert _F} \le \frac{4}{ \Vert W \Vert _2} \le \frac{4}{ \rho (W)} \end{aligned}$$
(36)

The following toy example illustrates the different performance of these bounds. Consider a fuzzy cognitive map with weight matrix W:

$$\begin{aligned} W=\left( \begin{array}{rrrrr} 0 &{} -1 &{} 0.5 &{} 0 &{} 0 \\ 0 &{} 0 &{} -0.5 &{} 0.5 &{} -0.5 \\ -1 &{} 1 &{} 0 &{} -0.5 &{} 0 \\ 0 &{} 1 &{} -1 &{} 0 &{} -0.5 \\ -1 &{} 0 &{} 1 &{} -0.5 &{} 0 \end{array} \right) \end{aligned}$$
(37)

The threshold function is the same for all the concepts, \({ f(x)=\frac{1}{1+e^{-\lambda x}} }\). Table  1 shows the different upper bounds on parameter \(\lambda\), with proven global stability, provided by different methods (i.e. if the value of \(\lambda\) is less than the given number, then the FCM is globally asymptotically stable). We can observe that the method applying the spectral radius gives the best result. Numerical experiments show that the unique fixed point loses its global stability at about \(\lambda \approx 3.6708.\)

Table 1 Comparison of different bounds on \(\lambda\), using weight matrix W (Eq. 37)

6 Summary

Fuzzy cognitive map-based reasoning relies on the behaviour of repeated application of the updating rule, i.e. it depends on the behaviour of an iteration. This iteration may or may not converge to an equilibrium point (fixed point). If the iteration converges to a fixed point, then this fixed point (and the whole FCM model) may or may not be globally asymptotically stable. Moreover, if the model is globally asymptotically stable, then this fixed point is unique and the iteration arrives at this point from every initial state. In other words, if the model is globally asymptotically stable, then the system reaches the same equilibrium point, regardless of the initial stimulus.

It has been previously known from the literature that, in case of sigmoidal threshold functions, this property is somewhat related to the value of the steepness parameter \(\lambda\). Namely, if the value of \(\lambda\) is small (i.e. the transition from close to zero to close to one is not so drastic), then the FCM has the global stability property. Moreover, it has been also clear, that this property is influenced by the structure of the weighted connections of the network (i.e. weight matrix W).

In this paper, several novel analytical conditions have been presented for the global stability of fuzzy cognitive maps. These conditions involve the usual parameters of the model, namely the weight matrix and the parameter of the threshold function. Comparing to the existing results in the literature, these conditions are simpler and give a more efficient upper bound on the parameter of the threshold function. Moreover, the corresponding matrix norms and spectral radius can be easily determined by free mathematical software.

The results presented in this paper can be used in at least two different ways: in some applications, a unique fixed point is a required property of the model. It means that different initial stimuli should lead to the same equilibrium state. On the other hand, there are applications (for example pattern recognition) where the FCM should have more than one equilibrium point. In other words, in the first case, global stability is a required property, while in the second case we should avoid globally stable models. The simple analytical results help FCM users to decide about some model parameters before evaluation of the full model, decreasing the number of trial-and-error simulations.