Introduction

Uncertainty is a real feature of the objective world and directly leads to the human cognitive process of information processing and knowledge acquisition, which is characterized by significant uncertainty. In addition, from the viewpoint of visual perceptual topology and functional hierarchy, human cognition is also characterized by “overall priority”, i.e. global cognition is superior to local features [1]. Currently, the task of analysing and mining large amounts of data is increasing dramatically. Therefore, studying the expression and processing of uncertain knowledge, finding patterns in it, and allowing machines to simulate human cognitive processes have become popular research topics in the field of artificial intelligence (AI) [2, 3]. Many models, such as probabilistic models [4], fuzzy set models [5] and rough set models [6], have been used to address uncertain information. Generally, it is common to use an affiliation function to address uncertain information. However, thus far, the essence of the concept of the affiliation function has never been given a convincing explanation. In the human cognitive process, humans are more sensitive to qualitative concepts than to quantitative data. For example, people do not have an exact concept of 38-degree weather, but they will feel hot in such weather. This means that data without special semantics are meaningless [7]. Therefore, the uncertainty of human cognition needs to be studied so that AI algorithms can have the same understanding and judgement as humans [8,9,10].

Although AI algorithms exhibit excellent performance in some respects, the mechanism of the methods they use is not fully understood. To emulate the human cognitive process, in the early 1990s, Deyi Li, an academician from China, proposed a model that combines fuzziness and randomness: the cloud model [11]. This model utilizes a forward cloud generator and reverse cloud generator to convert qualitative concepts to quantitative data. The cloud model also performs bidirectional cognition of concept extents and connotations, which simulates the human cognitive process and can be a good solution for the representation and processing of uncertainty problems [12]. In addition to representing uncertainty in the data processing process, cloud models can largely provide methods for analysing qualitative concepts in a more similar way to human cognition [13]. With continuous research, the theory of cloud modelling has become general and complete. Subsequently, similarity algorithms for cloud modelling have gradually become an essential part of cloud modelling theory research and have important applications in data mining [14], collaborative filtering for recommendation [15], system evaluation [16,17,18], and time series prediction [19].

The cloud model reflects the vagueness and randomness of concepts in human knowledge and provides a new method for studying uncertain AI. The emergence of fuzzy mathematics has led to a great leap from determinism to randomness to fuzziness; this approach is more similar to real life and can be used to solve many practical problems. Transportation problems (TPs) and assignment problems (APs) are widely used and optimized cases in the field of fuzzy mathematics. Senthil Kumar et al. made great contributions to this field [20,21,22]. Senthil Kumar et al. attempted to classify TPs in mixed deterministic and uncertain environments and formulated the problem of solving TPs using crisp, triangular fuzzy and trapezoidal fuzzy numbers. In order to find the mixed optimal solution, P. Senthil Kumar method (PSK) [21, 23], which is very simple and easy to understand, was proposed. This method has been generalized to solve intuitionistic fuzzy solid transportation problems [24], hybrid and type-4 intuitionistic fuzzy solid transportation problems (IFSTPs) [25], and many other engineering examples. In addition to this, Ali Mohammadi et al. [26,27,28] have made many contributions in the field of intelligent optimization, made a large number of reviews and analyses of the inclined planes system optimization (IPO) algorithm, and proposed an improved version of the IPO algorithm (IIPO) [26]. The IIPO algorithm has a great deal of accuracy in estimating the coefficients, convergence, adaptability, stability and reliability. Improvement and the comparison between its algorithms provides a good reference idea for this paper.

The innovations of this paper are as follows:

  1. 1.

    From the perspective of fuzzy mathematics, the triangular cloud model, which is an extended cloud model of the normal cloud model, is taken as the research object because of its fuzzy and random nature.

  2. 2.

    By introducing \({{D}}_{\text{T}}\) of distance the symmetric triangular fuzzy number, the concept of exponential closeness is proposed. In this paper, exponential closeness based on triangular fuzzy numbers is used to describe the similarity of the distances between two sets of cloud models.

  3. 3.

    In this paper, the DDTSTCM similarity algorithm is proposed using the variance of the cloud model to measure the shape similarity of the cloud model and combining the distance similarity and shape similarity.

  4. 4.

    In this paper, three groups of experiments are set up to verify and summarize the advantages and disadvantages of the existing algorithms and this paper’s cloud model similarity algorithms corresponding to the four indices. Engineering case studies verify that the algorithm proposed in this paper has practical engineering application value.

The contributions of this paper are as follows:

  1. 1.

    In this paper, from the perspective of fuzzy mathematics, triangular fuzzy numbers are utilized to study the distance similarity of cloud models, which can provide new research ideas.

  2. 2.

    This paper avoids calculating and optimizing similarity algorithms based on a single feature and considers the degree of similarity of cloud models from multiple perspectives of distance and shape, which has better theoretical interpretability.

  3. 3.

    By comparing the discriminability, efficiency, stability and theoretical interpretability of algorithms, the proposed algorithm is compared with seven other algorithms, and corresponding simulation experiments are set up to verify the superiority of the proposed algorithms, which provide a reference for scholars to compare similarity algorithms and set up specific simulation experiments in the future.

  4. 4.

    Finally, this paper also analyses and validates engineering case studies to evaluate and test the proposed algorithm in the field of engineering projects with actual operational data.

The rest of the paper is organized as follows. The second section compiles and summarizes the existing cloud model similarity algorithms and provides a literature review. The third section introduces the definitions and theorems involved in the algorithm of this paper and briefly introduces the concept of the cloud model and the definition of triangular fuzzy number closeness. The fourth section proposes a comprehensive measure of the similarity of triangular cloud models from the perspective of distance similarity and shape similarity. In the fifth section, three groups of experiments are designed based on the four algorithm evaluation indices to analyse the advantages and practical application effects of DDTSTCM, and the practical engineering application value of the algorithm is discussed for specific engineering examples. The sixth section summarizes the advantages and disadvantages of DDTSTCM and discusses the shortcomings of the research in this paper and the outlook for the future.

Literature review

Currently, researchers have proposed many similarity algorithms for various problems [29,30,31]. The existing cloud model similarity algorithms can be classified into three main categories; many scholars have summarized and developed many novel similarity algorithms in terms of cloud model feature curves.

  1. 1.

    Conceptual extension. Zhang et al. [32] first proposed calculating the average distance of an extension to judge whether two CMs are similar. Cai Fang and Zhao [33] proposed the similarity cloud measurement (SCM) and improved an interval-based cloud model similarity measure. Wang et al. [34] proposed fuzzy distance-based similarity (FDCM) based on α-cuts. Dai et al. [35] proposed the use of the envelope area of the contribution based on the cloud model (EACCM). However, the algorithm of conceptual extension calculates similarity via stochastic simulation, which increases the computational complexity.

  2. 2.

    Numerical features. Zhang et al. [36] characterized similarity by the cosine angle likeness comparison method based on a cloud model (LICM). Zhang et al. [37] proposed multidimensional cloud models based on fuzzy similarity (MSCM). The algorithm combines digital features to calculate the similarity of cloud models numerically, but it fails to address the intrinsic connection between the three parameters of the cloud model well, and there is still room for further improvement.

  3. 3.

    Characteristic curves. Li et al. [38] proposed the area proportionality algorithm based on the expectation curve (ECM). The intersection area bounded by the expectation curves of two cloud models and the horizontal axis is used to represent similar components and is taken to represent the similarity of cloud models. Inspired by the relationship between the Gaussian distribution and Gaussian cloud model (GCM), researchers utilize the distance of probability distributions, i.e. the Kullback–Leibler divergence (KLD) [39], the earth movers’ distance (EMD) [40], and the square root of the Jensen–Shannon divergence [41], to describe concept drift, which is reflected by the distance between two cloud models (EMDCM). The overlap-based expectation curve of the cloud model (OECM) was used in [42] as a measure of cloud model similarity. In this algorithm, the overlap degree is used to describe the overlapping part of two clouds. This overlapping area yields the similarity of the cloud models using the membership degree of the “\({3}{En}\)” boundary and the intersection of the two clouds. Luo et al. [43] proposed a maximum boundary based on the cloud model (MCM) for structural damage identification. Lin [44] constructed five types of similarity between cloud concepts to reflect the various similarities that may exist in uncertain concepts. The overall similarity based on \({Ex}\), \({En}\) and \({He}\) (SimEEH) is used to characterize the similarity between cloud models, and the effectiveness of the proposed method is verified by a conceptual cognitive offset experiment. The algorithm in terms of characteristic curves can solve the instability problem to a certain extent, but the calculation of intersection points and multiple integration operations leads to a cumbersome computational process.

In addition, the calculation of cloud model similarity can be approached from different perspectives. Wang et al. [45] first combined shape and distance to establish a comprehensive similarity algorithm for cloud model (PDCM) and achieved good results. Yu et al. [46] established a new algorithm of location and shape based on the cloud model (LSCM) following the algorithms of Wang et al. and demonstrated certain advantages over the PDCM algorithm in application. Yao et al. [47] established a multidimensional shape-position similarity cloud model (MSPSCM), which accurately evaluates the water quality of lakes by considering the shape and position similarity between a sample cloud and a horizontal cloud. Yang et al. [48] proposed a model based on a triangular fuzzy number EW-type closeness based on the triangular cloud model (EPTCM), taking into account the shape and distance similarity of the existing cloud models.

Reference [49] revealed that fuzzy numbers have good application potential in distance measurement. Fuzzy numbers include triangular fuzzy numbers, trapezoidal fuzzy numbers, and intuitionistic fuzzy numbers [20, 50]. Triangular fuzzy numbers are the extreme case of trapezoidal fuzzy numbers. Intuitionistic fuzzy numbers are not considered here because this paper only needs to consider the affiliation value, not the non-affiliation value or hesitation index [51]. According to the definitions of the two types of fuzzy numbers in the literature [23, 50], triangular fuzzy numbers have an increasing and then decreasing degree of affiliation, which is suitable for describing a well-defined neighbourhood range. The degree of affiliation is at its maximum at a well-defined place, but only a certain point is allowed to have the maximum degree of affiliation. Trapezoidal fuzzy numbers are first raised for a period of time and subsequently lowered. The maximum degree of affiliation maintains its properties for a whilst and is equal to the triangular fuzzy number when the maximum degree of affiliation has a short enough duration. Triangular fuzzy numbers are chosen in this paper for two reasons. First, when describing the uncertainty of the object of study in this paper, the degree of affiliation should be represented by a single value rather than an interval. Second, the original intention of the algorithm in this paper was to solve the problem of making multiple decisions, to facilitate the later integration of the algorithm with the evaluation algorithms for solving the weight values of each factor, and to facilitate the ranking process [20, 23, 50, 51].

This section also lists the four categories of representative cloud model similarity algorithms and analyses the shortcomings of the algorithms, as shown in Table 1. Inspired by the above literature, this paper takes the triangular fuzzy number as the basis for establishing the cloud model similarity calculation model from the perspective of distance and shape and finally combines the two similarity algorithms.

Table 1 Summary of cloud model similarity algorithms

Definitions and lemmas

Definition 1

Let \({{U}}\) be a nonempty infinite set expressed by an accurate numerical value, and let \({{C}}\) be a qualitative concept on \({{U}}\). If there is an accurate numerical value \(x \in U\) and the mapping \(y = \mu_{C} ( x ) \in [0,1]\) of \({{x}}\) to \({{C}}\) is a random number with a stable pattern, then the distribution of \((x, y)\) on the universe \({{U}}\) is called a cloud. Each \((x, y)\) is called a cloud drop [52].

Definition 2

The three characteristic parameters (Ex, En, He) of the cloud model are the quantitative embodiment of its qualitative concept. The expectation (\(Ex\)) is the representation of the expectation value of the cloud. Additionally, it is the centre of gravity corresponding to the maximum value of the membership degree \(Y\). The entropy (\(En\)) is a measure of the uncertainty of the cloud model. This reflects the expected dispersion of cloud drops and the fuzziness of the cloud model data. The hyper-entropy (\(He\)) is the entropy of \(En\). It is a measure of the uncertainty of the cloud model entropy. Its value can represent the thickness of the cloud, reflecting the randomness of cloud model data [34].

Definition 3

If the random variable x satisfies\(x \sim N{ (Ex, En}^{{\prime}2} { )}\), where\({ En}{\prime} \sim N{ (En, He}^{{2}} {)}\), and the certainty of x with respect to the qualitative concept satisfies

$$ \mu_{C} (x) = \exp \left( { - \frac{{(x - {Ex})^{2} }}{{2({En}^{\prime})^{2} }}} \right), $$
(1)

then the distribution of x on the nonempty infinite set U is a normal cloud [34, 38, 52].

Definition 4

If the random variable \(x\) satisfies \(x \sim N ({Ex}, {En}^{^{\prime}2} )\), where \({En}{\prime} \sim N ({En}, {He}^{2} )\), and the certainty of x with respect to the qualitative concept satisfies [48]

$$ \mu_{C} (x) = \left\{ {\begin{array}{*{20}c} {\frac{{x - \left( {{Ex} - 3{En}{\prime} } \right)}}{{3{En}{\prime} }}, } & {x < {Ex}} \\ {1 - \frac{{x - {Ex} }}{{3{En}{\prime} }}, } & {x \ge {Ex}} \\ \end{array} } \right., $$
(2)

then, the distribution of x on the nonempty infinite set U is a triangular cloud.

Definition 5

If the random variable x satisfies \(x \sim N ({Ex}, {En}^{^{\prime}2} )\), where \( {En}{\prime} \sim N{ (En, He}^{{2}} {)}\) and \({En} \ne {0}\), then the following equation holds [48]:

$$ y = \left\{ {\begin{array}{*{20}c} {\frac{{x - \left( {{Ex} - 3{En} } \right)}}{{3{En}}},} & { x < {Ex}} \\ {1 - \frac{{x - {Ex}}}{{3{En}}},} & {x \ge {Ex}} \\ \end{array} } \right. , $$
(3)

where \(y \) is called the expectation curve of the triangular cloud. The expectation curve can intuitively describe the shape characteristics of the triangular cloud, and all cloud drops fluctuate randomly around the expectation curve.

Definition 6

Fuzzy numbers are convex fuzzy sets defined on real numbers R [53]. For a certain fuzzy number, its membership degree satisfies:

$$ F\left( x \right){ = }\left\{ {\begin{array}{*{20}c} {\frac{{x - r^{l} }}{{r^{m} - r^{l} }} ,} & {r^{l} \le x \le r^{m} } \\ {\frac{{x - r^{n} }}{{r^{m} - r^{n} }} ,} & {r^{m} \le x \le r^{n} } \\ {{0},} & {{\text{otherwise}}} \\ \end{array} } \right., $$
(4)

where \(\tilde{r} = (r^{l} ,r^{m} ,r^{n} )\) is called a triangular fuzzy number. The membership function of \(\widetilde{\text{r}}\) is \(F\left( x \right):R \to \left[ {0,1} \right]\), where \(x \in R\) and \({{R}}\) is a real number field. \({{r}}^{{l}}, {{r}}^{{m}}{, }{{r}}^{{n}}\) are the lower bound, median, and upper bound of the triangular fuzzy number, respectively, and \(r^{l} \le r^{m} \le r^{n}\). When they are equal, \({\tilde{\text{r}}}\) degenerates into a real value. Figure 1 shows an example of a triangular fuzzy number (0.7, 1, 1.3), where \(\left\{ {\begin{array}{*{20}c} {\beta = r^{m} - r^{l} } \\ {\gamma = r^{n} - r^{m} } \\ {t = r^{m} } \\ \end{array} } \right.\). \({\upbeta }\) and \({\upgamma }\) represent the left width and right width of the triangular fuzzy number, respectively. \(t\) represents the expectation of the triangular fuzzy number.

Fig. 1
figure 1

Cloud diagram of a triangular fuzzy number (0.7, 1, 1.3). When the left and right widths \(\beta\) and \(\gamma\) are equal, the cloud diagram has symmetry

Definition 7

Let Eq. (3) equal 0 [48] to obtain

$$ y = \left\{ {\begin{array}{*{20}c} {\frac{{x - \left( {Ex - 3{En}} \right)}}{{3{En}}} = 0, } & {x < {Ex}} \\ {1 - \frac{{x - {Ex}}}{{3{En}}} = 0,} & {x \ge {Ex}} \\ \end{array} } \right. \Rightarrow \left\{ {\begin{array}{*{20}c} {x_{1} = {Ex} - 3{En}} \\ {x_{2} = {Ex} + 3{En}} \\ \end{array} } \right., $$
(5)

From Eq. (5), the intersection points of the cloud model expectation curve and the horizontal axis are \({\text{x}}_{1}\) and \({\text{x}}_{2}\), respectively. The cloud map of the triangular cloud model \({\text{C}}(Ex, En, He)\) is generated using the triangular forward cloud generator, as shown in Fig. 2. Approximately 99% of the cloud drops are distributed in the range of \({[Ex} - {3En, Ex + 3En]}\). Therefore, in practical applications, cloud drops outside \({[Ex} - {3En, Ex + 3En]}\) can be ignored. This is the “3En” rule of the triangular cloud model.

Fig. 2
figure 2

Schematic diagram of the triangular cloud model. The cloud drops are concentrated in the upper part of the triangular cloud and scattered in the lower part

Lemma 1

If the random variable x satisfies\(x \sim N ({Ex}, {En}^{^{\prime}2} )\), where \({En}{\prime} \sim {N} (En, He^{{2}} {)}\) and \({En} \ne {0}\) [52], then

$${{D}}(x) = {En}^{2}+ {He}^{2}$$
(6)

where \({{D}}(x)\) is the variance of the cloud drops. \(He\) determines the thickness of the clouds. \(En\) determines the dispersion degree of the cloud drops. The larger the difference between \(He\) and \(En\) is, the smaller the shape similarity between the two cloud models.

Lemma 2

Suppose there exist two sets of triangular fuzzy numbers \(A = \left( {r_{a}^{l} ,r_{a}^{m} ,r_{a}^{n} } \right)\) and \(B = (r_{b}^{l} ,r_{b}^{m} ,r_{b}^{n} )\). Ref. [49] showed that the distance \(D_{{\text{T}}} (A,B)\) for the triangular fuzzy number is defined as:

$$\begin{aligned} D_{T} \left( {A,B} \right)& = \left[ {\left( {t_{a} - t_{b} } \right)^{2} + (t_{a} - t_{b} )}\right. \\ & \quad\times\left( {\frac{1}{3}\left( {1 - \lambda } \right)\left( {\gamma_{a} - \gamma_{b} } \right) - \frac{1}{3}\lambda \left( {\beta_{a} - \beta_{b} } \right)} \right)\\ & \quad \left.{ + \frac{\lambda }{9}\left( {\beta_{a} - \beta_{b} } \right)^{2} + \frac{{\left( {1 - \lambda } \right)}}{9}(\gamma_{a} - \gamma_{b} )^{2} } \right]^{\frac{1}{2}} \;\\ &\quad \lambda \in [0,1]. \end{aligned}$$
(7)

Materials and methods

In this section, the similarity calculation starts from the distance and shape of the cloud model. First, the exponential closeness is defined under the concept of a triangular fuzzy number. It is combined with the \({{D}}_{\text{T}}\) distance. A proof is given to satisfy the closeness condition to calculate the distance similarity of the cloud model. Then, the shape similarity of the cloud model is considered in terms of its appearance. Finally, the distance similarity and shape similarity are combined to define DDTSTCM.

Distance similarity of the triangular cloud model based on exponential closeness

Fuzzy mathematics studies the problem of uncertainty. In practical daily life, uncertainty appears in many aspects, such as transportation cost and transportation mode [24]. Fuzzy closeness is an important concept in fuzzy mathematics [54] and is commonly used to determine the degree of similarity between two fuzzy sets. To measure the similarity of two fuzzy sets more accurately, fuzzy closeness must have a good degree of differentiation. Commonly used fuzzy proximity methods include the Hamming closeness [55], Euclid closeness [56], and measurement closeness. However, these methods often cannot distinguish the degree of closeness between fuzzy numbers well in practical applications. Inspired by the exponential fuzzy number affiliation degree [57], this paper defines a algorithm based on the \({{D}}_{\text{T}}\) distance formula for exponential closeness.

According to the “3En” rule of the triangular cloud model in Definition 7, cloud drops outside the range of [Ex − 3En, Ex + 3En] are not considered. Let the cloud model C(Ex, En, He) satisfy

$$ y_{1} (x) = \left\{ {\begin{array}{*{20}c} {\frac{{x - ({Ex} - 3{En})}}{{{Ex} - ({Ex} - 3{En})}},} & {({Ex} - 3{En}) \le x \le {Ex}} \\ {1,} & {x = {Ex}} \\ {\frac{{({Ex} + 3{En}) - x}}{{{Ex} + 3{En} - {Ex}}},} & {{Ex} \le x \le ({Ex} + 3{En})} \\ {0,} & {{\text{otherwise}}} \\ \end{array} } \right., $$
(8)

where \({\text{y}}_{1}\) is a triangular fuzzy number, denoted as \(y{\prime} { = < Ex} - {3En, Ex, Ex + 3En > }\). If \({Ex} - {(Ex} - {3En) = (Ex + 3En)} - {Ex = 3En}\), \(y_{1}\) is a symmetric triangular fuzzy number, denoted as \(\tilde{y}{\prime} { = (Ex, 3En)}_{{\text{T}}}\). Here, \({Ex}\) and \({3}{En}\) are the expectation and the blurring degree (also called the width) of the triangular fuzzy number, respectively.

Consider the two-cloud model \(C_{{1}} {(Ex}_{{1}} {, En}_{{1}} {, He}_{{1}} {)}\) and \(C_{{2}} {(Ex}_{{2}} {, En}_{{2}} {, He}_{{2}} {)}\) as two symmetric triangular fuzzy numbers \(\tilde{C}_{1} = {(Ex}_{{1}} {, 3En}_{{1}} {)}_{{\text{T}}}\) and \(\tilde{C}_{{2}} = {(Ex}_{{2}} {, 3En}_{{2}} {)}_{{\text{T}}}\). Define \(\Delta {Ex} = {Ex}_{{1}} - {Ex}_{{2}} {, }\Delta {En} = {En}_{{1}} - {En}_{{2}}\). From Ref. [49], \(d(\tilde{C}_{1} ,\tilde{C}_{2} )\) is reduced to:

$$\begin{aligned} d(\tilde{C}_{1} ,\tilde{C}_{2} ) &= D_{{\text{T}}} \left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right)\\ & = \left[ {\left( {\Delta {Ex}} \right)^{{2}} { + }\left( {1 - 2\lambda } \right)\Delta {Ex}\Delta {En + }\left( {\Delta {En}} \right)^{{2}} } \right]^{{\frac{{1}}{{2}}}} .\end{aligned} $$
(9)

This distance describes the relationship between the positions of two triangular cloud models. It is jointly determined by the expectation of the symmetric triangular fuzzy number and the fuzziness.

According to the \(D_{{\text{T}}}\) distance \(d(\tilde{C}_{1} ,\tilde{C}_{2} )\), the closeness of \(C_{1}\) and \({{C}}_{{2}}\) can be expressed as follows:

$$ q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = e^{{ - \left[ {d(\tilde{C}_{1} ,\tilde{C}_{2} )} \right]^{2} }} $$
(10)

The following is the process of proving that Eq. (10) satisfies the closeness condition:

$$\begin{aligned} 1. \quad \because d(\tilde{C}_{1} ,\tilde{C}_{2} ) &= \left[ {\left( {{Ex}_{1} - {Ex}_{2} } \right)^{2} + \left( {1 - 2\lambda } \right)({Ex}_{1} - {Ex}_{2} )}\right. \\ & \quad \times\left.{({En}_{1} - {En}_{2} ) + \left( {{En}_{1} - {En}_{2} } \right)^{2} } \right]^{\frac{1}{2}}\\ &= { }\left[ {\left( {{Ex}_{{2}} - {Ex}_{{1}} } \right)^{{2}} { + }\left( {1 - 2\lambda } \right){(Ex}_{{2}} - {Ex}_{{1}} {\text{)}}}\right. \\ & \quad \times\left.{{(En}_{{2}} - {En}_{{1}} {) + }\left( {{En}_{{2}} - {En}_{{1}} } \right)^{{2}} } \right]^{{\frac{{1}}{{2}}}} { }\\ &= d\left( {\tilde{C}_{2} ,\tilde{C}_{1} } \right) \end{aligned}$$
$$ \therefore q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = e^{{ - \left[ {d(\tilde{C}_{1} ,\tilde{C}_{2} )} \right]^{2} }} = e^{{ - \left[ {d(\tilde{C}_{2} ,\tilde{C}_{1} )} \right]^{2} }} = q\left( {\tilde{C}_{2} ,\tilde{C}_{1} } \right). $$

\(2. \quad \because q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = 1 \Leftrightarrow e^{{ - \left[ {d(\tilde{C}_{1} ,\tilde{C}_{2} )} \right]^{2} }} = 1 \Leftrightarrow d\left( { \tilde{C}_{1} ,\tilde{C}_{2} } \right) = 0\)

$$ \Leftrightarrow \left[ {\left( {\Delta {Ex}} \right)^{2} + \left( {1 - 2\lambda } \right)\Delta {Ex}\Delta {En} + \left( {\Delta {En}} \right)^{2} } \right]^{\frac{1}{2}} = 0 $$
$$ \Leftrightarrow \left( {\Delta {Ex} + \Delta {En}} \right)^{2} { = }\left( {2\lambda + 1} \right)\left( {\Delta {Ex}\Delta {En}} \right) $$
$$ \because \lambda \in [0,1] \therefore \Delta Ex\Delta En \ge 0 $$

When \(\lambda \in [0,0.5]\), it is easy to see that \(\left[ {\left( {\Delta Ex} \right)^{2} + \left( {1 - 2\lambda } \right)\Delta Ex\Delta En + \left( {\Delta En} \right)^{2} } \right]^{\frac{1}{2}} = 0\) holds when \(\left\{ {\begin{array}{*{20}c} {\Delta {Ex} = 0} \\ {\Delta {En} = 0} \\ \end{array} } \right.\).

When \(\lambda \in [0.5,1]\), \(\left( {\Delta {Ex + }\Delta {En}} \right)^{2} { = }\left( {2\lambda + 1} \right)\left( {\Delta {Ex}\Delta {En}} \right)\) \(\Leftrightarrow \left( {\Delta {Ex}} \right)^{2} { + }\left( {\Delta {En}} \right)^{2} { = }\left( {2\lambda - 1} \right)\Delta {Ex}\Delta {En}\).

From the basic inequality, it follows that \(\left( {\Delta {Ex}} \right)^{2} { + }\left( {\Delta {En}} \right)^{2} \ge 2\Delta {Ex}\Delta {En}\).

$$ \therefore \left( {2\lambda - 1} \right)\Delta {Ex}\Delta {En} \ge 2\Delta {Ex}\Delta {En} \Leftrightarrow \left( {2\lambda - 3} \right)\Delta {Ex}\Delta {En} \ge 0 $$

\(\because \lambda \in [0,1] \therefore (2\lambda - 3) \in [ - 3, - 1], \)and the inequality holds only when \(\left\{ {\begin{array}{*{20}c} {\Delta {Ex} = {0}} \\ {\Delta {En} = {0}} \\ \end{array} } \right.\).

$$ \therefore d\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = 0 \Leftrightarrow \left\{ {\begin{array}{*{20}c} {\Delta {Ex} = 0} \\ {\Delta {En} = 0} \\ \end{array} } \right. \;\therefore q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = 1 \Leftrightarrow \tilde{C}_{1} = \tilde{C}_{2} . $$

3. When \(\tilde{C}_{1} \subseteq \tilde{C}_{2} \subseteq \tilde{C}_{3}\), \({Ex}_{{1}} = {Ex}_{{2}} = {Ex}_{{3}}\), and \({3En}_{1} \le {3En}_{2} \le {3En}_{3}\), we have:

$$ \left\{ {\begin{array}{*{20}c} {\left[ {\left( {{Ex}_{1} - {Ex}_{3} } \right)^{2} { + }\left( {1 - 2\lambda } \right){(Ex}_{1} - {Ex}_{3} {)(En}_{1} - {En}_{3} {) + }\left( {{En}_{1} - {En}_{3} } \right)^{2} } \right]^{\frac{1}{2}} } \\ {{ = }\left[ {\left( {{Ex}_{1} - {Ex}_{2} } \right)^{2} { + }\left( {1 - 2\lambda } \right){(Ex}_{1} - {Ex}_{2} {)(En}_{1} - {En}_{2} {) + }\left( {{En}_{1} - {En}_{2} } \right)^{2} } \right]^{\frac{1}{2}} ,} \\ {\left[ {\left( {{Ex}_{1} - {Ex}_{3} } \right)^{2} { + }\left( {1 - 2\lambda } \right){(Ex}_{1} - {Ex}_{3} {)(En}_{1} - {En}_{3} {) + }\left( {{En}_{1} - {En}_{3} } \right)^{2} } \right]^{\frac{1}{2}} } \\ {{ = }\left[ {\left( {{Ex}_{3} - {Ex}_{2} } \right)^{2} { + }\left( {1 - 2\lambda } \right){(Ex}_{3} - {Ex}_{2} {)(En}_{3} - {En}_{2} {) + }\left( {{En}_{3} - {En}_{2} } \right)^{2} } \right]^{\frac{1}{2}} } \\ \end{array} } \right. $$

that is, we have \(\left\{ {\begin{array}{*{20}c} {d\left( { \tilde{C}_{1} ,\tilde{C}_{3} } \right) \ge d\left( { \tilde{C}_{1} ,\tilde{C}_{2} } \right)} \\ {d\left( { \tilde{C}_{1} ,\tilde{C}_{3} } \right) \ge d\left( { \tilde{C}_{2} ,\tilde{C}_{3} } \right)} \\ \end{array} } \right. \Leftrightarrow \left\{ {\begin{array}{*{20}c} {q\left( {\tilde{C}_{1} ,\tilde{C}_{3} } \right) \ge q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right)} \\ {q\left( {\tilde{C}_{1} ,\tilde{C}_{3} } \right) \ge q\left( {\tilde{C}_{2} ,\tilde{C}_{3} } \right)} \\ \end{array} } \right.\).

Additionally, \( q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right):[0, + \infty ] \to (0,1)\) is strictly monotonically decreasing. \( Q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = 1\) when \( d\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = 0\). When \( d\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) \to + \infty\), \( \mathop {\lim }\nolimits_{{d\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) \to + \infty }} q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right) = 0\).

In summary, \( q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right)\) is the closeness of the cloud models \(\tilde{C}_{1}\) and \(\tilde{C}_{2}\). This closeness is determined by the distance between the two cloud models. The smaller the distance is, the greater the closeness. Therefore, this exponential closeness can be used to characterize the distance similarity between two cloud models. Within the “\({3En}\)” rule, when the two cloud models have no intersection, i.e. \(\left| {\Delta {Ex}} \right|{ > }\left| {\Delta {En}} \right|\), their similarity is 0. Therefore, their distance similarity can be expressed as follows:

$$ {\text{Sim}}_{{1}} \left( {C_{{1}} {,}C_{{2}} } \right){ = }\left\{ {\begin{array}{*{20}c} {q\left( {\tilde{C}_{1} ,\tilde{C}_{2} } \right){, }} & {\left| {\Delta {Ex}} \right| \le \left| {\Delta {En}} \right|} \\ {0,} & {\left| {\Delta {Ex}} \right| > \left| {\Delta {En}} \right|} \\ \end{array} } \right., $$
(11)

Equation (11) is the distance similarity derived from the \({{D}}_{\text{T}}\) distance.

In fact, not only can triangular fuzzy numbers be converted, but intuitionistic fuzzy numbers and trapezoidal fuzzy numbers can as well. It is only necessary to modify the affiliation functions corresponding to intuitionistic fuzzy numbers and trapezoidal fuzzy numbers into corresponding intuitionistic and trapezoidal clouds and then combine them with the specific fuzzy posting schedule to obtain the specific distance similarity algorithm.

Shape similarity of the triangular cloud model based on the cloud drop variance

The single distance similarity calculation considers only \({Ex}\) and \({En}\) in the cloud model, but does not consider \({He}\). This algorithm therefore leads to some errors in the results. To reduce the experimental error, \({He}\) must be properly introduced into the calculation.

The similarity of cloud model essentially includes distance similarity and shape similarity. Shape similarity is a measure of how similar the cloud model uncertainty is and is reflected in the geometry of the two clouds [45]. According to Lemma 1, the cloud drop variance determines the morphology of the cloud model and describes the shape characteristics. If the position relationship is not considered, the more similar two cloud models are in appearance, the greater their similarity. Therefore, this paper constructs the shape similarity of triangular cloud models starting from the cloud drop variance. If there are two groups of triangular cloud models \({{C}}_{\text{i}}{(E}{x}_{\text{I}}{, E}{n}_{\text{I}}{, H}{e}_{\text{i}}\text{)}\) and \({{C}}_{\text{j}}{(E}{x}_{\text{j}}{, E}{n}_{\text{j}}{, H}{e}_{\text{j}}\text{)}\), their shape similarity is expressed as follows:

$$\text{Sim}_{2}\text{(}{{C}}_{1}\text{,}{{C}}_{2}\text{)=}\frac{{\text{min}}\left(\sqrt{{{D}}{\left({\text{x}}\right)}_{\text{i}}}\text{,}\sqrt{{{D}}{\left({\text{x}}\right)}_{\text{j}}}\right)}{{\text{max}}\left(\sqrt{{{D}}{\left({\text{x}}\right)}_{\text{i}}}\text{,}\sqrt{{{D}}{\left({\text{x}}\right)}_{\text{j}}}\right)},$$
(12)

where \({{D}}\left({{x}}\right)\) is related only to the parameters \({En}\) and \({He}\). A change in the parameter \({Ex}\) does not affect the shape similarity. For example, as shown in Fig. 3, although the \({Ex}\) values of the two triangular cloud models \({{C}}_{2}\) and \({{C}}_{3}\) are different, the \({En}\) and \({He}\) values are exactly the same. Therefore, it is determined that \({\text{Sim}}_{2}\text{(}{{C}}_{1}\text{,}{{C}}_{2}\text{)=1}\). Therefore, from the perspective of shape similarity, although the distance between \({{C}}_{1}\) and \({{C}}_{2}\) is smaller than that between \({{C}}_{2}\) and \({{C}}_{3}\), \({\text{Sim}}_{2}\left({{C}}_{2}\text{,}{{C}}_{3}\right) \, \text{>} \, {\text{Sim}}_{2}\left({{C}}_{1}\text{,}{{C}}_{2}\right)\).

Fig. 3
figure 3

Shape similarity example. Cloud model similarity is considered in terms of appearance; the more similar two clouds appear, the greater the similarity value

The integrated similarity measurement of the triangular cloud model

The distance similarity based on the exponential closeness of the \({{D}}_{\text{T}}\) distance is integrated with the shape similarity algorithm based on the cloud drop variance. The synthesized similarity algorithm considers both the distance and shape difference of the cloud model. Therefore, it is more in line with human cognitive perception. DDTSTCM is expressed as:

$$ {\text{DD}}_{{\text{T}}} {\text{STCM(}}C_{{1}} {,}C_{{2}} {\text{) = Sim}}_{{1}} \left( {C_{{1}} {,}C_{{2}} } \right) \times {\text{Sim}}_{{2}} {(}C_{{1}} {,}C_{{2}} {)}{\text{.}} $$
(13)

The following is the code for the MATLAB-based DDTSTCM:

figure b

A cloud model similarity algorithm based on exponential closeness and cloud drop variance

Experiments and results

Experimental setup

For the evaluation of cloud model similarity algorithms, the commonly used evaluation indices are discriminability, efficiency, stability, and theoretical interpretability [58, 59]. To facilitate understanding, the research ideas in this section are briefly introduced. In “Experimental setup”, the evaluation indices of the cloud models are determined, and appropriate comparison algorithms are selected. “Equipment security system capability evaluation experiment” presents a capability evaluation experiment for the equipment assurance system to verify the rationality of DDTSTCM. In “Cloud model differentiation simulation experiment”, a simulation experiment on cloud model differentiation is formulated to examine the discriminability or inferiority of the comparison algorithms in extreme cases. In “Time series classification accuracy experiment”, a simulation experiment on cloud model differentiation is developed from the perspective of the contrasting algorithm’s stability and operational efficiency, and a time series classification accuracy experiment is formulated. “Comparative analysis and summary” synthesizes the performance of the eight algorithms in the experimental process, compares the theoretical interpretability of the different algorithms, and summarizes the results. In “Engineering case study”, the triangular cloud model similarity algorithm proposed in this paper is introduced as an engineering example for the risk assessment of an island microgrid in the Yangtze River Delta region, and the value of the DDTSTCM algorithm in engineering applications is verified.

The comparison algorithms were selected from the perspective of algorithm novelty and relative advantages. Seven algorithms, FDCM, SimEEH, MCM, OECM, EPTCM, MSPSCM, and LSCM, were ultimately selected for the comparison experiments. All the simulation experiments in this section are based on the MATLAB R2022a platform, and the data and code related to the experiments can be accessed through the links in the Data Availability section.

Equipment security system capability evaluation experiment

This section evaluates the actual equipment safeguard system capability and verifies the similarity between its results and those of human cognition to evaluate the rationality and effectiveness of the algorithm. The results of evaluating the capability of an equipment safeguard system, which was evaluated in several related studies [60, 61], were used to establish the evaluation target cloud \({{C}}_{\text{T}} (84.77, 4.0, 0.4)\). To ensure that the experimental results are objective and realistic, this paper refers to the algorithm of dividing the thesis domain and the corresponding numerical intervals and cloud model parameters of each evaluation grade in the literature [61], as shown in Table 2. The grades in Table 2 are represented by I-V, where the higher the grade is, the better the evaluation performance of the equipment.

Table 2 The numerical ranges and cloud models corresponding to the evaluation grades

Experiment 1 utilizes DDTSTCM and seven comparison algorithms to calculate the similarity between the grade cloud and the target cloud, and the results are shown in Table 3. To assist in analysing the results, the visualized diagrams of the hierarchical cloud and the target cloud are shown in Fig. 4.

Table 3 Similarity between the target cloud and evaluation grade cloud
Fig. 4
figure 4

Cloud map of the grade cloud and target cloud. The blue clouds represent the five rank evaluation clouds, and the red clouds points represent the evaluation target cloud \(C_{{\text{T}}}\)

As shown in Table 3, except for SimEEH, the conclusions obtained by all algorithms indicate that the target cloud is similar to rank clouds IV and V, but ambiguity arises regarding which rank cloud it is most similar to. The specific analysis is as follows:

FDCM and SimEEH exhibit poor discriminability. From Fig. 4, we know that the target cloud does not intersect with the three rank clouds I, II and III. However, the two algorithms determined that the target cloud and rank clouds II and III have high similarity, which is obviously not in line with human cognition.

MCM and OECM are algorithms based on the characteristic curve aspect of the cloud model. From Table 3, the results of the three algorithms are the same, and the target cloud is most similar to rank cloud V. From Fig. 4, it is easy to see that the target cloud has the largest overlap area with rank cloud V. From a methodological point of view, the results obtained are in accordance with the assumptions. However, from Fig. 4, it can be found that the shapes of the target cloud and rank cloud V have large differences, as reflected in the difference between the parameters \({En}\) and \({He}\).

LSCM, EPTCM, and DDTSTCM all start from two perspectives, distance and shape, in contrast to the algorithm based on feature curves. The judgement results of all three algorithms are that the target cloud is more similar to rank cloud IV; MSPSCM also starts from two perspectives, but Yao et al. [47] used a random weighting method to randomize the data to reduce the effect of uncertainty on the accuracy. Therefore, the MSPSCM determination is that the target cloud is more similar to rank cloud V because it reduces the effect of \({He}\).

Cloud model differentiation simulation experiment

In this section, a set of four cloud models generated in Ref. [36] is used for the differentiation experiments via a collaborative filtering algorithm. This set of cloud models has been cited in several studies [36, 46] and consists of a set of data that tests the strengths and weaknesses of the algorithm and is authentic and reliable. The four sets of cloud models are shown below.

$$\left[\begin{array}{c}{{C}}_{1}\left(\text{1.5, 0.62666, 0.33900}\right)\\ {{C}}_{2}\left(\text{4.6, 0.60159, 0.30862}\right)\\ {{C}}_{3}\left(\text{4.4, 0.75199, 0.27676}\right)\\ {{C}}_{4}\left(\text{1.6, 0.60159, 0.30862}\right)\end{array}\right].$$

A visualization of these four sets of cloud models is shown in Fig. 5.

Fig. 5
figure 5

Four groups of cloud model diagrams. The two groups of clouds that are close to each other are represented by the same colours. \(C_{1}\) and \(C_{4}\) are in green, and \(C_{2} {\text{and}} C_{3}\) are in orange

Figure 5 shows that \({{C}}_{1}\text{ and }{{C}}_{4}\) and \({{C}}_{2}\text{ and }{{C}}_{3}\) are very similar. Therefore, the four cloud models are divided into two groups, where \({{C}}_{1}\text{ and }{{C}}_{4}\) are group A and \({{C}}_{2}\text{ and }{{C}}_{3}\) are group B. The differences between groups A and B are relatively large, and the differences within the groups are relatively small. To compare the discriminative effect of the algorithm in extreme cases, the similarity difference \(Dis\) between groups A and B is defined:

$${\text{Dis}} = \left|{\text{Sim}}\text{(}{{C}}_{1}\text{,}{{C}}_{4}\text{)}-{\text{Sim}}\text{(}{{C}}_{2}\text{,}{{C}}_{3}\text{)}\right|.$$
(14)

The larger \({\text{Dis}}\) is and the greater the intergroup variability between groups A and B is, the stronger the discriminability nature of the algorithm and the greater the differentiation. Experiment 2 used eight algorithms to test the two groups of cloud models separately, and the calculated results are shown in Table 4. The similarity order of the cloud models calculated by the proposed DDTSTCM is \(\left({{C}}_{1}\text{,}{{C}}_{4}\right)\text{>}\left({{C}}_{2}\text{,}{{C}}_{3}\right)\text{>}\left({{C}}_{3}\text{,}{{C}}_{4}\right)\text{>}\left({{C}}_{1}\text{,}{{C}}_{3}\right)\text{>}\left({{C}}_{2}\text{,}{{C}}_{4}\right)> \text{ } \left({{C}}_{1}\text{,}{{C}}_{2}\right)\), as shown in Table 3. This is consistent with the visual impression in Fig. 5.

Table 4 Measurement results of the similarity of the cloud models of eight algorithms

To compare the results more visually, a histogram of the discriminability results of each algorithm is constructed, as shown in Fig. 6.

Fig. 6
figure 6

Comparison histogram of the discriminability results of the eight algorithms. The similarity difference \({\text{Dis}}\) of the first four algorithms is significantly lower than the values of the last four algorithms. DDTSTCM has the highest value of \({\text{Dis}}\), and FDCM has the lowest value of \({\text{Dis}}\)

From Fig. 6, we see that the discriminability of DDTSTCM is the best, followed by that of LSCM and MSPSCM. Side by side, the results show that the algorithm based on distance and shape similarity in both directions is reasonable and advantageous and has better discriminability.

FDCM calculates the wrong value of NAN in Table 4, which is influenced by the algorithm’s calculation formula. FDCM will lose its effectiveness when dealing with two cloud models with identical \({En}\) and \({He}\) values. MCM itself uses \({3}{He}\), which over-exaggerates the role of \({He}\), leading to poor discriminability. SimEEH integrates the similarities of \({Ex}\), \({En}\), and \({He}\) and exaggerates the roles of the three parameters at the same time in determining the discriminability of two groups of completely unrelated clouds, resulting in poor discriminability.

Time series classification accuracy experiment

To verify the stability and CPU operation efficiency of the proposed algorithm, time series simulation experiments are used in this section. A time series refers to a sequence formed by arranging the values of the same indicator in accordance with the time sequence of their occurrence; this sequence reflects dynamic changes over time, so it is also called a dynamic series. A time series generally includes two elements: time and temporal indicator values. The synthesis of these two basic elements can reflect the changes during industrial production of an object, describing the state of development and the results. Research on the development trend and development speed during an experiment, exploring the pattern of change of experimental development, can involve time series forecasting.

Introduction of the time series classification dataset

Time series data have strict temporal sequences, which is important in time series data analysis and data mining [62]. To verify the performance of DDTSTCM and the seven comparative algorithms, the experiments in this section use the synthetic control chart dataset (SYNDATA), a time series dataset [63] from the UCI Knowledge Discovery Database (UCI KDD), for algorithm testing. The SYNDATA dataset contains a variety of change trends that are volatile and complex and can be a good test of the accuracy of a similarity algorithm.

This dataset includes 600 examples of integrated control charts generated by Alcock and Manolopoulos in 1999; these charts are divided into six categories, each of which has 100 time series of length 60 [64]. The SYNDATA dataset contains six types of time series patterns with different trends of change. The integrated control dataset, which is used to count control processes, exhibits six main pattern types: normal, cyclic, increasing, decreasing, upwards shifting, and downwards shifting. The specific performance metrics of the dataset are shown in Table 5.

Table 5 SYNDATA: the composition and content details of the dataset

Time series classification algorithm based on cloud model similarity

SYNDATA with labels [65] is commonly represented in the form of a matrix \(D_{m \times n}\) (m = 600, n = 6). 600 rows of data have one class for every 100 rows, for a total of 6 classes. Amongst machine learning algorithms, the K-nearest neighbours (KNN) [66, 67] algorithm is a very easy and convenient algorithm to implement. This algorithm is often used to analyse the association information between data values and determine the intrinsic connections between variables. The traditional KNN algorithm first determines the number \({\text{k}}\) of nearest neighbours with a certain feature value. The \({\text{k}}\) closest distance feature values are filtered out by determining the distance magnitude between the target feature values and different feature values. Then, the target feature values are categorized. In this paper, instead of taking distance measurements between different feature values, DDTSTCM is used. Each record is treated as a query sequence individually. That is, for any record, cloud model similarity calculations need to be performed with the remaining 599 records. The top \({\text{k}}\) maxima are selected according to the similarity magnitude ranking. These \({\text{k}}\) numbers are categorized according to the group to which they belong. The flowchart of the time series classification algorithm is shown in Fig. 7.

Fig. 7
figure 7

Algorithm flowchart. The flowchart succinctly expresses the basics of the time series experiment and the basic steps of the algorithm

SYNDATA has high requirements on correctness and time complexity. Therefore, the time series classification experiment research algorithm should start from the correctness and time complexity of the results. The computational results of the comparison algorithm and DDTSTCM in the time series classification accuracy experiments are compared. To facilitate the analysis of the results, the average accuracy calculation value is used as a reference index. The average accuracy calculation formula is defined as follows:

$${{\text{Average Accuracy}}}\left({\text{Group}} X \right)=\frac{{\sum }_{\text{i} = {1}}^{10}\left({{P}}_{{X}}\left({k=i}\right)\right)}{10}, {X} = {A, B, C}$$
(15)

The classification accuracy results for the three training sets A, B and C are shown in Fig. 8. From Fig. 8, it is easy to see that when DDTSTCM is applied to the KNN algorithm for classification, the classification accuracy in the three training sets remains high, which is especially obvious in sets A and C. With the change in \({\text{k}}\), the classification accuracy maintains good stability.

Fig. 8
figure 8

Classification results for the three training sets. a Shows the classification results for group A, b shows the classification results for group B, and c shows the classification results for group C. In the bottom right corner is the legend of the corresponding line shapes of the eight algorithms

Regarding the time complexity of the algorithms, FDCM, EPTCM, and MSPSCM need to undergo multiple integration operations due to the characteristics of the algorithms themselves, which greatly increases the CPU running time. Therefore, this paper does not compare the CPU running time of the above three algorithms. The CPU running time of the remaining five similarity algorithms trained on the groups are shown in Fig. 9.

Fig. 9
figure 9

Comparison graph of CPU time costs. This figure shows the CPU running time of the algorithm processing corresponding to different training groups

From Fig. 9, amongst the remaining five algorithms, the CPU running times of four algorithms are basically the same, and that of MCM is greater. The CPU running time of DDTSTCM, in contrast to those of the other four algorithms, is on the order of milliseconds, so it maintains high efficiency.

Based on the time series classification results, the classification accuracies of the eight algorithms are shown in Table 6.

Table 6 Average classification accuracy of the eight algorithms

shows that DDTSTCM has strictly higher results than the remaining seven algorithms in terms of classification accuracy, and overall, it guarantees an average classification accuracy of more than 91.78%. SimEEH has the next highest average classification accuracy, with an accuracy of 89.00%. OECM has the lowest average classification accuracy, at only 58.89%. Comparisons between the eight algorithms were made, and if DDTSTCM was used, the classification accuracy increased by a minimum of 2.78% and a maximum of 32.89%.

In order to explore the influence of parameter \(\lambda\) on the experimental results more clearly, we carry out the traversal of parameter \(\lambda\) to verify the influence of experimental results, and the results are shown in Table 7. As can be seen from Table 7, with the change of parameter \(\lambda\), the average classification accuracy changes, but the value is still high. When the parameter \(\lambda\) is equal to 0, the average classification accuracy is the least satisfactory with a value of 89.61%, but it is still higher than the maximum value of 89.00% of the remaining seven comparison algorithms in Table 6, which reflects the superiority of the algorithm in this paper.

Table 7 Classification accuracy results of time series corresponding to different \(\lambda\) values

A change in \({\text{k}}\) causes a change in the classification accuracy of the algorithm. Therefore, even for the same algorithm, the degree of variation between different groups is not exactly the same. To further illustrate the stability of the algorithms under different \({\text{k}}\) values, box line plots of variance comparisons under the eight algorithms were made, as shown in Fig. 10. In this figure, each algorithm has three groups of variances, A, B and C. In this paper, the average variance of the three groups is used instead of the variance of the same algorithm.

Fig. 10
figure 10

Algorithm variance comparison box line graph. The horizontal coordinates of the plot represent the algorithms, and the vertical coordinates represent the variance values. In the middle of each set of box line plots, the average variance values are shown

As shown in Fig. 10, the average variance in DDTSTCM is \(6.39 \times 10^{ - 4}\), which is the lowest amongst the eight algorithms. This means that the data in groups A, B, and C under this algorithm deviate from their respective means by the smallest distance, the data are the most stable, the errors are the smallest, and the results are the most accurate. From Fig. 10, the average variance of LSCM is also very small at \(8.46 \times 10^{ - 4}\). That is, the classification stability of the data in groups A, B and C of this algorithm is also very high, and the stability of group C is even greater than that of DDTSTCM. However, from an overall perspective, it is clear that DDTSTCM is better than LSCM.

Comparative analysis and summary

The above experiments verify the performance of DDTSTCM in three respects—discriminability, stability and efficiency—and analyse the performance of several algorithms from the perspective of theoretical interpretability.

DDTSTCM starts with the shape and location similarity in both directions, and a change in \({\text{k}}\) can adjust the allocation of its parameters to obtain better and more reasonable similarity results. DDTSTCM can assign the feature parameters in a more human-like way and has good theoretical interpretations, and MSPSCM, LSCM, and EPTCM have the same starting point as DDTSTCM and also have good theoretical interpretations. \({3}{He}\) is introduced in MCM, and it exaggerates the role of \({He}\) to a certain extent and has general theoretical interpretability. Table 4 shows that FDCM has an error value of NAN in the experiment, and its theoretical interpretability is poor due to the limitations of its own algorithm. The basic idea of SimEEH is to determine the similarity between the three parameters from the cloud model, which reflects the overall difference between two uncertain concepts, and its theoretical interpretability is general.

To summarize the comparison results of the eight algorithms, they are organized in Table 8. Table 8 shows that the similarity algorithm based on the exponential progress and cloud droplet variance in this paper scores highly on all four evaluation indices. This indicates that DDTSTCM has strong comprehensive performance and good application ability.

Table 8 Comparison of cloud model similarity algorithms

Engineering case study

To further validate the engineering application value of DDTSTCM, the algorithm is applied to assess island power grids in the Yangtze River Delta region. In this section, an engineering case study of the proposed cloud model similarity computation method is presented as an example of a cloud model-based approach for assessing risk to island microgrids. Wu et al. [68] adopted a holistic view based on a three-dimensional model to identify risks, including the four categories of technology, economy, environment, and society, and constructed a reasonably applicable framework of risk assessment for island microgrids, which is shown in Fig. 1 of the supplementary file. Wu et al. [68] also utilized a hesitant fuzzy language word set to evaluate the indices and transformed the evaluation information using the cloud model. The final conclusion is that the overall risk level to China’s island microgrids is “slightly high”.

In this section, the modelling and analysis of the results are based on the evaluation system established by Wu et al. and the experimental data collected. First, from the perspective of overall analysis, the relationship between specific indices and the assessment results of island microgrids is analysed qualitatively by drawing relevant cloud diagrams. The similarity theory is used to calculate the relationship between each index and the corresponding evaluation level, calculate the degree of association, and analyse the overall risk level of the island microgrid from a quantitative perspective. The conclusions drawn by Wu et al. are verified by cross-validation. If the conclusions based on the method of this paper are the same as those of Wu et al., the algorithm of this paper has the same engineering application value.

Overall analysis results

Wu et al. [68] first established a risk assessment index system that contains 4 first-class indices and 14s-class indices. A hesitant fuzzy linguistic term set (HFLTS) is applied to evaluate the risk level of each index. Standard intervals are set for seven scales (\({\text{s}}_{0}\text{,}{ \, {\text{s}}}_{1}\text{,}{ \, {\text{s}}}_{2}\text{,}{ \, {\text{s}}}_{3}\text{,} \, {\text{s}}_{4}\text{,} \, {\text{s}}_{5}\text{,} \, {\text{s}}_{6}\)). The seven criteria intervals correspond to the evaluation statuses of extremely low (EL), low (L), slightly low (SL), general (G), slightly high (SH), high (H), and extremely high (EH). Using the forward cloud generator, seven HFLTS-scale clouds, named standard clouds, are generated. The cloud parameters of the standard clouds are shown in Table 1 of the supplementary file.

Each risk level is then evaluated according to the evaluation cloud assigned by experts. The evaluation language of the experts is used to generate the three parameters of the cloud model through Eq. (16) [68].

$$ \left\{ {\begin{array}{*{20}c} {{Ex}_{j} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} z_{ij} } \\ {{En}_{j} = \frac{1}{n}\sqrt {\frac{\pi }{2}} \mathop \sum \limits_{i = 1}^{n} \left| {z_{ij} - {Ex}_{j} } \right|} \\ {S_{j}^{2} = \frac{1}{n - 1}\mathop \sum \limits_{i = 1}^{n} \left( {z_{ij} - {Ex}_{j} } \right)^{2} } \\ {{He}_{j} = \sqrt {\left| {S_{j}^{2} - {En}_{j}^{2} } \right|} } \\ \end{array} } \right.. $$
(16)

The BWM (best and worst method) was used to calculate the weight of each categorized index, and the results are shown in Table 2 of the supplementary file. The main idea of the BWM is to select the best and worst indices from amongst all the evaluation indices. Then, it was used as a reference standard for comparison with other indices and the two reference indices and scored using the 1–9 scoring method. The cloud model parameters for each primary and secondary index can be obtained using Eq. (17) several times [68], and the results are shown in Tables 3 and 4 of the supplementary file.

$$ \left\{ {\begin{array}{*{20}c} {{Ex} = \frac{{{Ex}_{1} \omega_{1} + {Ex}_{2} \omega_{2} + \cdots + {Ex}_{n} \omega_{n} }}{{\mathop \sum \nolimits_{i = 1}^{n} \omega_{i} }}} \\ {{En} = \frac{{{En}_{1} \omega_{1}^{2} + {En}_{2} \omega_{2}^{2} + \cdots {En}_{n} \omega_{n}^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \omega_{i}^{2} }}} \\ {{He} = \frac{{{He}_{1} \omega_{1}^{2} + {He}_{2} \omega_{2}^{2} + \cdots {He}_{n} \omega_{n}^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \omega_{i}^{2} }}} \\ \end{array} } \right.. $$
(17)

The final solution yields the overall risk assessment cloud \({\text{R}}\)(57.6605, 2.0631, 0.0370), named the integrated cloud. The combination of the integrated cloud and the standard cloud is represented in the form of a cloud diagram, and the results are shown in Fig. 11. Similarly, the cloud diagrams of the synthesized and standard clouds are drawn in this paper according to the method of triangular cloud generation, and the results are shown in Fig. 12. Figures 11 and 12 show that most of the integrated clouds are covered by “slightly high” clouds. The overall risk level of the island microgrid is slightly higher than that of the other microgrids, and the conclusions of the two methods are consistent. Moreover, in this paper, cloud diagrams of the intersections of the four first-level indicators and the standard cloud are constructed, and the results are shown in Figs. 2, 3, 4, and 5 of the supplementary file. These conclusions are consistent with those of Wu et al. (see Fig. 6 of the supplementary file). Specifically, “Technology risk” ranked high amongst the main indices, “Economic risk” ranked second, “Social risk” ranked third, and “Environmental risk” ranked fourth. The conclusion of this paper’s algorithmic judgement is consistent with the conclusions of Wu et al. [68], Nazir et al. [69] and Khodaei et al. [70].

Fig. 11
figure 11

Overall risk assessment results (normal cloud model). The blue clouds in the cloud diagram indicate the standard cloud \(s_{0} \sim s_{6}\), and the red cloud indicate the synthesized cloud \({\text{R}}\). This figure shows the cloud diagram drawn by the algorithm of Wu et al.

Fig. 12
figure 12

Overall risk assessment results (triangular cloud model). The meaning is the same as in Fig. 11, which is the cloud model drawn by the algorithm of this paper

Comparison of evaluation methods

Wu et al. also chose the object element topology method to verify the correctness of the results. The basic idea is to determine the correlation between the clouds to be evaluated at each level and the standard clouds at each level. The correlation coefficient \({\text{K}}\) is defined to consider the correlation between the object elements of the clouds to be evaluated and the level standard clouds, and the correlation coefficient of the metrics to be evaluated is obtained with the correlation degree (CD) of the evaluated objects. \({\text{K}}\) can be calculated by Eq. (18) [68]. Second, the correlation degrees of the indices to be evaluated with the standard clouds at all levels are determined. In this paper, Eq. (18) is applied to consider the correlation between the object elements of the cloud to be evaluated and the standard clouds of the levels [68]. Finally, the correlation coefficients of the indices can be obtained from the correlation degrees of the evaluated objects, as shown in Table 8.

$$ \left\{ {\begin{array}{*{20}l} {K = \frac{\left| N \right|}{{\left| M \right|}}} \\ {N = \left\{ {\left( {{Ex}_{{1}} - {3En}_{{1}}{\prime} {,}\;{Ex}_{{1}} { + 3En}_{{1}}{\prime} } \right)} \right\} \cap {\{ (Ex}_{{2}} - {3En}_{{2}}{\prime} {,}\;{Ex}_{{2}} { + 3En}_{{2}}{\prime} {\text{)\} }}} \\ {M = \left\{ {\left( {{Ex}_{{1}} - {3En}_{{1}}{\prime} {,}\;{Ex}_{{1}} { + 3En}_{{1}}{\prime} } \right)} \right\} \cup {\{ (Ex}_{{2}} - {3En}_{{2}}{\prime} {,}\;{Ex}_{{2}} { + 3En}_{{2}}{\prime} {\text{)\} }}} \\ \end{array} } \right.. $$
(18)

\({E}{n}^{\prime}\) is a random number generated by the cloud model. The correlation coefficients between the algorithms and the standard clouds are shown in Table 5 of the supplementary file.

In this section, following the idea of Wu et al., the method of this paper is applied to calculate the similarity between the standard cloud and the evaluation clouds at all levels and compared with the correlation calculated by Wu et al. The results are shown in Table 9. Since there are many evaluation indicators, only one indicator is selected from each of the four major categories, and the weight of each indicator is the largest in the same group.

Table 9 Correlation results

According to the principle of maximum affiliation, the overall risk level of island microgrids can be judged as slightly high from Table 9, and the results of this paper are consistent with the results of Wu et al. and with the evaluation results in “Overall analysis results”. As a result, this section verifies the feasibility of the algorithm proposed in this paper through the example of island power grid evaluation in the Yangtze River Delta region and shows that DDTSTCM has engineering application value.

Summary and discussion

To address the defects of the existing cloud model similarity algorithms, this paper presents a cloud model similarity metric algorithm based on the exponential closeness of triangular fuzzy numbers and cloud drop variance. Unlike previous studies, this paper takes the triangular cloud model, an extension of the normal cloud model, as the research object and constructs the algorithm from two perspectives: cloud model distance and shape. In this paper, a new type of exponential closeness of triangular fuzzy numbers is defined as the cloud model distance similarity, and the cloud drop variance is utilized to calculate the cloud model shape similarity. The advantages of these two methods are synthesized, and the DDTSTCM similarity algorithm is defined. The advantages and performance of the algorithm are evaluated via three sets of experiments. The equipment guarantee system capability evaluation experiment verifies the discrepancy between the results of DDTSTCM and those of previous algorithms due to differences in algorithm construction angles. The results of the cloud model discriminability simulation experiments show that DDTSTCM has better discriminability than the other seven algorithms. In the time series classification experiment, DDTSTCM achieved a combined classification accuracy of 91.87%, the highest amongst the eight algorithms. In terms of algorithm efficiency, owing to the theoretical composition of DDTSTCM, the CPU running time during operation is on the order of milliseconds, which greatly improves the operation efficiency. The DDTSTCM metric results are stable, and the average variance in the classification accuracy is only \(6.39 \times 10^{ - 4}\), which indicates excellent stability. Combining the experimental results, the cloud model similarity algorithm based on the exponential posting progress and cloud drop variance in this paper shows excellent results compared to the other seven commonly used algorithms in terms of the four metrics of discriminability, efficiency, theoretical interpretability, and stability. This paper also introduces DDTSTCM into a specific engineering example to assess potential risks to the Chinese island microgrid industry, and the assessment results are consistent with the literature results. This shows that DDTSTCM has good engineering application value and can be used to assess and analyse specific practical engineering problems to reduce risk. Finally, this paper verifies that the algorithm is closer to human cognition when constructed from the cloud model distance and shape in both directions than when constructed from the previous conceptual epipolar algorithm, feature curve algorithm, or numerical feature algorithm.

However, DDTSTCM also has several shortcomings, and we hope to address the following problems in the future.

  1. 1.

    The magnitude of the shape similarity depends only on the parameters \({En}\) and \({He}\) of the cloud model. Therefore, even when the distances of the two cloud models are very great, as long as the two parameters correspond, the shape similarity is still 1, which is obviously not in line with actual situations. Modifying the algorithm to make it more reasonable is our next step.

  2. 2.

    When calculating the distance similarity, its size is related not only to the three parameters of the cloud model but also to the value of \(\lambda\) of the \(D_{{\text{T}}}\) distance formula. In this paper, the optimal value of \(\lambda\) is found by transforming the step size to find the optimal value during the simulation experiment. However, the amount of data used in practical engineering applications is often large, and the next step is determining how to find the optimal value of \(\lambda\) in the fastest way or reduce the number of parameters input into the algorithm.