A novel ensemble estimation of distribution algorithm with distribution modification strategies

Wang, Xiaofei; Li, Yintong; Liang, Yajun; Wu, Bi; Xuan, Yongbo

doi:10.1007/s40747-023-00975-y

A novel ensemble estimation of distribution algorithm with distribution modification strategies

Original Article
Open access
Published: 20 March 2023

Volume 9, pages 5377–5416, (2023)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

A novel ensemble estimation of distribution algorithm with distribution modification strategies

Download PDF

Xiaofei Wang¹,
Yintong Li ORCID: orcid.org/0000-0001-5304-5147²,
Yajun Liang³,
Bi Wu¹ &
…
Yongbo Xuan¹

917 Accesses
Explore all metrics

Abstract

The canonical estimation of distribution algorithm (EDA) easily falls into a local optimum with an ill-shaped population distribution, which leads to weak convergence performance and less stability when solving global optimization problems. To overcome this defect, we explore a novel EDA variant with an ensemble of three distribution modification strategies, i.e., archive-based population updating (APU), multileader-based search diversification (MSD), and the triggered distribution shrinkage (TDS) strategy, named E₃-EDA. The APU strategy utilizes historical population information to rebuild the search scope and avoid ill-shaped distributions. Moreover, it continuously updates the archive to avoid overfitting the distribution model. The MSD makes full use of the location differences among populations to evolve the sampling toward promising regions. TDS is triggered when the search stagnates, shrinking the distribution scope to achieve local exploitation. Additionally, the E₃-EDA performance is evaluated using the CEC 2014 and CEC 2018 test suites on 10-dimensional, 30-dimensional, 50-dimensional and 100-dimensional problems. Moreover, several prominent EDA variants and other top methods from CEC competitions are comprehensively compared with the proposed method. The competitive performance of E₃-EDA in solving complex problems is supported by the nonparametric test results.

Symmetric-Approximation Energy-Based Estimation of Distribution (SEED) Algorithm for Solving Continuous High-Dimensional Global Optimization Problems

Large-Scale Estimation of Distribution Algorithms with Adaptive Heavy Tailed Random Projection Ensembles

Article 22 November 2019

An adaptive multiobjective estimation of distribution algorithm with a novel Gaussian sampling strategy

Article 27 August 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Evolutionary computation is an important field of numerical optimization. Due to the insensitivity of the characteristics of the objective function, evolutionary algorithms are considered effective for solving nonconvex optimization and NP-hard problems. The estimation of distribution algorithm (EDA) [1], as one of the traditional evolutionary computation techniques, has received considerable research attention over the past two decades. In contrast to the traditional evolutionary algorithms that use crossover, mutation and selection mechanisms, EDA has a unique evolutionary process [2]. It estimates the probability distribution model of selected solutions and evolves the whole population iteratively. The Gaussian distribution model is typically used with the EDA to solve problems in the continuous domain. According to the structure of the Gaussian probability model and the relationships among variables, EDAs can be classified as univariate [3], bivariate [4], or multivariate [5] models. The multivariate Gaussian distribution model used in EDA and its variants is competitive in various real world applications [6,7,8,9,10].

The EDA extracts the features of the population location from a macro perspective, and its generation relies heavily on the estimated distribution model. However, the variances in the distribution of the fitness descent directions shrink rapidly in its later stages [11, 12]. Moreover, the basic EDA has no local search mechanism to enhance the population diversity, causing it to be easily affected by ill-shaped distributions. Specifically, when addressing multimodal problems, the EDA cannot capture the appropriate characteristics of a problem effectively through the unimodal Gaussian distribution model. Thus, the traditional EDA struggles with premature convergence and is easily trapped in stagnate conditions. To address these drawbacks, many studies have been performed in recent decades, as reviewed in the next section. However, according to the results of single-objective optimization competitions organized by the IEEE Congress on Evolutionary Computation (IEEE CEC) in recent years, EDA extensions have been unable to achieve top performances in solving those complex benchmarks. Nevertheless, due to its unique model-based features, the EDA can be useful as a supplement to other algorithms. Note that all of the recent champion algorithms utilized the covariance matrix adaptation technique, such as HSES [13], ELSHADE-SPACMA [14] (in CEC 2018), EBOwithCMAR [15] (in CEC 2017), and UMOEAsII [16] (in CEC 2016). In these promising hybrid methods, the EDA provides a unique search framework to capture problem features, while other search mechanisms favorably supplement the EDA in enriching population diversity and performing efficient local searches.

However, the EDA’s potential is not limited to its application as a hybrid algorithm. The property of the distribution characteristics has not been fully exploited. First, the role of historical distribution information in modifying the abnormal distribution of the population is inefficiently developed. Second, the application of the best or the leader solutions in directing the search operation toward dominant regions has not received widespread attention. In addition, the function of inferior solutions in enhancing population diversity is always neglected. These areas of potential improvement leave room for us to further explore the characteristics of the EDA and make full use of the population distribution information to enhance the algorithm performance.

In this study, we focus on the improvement of the EDA distribution model from the above aspects and explore a new, improved EDA for solving single-objective optimization problems using three modification strategies. The main innovation points can be summarized as the following three points:

a.
Archive-based population updating (APU) We propose a distribution model estimation strategy based on historical distribution information. An archive is designed to store successive generations of solutions, and promising individuals are selected from this archive to form a new population and estimate the distribution model. This operator is called archive-based population updating (APU), which makes full use of historical population information. Using APU, the selected excellent individuals from different generations can expand the variances in the search scope. Nevertheless, the continuous archive updates following the first-in first-out rule can effectively avoid the over computation of several best solutions that result in the distribution model overfitting and loss of population diversity.
b.
Multileader-based search diversification (MSD) We develop a multileader-based search diversification (MSD) strategy to utilize the location difference among the population to diversify search scope. The MSD contains two different search strategies. First, the mean point is prompted using a leader solution randomly selected from a set where several of the most promising solutions are preserved. On the other hand, the candidate is generated around a disturbed mean point. Using the MSD strategy, the advantage of leader solutions in directing the evolution is fully utilized. Furthermore, the sampling scopes are diversified with different mean points and can effectively enhance the algorithm’s exploration behavior, thus avoiding the ill-shaped unimodal distribution model misleading the population into stagnation.
c.
Triggered distribution shrinkage (TDS) We study a triggered distribution shrinkage (TDS) strategy to scale the search scope when the algorithm falls into stagnation. This mechanism is designed to decrease the search scope and evolve the algorithm to focus on local exploration and improve the convergence performance.

We assemble the above three search strategies in the EDA to obtain the E₃-EDA. The algorithm focuses on single-objective optimization problem optimization, which is the basic issue of a new evolutionary algorithm and the foundation of other complex extensions, such as multi-objective algorithms, constrained algorithms, and parallel algorithms. To verify the effectiveness of our proposal, the well-known CEC 2014 and CEC 2018 test suites are used for benchmarking. Both tests are carried out with 10-dimensional, 30-dimensional, 50-dimensional and 100-dimensional (10D, 30D, 50D and 100D) functions. Additionally, the top methods in the CEC 2014 and CEC 2018 competitions, including LSHADE-EpSin [17], UMOEAsII [16], L-SHADE [18], HSES [13], LSHADE-RSP [19], ELSHADE-SPACMA [14], EBOwithCMAR [15], in addition to two promising EDA variants, MLS-EDA [20] and ACSEDA [21], are employed to perform in competitions with E₃-EDA.

This study is organized as follows. In Sect. “Related advances in EDA research”, the basic EDA is presented, and its related achievements are reviewed. In Sect. “E₃-EDA search behavior description”, the search behaviors of E₃-EDA are described mathematically. Section “Experimental study using modern IEEE CEC benchmarks” details the experimental research. First, a parametric study is carried out to determine E₃-EDA’s optimal parameter settings. Then, the influence of different search mechanisms on algorithm convergence performance is revealed. Finally, the performance of E₃-EDA is evaluated using the CEC 2014 and CEC 2018 benchmarks. Nonparametric tests are also adopted to compare E₃-EDA with the top algorithms. Finally, the findings of this study are concluded in Sect. “Conclusions”.

Related advances in EDA research

The basic EDA evolves the search scope by learning the distribution characteristics of the selected superior solutions. The mean value μ and covariance matrix C of the Gaussian distribution model are determined using maximum likelihood estimation (MLE) as follows:

$$ {\varvec{\mu}} = \frac{1}{{|{\varvec{S}}|}}\sum\limits_{i = 1}^{{|{\varvec{S}}|}} {{\varvec{x}}_{i} } ,\quad {\varvec{x}}_{i} \in {\varvec{S}}\quad {\text{and}}\quad {\varvec{S}} \subset {\varvec{X}}, $$

(1)

$$ {\varvec{C}} = \frac{1}{{|{\varvec{S}}|}}\sum\limits_{i = 1}^{{|{\varvec{S}}|}} {\left( {{\varvec{x}}_{i} - {\varvec{\mu}}} \right)\left( {{\varvec{x}}_{i} - {\varvec{\mu}}} \right)^{{\text{T}}} } ,\quad {\varvec{x}}_{i} \in {\varvec{S}}\quad {\text{and}}\quad {\varvec{S}} \subset {\varvec{X}} $$

(2)

where the symbol S is a set containing the selected superior solutions and |S| is its cardinality. Then, population sampling is carried out using the mean point and covariance matrix as follows:

$$ {\varvec{x}}_{i} = {\varvec{\mu}} + {\kern 1pt} {\varvec{y}}_{i} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\varvec{y}}_{i} \sim N\left( {0,{\varvec{C}}} \right). $$

(3)

The estimated distribution model is crucial to algorithm performance. When the population distribution becomes abnormal, the algorithm has no mechanism to maintain population diversity, which leads the main search direction to be gradually perpendicular to the descent direction of the objective function value, thus causing the algorithm to fall into local stagnation. This deficiency was first noted in [22], where Cai et al. studied an adaptive variance scaling (AVS) method. In [23, 24], Grahl et al. proposed a correlation-triggered AVS (CT-AVS) mechanism and another trigger technique called SDR. On this basis, a well-known EDA named AMaLGaM was proposed in 2013 [25]. However, this method scaled the variances without differences, and thus, the search scope cannot be readily directed toward the fitness descent directions. Ren et al. equipped EDA with anisotropic AVS (AAVS-EDA) [11]. The advances of AAVS-EDA over AMaLGaM were supported by their experimental study. Although AAVS-EDA can anisotropically scale the variances based on fitness landscape detection, one concern is that the trial points are not limited to the input domain, which leads to unreliable detection. Their other work, EDA², employed an archive to preserve the superior solutions of successive generations [12]. This mechanism uses the distribution differences among distinct generations to increase the variance in the fitness descent direction and effectively avoid a problematic distribution. Similarly, Yang et al. [21] proposed ACSEDA with an adaptive covariance scaling strategy for covariance estimation.

However, all elements in the archive participate in the distribution estimation, which requires additional time consumption. The more widely accepted EDA variant is the covariance matrix adaptation evolution strategy (CMA-ES) [26]. CMA-ES utilizes the “rank-1” update and the “rank-μ” update to estimate the covariance matrix and adopts a cumulative step length adjustment to scale the distribution scope adaptively. Its promising variants, IPOP-CAM-ES [27] and NBiPOP-aCMA-ES [28], were the top methods in the CEC 2005 and CEC 2013 competitions, respectively. However, they incur a greater computational burden than other approaches due to their complex search framework.

In addition to modifications to the distribution model, many other search operators have been adopted in EDAs to enrich the population diversity. In our previous work [20], the MLS-EDA utilizes a multi-leader search strategy to enhance population diversity and avoid premature convergence. In [29], the EDA was equipped with a simulated annealing technique to enhance local exploration performance. Miquélez et al. [30] employed a Bayesian classifier method to estimate the probability model. Their proposal exhibited competitive performance in solving continuous optimization problems. The effectiveness of a regularized learning method in EDA modification was reported by Karshenas [31]. The copula theory [32] and a probabilistic graphical model [33] were also employed in the EDA for sampling. In RWGEDA, random walk strategies were utilized to strengthen the EDA’s exploration performance [34]. Moreover, the effects of the promising area detection technique and niching method on EDA performance improvement have been addressed by various studies [5, 35].

Combining EDA with other optimization algorithms to fully leverage their advantages is regarded as another effective way to improve search performance. A hybrid of the EDA with PSO was explored by Qi et al. [36] to address the real world optimization problem. Furthermore, several studies have been carried out to combine DE and EDA [37, 38]. The statistical results demonstrated that their proposals outperformed other approaches. Moreover, in Sun’s work [39], a hybrid technique combining EDA and CS was implemented to solve a scheduling problem.

The above review indicates that EDA development has progressed through different aspects. For the EDA variants, the properties of each element of solution information are not fully utilized. For example, poor solution information that is not involved in model construction is directly discarded. The function of elite solutions in promoting algorithms to search for different dominant regions has not been explored. For the remaining EDA studies, the improvements employed different search mechanisms or algorithms to compensate for the lack of population diversity in the basic EDA. These developments were accompanied by a complex implementation framework and additional free parameters, which reduce the algorithm’s efficiency and robustness. In contrast to previous studies, our E₃-EDA utilizes APU, MSD and TDS strategies to modify the distribution model rather than employ other search mechanisms or sub-algorithms. The proposed modifications make full use of historical and current population information to enhance the basic EDA optimization and convergence performance.

E₃-EDA search behavior description

The E₃-EDA search framework integrates APU, MSD and TDS with the basic EDA, as illustrated in Fig. 1. In each generation, the first NP best-performing solutions from the archive are selected to participate in distribution model estimation. Then, TDS is activated to shrink the variances of the covariance matrix to strengthen the exploitation capacity only if the algorithm is determined to be in stagnation. In MSD, two different search behaviors are employed to modify the mean point to diversify the search scope. After sampling, the APU strategy updates the new population based on an archive to avoid the overuse of the determined best solutions. The detailed descriptions of E₃-EDA are denoted below.

APU strategy

In the APU strategy, continuous n generations of population information are retained in an archive, i.e., $A^{t} = X^{t} \cup X^{t - 1} \cup \cdots \cup X^{t - n + 1}$. The first NP of the best-performing individuals is selected from the archive ${\varvec{A}}^{t}$ as the new population to estimate the Gaussian distribution model as follows:

$$ \left\{ \begin{gathered} {\varvec{\mu}} = \frac{1}{NP}\sum\limits_{i = 1}^{NP} {\omega_{i} {\varvec{x}}_{i} } \hfill \\ {\varvec{C}} = \frac{1}{NP}\sum\limits_{i = 1}^{{{\text{NP}}}} {\left( {{\varvec{x}}_{i} - {\varvec{\mu}}} \right)\left( {{\varvec{x}}_{i} - {\varvec{\mu}}} \right)^{T} } ,{\kern 1pt} \hfill \\ \end{gathered} \right.{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\varvec{x}}_{i} \in {\varvec{A}}_{{\left( {1:NP} \right)}} . $$

(4)

In (4), ω_i is the weight coefficient of the weighted maximum likelihood estimation used to calculate the mean value, which is denoted as follows:

$$ \omega_{i} = {{\ln \left( {NP + 1} \right)} \mathord{\left/ {\vphantom {{\ln \left( {NP + 1} \right)} {\left( {\sum\limits_{i = 1}^{NP} {\left( {\ln \left( {NP + 1} \right) - \ln \left( i \right)} \right)} } \right)}}} \right. \kern-0pt} {\left( {\sum\limits_{i = 1}^{NP} {\left( {\ln \left( {NP + 1} \right) - \ln \left( i \right)} \right)} } \right)}}. $$

(5)

If the ith superior solution has a better fitness value than the kth, i.e., f(x_i) < f(x_k), the values of their weight coefficients are opposite as ω_i > ω_k. Compared with the truncation selection in CMA-ES, the gap of weight coefficients in APU is smaller. The purpose of this setting is to overcome the overreliance on the better solutions during the distribution estimation. Moreover, more solutions selected from different generations can rebuild the search scope, thus modifying the overfitted model. Therefore, the new candidates are generated following the newly estimated Gaussian distribution model as follows:

$$ {\varvec{x}}_{i} = {\varvec{\mu}}{ + }{\varvec{y}}_{i} ,\quad {\varvec{y}}_{i} \sim {N}\left( {0,{\varvec{C}}} \right). $$

(6)

The newly generated population is indicated as X^t+1. Then, the new archive is updated as $A^{t + 1} = X^{t + 1} \cup X^{t} \cup \cdots \cup X^{t - n + 2}$, meaning that the oldest population information is abandoned. Then, the new population is chosen from A^t+1 to evolve the algorithm iteratively. In this behavior, the old population is discarded in archive updating even if they contain some excellent individuals. This principle is considered to avoid the defect of premature convergence, which is inherited in the unimodal distribution model, as illustrated in Fig. 2. During the iterations, using the old high-quality solution many more times will make the long axis of the distribution model gradually perpendicular to the descent direction of the fitness value, which will lead the algorithm into a stagnated state. This phenomenon can be avoided by properly removing these old high-quality solutions, as presented in Fig. 3. After cutting off several old candidates, the distribution of the new high-quality solutions has good diversity, which can increase the sampling range.

Moreover, the distribution model of offspring sampling is calculated based on the information of the parent population, which is composed of the first NP historical optimal solutions in updated archive A. These high-quality solutions from different generations may be distributed in multiple dominant regions, as illustrated in Fig. 4. The new high-quality solutions in the population will increase the sampling range of the distribution model, which will enhance the algorithm’s exploration ability and population diversity.

MSD strategy

In our previous work [20], we utilized the individual information in the group for local search. On this basis, the MSD strategy is developed to diversify the distribution scope using the current best performing solutions (also called leader solutions) or the population location differences to modify the mean point. It provides two different search behaviors. The first one utilizes the multiple leader solution to direct the mean point shift as indicated below:

$$ {\varvec{\mu}}_{MSDi} = ({\varvec{\mu}} + {\varvec{L}}_{i} )/2, $$

(7)

where L_j is a randomly selected member in set L, which contains several top leader solutions. Set L has a variable capacity. Initially, only the current best solution is selected in set L. Once the algorithm stagnates, the capacity of L expands to store the best two solutions. Until its capacity arrives at its predefined maximum value, the capacity no longer changes.

It is difficult for the basic EDA to capture the characteristics of a multimodal problem effectively through a unimodal distribution model, which makes the algorithm susceptible to falling into a local optimum. In this behavior, these leader solutions may be scattered in different dominant regions, thus leading the diversified shifted mean points to perform a full global detection of the search area; this beneficially increases the sampling range and avoids the failure mode of algorithm stagnation. Since the population of each generation is updated from archive A, the elements in set L are changed, which effectively ensures the diversity of population sampling.

In the other search behavior of MSD, the mean point of the ith solution is shifted on each dimension using its individual location information, as described below:

$$\begin{aligned}{\varvec{\mu}}_{MSDi} & = \left( {{\varvec{\mu}} + {\varvec{x}}_{i} } \right)/2 + \left( {{\varvec{\mu}} - {\varvec{x}}_{i} } \right) \cdot {\varvec{B}} \cdot {\varvec{r}}_{1 \times D} \cdot {\varvec{B}}^{T},\\ & \quad {\varvec{r}}_{j} \sim U\left( {0,1} \right),\end{aligned} $$

(8)

where r is a 1 × D vector and D represents the problem dimensionality. B is the eigenvector matrix obtained from the decomposition of covariance matrix C.

$$ {\varvec{C}} = \left( {{\varvec{B}} \cdot {\varvec{D}}} \right) \cdot \left( {{\varvec{B}} \cdot {\varvec{D}}} \right)^{T} . $$

(9)

The μ_MSD is a product of a disturbance around the estimated mean point. When solving the multimodal problem, it is difficult to determine the promising search region. The current inferior solution may be located near the global optimum and can be converted into a superior solution at the next moment. Thus, we utilize the location information of all individuals to disturb the mean point. The modification along the axis of the probability density ellipsoid is achieved using the eigenvector matrix, which is helpful for shifting the mean point on the descent direction of the fitness value. The advantage of using eigen coordinates has been argued in [40, 41].

After using Eqs. (7) or (8) to obtain the improved mean point, the new individual sampling is formulated as follows:

$$ {\varvec{x}}_{i} = {\varvec{\mu}}_{{MSD{\kern 1pt} i}} + {\varvec{y}}_{i} ,\quad {\varvec{y}}_{i} \sim N\left( {0,{\varvec{C}}} \right). $$

(10)

The substantial difference between these two behaviors of the mean point in MSD, as presented in Eqs. (7) and (8), lies in the direction of mean point modification. In the first behavior, the search center moves toward the determined best-performing solutions. When the unimodal Gaussian model can capture the descent direction of the fitness value, shifting the mean point to the elite individuals is the same as promoting the distribution model in the descent direction, which is conducive to the algorithm’s convergence performance. In the second behavior, the search center is accessed using the individual information. Since the members of the population are selected from the archived A, the distribution of solutions derived from different generations is scattered. Various solutions are used to disturb the mean point to realize the diversification of the distribution model, which can considerably enhance the ability to explore the solution space. In other words, Eq. (7) is advantageous to the algorithm when the distribution model obtains the dominant evolution direction, while Eq. (8) is advantageous to the algorithm to eliminate local stagnation when dealing with multimodal problems. However, it is difficult to determine which behavior is better during algorithm execution. Therefore, we present a mechanism to adaptively adjust the selection probabilities of the two behaviors in MSD according to the proportion of high-quality offspring generated using Eqs. (7) or (8).

If the probability of each individual choosing Eqs. (7) and (10) for sampling is P₁, then the probability of choosing other behavior is P₂, and P₁ + P₂ = 1 (initially, P₁ = P₂ = 0.5). After population renewal, the number of offspring obtained by Eqs. (7) and (10) renewal is NP₁, the number of individuals superior to their parents is SNP₁, and the ratio of dominant offspring is SR₁:

$$ {\text{SR}}_{1} = {\text{SNP}}_{1} /{\text{NP}}_{1} . $$

(11)

In the same way, SR₂ is the ratio of the promising offspring generated using Eqs. (8) and (10):

$$ {\text{SR}}_{2} = {\text{SNP}}_{2} /{\text{NP}}_{2} . $$

(12)

If SR₁ is greater than SR₂, it indicates that the effect of the first search behavior of MSD is better than the other. In the next generation, the value of P₁ should be increased appropriately, and the selection probability of the two techniques should be adjusted to the following:

$$ \left\{ \begin{gathered} P_{1} { = }\frac{{\left( {P_{1} { + }\left( {1 - P_{1} } \right) \cdot {{{\text{SR}}_{{1}} } \mathord{\left/ {\vphantom {{{\text{SR}}_{{1}} } {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right. \kern-0pt} {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right)}}{{\left( {1{ + }\left( {1 - P_{1} } \right) \cdot {{{\text{SR}}_{{1}} } \mathord{\left/ {\vphantom {{{\text{SR}}_{{1}} } {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right. \kern-0pt} {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right)}} \hfill \\ P_{{2}} = 1 - P_{1} \hfill \\ \end{gathered} \right.,\quad {\text{if}}\quad {\text{SR}}_{{1}} > {\text{SR}}_{{2}} . $$

(13)

In the same way, if SR₁ is smaller than SR₂, it indicates that the sampling using Eqs. (8) and (10) is better. In the next generation, the value of P₂ should be increased appropriately, and the selection probability of the two behaviors should be adjusted as follows:

$$ \left\{ \begin{gathered} P_{{2}} = \frac{{\left( {P_{{2}} { + }\left( {1 - P_{{2}} } \right) \cdot {{{\text{SR}}_{{2}} } \mathord{\left/ {\vphantom {{{\text{SR}}_{{2}} } {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right. \kern-0pt} {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right)}}{{\left( {1{ + }\left( {1 - P_{{2}} } \right) \cdot {{{\text{SR}}_{{2}} } \mathord{\left/ {\vphantom {{{\text{SR}}_{{2}} } {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right. \kern-0pt} {\left( {{\text{SR}}_{{1}} + {\text{SR}}_{{2}} } \right)}}} \right)}} \hfill \\ P_{1} { = }1 - P_{{2}} \hfill \\ \end{gathered} \right.,\quad {\text{if}}\quad {\text{SR}}_{{2}} > {\text{SR}}_{1} . $$

(14)

To avoid the extinction of either search behavior, the probabilities of the two methods in MSD take a boundary as follows:

$$ \left\{ \begin{gathered} P_{1} = \min \left( {0.95,P_{1} } \right),P_{1} = \max \left( {0.05,P_{1} } \right) \hfill \\ P_{2} = \min \left( {0.95,P_{2} } \right),P_{2} = \max \left( {0.05,P_{2} } \right) \hfill \\ \end{gathered} \right.. $$

(15)

TDS strategy

Both APU and MSD can enlarge the search scope. They can improve the algorithm’s exploration performance but contribute little to the convergence performance. In E₃-EDA, the TDS strategy is adopted to narrow the search area by scaling the variances when the algorithm falls into stagnation:

$$ \left[ {{\varvec{D}}^{2} } \right]_{{{\text{diagonal}}}} = \left( {1 - {\text{FEs}}/{\text{FEs}}_{\max } } \right) \cdot \left[ {{\varvec{D}}^{2} } \right]_{{{\text{diagonal}}}} , $$

(16)

where FEs are the function evaluations and FEs_max is the maximum boundary. As presented in Eq. (16), the search range decreases with additional iterations, which makes the algorithm focus on local exploitation in the late stage. On the other hand, we set the evaluation criteria of algorithm stagnation as follows: if the average fitness value of the first half of the superior individuals in the current population is not less than the average value of the previous generation, the algorithm is trapped in stagnation. When the algorithm stagnates, the covariance matrix is no longer updated. Only the search variances are shrunk until the algorithm does not meet the stagnant standard. This setting can reduce the computational burden in the covariance matrix calculation.

Moreover, to ensure E₃-EDA’s convergence performance, the best result is updated after each generation. The flowchart of the execution framework of E₃-EDA is illustrated in Fig. 5.

The differences between E₃-EDA and MLS-EDA

Generally, E₃-EDA is an extended study based on our previous MLS-EDA. In MLS-EDA, we found that utilizing population diversity can effectively improve the EDA performance. Inspired by this, we have modified the MLS in E₃-EDA, i.e., the newly proposed MSD adopts new search strategies and a statistical based search behavior adaptive selection method, which are different from the mechanisms adopted in MLS-EDA. Furthermore, only using the current population information still would have caused some restrictions on the population diversity. Thus, the APU strategy is investigated to utilize the historical distribution information to enhance the exploration performance. Through the ensemble of APU, MSD and TDS, a more efficient E₃-EDA is achieved.

Time complexity analysis of E₃-EDA

The computational time complexity is an important issue for evaluating the efficiency of an evolutionary algorithm. The calculation efficiency of the EDA is affected mainly by the calculation of the covariance matrix. E₃-EDA is an EDA-based algorithm, but it has little difference compared with the basic EDA in terms of computation cost. As described above, in each iteration, all members of the population participate in covariance matrix estimation, as in Eq. (4). The time complexity of this part is O(D²·NP), where D is the dimensions of the problem. Moreover, the decomposition of the covariance matrix using the Jacobi method has a complexity of O(D³). In population updating, sampling is executed in each dimension for all individuals, which leads to a computational cost of O(NP·D). Therefore, the maximum time complexity of E₃-EDA is determined by O{D² × max(D, NP)}. Additionally, according to the execution process in Fig. 2, the computational cost for updating the covariance matrix can be avoided when the algorithm stagnates. Under this condition, our E₃-EDA achieves a time complexity less than O{D² × max(D, NP) + NP·D} in each iteration.

Experimental study using modern IEEE CEC benchmarks

Using benchmarks to verify the performance is an indispensable part of developing a new evolutionary algorithm. Since 2005, IEEE CEC has proposed several challenging benchmarks on different aspects and has held algorithm competitions every year. The CEC 2014 and CEC 2018 test suites are two modern and challenging sets of benchmarks for a single-objective optimization algorithm. The CEC 2014 test suite consists of 30 functions, which can be categorized into four types: unimodal functions F1–F3, multimodal functions F4–F16, hybrid functions F17–F22 and composite functions F23–F30. Similarly, the CEC 2018 testbed contains 29 benchmarks with F2 excluded, and it is also divided into four groups: unimodal functions F1 and F2, multimodal functions F4–F10, hybrid functions F11–F20 and composite functions F21–F30. These four types of test functions have different characteristics, and the difficulty of obtaining the optimal solution gradually increases. The distinction between CEC 2014 and CEC 2018 is that the basic functions used are different, which makes the problems have different features. More details about these two sets of testbeds are explained in [42] and [43], respectively.

Because the CEC test is a black-box problem, its result is recoded using the error value between the best value determined by the algorithm and the optimal value as f(x) − f(x*). The optimal value is determined when the error value is less than 1e−08. The maximum function evaluation is related to the problem dimensionality D and set to D × 10,000. All simulations in this section are implemented on a laptop with an i7-8700HQ processor (2.20 GHz) and 16 GB memory. The software MATLAB 2018a is utilized for coding and running.

In this section, we first use the CEC 2014 test suite with 30D problems to determine the optimal parameter settings of E₃-EDA. Then, the contributions of the proposed modifications are discussed. Finally, the performance of E₃-EDA is compared with that of the top methods in CEC competitions using the CEC 2014 and CEC 2018 testbeds.

E₃-EDA parametric study

The tuning of parameters plays a substantial role in influencing the performance of an evolutionary algorithm. If an algorithm possesses many free parameters, it is difficult to determine its optimal values for different problems, which limits the algorithm’s robustness and applicability. In E₃-EDA, only three parameters need to be discussed in terms of obtaining optimal settings: the population size (NP), the maximum size of archive A (Size|A|_max), and the maximum size of set L (Size|L|_max). In the basic EDA, a large population size is always desired to maintain the search diversity. In E₃-EDA, five values of population size are used, namely, NP = 12·D, 15·D, 18·D, 21·D, and 24·D, where D denotes the problem dimensionality. Archive A retains population information of successive generations; therefore, its optimal size is related to the population size, i.e., Size|A|_max = NP, 2·NP, 3·NP, 4·NP, and 5·NP. The set L preserves several top leader solutions to inform the mean point modification. Its optimal value is determined from six cases: Size|L|_max = 1, 0.1·NP, 0.2·NP, 0.3·NP, 0.4·NP, and 0.5·NP. When the size of L equals 1, only the best solution is utilized. While the size of L is set to 0.5·NP, the first half of the best population is preserved. We associate the parameter assignment with the problem size so that E₃-EDA can adaptively adjust its parameter values when dealing with problems of different dimensions. This step is performed to reduce the parameter sensitivity of the algorithm and avoid the parameter adjustment burden.

In this investigation, CEC 2014 benchmarks are employed to evaluate the E₃-EDA performance with 150 (5 × 5 × 6) different parameter settings. The experiments are carried out with 30D problems, and each problem is run for 51 times independently. The maximum function evaluation is fixed at 300,000. To save space, we do not provide the optimization results of the 150 algorithms. However, the Friedman test is performed to rank the results to reveal the differences. Since the three free parameters are related to the population size, we analyze the influence of the other two parameter settings on the algorithm performance with different NP values. For NP set to 12·D, 15·D, 18·D, 21·D, and 24·D, the influence of the parameters Size|A|_max and Size|L|_max on the algorithm performance is tabulated in Tables 1, 2, 3, 4 and 5. The ranking differences are illustrated in Figs. 6, 7, 8, 9 and 10. With the population size NP taking these five different values, most of the best points are accompanied by the parameter Size|A|_max being equal to 3·NP and Size|L|_max being equal to 0.1·NP. The best parameter settings occur at NP = 18·D, Size|A|_max = 3·NP, and Size|L|_max = 0.1·NP, as shown in Table 3, where the statistical ranking value is the lowest at 38.2857.

Table 1 Rankings of algorithms with different size |A|_max and size |L| _max settings when NP = 12·D

No.	Types	Dimensionality = 10					Dimensionality = 30
No.	Types	Best	Worst	Median	Mean	SD	Best	Worst	Median	Mean	SD
01	Unimodal functions	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
02		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
03		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
04	Multimodal functions	0.00e+00	3.48e+01	3.48e+01	2.82e+01	1.40e+01	2.39e−04	5.45e+01	1.75e+00	7.61e+00	1.33e+01
05		0.00e+00	2.00e+01	2.00e+01	1.33e+01	9.66e+00	2.00e+01	2.10e+01	2.00e+01	2.00e+01	1.90e−01
06		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
07		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
08		0.00e+00	9.95e−01	0.00e+00	9.48e−02	2.99e−01	0.00e+00	3.98e+00	9.95e−01	1.07e+00	9.92e−01
09		0.00e+00	9.95e−01	0.00e+00	2.84e−01	4.61e−01	0.00e+00	2.98e+00	4.30e−08	5.85e−01	7.22e−01
10		3.96e+00	2.57e+01	1.50e+01	1.49e+01	5.88e+00	2.06e+01	8.91e+02	3.63e+01	8.11e+01	1.54e+02
11		1.25e−01	1.21e+01	3.12e−01	1.39e+00	3.54e+00	2.45e−01	2.49e+02	7.40e+00	5.23e+01	7.19e+01
12		0.00e+00	1.51e−02	0.00e+00	7.20e−04	3.30e−03	1.90e−04	2.18e−02	4.71e−03	5.75e−03	5.03e−03
13		2.37e−02	1.09e−01	4.94e−02	5.22e−02	2.31e−02	1.47e−01	2.98e−01	2.34e−01	2.32e−01	3.63e−02
14		2.43e−01	4.51e−01	3.61e−01	3.57e−01	6.12e−02	2.64e−01	4.20e−01	3.56e−01	3.51e−01	3.82e−02
15		0.00e+00	9.33e−01	4.03e−01	3.51e−01	3.47e−01	6.18e−01	3.29e+00	2.42e+00	2.39e+00	4.33e−01
16		9.72e−02	1.12e+00	2.62e−01	3.46e−01	2.53e−01	5.01e+00	1.05e+01	6.37e+00	6.50e+00	8.77e−01
17	Hybrid function	0.00e+00	1.20e+00	2.08e−01	3.45e−01	2.41e−01	6.94e−02	1.48e+00	2.78e−01	6.13e−01	5.07e−01
18		1.34e−04	5.00e−01	9.97e−02	2.22e−01	1.96e−01	5.00e−01	1.49e+00	5.00e−01	5.39e−01	1.95e−01
19		2.91e−02	1.08e−01	3.68e−02	4.84e−02	2.49e−02	1.48e−01	2.29e+00	4.59e−01	6.38e−01	4.69e−01
20		6.14e−02	7.71e−01	4.99e−01	4.13e−01	1.93e−01	5.00e−01	1.64e+00	1.20e+00	1.21e+00	1.81e−01
21		6.39e−03	1.12e+00	3.17e−01	4.05e−01	3.26e−01	5.30e−01	1.35e+00	9.16e−01	9.00e−01	1.62e−01
22		1.32e−01	2.41e+01	7.70e−01	5.45e+00	8.85e+00	2.05e+01	8.19e+01	2.14e+01	2.31e+01	8.59e+00
23	Composition functions	3.29e+02	3.29e+02	3.29e+02	3.29e+02	0.00e+00	3.15e+02	3.15e+02	3.15e+02	3.15e+02	4.84e−13
24		1.00e+02	1.06e+02	1.00e+02	1.01e+02	1.62e+00	2.00e+02	2.23e+02	2.08e+02	2.11e+02	9.61e+00
25		1.00e+02	2.01e+02	1.21e+02	1.31e+02	3.39e+01	2.03e+02	2.03e+02	2.03e+02	2.03e+02	4.06e−02
26		1.00e+02	1.00e+02	1.00e+02	1.00e+02	1.73e−02	1.00e+02	1.00e+02	1.00e+02	1.00e+02	4.16e−02
27		4.93e−01	4.00e+02	3.00e+02	2.05e+02	2.01e+02	3.00e+02	4.00e+02	3.00e+02	3.14e+02	3.48e+01
28		3.57e+02	4.99e+02	3.60e+02	4.11e+02	6.21e+01	3.00e+02	8.49e+02	7.75e+02	7.27e+02	1.30e+02
29		1.58e+02	2.23e+02	2.22e+02	2.19e+02	1.40e+01	1.00e+02	7.18e+02	1.49e+02	1.93e+02	1.57e+02
30		4.54e+02	5.50e+02	4.60e+02	4.70e+02	2.46e+01	3.26e+02	5.80e+02	3.53e+02	3.65e+02	3.80e+01

No.	Types	Dimensionality = 50					Dimensionality = 100
No.	Types	Best	Worst	Median	Mean	SD	Best	Worst	Median	Mean	SD
01	Unimodal functions	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	1.13e−04	2.24e−04	1.46e−04	1.54e−04	2.59e−05
02		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	1.16e−02	2.12e−02	1.57e−02	1.63e−02	2.08e−03
03		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	2.06e−08	3.15e−08	2.53e−08	2.53e−08	2.97e−09
04	Multimodal functions	2.25e+01	9.81e+01	8.35e+01	7.42e+01	1.74e+01	1.26e+02	2.03e+02	1.43e+02	1.55e+02	2.16e+01
05		2.00e+01	2.12e+01	2.00e+01	2.00e+01	1.68e−01	2.00e+01	2.13e+01	2.00e+01	2.02e+01	3.51e−01
06		3.16e−04	7.13e−04	4.17e−04	4.40e−04	7.86e−05	2.40e−01	1.37e+00	2.99e−01	4.32e−01	3.33e−01
07		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
08		6.63e−05	6.97e+00	2.99e+00	3.36e+00	1.91e+00	1.68e+01	2.79e+01	2.24e+01	2.21e+01	3.17e+00
09		6.90e−07	4.98e+00	1.99e+00	2.19e+00	1.24e+00	5.17e+00	2.19e+01	1.39e+01	1.40e+01	3.23e+00
10		6.85e+01	9.45e+03	2.67e+02	5.53e+02	1.30e+03	1.64e+03	6.93e+03	3.26e+03	3.52e+03	1.06e+03
11		1.02e+01	5.83e+02	1.56e+02	2.04e+02	1.45e+02	1.32e+03	5.26e+03	2.37e+03	2.74e+03	1.06e+03
12		3.65e−04	7.07e−02	6.37e−03	1.35e−02	1.60e−02	6.35e−03	4.27e+00	2.97e−02	1.20e−01	5.93e−01
13		2.71e−01	4.37e−01	3.59e−01	3.61e−01	4.04e−02	3.69e−01	6.81e−01	5.44e−01	5.42e−01	6.52e−02
14		2.42e−01	4.25e−01	3.62e−01	3.56e−01	4.16e−02	2.94e−01	4.97e−01	3.68e−01	3.82e−01	5.16e−02
15		3.47e+00	6.05e+00	4.41e+00	4.45e+00	5.03e−01	7.53e+00	6.70e+01	1.02e+01	1.22e+01	1.12e+01
16		1.16e+01	1.91e+01	1.33e+01	1.33e+01	1.12e+00	3.06e+01	4.26e+01	3.36e+01	3.41e+01	2.69e+00
17	Hybrid function	2.54e−01	1.22e+01	3.42e+00	4.55e+00	3.00e+00	4.26e+01	4.38e+03	1.02e+02	1.87e+02	6.00e+02
18		4.98e−01	1.52e+00	5.11e−01	5.48e−01	1.45e−01	2.98e+00	1.04e+01	5.89e+00	5.65e+00	1.64e+00
19		3.21e+00	9.11e+00	6.70e+00	6.26e+00	1.36e+00	1.49e+01	8.83e+01	8.58e+01	7.84e+01	1.74e+01
20		1.37e+00	3.22e+00	1.77e+00	1.86e+00	3.27e−01	3.19e+00	6.02e+00	3.90e+00	4.01e+00	6.23e−01
21		7.67e−01	1.24e+02	1.06e+00	6.04e+00	2.37e+01	3.25e+01	2.87e+03	1.65e+02	2.54e+02	5.27e+02
22		2.20e+01	5.74e+02	2.36e+01	8.04e+01	1.48e+02	3.17e+01	2.16e+03	4.45e+01	4.49e+02	7.37e+02
23	Composition functions	3.44e+02	3.44e+02	3.44e+02	3.44e+02	1.02e−12	3.48e+02	3.48e+02	3.48e+02	3.48e+02	9.44e−06
24		2.59e+02	2.72e+02	2.68e+02	2.67e+02	1.81e+00	3.66e+02	3.75e+02	3.70e+02	3.70e+02	2.00e+00
25		2.05e+02	2.05e+02	2.05e+02	2.05e+02	5.02e−02	2.14e+02	2.15e+02	2.14e+02	2.14e+02	2.51e−01
26		1.00e+02	1.00e+02	1.00e+02	1.00e+02	4.39e−02	2.00e+02	2.00e+02	2.00e+02	2.00e+02	1.23e−04
27		3.00e+02	3.47e+02	3.00e+02	3.02e+02	9.89e+00	3.05e+02	3.40e+02	3.06e+02	3.09e+02	9.18e+00
28		8.55e+02	1.26e+03	1.08e+03	1.04e+03	9.54e+01	1.39e+03	2.78e+03	1.56e+03	1.83e+03	4.93e+02
29		3.38e+02	3.75e+02	3.49e+02	3.51e+02	8.62e+00	5.95e+02	6.87e+02	6.56e+02	6.50e+02	2.30e+01
30		1.06e+04	1.28e+04	1.13e+04	1.14e+04	3.78e+02	1.72e+03	2.31e+03	1.93e+03	1.94e+03	1.38e+02

No.	Types	Dimensionality = 10					Dimensionality = 30
No.	Types	Best	Worst	Median	Mean	SD	Best	Worst	Median	Mean	SD
01	Unimodal functions	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
03	Unimodal functions	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
04	Multimodal functions	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	1.54e−03	8.48e+01	5.93e+01	5.76e+01	1.57e+01
05		0.00e+00	1.03e+00	9.34e−02	2.51e−01	3.13e−01	0.00e+00	2.98e+00	9.95e−01	9.17e−01	8.86e−01
06		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
07		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	3.20e+01	3.62e+01	3.32e+01	3.32e+01	7.61e−01
08		1.04e+01	1.25e+01	1.09e+01	1.10e+01	5.26e−01	0.00e+00	2.98e+00	9.95e−01	7.80e−01	8.04e−01
09		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00
10		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	3.58e+00	5.23e+02	1.21e+02	9.35e+01	9.79e+01
11	Hybrid function	1.87e−01	9.42e+00	3.43e−01	1.31e+00	2.44e+00	0.00e+00	6.00e+01	0.00e+00	2.60e+00	8.42e+00
12		0.00e+00	8.50e−01	5.46e−02	1.83e−01	2.56e−01	1.39e−01	4.86e−01	3.47e−01	3.28e−01	9.00e−02
13		0.00e+00	6.24e−01	4.16e−01	3.12e−01	1.70e−01	0.00e+00	1.54e+01	1.33e+01	9.48e+00	6.35e+00
14		0.00e+00	9.25e+00	5.20e+00	3.64e+00	3.21e+00	1.59e−06	8.64e−06	2.93e−06	3.49e−06	1.59e−06
15		6.49e−07	1.69e−06	1.27e−06	1.19e−06	3.13e−07	4.14e−01	5.65e−01	4.90e−01	4.83e−01	2.52e−02
16		5.76e−03	5.00e−01	4.99e−01	3.89e−01	1.74e−01	8.93e−01	5.80e+02	2.07e+00	6.41e+01	1.49e+02
17		4.37e−01	2.30e+00	7.17e−01	8.79e−01	4.52e−01	7.86e−01	2.66e+01	2.10e+01	2.02e+01	4.93e+00
18		4.03e−01	2.17e+01	1.84e+00	8.65e+00	9.83e+00	5.00e−01	2.05e+01	2.05e+01	1.73e+01	7.32e+00
19		2.93e−01	5.00e−01	5.00e−01	4.73e−01	6.96e−02	5.09e−02	2.95e+00	1.14e+00	1.24e+00	7.24e−01
20		0.00e+00	1.94e−02	1.94e−02	1.82e−02	4.85e−03	1.54e−03	2.07e+01	2.21e−01	1.47e+00	4.78e+00
21	Composition functions	5.66e−05	2.50e+01	4.68e−01	3.32e+00	7.56e+00	2.00e+02	2.03e+02	2.00e+02	2.01e+02	1.10e+00
22		1.00e+02	2.03e+02	2.00e+02	1.93e+02	2.52e+01	1.00e+02	1.00e+02	1.00e+02	1.00e+02	3.72e−13
23		1.00e+02	1.00e+02	1.00e+02	1.00e+02	8.61e−02	3.00e+02	3.39e+02	3.26e+02	3.25e+02	1.05e+01
24		3.00e+02	3.00e+02	3.00e+02	3.00e+02	0.00e+00	3.50e+02	4.11e+02	3.99e+02	3.94e+02	1.36e+01
25		2.68e+02	3.28e+02	3.11e+02	3.11e+02	1.72e+01	3.87e+02	3.87e+02	3.87e+02	3.87e+02	6.85e−03
26		3.98e+02	4.43e+02	3.98e+02	4.18e+02	2.33e+01	2.00e+02	7.56e+02	4.00e+02	4.36e+02	1.08e+02
27		3.00e+02	3.00e+02	3.00e+02	3.00e+02	0.00e+00	4.62e+02	5.00e+02	4.75e+02	4.75e+02	8.90e+00
28		3.90e+02	3.95e+02	3.94e+02	3.94e+02	1.14e+00	3.00e+02	4.14e+02	3.00e+02	3.17e+02	3.63e+01
29		3.00e+02	6.12e+02	3.00e+02	3.92e+02	1.41e+02	4.02e+02	4.24e+02	4.10e+02	4.10e+02	5.15e+00
30		2.26e+02	2.34e+02	2.29e+02	2.29e+02	2.35e+00	1.95e+03	1.99e+03	1.98e+03	1.98e+03	1.13e+01

No	Types	Dimensionality = 50					Dimensionality = 100
No	Types	Best	Worst	Median	Mean	SD	Best	Worst	Median	Mean	SD
01	Unimodal functions	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	1.11e−02	1.99e−02	1.48e−02	1.51e−02	1.90e−03
03	Unimodal functions	0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	3.97e−07	2.17e−06	8.75e−07	9.23e−07	3.39e−07
04	Multimodal functions	1.95e+01	1.46e+02	2.86e+01	4.71e+01	3.56e+01	1.52e+02	2.50e+02	2.00e+02	2.08e+02	2.12e+01
05		2.84e−06	6.96e+00	2.98e+00	3.10e+00	1.72e+00	1.20e+01	2.49e+01	1.69e+01	1.71e+01	3.25e+00
06		1.20e−06	2.71e−05	1.05e−05	1.23e−05	1.04e−05	2.27e−02	6.82e−02	3.63e−02	3.85e−02	1.02e−02
07		5.41e+01	6.05e+01	5.65e+01	5.65e+01	1.21e+00	1.09e+02	1.17e+02	1.12e+02	1.12e+02	1.65e+00
08		9.50e−07	7.96e+00	2.98e+00	3.14e+00	1.98e+00	1.06e+01	2.82e+01	1.52e+01	1.64e+01	3.73e+00
09		0.00e+00	0.00e+00	0.00e+00	0.00e+00	0.00e+00	6.38e−07	1.27e−06	9.14e−07	9.27e−07	1.39e−07
10		1.23e+02	4.70e+02	1.31e+02	1.49e+02	6.46e+01	7.79e+02	6.60e+03	2.15e+03	2.49e+03	1.39e+03
11	Hybrid function	1.83e+01	1.93e+01	1.83e+01	1.83e+01	2.70e−01	4.54e+00	8.40e+01	1.47e+01	1.99e+01	1.64e+01
12		2.50e−01	1.38e+02	1.69e+00	1.12e+01	3.05e+01	1.30e+02	5.75e+02	3.08e+02	3.15e+02	9.91e+01
13		2.59e−05	8.84e+00	1.64e+00	2.40e+00	2.51e+00	1.06e+01	4.13e+01	1.76e+01	2.41e+01	1.17e+01
14		4.37e−04	2.06e+01	2.00e+01	1.89e+01	4.78e+00	2.00e+01	2.18e+01	2.04e+01	2.05e+01	3.96e−01
15		1.71e+01	1.82e+01	1.72e+01	1.72e+01	1.40e−01	7.90e+00	8.14e+01	1.52e+01	2.57e+01	2.36e+01
16		2.60e+00	1.33e+03	5.58e+00	8.68e+01	1.88e+02	1.27e+01	1.10e+02	1.86e+01	2.53e+01	1.97e+01
17		2.31e+01	7.20e+02	2.63e+01	6.54e+01	1.57e+02	4.82e+01	2.65e+03	1.04e+02	3.28e+02	6.88e+02
18		2.05e+01	2.10e+01	2.05e+01	2.06e+01	1.14e−01	1.49e+00	2.26e+01	2.10e+01	1.44e+01	8.33e+00
19		1.57e+00	6.10e+00	3.60e+00	3.61e+00	7.11e−01	9.93e+00	1.38e+01	1.14e+01	1.14e+01	7.48e−01
20		2.07e+01	7.93e+02	2.11e+01	3.67e+01	1.08e+02	9.67e+01	2.93e+03	2.52e+02	7.34e+02	9.53e+02
21	Composition functions	2.00e+02	2.08e+02	2.03e+02	2.03e+02	1.99e+00	2.25e+02	2.46e+02	2.38e+02	2.37e+02	4.70e+00
22		1.00e+02	1.00e+02	1.00e+02	1.00e+02	3.12e−08	1.00e+02	5.28e+03	1.82e+03	1.78e+03	1.53e+03
23		3.80e+02	4.07e+02	3.90e+02	3.91e+02	4.26e+00	5.19e+02	6.05e+02	5.42e+02	5.46e+02	1.51e+01
24		4.18e+02	4.79e+02	4.65e+02	4.62e+02	1.21e+01	7.08e+02	8.11e+02	7.83e+02	7.80e+02	1.90e+01
25		4.58e+02	5.63e+02	4.80e+02	4.87e+02	2.67e+01	5.77e+02	7.14e+02	6.37e+02	6.46e+02	3.51e+01
26		4.00e+02	6.16e+02	4.00e+02	4.16e+02	5.41e+01	1.29e+03	2.40e+03	1.95e+03	1.95e+03	2.10e+02
27		4.59e+02	5.04e+02	4.77e+02	4.79e+02	1.61e+01	5.07e+02	5.40e+02	5.30e+02	5.28e+02	7.22e+00
28		4.59e+02	4.59e+02	4.59e+02	4.59e+02	5.05e−13	4.78e+02	5.60e+02	5.20e+02	5.22e+02	2.22e+01
29		2.90e+02	3.19e+02	3.01e+02	3.02e+02	6.09e+00	6.38e+02	8.73e+02	7.74e+02	7.56e+02	5.80e+01
30		6.94e+05	8.76e+05	7.66e+05	7.71e+05	4.34e+04	2.29e+03	2.45e+03	2.35e+03	2.35e+03	3.99e+01

A novel ensemble estimation of distribution algorithm with distribution modification strategies

Abstract

Similar content being viewed by others

Symmetric-Approximation Energy-Based Estimation of Distribution (SEED) Algorithm for Solving Continuous High-Dimensional Global Optimization Problems

Large-Scale Estimation of Distribution Algorithms with Adaptive Heavy Tailed Random Projection Ensembles

An adaptive multiobjective estimation of distribution algorithm with a novel Gaussian sampling strategy

Introduction

Related advances in EDA research

E3-EDA search behavior description

APU strategy

MSD strategy

TDS strategy

The differences between E3-EDA and MLS-EDA

Time complexity analysis of E3-EDA

Experimental study using modern IEEE CEC benchmarks

E3-EDA parametric study

Investigation of the influence of E3-EDA search behavior

Comparison with LSHADE-EpSin, UMOEAsII, L-SHADE, MLS-EDA and ACSEDA using the CEC 2014 test suite

Comparison with HSES, LSHADE-RSP, ELSHADE-SPACMA, EBOwithCMAR and ACSEDA using the CEC 2018 test suite

Conclusions

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

E₃-EDA search behavior description

The differences between E₃-EDA and MLS-EDA

Time complexity analysis of E₃-EDA

E₃-EDA parametric study

Investigation of the influence of E₃-EDA search behavior