1 Introduction

There is a controversy regarding the ever-increasing number of new metaheuristic algorithms referred to as “innovative” [1]. As indicated by Sörensen et al. [2], many of these metaheuristics do not make real contributions to the research field, having only as theoretical support an ambiguous metaphor with some natural process. Although these types of new metaheuristics continue to be proposed, other types of studies are focused on combining existing algorithms or adding operators to metaheuristics already presented to create “improved” versions. Either of these cases results in the publication of many similar studies whose scientific relevance seems questionable due to the low level of innovation. Notwithstanding these objectionable characteristics and the claim of part of the scientific community [1], such studies continue to be published in internationally renowned scientific journals. A remarkable feature is that these studies constantly announce outstanding performances, which, in turn, may explain the interest they have generated in the scientific community.

Of particular concern is the number of different documented cases where these new metaheuristics are just plagiarisms of other more popular algorithms or simple reformulations of metaheuristics already presented. One such case is pointed out by Weyland [3, 4] who shows that Harmony Search [5], a very popular and widespread metaheuristic, presents an identical formulation to Evolution Strategies, a metaheuristic based on evolutionary processes presented many years earlier [6]. Another example is given by Camacho et al. [7] who indicate that the Intelligent Water Drops Algorithm [8] is a special case of Ant Colony Optimization [9]. Additionally, Camacho et al. [10] analyzed the components of Gray Wolf Optimizer [11], Firefly Algorithm [12], and Bat Algorihm [13], finding that these metaheuristics are simple reformulations of Particle Swarm Optimization (PSO) [14]. Another example can be found in Tzanetos and Dounias [15], who reviewed multiple metaheuristics and found that many of them have a structure similar to PSO.

The studies above provide a glimpse of a growing problem in the field of metaheuristics: with so many new algorithms being published every year, how could anyone discern those that are genuinely innovative and valuable to the scientific community from those that are not? Such is the seriousness of the matter that some journals have introduced new and stricter conditions for accepting papers that present new metaheuristics. Journal of Heuristics [16] and ACM Transactions on Evolutionary Learning and Optimization [17] state that they do not accept papers that present algorithms with already known ideas or that only demonstrate superior performance without clearly explaining why the algorithm performs well. Furthermore, Swarm Intelligence [18] asks its editorial body to reduce the number of accepted studies that present innovative metaheuristics, only accepting to the review process those with high-quality standards.

Although the controversy over the relevance and innovation of new metaheuristics has been addressed by different publications [1, 2, 15, 19], the truth is that their number only seems to grow year by year. As a reiterative call to reconsider the validity and relevance of this myriad of new algorithms, this paper studies the characteristics of a sample of metaheuristics proposed in papers released in 2022. Despite being a short observation window, the authors were able to find 111 different algorithms self-described as “new”, “novel”, “advanced”, “enhanced”, “improved”, or alike, which is an average of two proposed algorithms per week. The main objective of this paper is to conduct a review and a critical analysis of recently proposed metaheuristics. For that purpose, the structure of the paper is organized as follows: In Sect. 2, the ways in which new algorithms were proposed are described. It also describes the most commonly added components and strategies to improve their performance. Section 3 gives a brief abstract of the No Free Lunch (NFL) theorem and mentions the cases where the theorem does not hold. Additionally, the validity of the up-the-wall-game argument for proposing new or improved algorithms is discussed. Section 4 describes the procedures and problems used to evaluate the new algorithms and discusses their validity. In Sect. 5, an analysis of two so-called innovative algorithms (Black Widow and Coral Reef Optimization) is conducted to prove how repeated concepts can be masked by a metaphorical language. Finally, Sects. 6 and 7 discuss the results and offer conclusions of the study.

2 On the Interest and Motivation of New Metaheuristics

A sample of 111 recently published papers proposing new metaheuristics was compiled to make this study. To know the interest that these publications have generated in the scientific community, the citations received by the papers that make up the sample were recorded. The number of received citations was obtained using Google Scholar on December 15, 2022. They are shown in Fig. 1 by a box plot of their distribution and their histogram. In the box plots, the sample values between the 25th and 75th percentiles are represented by a box referred to as the interquartile range, with an intermediate line representing the 50th percentile or median. The lines extending from the box have a range equal to 1.5 times the interquartile range; likewise, all values outside that range are called outliers. It is essential to point out that, to facilitate its interpretation, five outliers are omitted in the box plot, which presented values of 16, 18, 21, 25, 30, 35, 37, 51, and 112 citations.

Fig. 1
figure 1

a Box plot; b frequency of citations received

Figure 1 shows that the citations’ distribution is positively skewed because many studies have not received any citations yet. Additionally, 50% of the 111 articles in the sample have received three or fewer citations. This value increases to six citations or less when 75% of the total sample is considered. Given that the articles in the sample are less than a year old, it is noteworthy that several of them have already accumulated a considerable number of citations. A remarkable case is given by the algorithm called Ebola Optimization Search Algorithm [20], which makes a metaphor for the spread of the Ebola virus. This one has received 112 citations at the time of this study. This number of citations gives an idea of the great interest that these studies have generated in the scientific community.

Another aspect addressed during the sample’s study was the origin of the proposed algorithm, i.e., whether it was based on a new metaphor, an improved version of an existing one, or a combination of two already established algorithms. The percentages found for each type are presented in Fig. 2. Sörensen [2, 19] highlights a relevant point on this subject: many articles that present new metaheuristics use as theoretical support some metaphor with natural or physical processes that often relate little to optimization. However, it can be seen from Fig. 2 that the trend has now changed. According to data, 65% of the papers present improved versions of existing algorithms. On the other hand, the number of studies that presented a metaheuristic based on a new metaphor or a combination of already established algorithms was 19% and 16%, respectively.

Fig. 2
figure 2

Percentages of algorithms based on new metaphors, combination of existing algorithms and improved versions

2.1 New, Combined, and Improved Algorithms

In the following subsections, the characteristics of the new, hybrid, and improved algorithms are described in detail. Among the 111 articles consulted, only 21 presented a new algorithm based on metaphors. Figure 3 shows the percentages of the types of metaphors used by the articles in the sample that present new metaheuristics, classified as metaphors based on biology, physics, and human behavior. The largest group, with 62% of the sample, comprises algorithms resembling biological processes; their authors state that they took inspiration from animal [21], plant [22], and even disease [23] behaviors. New algorithms resembling human and physical processes reach 24% and 14%, respectively. In the case of metaphors based on human behavior, some algorithms resemble the migration processes of nomadic tribes [24] or the teaching of the sewing [25] and cooking [26] professions. On the other hand, those that mimic physical processes take inspiration from phenomena such as special relativity [27] or the process of fission and fusion in atomic nuclei [28].

Fig. 3
figure 3

Metaphor types used in new metaheuristics

In addition to proposing algorithms based on entirely new metaphors, combining already established metaheuristics is another practice among algorithm designers. For this reason, it was decided to study the most frequently combined algorithms in the sample collected. Only 18 of the 111 articles in the sample proposed an algorithm resulting from combining two different algorithms; this represents 16% of the sample. Figure 4 shows the algorithms combined in the collected sample with their respective percentages of occurrence. The data indicate that 11% of the combined algorithms correspond to the Simulated Annealing and Sine Cosine Algorithms. Kirkpatrick et al. [29] proposed the first of these in 1983, and is one of the best-known metaheuristics today. The second algorithm was proposed by Mirjalili [30] in 2016 and had a formulation based on the sine and cosine functions. The third most employed metaheuristic as a combination element in the sample was Slime Mould Algorithm [31] with 8% of occurrence. It should be noted that those algorithms that had low appearance percentages were grouped in the “other” category.

Fig. 4
figure 4

Combined metaheuristics in the sample

As shown in Fig. 2, 65% of the new metaheuristics considered in this study are improved versions of others already established. In the literature, it is possible to find multiple operators and strategies employed by the authors to improve the performance of the algorithms. Those present in the sample collected, with their respective percentages of appearance, are shown in Fig. 5. The most numerous group, with 24% of appearance, is the operator modification, which groups slight changes in the algorithm’s formulation. This type of change can be of various kinds, such as changing the algorithm’s operation rules or modifying some equation. For example, Xue et al. [32] proposed an improved version of the algorithm called Brainstorm optimization [33], where the strategy for generating new solutions was modified. In the original algorithm, the following equation is used to create new solutions:

$$x_{new}^{i} = x_{old}^{i} + \xi (t) \cdot N(\mu ,\sigma^{2} )$$
(1)

where \(x_{new}^{i}\) is the new solution to be created, \(x_{old}^{i}\) is the solution selected to create a new solution, \(N\left( {\mu ,\sigma^{2} } \right)\) is a normal distribution with mean \(\mu\) and variance \(\sigma\), and \(\xi \left( t \right)\) is a coefficient that weights the contribution of the normal distribution random variable. In the modified version proposed by Xue et al. [32] it employs any of the following three equations:

$$x_{new}^{i} = x_{old}^{i} + \left( {x_{central}^{icluster} - x_{old}^{i} } \right) \cdot N\left( {\mu ,\sigma^{2} } \right)$$
(2)
$$x_{new}^{i} = x_{old}^{i} + (x_{near}^{{}} - x_{old}^{i} ) \cdot N(\mu ,\sigma^{2} )$$
(3)
$$x_{new}^{i} = x_{old}^{i} + (x_{optimal}^{{}} - x_{old}^{i} ) \cdot N(\mu ,\sigma^{2} )$$
(4)
Fig. 5
figure 5

Commonly added operators and components to improve metaheuristic’s performance

Other examples of modifications of similar nature can be found in Li and Xu [34] and Chakraborty et al. [35]. Other frequent modifications to the algorithms are those that consider the addition of opposition-based learning (15%), chaotic maps (12%) and Lévy flight (7%). Opposition-based learning was initially proposed by Tizhoosh [36] in 2005. This strategy is based on the idea that randomly generated solutions tend to have low quality, which causes them to be in the “opposite” position of the problem optima. Therefore, instead of searching in all possible directions for the location of the optimal solution, it is more convenient to search in the “opposite” direction. As Mahdavi et al. [37] point out, multiple versions of this search strategy exist. Another common strategy is using chaotic maps, nonlinear equations exhibiting chaotic behavior. According to Aydemir [38], chaotic maps are beneficial components for optimization algorithms due to their ergodicity and the non-repeatability of chaos. It should be noted that there are dozens of chaotic maps currently in existence. The third most used strategy is the Lévy flight, a random walk considering a Lévy distribution. Viswanathan et al. [39] point out that some animal behaviors seem to follow a Lévy distribution during foraging. It is necessary to remember that foraging is a process where living beings must optimize their resources (energy or time) to avoid starvation. For this reason, it seems that using a Lévy flight strategy is beneficial for optimization algorithms, where solutions must “search” in the configuration space for the optimal solution to the problem. As can be seen, there are other components added to improve the performance of the algorithms, such as restart strategies, Cauchy or Gaussian mutation, or the Nelder-Mead simplex method. Other strategies with a much lower frequency were grouped in “others”.

2.2 On Improved Algorithms

Due to the high number of improved algorithms presented recently, this topic is studied more deeply. Two specific cases that draw great attention are the Whale Optimization Algorithm and Slime Mould Optimization due to the high number of “improved” versions they received only in 2022. The whale Optimization Algorithm (WOA) is a population-based algorithm proposed by Mirjalili and Lewis [40] based on a metaphor for the hunting behavior of humpback whales. According to its creators, “WOA algorithm is very competitive compared to the state-of-art meta-heuristic algorithms as well as conventional methods” [40]. Despite these statements, only in 2022 WOA has been “improved” at least seven times. The list of modifications made to the algorithm is long and usually consists of changes to the algorithm’s operators. El-Kenawy et al. [41] combined Sine Cosine Algorithm (SCA) with a modified version of WOA (MWAO) to create Sine Cosine Modified Whale Optimization Algorithm (SCMWOA). Liu and Zhang [42] proposed the Differential Evolution Chaotic Whale Optimization Algorithm (DECWOA) that employs sine chaos theory to create the algorithm’s initial population, adaptive weights to update the position of the whales, and a differential variance algorithm for creating new populations. It is important to note that these modifications were made because Liu and Zhang [42] claim that WOA suffers from insufficient search capability and low convergence speed. On the other hand, Ma and Yue [43] proposed an improved whale optimization algorithm (RAV-WOA), which considers adaptive weights on the position of whales in addition to using reverse learning strategy and horizontal and vertical crossover to create the algorithm populations. Likewise, Ma and Yue [43] point out that WOA tends to fall into local optima and has convergence speed and convergence accuracy problems. Qiao et al. [44] introduced the worst individual disturbance (WD) and neighborhood mutation (NM) search strategies with the original WOA to create WDNMWOA. Similarly, Qiao et al [44]. indicate that the modified version aims to solve the problems of low exploratory capacity, a tendency to get trapped into local optima, and low optimization accuracy presented by the original algorithm. Seyyedabbasi [45] combined WOA with Sine Cosine Algorithm (SCA) and Lévy flight (LF) to create a hybrid algorithm called WOASCALF. The decision to combine WOA with SCA was made because the former has low performance in the exploitation phase, an aspect that can be corrected with the high performance shown by SCA in that same phase. Wang et al. [46] developed the Improved Surrogate-Assisted Whale Optimization Algorithm (ISAWOA), which adds three strategies to improve WOA’s performance: a surrogate-assisted model, Lévy flight, and quadratic interpolation. Again, Wang et al. [46] indicate that WOA tends to be trapped in local optima. Finally, Wang et al. [47] proposed the Improved Whale Optimization Algorithm (IWOA), which considers logistic chaos to improve WOA’s performance.

Slime Mould Algorithm (SMA) is a population-based algorithm proposed by Li et al. [31] that resembles the behavior of slime mould Physarum Polycephalum. Despite being a recent algorithm, just in 2022, it has been modified at least five times. As is typical for this kind of study, the authors of the original SMA reported that the algorithm “benefits from competitive, often outstanding performance on different search landscapes” [31]. The following are some of the modifications made to SMA. Ewees et al. [48] proposed Gradient-based Optimizer and Slime Mould Algorithm (GBOSMA), an algorithm obtained by combining Gradient-Based Optimize (GBO) and SMA. In the hybrid GBOSMA algorithm, the current population is operated either by SMA or GBO, with both algorithms having the same probability of being selected to modify the solutions. Kaveh et al. [49] developed an Improved Slime Mould Algorithm (ISMA), which includes an elitist population replacement strategy and a new rule for updating the solutions’ position. In this case, Kaveh et al. [49] mention that the modifications were proposed to solve premature convergence problems to non-optimal solutions and the slow convergence rate presented by the original SMA. Örnek et al. [50] combined Sine Cosine Algorithm (SCA) and SMA to develop SCA-SMA. In SCA-SMA, sine and cosine equations are used to update the positions of the population solutions; also, a modified version of the Sigmoid function replaces the arctanh function used in SMA. Likewise, the authors justified the development of the new algorithm by mentioning that SMA tends to get trapped in local optima [50]. Qiu et al. [51] presented a modified version of SMA called Improved Slime Mould Algorithm (ISMA), which employed a Cauchy mutation mechanism with crossover and mutation operators from Differential Evolution (DE). Again, the tendency of the original SMA to get trapped in local optima is mentioned as one of the motivations for developing this new version [51]. Finally, Zhong et al. [52] proposed Teaching– Learning Slime Mould Algorithm (TLSMA), combining two population-based metaheuristics: SMA and Teaching–Learning Based Optimization (TLBO). In TLSMA, the population is divided into two subgroups: the first is operated by TLBO, while SMA operates the second. As the algorithm iterations progress, the SMA population is transferred to the TLBO-operated population to balance both algorithms' exploration and exploitation capabilities. Zhong et al. [52] mention that TLSMA was proposed to solve the problem of diversity loss suffered by SMA.

The multiple versions of WAO and SMA exemplify a long-known flaw in the metaheuristic design field: ignoring that these algorithms are component-based. This means it is possible to combine components from different frameworks to create algorithms that meet the needs of their designers. However, it is necessary to consider the following question: is adding components (operators or strategies) to a framework sufficient to publish the resulting method as an innovative algorithm or even as an improved version of the base algorithm? Answering this question is extremely difficult in the current condition of the field because similar frameworks are hidden under whimsical and unscientific names. To further contrast this idea, consider the following case. Kuyu and Vatansever [53] and Jia et al. [54] each modified a PSO-like framework by adding a Cauchy-based mutation mechanism and opposition-based learning. Despite working on similar frameworks and adding the same components, the resulting algorithms are considered different since one “analogizes” to a forensic group [55] and the other to a flock of sparrows [56]. This subjective division leads many to consider that two algorithms represent different areas of study just because they have different names [57,58,59,60]. Keeping as valid this division between algorithms based more on their names than on their framework’s structure would lead to an unnecessary number of algorithms. For example, all the PSO-like algorithms highlighted by Tzanetos and Dounias [61] could be modified by adding a Cauchy mutation mechanism and opposition-based learning, each being considered a different algorithm. Such a succession of studies would be considered frivolous by serious researchers, not only because they mix well-known components but also because they do not enrich our knowledge about optimization algorithms. Therefore, to identify the validity and contribution of these “improved” algorithms, one must first be sure that the underlying framework is genuinely unique and innovative.

Another interesting point observed in both WOA and SMA is that their respective authors reported outstanding performances in the original studies, which disagrees with the arguments outlined to justify the creation of their improved versions (premature convergence, lack of diversity, etc.). Both statements, those claiming superior performance and those alleging performance problems, can be seen as mutually exclusive. However, theory tells us that both viewpoints are likely to be true. A result of great relevance to the field of metaheuristics is the No Free Lunch (NFL) theorem [62], which shows that these metaheuristics (new and improved) are calibrated to solve the problems they were tested. For this reason, it is common to find outstanding performances in these studies. However, when these algorithms are used in different problems for which they were not calibrated, issues such as the lack of diversity or the tendency to be trapped in local optima appear. This point is addressed and explained in detail in Sect. 3.

3 Performance and No Free Lunch (NFL) Theorem

A common motivation employed by metaheuristic designers is to find an algorithm that shows superior performance to all other currently available options. This is known as the up-the-wall-game argument, where the only goal pursued by researchers is to obtain a better result than that achieved by the rest of their peers [19]. A priori, this argument would seem to be sufficient motivation to justify the overwhelming number of new metaheuristics recently proposed. However, the well-known No Free Lunch (NFL) theorem, proposed by Wolpert and Macready [62], directly opposes such an argument. Moreover, the same theorem raises doubts about the superior performances reported in virtually all studies proposing new metaheuristics. To point out these discrepancies between theoretical and reported results, a summary of the formulation by Wolpert and Macready [62] to derive NFL is given below.

3.1 No Free Lunch (NFL) Theorem

Consider a finite search space \(X\) associated to a finite space \(Y\) through an objective function \(f:X \to Y\) where \(Y \subset {\mathbb{R}}\). Let \(F\) be the space of all possible optimization problems. Optimization problems (also referred to as “objective functions”) are represented using probability theory and a uniform distribution \(P\left( f \right)\), defined over \(F\), which gives the probability that each \(f \in F\) is the optimization problem under consideration, namely:

$$P(f) = P(f(x_{1} ),f(x_{2} ),...,f(x_{|X|} ))$$
(5)

Wolpert and Macready [62] indicate that employing a probability distribution also allows for analyses that consider a single objective function whose uncertainties are encoded in \(P\left( f \right)\). Likewise, the search algorithms are considered to check a total of \(m\) distinct solutions \(x \in X\) with their respective objective function evaluations \(y = f\left( x \right) \in Y\). This sample of solutions \(x\) and evaluations \(y\) can be defined as a set of different and chronologically ordered configurations \(\overline{d}_{m}\), that is:

$$\overline{d}_{m} \equiv \left\{ {\left( {d_{m}^{x} (1),d_{m}^{y} (1)} \right),...,\left( {d_{m}^{x} (m),d_{m}^{y} (m)} \right)} \right\}$$
(6)

The probability that any search algorithm \(\alpha\) has of finding a particular sample \(\overline{d}_{m}\) is dependent on the objective function \(f\), the iteration m and the algorithm itself employed, this is denoted as \(P(d_{m}^{y} |f,m,\alpha\)). Denoting any pair of algorithms as \(a\) and \(b\), the first theorem of NFL states:

$$\mathop \sum \limits_{f} P\left( {d_{m}^{y} \left| {f,m,a} \right.} \right) = \mathop \sum \limits_{f} P\left( {d_{m}^{y} \left| {f,m,b} \right.} \right)$$
(7)

This result implies that, for any metric \(\phi\) considered, the mean performance of any two search algorithms is identical when applied to all possible optimization problems. Theoretically, this result holds even when competing a simple algorithm such as random search with more rational ones such as Genetic Algorithms or Simulated Annealing. Additionally, for equality (7) to hold, high-performance values of an algorithm on a specific set of problems must be paid with low-performance values on the rest of the possible problems. To further clarify this idea, it is recommended to consider the geometric interpretation of NFL. Stavros et al. [63] pointed out that the performance of a search algorithm \(\alpha\) is defined by the inner product of two vectors acting on a particular objective function \(f\).

$$\phi = \overline{v}_{{_{\alpha } }} \cdot \overline{P}$$
(8)

where \(\overline{v}_{\alpha } \equiv P(d_{m}^{y} |f,m,\alpha )\) contains only the information of the employed search algorithm, while \(\overline{P} \equiv P\left( f \right)\) describes the uncertainties of the considered objective function \(f\) [64]. This interpretation indicates that the algorithm’s performance is measured “by how well it is 'aligned' with the distribution \(P\left( f \right)\) that governs the problems on which that algorithm is run” [64]. It is for this reason that high optimization capabilities are claimed for all algorithms, but also a tendency to get trapped in premature local optima is frequent. This is because the proposed algorithms are “aligned” (show high-performance measures) to some specific set of problems they are assessing on, which in turn causes a “misalignment” (show low-performance measures) with the rest of the possible optimization problems.

3.2 Criticism and Free Lunch Theorems

In contrast to NFL, some researchers have claimed that the theorem has limited relevance in practice because it considers in its formulation tight and unrealistic assumptions [65, 66]. The main assumptions on which NFL works are: 1) optimization problems follow a uniform probability distribution; 2) the search space is finite; and 3) the algorithm does not revisit solutions during the search process. On the first assumption, Wolpert [64] has shown that NFL holds even when a set of probability distributions \(P\left( f \right)\)'s are considered; this because the characteristics of the search algorithm employed are not dependent on the distribution \(P\left( f \right)\) chosen. Although assumption (2) might make it seem that continuous optimization problems are not subject to NFL, the truth is that no computer can create a truly infinite search space up to now, no matter its precision. Assumption (3) is violated in most search algorithms (with some exceptions such as tabu search); however, it has yet to be formally demonstrated that this is a sufficient argument to dismiss the validity of NFL.

Despite researchers' general acceptance of the NFL theorem, there are cases where the theorem does not hold. Köppen et al. [67] indicated that NFL is not valid for cases where one of the search algorithms employs strategy stealing, i.e., scenarios in which an additional move can never be considered a disadvantage. Similarly, Droste et al. [65] have pointed out that NFL does not hold for some black-box optimization scenarios considering complexity theory aspects. Despite this result, the authors point out that the possible performance increments of one algorithm with respect to others are small. Another interesting result is provided by Corne and Knowles [68], who indicated that there are multi-objective optimization algorithms that can perform better than others, i.e., that NFL does not hold for this type of problems. This is due to the different ways in which single and multi-objective algorithms map the search space. While single-objective algorithms keep the best solution so far, multi-objective algorithms create a limited length record with different solutions. This feature allows the existence of a Free Lunch Theorem when considering algorithms with different memory capacities. Additionally, Wolpert and Macready [69] showed that NFL is no longer valid in coevolution and self-play problems, i.e., situations where two competitors collaborate to obtain a champion. Other studies pointing out particular scenarios for which NFL does not hold can be found in Kimbrough et al. [70] and Auger and Teytaud [71].

Interestingly, instead of supporting proposals for new metaheuristics with the above findings, most authors simply ignore this fact of optimization theory. From the 111 articles surveyed for the preparation of this study, 57% omitted any mention of the NFL theorem. On the other hand, in the remaining 43% that mention it, the authors provided an incomplete description or misinterpreted it by using it to support the development of new metaheuristics For example, Braik et al. [72], Gezici and Livatyali [73], Hassan et al. [74], and others mention that researchers are motivated to propose and develop better metaheuristics since NFL prohibits a single algorithm from solving all possible optimization problems. This approach of constantly searching for an algorithm that always performances better than others would be valid if authors focused on the exceptional cases where Free Lunch Theorems hold. However, the norm seems to propose metaheuristics with repeated concepts [15] to solve the same problems. This last point is associated with the next section, where it is reviewed the problems commonly used to demonstrate, in an informal way, the superior performance shown by all the newly proposed metaheuristics.

4 Types of Solved Problems

It is common for newly proposed algorithms to be put into competition with other metaheuristics to demonstrate their optimization capabilities. To learn more about this widespread practice, the problems used to compare the algorithm’s performances are analyzed. For this analysis, the sampled articles were divided into four groups: (1) studies focused on solving specific problems, (2) studies that only solve benchmarking problems, (3) studies that solve benchmarking and engineering problems, and (4) studies that consider both benchmarking and specific problems. The first group considers studies where an algorithm is proposed for a particular application. These can be as diverse as object identification in images [75], power grid optimization [76], structural optimization [77, 78], chemical process optimization [79], or disease identification [80]. The second group contains those studies that only solve benchmarking problems proposed for the Congress on Evolutionary Computation (CEC). These problems are sets of continuous and scalable functions, which can be modal or multimodal, single or multi-objective, among other characteristics [81,82,83,84]. The third group considers studies that solve both benchmarking and engineering problems. The latter can be the Welded Beam Design Problem, Tension/Compression Spring Design Problem, Pressure Vessel Design Problem, Speed Reducer Design Problem, or others. The last group is cases where benchmarking problems are solved first, and then the proposed algorithm is used to solve a specific problem. Only a few articles treat the last group. The occurrence percentages per group are shown in Fig. 6.

Fig. 6
figure 6

Percentages of problem types solved in the sample

According to Fig. 6, the most frequent studies are those that solve benchmarking and engineering problems, with 51% of the sample. The second largest group is the one where only a specific problem is considered, representing 25% of the sample. On the other hand, the most infrequent studies, with 4%, are those that consider benchmarking and specific problems. According to the data, 75% of the articles studied solve benchmarking problems, while specific problems are addressed in 29% of the sample. A fascinating discussion on the benchmarking problems used to study the performance of search algorithms is presented by Kudela [85]. According to his analysis, the benchmarking problems considered in many studies tend to present their global optimum in the center of the search space. This characteristic causes algorithms with a propensity to search in this area to perform better than those without such bias. For this reason, many of the algorithms that show high performance in benchmarking problems may give poor results in other more general search spaces.

Additionally, the different types of specific problems included in the sample are shown in Fig. 7 with their respective percentages of occurrence. As can be seen, the specific problems belong to diverse areas, such as chemical engineering, image analysis, and others. The most frequent fields of application are feature selection, electrical engineering, and structural engineering, with 28%, 22%, and 13%, respectively.

Fig. 7
figure 7

Specific problem types included in the sample

Since engineering problems are quite common (they are addressed by half of the sample) and are used to demonstrate that proposed algorithms can solve “real-world” optimization problems, they will be analyzed in detail in the following sections. Additionally, aspects such as the validity of these problems and some inconsistencies among the reported results will be discussed at the end of this section.

4.1 Engineering Problems

Engineering problems are used as a complement in many studies to demonstrate an algorithm optimization capability on real problems. Six engineering problems commonly found in the literature will be analyzed in detail. These are Cantilever Beam Design Problem, Welded Beam Design Problem, Pressure Vessel Design Problem, Tension/Compression Spring Design Problem, Speed Reducer Design Problem, and 3-Bar Truss Design Problem. For each problem, a brief description, its mathematical formulation, and a table comparing the results reported by the sample articles are given in the following sections. It is important to mention that the objective functions of the best solutions found in the following tables are presented respecting the numbers used in the original studies. Additionally, Sect. 4.2 mentions relevant features of the solutions found in the literature.

4.1.1 Cantilever Beam Design Problem

The Cantilever Beam Design Problem consists of defining the beam cross-sections so that its weight is minimized. Initially, this problem considered ten decision variables consisting of the depth and width of five cross-sections [86]. To date, the problem has been simplified to five decision variables representing the width (\(x_{i}\)) of five squared box cross-sections of constant thickness (\(t\)). A layout of the problem and a comparison of the best solutions found by state-of-the-art algorithms are presented in Fig. 8 and Table 1, respectively. The problem formulation is as follows:

Fig. 8
figure 8

Cantilever beam design problem layout

Table 1 Comparison of best solutions found for Cantilever beam design problem

Consider \(\overline{x} = \left[ {x_{1} , x_{2} ,x_{3} ,x_{4} ,x_{5} } \right]\).

Minimize \(f\left( {\overline{x}} \right) = 0.6224\left( {x_{1} + x_{2} + x_{3} + x_{4} + x_{5} } \right)\).

Subject to:

$$g_{1} \left( {\overline{x}} \right) = \frac{60}{{x_{1}^{3} }} + \frac{27}{{x_{2}^{3} }} + \frac{19}{{x_{3}^{3} }} + \frac{7}{{x_{4}^{3} }} + \frac{1}{{x_{5}^{3} }} - 1 \le 0$$
(9)

where:

$$0.01 \le x_{1} , x_{2} ,x_{3} ,x_{4} ,x_{5} \le 100$$
(10)

4.1.2 Welded Beam Design Problem

The Welded Beam Design Problem consists of defining the minimum cost of a welded rectangular beam. The problem considers different constraints associated with stresses and deflections [87]. The decision variables of the problem are the thickness (\(t\)) and length (\(l\)) of the weld, and the dimensions of the beam cross section (\(b\), \(h\)). A layout of the problem and a comparison of best solutions found by state-of-the-art algorithms are presented in Fig. 9 and Table 2, respectively. The problem formulation is as follows:

Fig. 9
figure 9

Welded beam design problem layout

Table 2 Comparison of best solutions found for Welded beam design problem

Consider \(\overline{x} = \left[ {h, l, t, b} \right] = \left[ {x_{1} , x_{2} , x_{3} , x_{4} } \right]\).

Minimize \(f\left( {\overline{x}} \right) = 1.10471x_{1}^{2} x_{2} + 0.04811x_{3} x_{4} \left( {14 + x_{2} } \right)\).

Subject to:

$$\begin{gathered} g_{1} \left( {\overline{x}} \right) = \tau \left( {\overline{x}} \right) - 13,600 \le 0 \\ g_{2} \left( {\overline{x}} \right) = \sigma \left( {\overline{x}} \right) - 30,000 \le 0 \\ g_{3} \left( {\overline{x}} \right) = x_{1} - x_{4} \le 0 \\ g_{4} \left( {\overline{x}} \right) = 0.10471x_{1}^{2} + 0.04811x_{3} x_{4} \left( {14 + x_{2} } \right) - 5 \le 0 \\ g_{5} \left( {\overline{x}} \right) = \delta \left( {\overline{x}} \right) - 0.25 \le 0 \\ g_{6} \left( {\overline{x}} \right) = 6000 - p_{c} \left( {\overline{x}} \right) \le 0 \\ \end{gathered}$$
(11)

where:

$$\begin{gathered} \tau \left( {\overline{x}} \right) = \sqrt {\left( {\tau^{\prime}} \right)^{2} + 2\tau^{\prime}\tau^{\prime\prime}\frac{{x_{2} }}{2R} + \left( {\tau^{\prime\prime}} \right)^{2} } \\ \tau^{\prime} = \frac{6000}{{\sqrt 2 x_{1} x_{2} }};\quad \tau^{\prime\prime} = \frac{MR}{J} \\ M = 6000\left( {14 + \frac{{x_{2} }}{2}} \right);\quad R = \sqrt {\frac{{x_{2}^{2} }}{4} + \left( {\frac{{x_{1} + x_{3} }}{2}} \right)^{2} } \\ J = \frac{{2x_{1} x_{2} }}{\sqrt 2 }\left( {\frac{{x_{2}^{2} }}{4} + \left( {\frac{{x_{1} + x_{3} }}{2}} \right)^{2} } \right) \\ \sigma \left( {\overline{x}} \right) = \frac{504,000}{{x_{4} x_{3}^{2} }} \\ \delta \left( {\overline{x}} \right) = \frac{65,856,000}{{x_{4} x_{3}^{3} \left( {30 \times 10^{6} } \right)}} \\ p_{c} \left( {\overline{x}} \right) = \frac{{4.013\left( {30 \times 10^{6} } \right)\sqrt {\frac{{x_{3}^{2} x_{4}^{6} }}{36}} }}{196}\left( {1 - \frac{{x_{3} \sqrt {\frac{{30 \times 10^{6} }}{{4\left( {12 \times 10^{6} } \right)}}} }}{28}} \right) \\ 0.1 \le x_{1} ,x_{4} \le 2.0; 0.1 \le x_{2} ,x_{3} \le 10.0 \\ \end{gathered}$$
(12)

4.1.3 Pressure Vessel Design Problem

In the Pressure Vessel Design Problem, a cylindrical vessel capped at both ends by hemispherical heads is designed [87], looking for a geometry that will produce a minimum cost of production. The decision variables for this problem are the shell thickness (\(T_{s}\)), the head thickness (\(T_{h}\)), the inner radius (\(R\)), and the length of the vessel excluding the head (\(L\)). A layout of the problem and a comparison of best solutions found by state-of-the-art algorithms are presented in Fig. 10 and Table 3, respectively. The problem formulation is as follows:

Fig. 10
figure 10

Pressure vessel design problem layout

Table 3 Comparison of best solutions found for pressure vessel design problem

Consider \(\overline{x} = \left[ {T_{s} , T_{h} , R, L} \right] = \left[ {x_{1} , x_{2} , x_{3} , x_{4} } \right]\).

Minimize \(f\left( {\overline{x}} \right) = 0.6224x_{1} x_{3} x_{4} + 1.7781x_{2} x_{3}^{2} + 3.1661x_{1}^{2} x_{4} + 19.84x_{1}^{2} x_{3}\).

Subject to:

$$\begin{gathered} g_{1} \left( {\overline{x}} \right) = - x_{1} + 0.0193x_{3} \le 0 \\ g_{2} \left( {\overline{x}} \right) = - x_{2} + 0.00954x_{3} \le 0 \\ g_{3} \left( {\overline{x}} \right) = - \pi x_{3}^{2} x_{4} - \frac{4}{3}\pi x_{3}^{3} + 1296000 \le 0 \\ g_{4} \left( {\overline{x}} \right) = x_{4} - 240 \le 0 \\ \end{gathered}$$
(13)

where:

$$0 \le x_{1} ,x_{2} \le 99;\quad 10 \le x_{3} ,x_{4} \le 200$$
(14)

4.1.4 Tension/Compression Spring Design Problem

The Tension/Compression Spring Design Problem consists of minimize the weight of a spring. The constraints of the problem consider aspects like deflection, shear stress and surge frequency. The decision variables are the wire diameter (d), the mean coil diameter (D), and the number of coils (N) [87]. A layout of the problem and a comparison of best solutions found by state-of-the-art algorithms are presented in Fig. 11 and Table 4, respectively. The problem formulation is as follows:

Fig. 11
figure 11

Tension/compression spring design problem layout

Table 4 Comparison of best solutions found for tension/compression spring design problem

Consider \(\overline{x} = \left[ {d, D, N} \right] = \left[ {x_{1} , x_{2} , x_{3} } \right]\).

Minimize \(f\left( {\overline{x}} \right) = x_{1}^{2} x_{2} \left( {x_{3} + 2} \right)\).

Subject to:

$$\begin{gathered} g_{1} \left( {\overline{x}} \right) = 1 - \frac{{x_{2}^{3} x_{3} }}{{71785x_{1}^{4} }} \le 0 \\ g_{2} \left( {\overline{x}} \right) = \frac{{4x_{2}^{2} - x_{1} x_{2} }}{{12566\left( {x_{2} x_{1}^{3} - x_{1}^{4} } \right)}} + \frac{1}{{5108x_{1}^{2} }} - 1 \le 0 \\ g_{3} \left( {\overline{x}} \right) = 1 - \frac{{140.45x_{1} }}{{x_{2}^{2} x_{3} }} \le 0 \\ g_{4} \left( {\overline{x}} \right) = \frac{{x_{1} + x_{2} }}{1.5} \le 0 \\ \end{gathered}$$
(15)

where:

$$0.05 \le x_{1} \le 2;\quad 0.25 \le x_{2} \le 1.3;\quad 2 \le x_{3} \le 15$$
(16)

4.1.5 Speed Reducer Design Problem

The Speed Reducer Design Problem consists of designing a gear train with a minimum weight. Different conditions of stresses in the shafts are considered as constraints. The decision variables of the problem are the following: the face width (\(x_{1}\)), module teeth (\(x_{2}\)), number of teeth in the pinion (\(x_{3}\)), length of the first shaft between bearings (\(x_{4}\)), length of the second shaft between bearings (\(x_{5}\)) and the diameter of first (\(x_{6}\)) and second (\(x_{7}\)) shafts [88]. A layout of the problem and a comparison of best solutions found by state-of-the-art algorithms are presented in Fig. 12 and Table 5, respectively. The problem formulation is as follows:

Fig. 12
figure 12

Speed reducer design problem layout

Table 5 Comparison of best solutions found for speed reducer design problem

Consider \(\overline{x} = \left[ {x_{1} , x_{2} , x_{3} , x_{4} , x_{5} , x_{6} , x_{7} } \right]\).

Minimize \(f\left( {\overline{x}} \right) = 0.7854x_{1} x_{2}^{2} \left( {3.3333x_{3}^{2} + 14.9334x_{3} - 43.0934} \right) - 1.508x_{1} \left( {x_{6}^{2} + x_{7}^{2} } \right)\)

$$\quad \quad \quad \quad \quad \quad + 7.4777\left( {x_{6}^{3} + x_{7}^{3} } \right) + 0.7854\left( {x_{4} x_{6}^{2} + x_{5} x_{7}^{2} } \right)$$

Subject to:

$$\begin{gathered} g_{1} \left( {\overline{x}} \right) = \frac{27}{{x_{1} x_{2}^{2} x_{3} }} - 1 \le 0 \\ g_{2} \left( {\overline{x}} \right) = \frac{397.5}{{x_{1} x_{2}^{2} x_{3}^{2} }} - 1 \le 0 \\ g_{4} \left( {\overline{x}} \right) = \frac{{1.93x_{5}^{3} }}{{x_{2} x_{3} x_{7}^{4} }} - 1 \le 0 \\ g_{5} \left( {\overline{x}} \right) = \frac{{\sqrt {\left( {\frac{{745x_{4} }}{{x_{2} x_{3} }}} \right)^{2} + 16.9x10^{6} } }}{{110x_{6}^{3} }} - 1 \le 0 \\ g_{6} \left( {\overline{x}} \right) = \frac{{\sqrt {\left( {\frac{{745x_{5} }}{{x_{2} x_{3} }}} \right)^{2} + 157.5x10^{6} } }}{{85x_{7}^{3} }} - 1 \le 0 \\ g_{7} \left( {\overline{x}} \right) = \frac{{x_{2} x_{3} }}{40} - 1 \le 0 \\ g_{8} \left( {\overline{x}} \right) = \frac{{5x_{2} }}{{x_{1} }} - 1 \le 0 \\ g_{9} \left( {\overline{x}} \right) = \frac{{x_{1} }}{{12x_{2} }} - 1 \le 0 \\ g_{10} \left( {\overline{x}} \right) = \frac{{1.5x_{6} + 1.9}}{{x_{4} }} - 1 \le 0 \\ g_{10} \left( {\overline{x}} \right) = \frac{{1.1x_{7} + 1.9}}{{x_{5} }} - 1 \le 0 \\ \end{gathered}$$
(17)

where:

$$\begin{gathered} 2.6 \le x_{1} \le 3.6;\quad 0.7 \le x_{2} \le 0.8;\quad 17 \le x_{3} \le 28 \hfill \\ 7.3 \le x_{4} \le 8.3;\quad 7.8 \le x_{5} \le 8.3;\quad 2.9 \le x_{6} \le 3.9 \hfill \\ 5.0 \le x_{7} \le 5.5 \hfill \\ \end{gathered}$$
(18)

4.1.6 3-Bar Truss Design Problem

The 3-Bar Truss Design Problem seeks to minimize the weight of a metallic structure subject to stress constraints [88]. The decision variables are the cross-sectional areas (A1, A2) of the elements of the structure. These are grouped as shown in Fig. 13. A layout of the problem and a comparison of best solutions found by state-of-the-art algorithms are presented in the Fig. 13 and Table 6, respectively. The problem formulation is as follows:

Fig. 13
figure 13

3-Bar Truss design problem layout

Table 6 Comparison of best solutions found for 3-Bar Truss design problem

Consider \(\overline{x} = \left[ {A_{1} , A_{2} } \right] = \left[ {x_{1} , x_{2} } \right]\).

Minimize \(f\left( {\overline{x}} \right) = l\left( {2\sqrt 2 x_{1} + x_{2} } \right)\).

Subject to:

$$\begin{gathered} g_{1} \left( {\overline{x}} \right) = \sigma_{1} = \frac{{\sqrt 2 x_{1} + x_{2} }}{{\sqrt 2 x_{1}^{2} + 2x_{1} x_{2} }}P \le 2 \\ g_{2} \left( {\overline{x}} \right) = \sigma_{2} = \frac{1}{{x_{1} + \sqrt 2 x_{2} }}P \le 2 \\ g_{3} \left( {\overline{x}} \right) = \sigma_{3} = \frac{{x_{2} }}{{\sqrt 2 x_{1}^{2} + 2x_{1} x_{2} }}P \le 2 \\ \end{gathered}$$
(19)

where:

$$0 \le x_{1} ,x_{2} \le 1;\quad {\text{L}} = 100\, {\text{cm}};{\text{ P}} = 2\, kN/{\text{cm}}^{2}$$
(20)

4.2 Observations on Engineering Problems

A noteworthy point about the engineering problems is presented in Sect. 4.1. is that they are not solved in a properly engineering way by any study. In these problems, the decision variables are composed of continuous values, which is inconsistent with the discrete and standardized values required by professional practice [89,90,91,92]. For this reason, the solutions provided to the problems are quite absurd. For example, has anyone seen a commercially available 0.408084 \({\text{cm}}^{2}\) bar like the one proposed for the 3-Bar Truss Design Problem? Or a 0.38464959 cm thick plate as the one recommended to be used in the Pressure Vessel Design Problem? Or a welder capable of welding with a thickness of 8.29147193 mm, as indicated in the Welded Beam Design Problem? Or can anyone explain what is meant by a spring with 11.285441 coils? It is evident that these kinds of results are explained by the fact that the proposed metaheuristics are born as algorithms focused on continuous problems. While it is true that modifying the formulation of an algorithm to solve discrete problems is a complicated task, this does not justify their authors claiming high capabilities to solve “real-world” optimization problems when this is not being authentically demonstrated. This characteristic of “solving” engineering problems that may not have a relation whatsoever with professional practice is just another example of the lack of rigor shown by this kind of study.

Another finding was that some studies present results that differ from those obtained when evaluating the objective function. The results reported for the Cantilever Beam Design Problem (see Table 1) are a clear example. As can be seen, two algorithms obtained structural weights of around 13 units, while the rest reported 1.3 units. When evaluating the objective function, it was found that the correct solutions presented an order of magnitude of 13 weight units. Other less obvious discrepancies in the reported results were found for the remaining problems. For the Welded Beam Design Problem, two solutions were found whose costs differed considerably from the other solutions. Zhong, Li and Meng [52] and Daliri et al. [93] reported solutions with a cost of 1.37 and 1.17, respectively, contrasting with the mean cost of 1.72 reported by other studies. Unfortunately, both studies omitted the values of their decision variables, so these results could not be verified. For the Pressure Vessel Design Problem, discrepancies were found as follows. Braik et al. [72] report a solution with a value of 5885 units; however, when evaluating the objective function, the actual result is 302 448 units. Su et al. [94] report a solution cost of 6060 units; nevertheless, when evaluating the objective function, the actual cost was 10 541 units. Tang and Zhou [95] and Qaraad et al. [96] presented a solution with a meager cost of 2310 and 4543 units, respectively. In the first case, it was found that such a solution does not meet the constraints \(g_{1} \left( {\overline{x}} \right)\) and \(g_{2} \left( {\overline{x}} \right)\), while the solution presented by Qaraad et al. [96] could not be verified since the values of the decision variables are omitted. Similarly, W. Zhou et al. [97] reported a solution whose decision variables do not satisfy the constraint \(g_{2} \left( {\overline{x}} \right)\) of the problem. For the Tension/Compression Spring Design Problem, Hu, Du, and Wang [98] reported a solution with a weight equal to 0.0127. When the objective function was evaluated, its actual weight was 1.89. For the Speed Reducer Design Problem, Daliri et al. [93] reported a solution with a weight of 1400 units, which is about half the weight of the other reported solutions. This result could not be verified because the study did not provide the values found for the decision variables. For the same problem, X. Zhou et al. [99] found a solution with a cost of 2891; when evaluating the objective function, it was found that the real cost was equal to 3120. Finally, for the 3-Bar Truss Design Problem, Tang and Zhou [95] reported a solution weighing 186, but the objective function evaluation gives a weight of 251. These discrepancies raise doubts about whether the same problems are being resolved in the aforementioned studies and the adequacy of the review process carried out prior to publication.

A widely used argument to justify the creation of new metaheuristics is the need for more powerful algorithms to solve optimization problems. Since the considered engineering problems have been solved for at least two decades, a performance comparison between the results reported by older papers and the solutions reported in the sample is possible. To make this comparison, the mean and the standard deviation (SD) of the solutions reported by the so-called “state-of-the-art” algorithms are contrasted with those reported by papers published more than 20 years ago. The results are shown in Table 7. It is important to note that values that presented discrepancies with the objective function evaluation and those whose veracity could not be verified were not considered in the calculation. Also, the comparison of the Cantilever Beam Design Problem is omitted because its approach changed over the years. The following nomenclature is used: Welded Beam Design Problem (WB), Pressure Vessel Design Problem (PV), Tension/Compression Spring Design Problem (TCS), Speed Reducer Design Problem (SR), and 3-Bar Truss Design Problem (3B). As can be seen, the results obtained more than 20 years ago for the engineering problems considered are very similar to those reported by state-of-the-art algorithms. The Tension/Compression Spring Design Problem presented the highest performance increase, where 3.2% more economical solutions were found than the one reported by Coello and Montes [100] in 2002. At the other extreme is the Welded Beam Design Problem, where the solution found by Coello [87] in 2000 saves 0.5% more material than the average of the recently proposed algorithms. Regarding standard deviations, the most significant difference is observed in the Pressure Vessel Design Problem, where there is a difference of 1.4SD between the mean and the solution found by Coello and Montes [100]. For the rest of the problems, there is a marginal difference (\(\le\)0.4SD) between the mean and the solutions reported by older articles.

Table 7 Comparison between older studies and the mean of the sample

These little performance differences reflect a stagnation of algorithm development and call into question the continuous improvement reported by metaheuristic designers today. As mentioned earlier, some recently proposed algorithms are noted for repeating concepts and strategies employed by previously presented frameworks. This recycling of concepts may be the cause of older algorithms presenting similar performances to those recently proposed. To exemplify how recent algorithms present already known components under different names, the following section reviews Black Window Optimization and Coral Reef Optimization algorithms.

5 Black Widow and Coral Reef Optimization

As mentioned in Sect. 2, many algorithms present identical formulations to others previously proposed. Unfortunately, these similarities can go unnoticed for years because they are hidden by a metaphorical language. In such cases, a solution may be named an empire, a remora, a whale, a black hole, or others. Something similar happens with the components used by the algorithms. To understand the true innovation of these algorithms, Sörensen [19] points out that it is necessary to go through a deconstruction process rather than relying solely on the algorithm’s name or the analogy on which it is based. In the deconstruction process [101,102,103,104], the contributions (benefits and disadvantages) of each algorithm component are analyzed to distinguish between the truly innovative components and those shared with other algorithms.

In this section, the deconstruction of two algorithms that have recently presented modified versions is performed, namely: Black Widow Optimization [105] and Coral Reef Optimization [106]. Both algorithms were born from an analogy with the life cycle of the species they are named after. The analysis process will show that they not only have a solid resemblance to Genetic Algorithms [107] but also that some of their components make them more complex but not necessarily more efficient|.

5.1 Black Widow Optimization

Black Widow Optimization is a population-based algorithm proposed by Hayyolalam and Pourhaji [105], inspired by the life cycle of black widow spiders. By December 15, 2022, Google Scholar indicated that the study where this algorithm was presented had obtained 329 citations. Additionally, in 2022, four improved versions of the algorithm were presented [98, 108,109,110]. The formulation of the algorithm, in the words of its creators, is as follows. The algorithm starts by creating an initial population of \(N_{pop}\) spiders. From this population of spiders, the next generation's parents are taken randomly. Because cannibalism exists among black widows, this algorithm considers three possibilities: (1) the female spider eats the male during mating, (2) the offspring eat each other, and (3) the offspring eat the mother. In all three cases, the value of the objective function is used to define which individuals will be cannibalized. Additionally, a mutation process after mating is used. At this point, anyone with knowledge of Genetic Algorithms can identify the similarities to Black Widow Optimization. However, to make it more straightforward, the components of the algorithm are cleaned of metaphorical vocabulary. The comparison between Genetic Algorithms and Black Widow Optimization is shown in Table 8.

Table 8 Comparison between black widow optimization and genetic algorithms

As seen from Table 8, the original Black Widow Optimization formulation has the same components as Genetic Algorithms but adds a cannibalism operator, which is redundant. Arguably, Black Widow Optimization's contribution to the optimization field is that it introduces the destruction of low-quality solutions. However, this additional step is required for Black Widow Optimization because the algorithm's selection operator is random, i.e., there is no criterion based on the solution’s quality to define which ones pass on their characteristics to the next generation. For this reason, many next-generation solutions are low-quality and must be destroyed somehow. If these low-quality solutions were not destroyed, their characteristics would negatively affect the quality of the following populations. This is an example of how blind recombination of components can give rise to more complex but not more efficient algorithms. It is needless to say, since the original version of Black Widow Optimization is a deficient version of Genetic Algorithms, then, by extension, its recently published improved versions are as well.

5.2 Coral Reef Optimization

Coral Reef Optimization is a population-based algorithm proposed by Salcedo et al. [106]. In the authors' words, the algorithm “simulates a coral reef, where different corals (solutions to the optimization problem considered) grow and reproduce in coral colonies, fighting by choking out other corals for space in the reef” [106]. By December 15, 2022, the original study where Coral Reef Optimization was presented had 195 citations registered in Google Scholar. Similarly to the Black Widow Optimization algorithm, two modified versions of the algorithm have been published just in 2022 [111, 112]. The formulation of the algorithm is as follows. The first step is to create a grid of size \({\text{NM}}\) which serves to model an artificial reef. Each of the possible positions of the grid is a space where corals will be housed. The initial population of the algorithm is randomly distributed on the artificial reef leaving some empty spaces. The study points out that there are three ways in which corals reproduce: (1) Broadcast Spawning, (2) Brooding, and (3) Budding or Fragmentation. The first type occurs when corals release gametes into the water together; under this form, new corals are created once two reproductive cells meet. The second form of reproduction is like the first one, with the difference that the encounter between gametes occurs inside a coral. The new coral is released after having partially developed inside its parent. The last form of reproduction occurs when a new coral is born from the separation of a single coral. These three forms of reproduction are considered in the algorithm. Once a new coral has been created, it looks for a place to stay on the artificial reef. If it finds an empty space, it automatically occupies it. If another coral occupies the space, the values of the objective functions are compared to define whether the new coral is rejected or replaces the old one. Each new coral has \(k\) chances to find a position on the reef. At the end of the algorithm iteration, those lower-quality corals have a probability \(f_{d}\) to be eliminated; this step is justified as a natural predation process on the coral reef. Clearly, this algorithm presents a more complex formulation than Black Widow Optimization; however, eliminating the metaphorical language helps to study each of its components. Table 9 presents a comparison between Coral Reef Optimization and Genetic Algorithms.

Table 9 Comparison between coral reef optimization and genetic algorithms

As seen from Table 9, most of the components of Coral Reef Optimization have an equivalent in Genetic Algorithms. The core component of the algorithm, the artificial reef, is nothing more than an alternative way of storing the solutions in a two-dimensional space. In Genetic Algorithms, this is done by a one-dimensional vector. Additionally, because the solutions “must fight for space in the reef” [106] in step 6, it is necessary to introduce the predation operator in step 7. It is worth noting that steps 6 and 7 are more closely similar to Evolutionary Strategies (ES). In the ES-(\(\mu + 1\)) version, the algorithm creates a population of size \(\mu\) and a single solution per iteration. Once the solution is created, it is compared to the lowest-quality solution in the current population. If the new solution has a higher quality than the worst solution, the former replaces the latter. In the case of Coral Reef Optimization, this process is performed probabilistically in step 6, i.e., the new solution created is not directly compared with the worst of the current population. Clearly, the strategy employed by Coral Reef Optimization is not efficient. Since there is a low probability that the worst solution will be selected to be compared with the new one, it becomes necessary to create a complementary strategy that increases the chances of discarding the worst quality solutions and thus prevent their characteristics from being transferred to the new generation. This is done in step 7 by the predation operator. Like the case of Black Widow Optimization, Coral Reef Optimization introduces additional steps that aim to correct the shortcomings of a formulation where different evolutionary operators are simply mixed carelessly.

6 Discussion

Multiple flaws in studies where new metaheuristics are proposed were pointed out throughout the paper. These flaws include a lack of knowledge of the NFL theorem and its implications, the recycling of well-known ideas and strategies that are presented as new, the unrealistic “real-world” problems they solve, the low standards required during the reviewing process, or even the limited increase in performance that the algorithms have shown in the last 20 years. Based on the obtained results, the authors agree with the position of Aranha et al. [1], who considered that the dozens of new algorithms proposed each year are symptoms of a lack of scientific rigor rather than an authentic advance in the field. As demonstrated in the cases of Black Widow Optimization and Coral Reef Optimization, algorithms that do not present any innovation can be considered new frameworks for years. If we add to these studies their “improved” versions, what we have is a problem that multiplies itself yearly. Of course, the best option to solve this problem is not to analyze study by study once published but to identify the lack of scientific rigor during the review process or before sending them to reviewers. It is therefore evident that to solve this problem, it is necessary to raise awareness of the subject and increase the standards required for this type of study.

It is important to remember that research is a human activity and, therefore, is not indifferent to the motivations and interests of those who carry it out. A well-known example that demonstrates the link between scientific endeavor and the researcher’s interests is the “publish or perish” philosophy. In this working way, researchers are under constant pressure to demonstrate high-performance metrics, either by number of publications or by number of citations received. As several studies point out [113,114,115], achieving high-performance metrics brings rewards such as greater prestige among peers but also economic benefits in the form of salary improvements or research funds. Because of this relationship between performance metrics and economic stability, researchers are willing to publish many scientifically weak papers despite multiple calls to stop that practice. In the case of new metaheuristic development, algorithm designers engage in questionable practices that help to provide an image of innovation to the work submitted but damage the field of research. Undoubtedly, the most harmful practice is the one where new algorithms are reinvented under new names, and their components are hidden under a metaphorical language. This practice not only generates fictitious areas of study [57,58,59,60] but also opens the door to “propose” improvements that have already been made.

The results obtained in this study reveal that this field of research must establish an evaluation framework to discern between speculative algorithms and those that genuinely provide new concepts and optimization strategies. Multiple authors have approached this topic from different perspectives to develop such an evaluation framework. Hooker [116] provides an interesting discussion revealing the drawbacks of comparing algorithms based on their execution time or the performance achieved in benchmark problems. Corstjens et al. [117] propose a methodology to evaluate the influence of the different algorithm components on its performance. Likewise, Campelo and Wanner [118] presented a statistical method to determine the experimental sample size when comparing different algorithms. Tzanetos and Dounias [15] proposed creating databases containing real optimization problems with which to compare the proposed algorithms. Finally, Franzin et al. [119] proposed a causal framework to explain the performance and results obtained by a search algorithm. Clearly, it is possible to develop a suitable evaluation framework based in whole or in part on the above works. However, the authors believe that an adequate evaluation framework should be based more on identifying the contributions of the algorithm’s components than on the performances they achieve on benchmark problems. This is because there are several cases where a metaphorical language is used to hide an already-known formulation and present it as a totally new algorithm. In such cases, the “new” algorithm would perform identically to the algorithm it emulates, making a performance-based evaluation framework unenlightening. Additionally, it is well known that different configurations of an algorithm's hyperparameters can generate different performances [120]. This, in turn, can be exploited to create samples where the compared algorithms show poor performances, thus creating a favorable picture for the proposed new algorithm. All these problems can be avoided if the discussion focuses more on the theoretical contributions provided by the new algorithms rather than a competition between their performances.

7 Conclusions

In this paper, an analysis of the characteristics presented by a sample of 111 recent papers where metaheuristics described as “new”, “novel”, “advanced”, “improved”, or similar was conducted. Different aspects were studied, such as the number of citations received, the origin of the proposed optimization algorithms, as well as the problems they solve. Valuable conclusions emerged from the analysis. These are listed below.

  • There is currently a trend to develop improved versions of established algorithms. Unfortunately, these studies present a deficient level of innovation because they only recombine well-known optimization components.

  • The names used by algorithm designers to label their creations do not serve to catalog their features clearly and concisely. Moreover, they only generate a fictitious research field where the proposed algorithm is compared with others bearing the same name, avoiding contrasting frameworks with scientific rigor and objectivity.

  • It is common for metaheuristic designers to have limited knowledge of the NFL theorem, which in turn causes them to erroneously use it as an argument in favor of proposing ever more powerful algorithms.

  • Although some cases allow the existence of Free Lunch Theorems, they are simply ignored by metaheuristic designers, representing an unexploited field of research.

  • The constant claims of algorithms with outstanding performance seem to be more a literary resource to get a new algorithm published than a fact. This is especially evident in the high number of “improved” versions that seek to fix the flaws exhibited by algorithms that were once presented as superior.

  • Despite the claim that the new proposed metaheuristics can solve “real world” problems, the truth is that the engineering problems they solve are far from being applicable in real practical situations.

  • Studies that propose metaheuristics, whether new or improved, may have been subjected to poor review processes.

  • From the review of their components, it was shown that some algorithms, like Black Widow Optimization and Coral Reef Optimization, are not really innovative but rather inefficient mixtures of evolutionary operators. By extension, this may also apply to their recently released improved versions.

  • It is necessary that both editors and reviewers raise the standards required for this kind of study to only allow the publication of those that effectively present innovative ideas that enrich the research field.