1 Introduction

As the world moves towards competition in all fields, people need to best use the limited resources to score a better result and thus achieve a better place in the competition. In this context, optimization is strongly needed. Optimization is a process of picking up the optimal values of the optimization problem's parameters from a given set of values to achieve the desired output, which specifically means output minimization or maximization. In other words, we need to obtain the best optimal solution under a set of limitations and constraints by tuning the parameters of the problem to be addressed. As mentioned in [1], the optimization process includes a set of steps which starts with formulating the problem to be in the form of an optimization problem, constructing the objectives (cost or fitness) function, determining the decision variables and the restrictions on these variables, then simplifying the reality of the problem by generating the mathematical model that represents the problem. Finally, the problem solver seeks to generate the most acceptable solution by maximizing or minimizing the value of the objective function.

Stochastic optimization algorithms are the most promising type under the umbrella of optimization, which can be classified as heuristic Algorithms (HAs) and metaheuristic Algorithms (MAs). In simple words, stochastic optimization is the general class of algorithms that depends on the random nature in the process of getting the optimal or near-optimal solution. HAs are iterative algorithms, iterating many times seeking a better solution than the solution obtained previously. HAs are used to find a feasible and reasonable solution but may not be the optimal one. In addition, HAs do not provide any evidence of the optimality of the solution obtained. A set of issues can be found in HAs, such as being problem-dependent algorithms specifically designed for a particular problem [2]. Another challenge in HAs is immeasurable success as there is no information about how close the obtained solution is to the optimal. Finally, there is a dilemma in measuring the computational time. Disclosing these gaffes and achieving a trade-off between solution quality and computational time is a main purpose of the appearance of metaheuristics [3]. As it is used to solve different types of problems, metaheuristics are the most preferable type of these algorithms. Metaheuristics were introduced for the first time by Glover in [4].

1.1 Understanding Optimization: What and Why?

In this section, we will help researchers understand the fascinating world of optimization. First, we will present examples of what optimization is, then we will try to answer the question, "Why do we study optimization?" Finally, this section will clarify how to optimize anything you have.

What is Optimization? In simple words, optimization is the art of perfectionism—how to perfectly make something in the best way. Optimization answers the question: how to obtain the best solution for a problem while applying a set of limitations? Maximizing profit, minimizing mass, pollution minimization, noise reduction, and drag reduction are all practical examples that can be achieved by using optimization. In most cases, optimization helps in the design process as a replacement for the traditional approach, which depends mainly on trials or humans. To clarify the simplicity and practical power of optimization in the design process, a diagrammatic view of how optimization methods help in the design process is presented in Fig. 1.

Fig. 1
figure 1

The optimization methods as a replacement of traditional method in the design process

Why Optimization? People may ask why we study optimization. In most cases, we do not have the opportunity to physically perform trials; instead, we use optimization to simulate a solution for a specific problem to see what the result of this trial would be. Hence, we can decide whether this trial is applicable or not. People may benefit from applying optimization in the industry to achieve a better position in competition under limited resources.

1.2 Paper Structure

The rest of this paper is structured as follows: various metaheuristic taxonomies and the development process are illustrated in Sects. 2 and 3. Taxonomies of optimization problems based on many criteria and their performance assessment are introduced in Sects. 4 and 5. The applications of metaheuristic algorithms (MAs) in different fields are presented in Sect. 6. The open issues and challenges of MAs and the observations from the experiment are introduced in Sect. 7. Finally, the research review is concluded in Sect. 8. The outline of the article is illustrated in Fig. 2.

Fig. 2
figure 2

The outline of the article

2 Metaheuristics Optimization Algorithms Taxonomies

Due to the rapid growth of the optimization field, many metaheuristic (MA) algorithms have been proposed recently. These algorithms need to be classified according to four main taxonomies: inspiration source, number of search agents, the mechanisms followed in the optimization process, and solution updating, in addition to the number of parameters included in the algorithm. In this section, these new algorithms will be classified.

2.1 Taxonomy According to the Inspiration Source

This is the most familiar and oldest classification of metaheuristic algorithms (MAs) and is suitable for studying the subcategory of MAs, which are nature-inspired metaheuristic algorithms. In general, by including the source of inspiration in the calculation, different studies use different classifications according to the inspiration, as illustrated in Table 1. In this study, Fig. 3 shows a more comprehensive taxonomy for MAs.

Table 1 Different trials of classifying the MAs according to inspiration source
Fig. 3
figure 3

The proposed classification of MAs based on the source of inspiration

As follows, the subcategories of the source of inspiration for MAs shown in Fig. 3 are illustrated in detail.

Swarm Intelligence (SI) is a self-organized system of collaborative behavior. SI has a set of characteristics, such as good communication skills between individuals, the ability to share information among its individuals, and the ability to learn from doing (adaptable beings). On the other hand, organisms do not have the ability to defend themselves against predators; they need to be in a swarm to perform the search or attack process for food. Mimicking the behavior of beings that live in flocks or herds seeking to hunt for prey or find food is the main inspiration for SI algorithms [8]. One of the most famous algorithms in this category is Particle Swarm Optimization (PSO) [9], which is inspired by mimicking the intelligent behavior of a flock of birds. Monkey Search Optimization (MSO) [10] is another example of SI algorithms that simulate the tree climbing process during the food discovery process. Hunting strategy and hierarchy-based leadership are the inspiration for Grey Wolf Optimizer (GWO) [11], Ant Colony Optimization (ACO) [12], Cuckoo Search (CS) [13], Ant Lion Optimizer (ALO) [14], and Honey Badger Algorithm (HBA) [15], which are well-known instances of SI algorithms.

Evolutionary Algorithms (EA) simulate the behavior of evolution, including recombination, mutation, crossover, and selection. EA begins by generating a random population; this population is then evaluated to choose the most fit individuals to contribute to the next generation. After several iterations, the population evolves to find the optimal solution. Genetic Algorithm (GA) [16] is the oldest algorithm in this class, mimicking Charles Darwin's theory of natural evolution. Other well-known EA methods include Differential Evolution (DE) [17], Genetic Programming (GP) [18], Coronavirus Disease Optimization Algorithm (COVIDOA) [19], Liver Cancer Algorithm [20], and Red Deer Algorithm (RDA) [21].

Human-based Algorithms (HA) is the main inspiration for this category. Mimicking the learning process between teachers and students led to the introduction of Teaching Learning-Based Optimization (TLBO) [22]. Tabu Search (TS) [23] enhances the search process through long and short memory. Other well-known HA algorithms include Group Leader Optimization Algorithm (GLO) [24], Stock Exchange Trading Optimization (SETO) [25], and Social Emotional Optimization Algorithm (SEOA) [26].

Physics-based Algorithms (PhA) are inspired by the physical laws or simulating a physical phenomenon such as gravitation, Big Bang, black hole, galaxy, and field. In other words, the physical rules are used in the process of generating new solutions. The most popular instances of this class are the Gravitational Search Algorithm (GSA) [27], Big Bang Big Chain (BBBC) [28], Heat Transfer Search [29], Henry Gas Solubility Optimization (HGSO) [30], Archimedes Optimization Algorithm [31], and Light Spectrum Optimizer (LSO) [32], which are some of the most common algorithms in the PhA category.

Chemistry based algorithms (ChAs) are algorithms that concentrate on the principle of chemical reactions such as molecular reaction, Brownian motion, molecular radiation. A list of algorithms that fall into this category are Chemical Reaction Optimization (CRO) [33], Gases Brownian Motion Optimization (GBMO) [34], Artificial Chemical Process (ACP) [35], Ions Motion Optimization Algorithm (IMOA) [36], and Thermal Exchange Optimization (TEO) [37], are common instances of the ChA category.

Math-based algorithms (MathA) Math-based optimization algorithms are algorithms that can be inspired from the mathematical theorems, concept and rules. Some algorithms fall into this group including; Gaussian Swarm Optimization (GSO) [38], Sine Cosine Algorithm (SCA) [39], Lévy flight distribution [40], Exponential Distribution Optimizer (EDO) [41], and Golden Sine Algorithm (GSA) [42], are common instances of the MathA category.

Plant-based Algorithms (PlA) The PLAs is relays on the simulation of the intelligent behavior of the plants. Specifically, a set of concepts in plant nature is used to inspire new metaheuristic optimization algorithms such as the flower flow pollination process, the phenomenon of colonization of invasive weeds in nature, the ecology and weed biology. Some algorithms fall into this group including; Flower Pollination Algorithm (FPA) [43], Invasive Weed Optimization (IWO) [44], Paddy Field Algorithm (PFA) [45], Artificial Plant Optimization Algorithm (APOA) [46], Plant Growth Optimization (PGO) [47], Root Growth Algorithm (RGA) [48], Rooted Tree Optimization (RTO) [49] are common instances of the PlA category.

Sports and Game based Algorithms (SpGA) Depending in the information and rules applied in the sports and gaming, a set of optimization algorithms can be inspired from team game strategies used in football, Basketball, and volleyball, Ludo Game. Ludo Game-Based Swarm Intelligence (LGSI) [50], Team Game Algorithm (TGA) [51], Football game algorithm (FGA) [52], World Cup Optimization (WCO) [53], Soccer League Competition (SLC) algorithm [54], and League championship algorithm (LCA) [55] are common instances of SpGA algorithms.

Miscellaneous The rest of metaheuristics optimization algorithms can be collected to be belongs to the miscellaneous class, the purpose of using the term miscellaneous is the miscellaneous ideas such as politics, Artificial thoughts, atmosphere, trade and other topics. Work occurring in clouds such as cloud movement, spread, and creation is the basic idea behind the inspiration of the Atmosphere Cloud Model Optimization Algorithm (ACMO) [56], the exchange of information in the stock market occurs, and is the basic motivation behind the Exchange Market Algorithm (EMA) [57]. The Grenade Explosion Method (GEM) [58], Passing Vehicle Search (PVS) [59], Small World Optimization (SWO) [60], Yin-Yang Pair Optimization (YYPO) algorithm [61], Political Optimizer (PO) [62], and the Great Deluge Algorithm (GDA) [63] are other examples of this category.

2.2 Taxonomy According to the Number of Search Agents

The classification according to the source of inspiration is the most familiar and is usually introduced in studies to summarize the concept of classification. However, this classification is not enough to tackle the classification process, as it does not provide any information about the internal mathematical structure or programming ideas of the algorithms. Hence a new angle of classification is used. Meta-heuristics can be categorized based on the number of search agents seeking to find the optimal into two groups of single-solution-based MAs (SMAs), and population-based MAs (PMAs). The following two paragraphs provide more information about each group. Figure 4 is a clarification view of this taxonomy.

Fig. 4
figure 4

The classification of MAs based on the number of search agents

Single-solution based MAs (SMAs) SMAs is also called Trajectory‑based algorithms (TAs) as the algorithms in this class depends on single trajectory nature in its work. In other words, in each iteration, the solution is directed to a single trajectory. The optimization procedure (searching about the optimal solution) of SMAs is started with single solution (from one search agent), later, and in the subsequent iterations, the solution is refined with the aim of achieving the optimal solution. We can say that the algorithm generates a single path to the optimal solution over the course of the iteration. For SMAs, the Simulated Annealing (SA) [64] is one of the familiar algorithms. where a single search agent moves through the design or search space of the problem being tackled. Over the course of iteration, a better solution or moves is accepted to participate in determining the optimal solution while the weak movements and solution are more likely to participate in the optimization process. Applying these actions guarantee generating an optimal path through the search space with a great probability of achieving a global optimal solution. Hill climbing (HC) reviewed in [65], Tabu Search (TS) [23], Great Deluge Algorithm (GDA) [63], Iterated Local Search (ILS) [66], and Greedy Randomized Adaptive Search Procedures (GRASP) [67] are some instances of this class.

Population-based MAs (PMAs) In contrast, and taking advantage of sharing information among agents, Collaborative work and data remembering, the PMAs is introduced. First, we can say that more than one agent is superior to a single agent in achieving the optimal solution. Specifically, a great number of search agents work together to extensively explore the search space, so we can call PMAs explorative-based algorithms. The optimization procedure starts with employing a population of search agents positioned at many distinct positions in the search space, and over the course of iterations, the population uses the advantage of sharing information to better achieve the best global solution. In simple words, a set of lines is drawn in the search space to extensively search the search space in order to obtain the best optimal solution achieved by all search agents. One of the oldest and widely used algorithms in PMAs is the Genetic Algorithm (GA), Chemical reaction optimization (CRO), Particle Swarm Optimization (PSO), Archimedes Optimization Algorithm (AOA), Sine Cosine Algorithm (SCA), Exponential Distribution Optimizer (EDO), Grey Wolf Optimizer (GWO), Ant Colony Optimization (ACO) and Honey Badger Algorithm (HBA) are some instances from this category.

In general, no class is totally better than the other where PMAs escape from the local optima dilemma in contrast to SMAs, also SMAs consume less computational time than PMAs, for a Itr number of iterations, the SMAs perform a lower number of objective function evaluation which equals 1 × Itr while the PMAs perform N × Itr evaluation of the objective function. N here stands for the number of search agents employed by the algorithm to obtain the optimal solution. But overall, the scientists prefer to use the PMAs as it has a greater probability of achieving global optimal solution in a considerable amount of time.

2.3 Taxonomy According to Updating Mechanisms

However, the classification according to the number of search agents provides information about the internal structure of the algorithm, but it cannot be treated as a uniform classification due to the few algorithms belonging to one group while the remainder (majority) falls under the other group. In this context, we need to provide a different classification angle to achieve an acceptable degree of uniform classification. According to the most important step of any algorithm, which is the solution update process. From this prospective MAs can be classified as solution creation-based algorithms (SCBAs) and differential vector movement-based algorithms (DVMs) [68]. In the following paragraph, we introduce a simple classification based on the behavior of the algorithms. Figure 5 is a clarification view of this taxonomy.

Fig. 5
figure 5

The classification of MAs based on population update mechanisms

Solution Creation Based Algorithms (SCs) In SCs, A set of parent solutions are merged to generate the new solution, in other words no single solution is used to create the fresh solution. Furthermore, the SCs can be categorized into two subcategories which are combination-based algorithms and stigmergy-based algorithms. In combination-based algorithms several solutions are combined or crossover-ed. Genetic Algorithm (GA), Gene Expression (GE), Harmony Search (HS), Bee Colony Optimization (BCO), Cuckoo Search (CS), Dolphin Search (DS) are some examples of this subcategory. On the other hand, in strategy-based solutions different solutions are indirectly coordinated by intermediate structure to generate new solutions. Ant Colony Optimization (ACO), Termite Hill Algorithm (THA), River Formation Dynamics (RFD), Intelligence Water Drops Algorithm (IWDA), Water-Flow Optimization Algorithm (WFOA), and Virtual Bees Algorithm (VBA) are some examples of the second subcategory.

Differential Vector Movement Based Algorithms (DVMs) Applying the mutation or shifting operation on the algorithm in order to generate a new solution is called Differential Vector Movement method. The fresh generated solution needs to be fitted to the previous one to participate in the next iteration of the optimization procedure. In this context, DVMs is categorized into three subcategories. In the first subcategory, the whole population's solution is used to generate the new solution, such operation occurs in Firefly Algorithm (FA) Gravitational Search Algorithm (GSA), Central Force Optimization (CFO), Human Group Formation (HGF) and Charged System Search (CSS). In the second sub-category, a small number of solutions (neighbourhoods) in population is employed to generate a new solution such as Artificial chemical process (ACP), Thermal Exchange Optimization (PSO), Group Search Optimizer (ALO), and Group Search Optimizer (GWO). In the last sub-category, only the relevant (best/worst) solutions are employed to generate the new solution such as Differential Evolution (DE), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO), Ant Lion Optimizer (ALO), and Grey Wolf Optimizer (GWO).

2.4 Taxonomy According to Number of Parameters

To deeply consider the internal configuration of the algorithm for this type of classification. Tuning the parameter of the algorithm plays a vital role in the performance of the algorithm when solving a specific problem. As mentioned in [1], it is a complicated task to choose the best values of the parameter that scores a better solution. Furthermore, the parameters can enhance the robustness and flexibility of the MAs if they are adjusted correctly. The optimization problem plays a vital role in defining the values of parameters. From a complexity perspective, the complexity of an algorithm is affected by the number of parameters. In this context and taking into account the importance of the parameters, this classification is introduced. Kanchan Rajwar et al. in [68] first classify the MAs according to the number of primary parameters employed in the MAs as illustrated in Fig. 6.

Fig. 6
figure 6

The classification of MAs according to the number of primary parameters

The number of parameters changes from one algorithm to another, which can be 0, 1, 3, 4, etc. For simplicity we will consider four main groups holding algorithm parameter numbers up to 3 and the rest fall into the miscellaneous group. The following paragraphs provide a detailed explanation of the five groups in this classification.

Zero‑parameter-based algorithms (ZPAs): The ZPAs do not have any parameter in their internal configuration so it also called Free‑parameter-based algorithms. The absence of parameters in ZPAs gives the user the opportunity to easily adapt the algorithm to be utilized in different optimization problems. Hence, the algorithms belong to this group considered as flexible, adaptive, and easy-to-use algorithms. Teaching–Learning-Based Optimization (TLBO) [22], Black Hole Algorithm (BH) [69], Multi-Particle Collision Algorithm (M-PCA) [70], Symbiosis Organisms Search (SOS) [71], Vortex Search Optimization (VS) [72], Forensic-Based Investigation (FBI) [73], and Lightning Attachment Procedure Optimization (LAPO) [74] are some examples of ZPAs.

Single‑parameter-based algorithms (SPAs): SPAs is the type of algorithms that own a single primary parameters in their internal configuration. So, it also is called monoparameter-based algorithms. Mostly, this single parameter has the ability to change the amount of exploration and exploitation that occurred in the algorithm. For example, in the Artificial Bee Colony (ABC) algorithm the single parameter Limit is used to determine the amount of food source left [75], in the Salp Swarm Algorithm (SSA) c1 is the parameter used to achieve a better balance between explorative and exploitative capabilities [76], and in Harris Hawks Optimizer (HHO) [77] the switch between soft and hard besiege is achieved by the magnitude value parameter E. Cuckoo Search (CS), Killer Whale Algorithm (KWO), and Social Group Optimization (SGO) are another example of this group.

Two‑parameter-based algorithms (TPAs): In TPAs only two primary parameters exist in the internal structure of the algorithm. For example, in the Grey Wolf Optimizer (GWO), the two primary parameters a and c must be adjusted. The a is adjusted to be equal to 2 to 0, allowing the algorithm to perform a smooth transition from exploration and exploitation while the c parameter is used to allow the algorithm to reach distinct locations around the optimal agent relative to the current location, In the Marine Predators Algorithm (MPA), P and FADs are the two primary control parameters. To overstate the predator or prey move, P is adjusted, while FADs is used to manage exploration behavior. Finally, in the Whale Optimization Algorithm (WOA) the two primary parameters A and C need to be modified to perform the exploration-to-exploitation transition and to allow the algorithm to explore several positions around the optimal agent relative to the present location. Differential Evolution (DE), Simulated Annealing (SA), Grasshopper Optimization Algorithm (GOA), Political Optimizer (PO), and Artificial Chemical Reaction Optimization Algorithm (ACROA) are just a few instances of TPAs.

Three‑parameter-based algorithms (TrPAs): In TPAs only three primary parameters exist in the internal structure of the algorithm. For example, the mutation rate mr, the crossover rate cr, and the new population selection criterion are the three parameters used in the Genetic Algorithm (GA) to allow the algorithm to escape from the local optima, improve the accuracy of the solution, and generate a most fit new generation, respectively. The randomization, attractiveness, and absorption are the three parameters included in the Firefly Algorithm (FA) to manage the execution of the algorithm and the random walks of fireflies. Finally, the distance bandwidth (BW), the harmony memory considering rate (HMCR), and the pitch adjusting rate (PAR) are the three primary parameters used in Harmony Search (HS) to increase the opportunity of achieving a global search and improve the local search problem. Squirrel Search Algorithm (SSA), Krill Herd (KH), Spring Search Algorithm (SSA), Artificial Algae Algorithm (AAA), Gases Brownian Motion Optimization (GBMO), Hurricane-Based Optimization Algorithm (HOA), Orca Optimization Algorithm (OOA), Social Spider Algorithm (SSA), Water Cycle Algorithm (WCA), Equilibrium Optimizer (EO), Parasitism Predation Algorithm (PPA), and Heap-Based Optimizer (HBO) are few instances of this group.

Miscellaneous: The rest of algorithms that own over three parameters in their internal configuration fall under the category of the miscellaneous group. It is not easy to cover all three-parameter algorithms. so, only three subgroups are introduced. the first subgroup is the four parameter-based algorithms such as Ant Colony Optimization (ACO), Sine Cosine Algorithm (SSA), Archimedes Optimization Algorithm (AOA), and Gravitational Search Algorithm (GSA). The second subgroup holds algorithms that employed five primary parameters in their internal structure such as Particle Swarm Optimization (PSO), Cheetah Chase Algorithm (CCA) and Farmland Fertility Algorithm (FFA). The last subgroup is algorithms with more than five primary parameters in their internal configuration. Biogeography-Based Optimization (BBO) with six parameters, Henry Gas Solubility Optimization (HGSO) with twelve primary parameters and the Camel Algorithm (CA) with seven} primary parameters are the most familiar algorithms in this subgroup. Cheetah Chase Algorithm (CCA), Exchange Market Algorithm (EMA), and Forest Optimization Algorithm (FOA) are also instances of this subgroup.

In general algorithms with few parameter-based MAs are easy to be adapted and hence the applicability of these algorithms to handle any optimization problem will increase and, on the other hand, large parameter-based MAs cause a disability of these algorithms to handle the optimization problems, as we encounter a problem in adapting all of their parameters to be suited for problem being tackled. hence the applicability will be decreed.

2.5 Metaheuristic Algorithms Merits

The MAs have a priority to be studied by the researcher than HAs, as they have four characteristics [78], which can be summarized as follows.

Metaheuristics simplicity It is painless to inspire a MAs as we can use a natural concept, physical phenomena or an animal behavior in the inspiration process. Utilizing the merit of simplicity, the researchers Seize the opportunity to make an extension in the metaheuristics works as they develop a new method by mimicking a natural idea, use the ability of search enhancement techniques to boost the performance of an existing algorithm, or even take the advantages in two metaheuristics algorithms and generate a new metaheuristics algorithm by applying a hybridization process. Furthermore, simplicity encourages computer scientists and other researchers to easily study the existing MAs and then apply them to solve a wide range of problems.

Metaheuristics flexibility In the other techniques there is a need to modify the structure of the algorithm to be matched with the problem being solved, unlike these techniques metaheuristics flexibility virtue allows the researchers to easily apply the MAs on any problem as the MAs have the capability of treating the problem as black box, in other words it need the input(s), output(s) of a problem on hand. No effort is used in modifying the structure; all effort is directed towards formulating the problem being solved in the form of an optimization problem.

Metaheuristics stochastic nature Computing the derivation of the search space of the problem is a necessity for the gradient-based optimization techniques to achieve an optimal solution. Dissimilar to these techniques, the preponderance of MAs is considered as a derivative-free mechanism when applying the process of optimization, specifically the MAs follow a stochastic nature during the search process as they start the optimization process by employing a set of search agents to generate random solutions without computing the derivative of the search space. The collaborative work of these search agents allows the algorithm to get the optimal solution. This merit allows researchers to easily use the MAs algorithms to perfectly tackle compound, expensive, and difficult problems that suffer from the trouble of obtaining the derivative information.

According to the previous features, the research community has increased, and researchers from different fields and application areas have been using the metaheuristic optimization algorithm in their work. About 4,476 documents founded in the Scopus have used the word metaheuristics in the last decade. Figure 7a is introduced to visualize the distribution of research studies according to the subject area, while Fig. 7b is used to depict the number of studies generated in each year of the previous decade.

Fig. 7
figure 7

Scopus statistics from 2014 to 2023

3 Development Process of Metaheuristic Optimization Algorithms

The simplicity merit of MAs allows researchers to easily develop a large number of algorithms in different application areas. To develop a new metaheuristic algorithm, a researcher can follow one of the following development processes according to the type of algorithm that is being developed, and some processes can also be used together.

Develop a new optimization algorithm The most of work for developing an optimization algorithm done by inspire the main idea of the algorithm from a different metaphors or concepts. These metaphors or concepts are mainly a simulation of rules or processes in different disciplines such as Chemistry, Physics, Biology, Psychology, Computation, Maths, and Human. Figure 3 is used to visualize a different source of inspiration with examples in each category. In general, most metaheuristics have been designed to mimic the system of living and survival of beings such as animals, birds, and insects, in addition to mimicking natural evolution. Insects (specifically, bees and ants) are the most popular metaphor for the development of a new optimization method by researchers.

Develop a new optimization algorithm from existing one One of the most popular ways to develop a new optimization method is to benefit from the operators of a specific algorithm in enhancing the structure of another algorithm. In simple words, the operators of other algorithms can emerge into the basic structures of the algorithm to boost the performance of the previously developed algorithm, and hence use it in solving different types of problems and issues. There are many enhancement operators used in the field; one of the most used ones is opposition-based learning (OBL). OBL is a machine learning mechanism that is used to increase the performance of the optimization algorithm by considering the opposite position of the solution in the search space. Specifically, two values are computed, the main and opposite positions, according to the objective function value, one of the two values maintained in the optimization process, and the other discarded. Taking into account only the best values, the optimization process became more accurate and a high level of performance is achieved. The orthogonal learning (OL) strategy is another example of an operator used as an enhancement strategy for MAs. The OL strategy mainly improves the exploitation capabilities. For example, the OL strategy was used to improve the Archimedes optimization algorithm, the cuckoo search algorithm and the artificial bee colony optimization algorithm, respectively. Enhanced solution quality (ESQ) is another mechanism used in the MA enhancement process. The ESQ was used to improve the performance of the reptile search algorithm and the Harris Hawks optimization (HHO) algorithm, respectively. Finally, the Local Escaping Operator (LEO) is used to develop an optimized version of the MPA called the enhanced marine predator algorithm (EMPA).

Hybridizing two or more optimization algorithm As a trial for enhancing the performance and applicability of the optimization method, researchers can benefit from hybridizing two or more optimization algorithms together in order to take the main strengths of each algorithm. The idea behind hybridization is to choose one algorithm better in exploration capabilities and another better in exploitation capabilities. Many challenges are encountered when we develop a new algorithm using the hybridization process, such as how to select the algorithm and how to merge them together, and is the new algorithm better than each one separately?

As shown in the previous paragraphs, there is a different development process for developing a new optimization method, although there is a set of limitations that must be considered during the development process such as the difficulty of transforming all the concepts with details into a mathematical form, how the algorithm totally manages the change in information about the source of inspiration, in addition to how people with low familiarity with the inspiration sources develop new methods.

3.1 Criteria for Comparative Algorithms

To gauge the effectiveness of newly developed algorithms, it is crucial for research to present the process of comparing them with existing algorithms. This should include a discussion of the selection criteria for comparative algorithms and the methodology used for comparison. The selection criteria for comparative algorithms depend mainly on the nature of the algorithm and the development process followed in developing the algorithm. In all cases, comparative algorithms should contain common criteria, which are state-of-the-art algorithms, newly developed algorithms, CEC winner algorithms, and high-performance algorithms. Specifically in case of developing the algorithm using the inspiration of a phenomenon process, the comparative algorithms list must contain algorithms with the same inspiration source or concept if there exist in addition to the common criteria algorithms. In case of developing an algorithm using the restructure method (i.e., merging a new operator or strategy), the comparative algorithms must contain the basic algorithm, algorithms developed using the same strategy if exists, algorithms that contain the strategy itself, in addition to the common criteria algorithms. In the case of developing algorithms using the hybridization process, the comparative algorithm list must contain the two basic algorithms that participate in the hybridization process, in addition to the common criteria algorithms.

3.2 Novelty Claims of Metaphor-Based Methods

The different ways of developing an optimization algorithm and the simplicity merit of the metaheuristic allow researchers to easily develop a large number of MAs. But a question must be asked here: Does this inspiration convey a novelty? In this section, we will present a set of claims and myths in the inspiration process of the metaheuristic optimization algorithms. As introduced in [79] a six widely used algorithms have been analyzed to prove that all components of the six (grey wolf, moth-flame, whale, firefly, bat and ant lion) are equivalent to a component of well-known techniques such as evolutionary algorithms and particle swarm optimization. Hence the authors called these algorithms misleading or tricky optimization algorithms, as they were inspired by bestial or duplicated metaphors and did not bring any novelty or useful principles in the metaheuristics field. We will present what considerations must be taken when developing a new novel algorithm and how to judge about the novelty of the new proposed algorithm in the field of metaheuristics.

Recently, a large number of publications have developed self-proclaimed or novel metaphor-based methods, but it is not obvious why they used them and what the novelty ideas are behind these methods. The set of all negative points, criticizes about novelty claims of various metaphor-based methods, can be introduced in the following points:

  • The metaphor-based methods redefine a well-known concept in the field of optimization and deliver it as a new concept or under new terminologies.

  • weak translation of the metaphors into a mathematical model or equations, and the model cannot be used totally to reflect the metaphors correctly. Finally, the proposed algorithm does not translate the mathematical model obtained from the metaphor correctly.

  • There is a myth in introducing the motivations behind the use of metaphor where instead of delivering the motivations as a sound or scientific basis they use accurate motivations such as a new metaphor "has never been used before" or a new mathematical model "has never been appeared in the past". Additionally, there is no concentration on the optimization process itself and how this process is employed to introduce effective design choices.

  • Instead of applying the evaluations of the proposed algorithms mainly on the state-of-the-art problems, the authors of these methods depend on the comparison with other algorithms or experimental analysis of low complexity problems in evaluating the performance or applicability of the proposed algorithm.

To prevent these negative points, the authors must apply two metrics analyses of the proposed algorithm before naming it as a "novel", which are:

  • Usefulness: in which the author must clearly introduce what are the useful ideas that come from the metaphor and how this metaphor helps in solving the optimization problems.

  • Novelty: When proposing a new method in the field of metaheuristics, was this new metaphor novel used to convey ideas?

4 Optimization Problems Overview

Achieving an acceptable solution is the main goal of any algorithm. Due to the rapid expansion of the complexity of the problem, scientists need to develop new methods that can cover this rapid extension. In this context, scientists are working to formulate any problem as an optimization problem to be easily tackled by optimization algorithms, as they provide better solutions than other traditional methods. In different fields, a great number of problems are formulated as an optimization problem, such as genetic algorithms used to automatically find and classify solitary lung nodules \cite{de2014automatic}, perform a classification for web pages, mining the web content, and dynamic organizing of the web content by ant colony optimization [80], In [81] Hussein et al., use the HHO to discover and design the drug through chemical descriptor selection and chemical compound activities. Applying HHO in microchannel heat sinks to minimize entropy generation [82], COVID-19 prediction [83], finally applying image segmentation and thresholding in [84, 85].

4.1 Basic Structure of Optimization Problems

In this section, we will try to support the readers who may not be familiar with optimization methods with the basic definitions and terms related to the optimization field. The process of solving an optimization problem using a metaheuristic algorithm starts with identifying the real-world problem, after that we move to the problem description stage in which we define the characteristics of the problem, determining the functional requirement in addition to analysis of nonfunctional requirements. After completing the problem description stage, we move to the research stage, in which the researcher first concentrates on how to mathematically formulate the problem in a mathematical form. To formulate the problem, we need to determine the design variables and parameters, formulating the objective function, determining the basic constraints on the variables, analyzing the complexity of the problem, and finally justifying the use of a metaheuristic algorithm. In the following paragraphs, the three main components which exist in any optimization problem are the objective function, the decision variables, and a set of constraints on these variables are discussed in detail.

Optimization model: Every system can be considered as a set of inputs producing one or more input, The system uses the set of constraints to minimize the number of inputs, in other words, we consider only the inputs that obey the constraint (i.e., valid inputs) and discard the other which does not match with the constraints (i.e., invalid inputs). The system performs processing on the valid inputs to produce the optimal solution, which can be evaluated using the objective function to obtain the minimum or maximum output value. In fact, the optimization algorithm will not find the optimal values of constraints; instead, it uses the constraints to produce the optimal solutions and construct the feasible solution area. The feasible solution area can be considered as the area which contains an infinite number of feasible solutions and one or more can be classified as the optimal one.

Formulating the optimization model as an optimization problem: When we solve the problem using the optimization algorithm, we look for all possible combinations of inputs. For example, if we have 3 inputs each with 10 discrete values, then we get 1000 combinations of inputs. The initial test to solve and evaluate the input is to use brute-force techniques. The brute force techniques will do better to obtain the optimal solution, but what about the large sized problems. Certainly, we will find a big problem in handling these problems using the brute-force techniques; hence searching all possible combinations for most real-world problems is impossible.

To avoid confusion for non-familiar people with the area of optimization, in this study, we will introduce the basic and most frequent terminologies used in the field:

  • The search space: it is the area in which all possible combinations of inputs are located.

  • The search landscape: it is the set of all possible combinations of inputs with their corresponding objective values.

  • Decision variables: it is the unknown quantities that need to be determined by assigning values to them. It is also known as the design variables. All possible values that can be assigned to these variables are named variable scope or domain. It can be mathematically as Xi where i = 1,2,3…N.

  • The objective function: This is the equation of the decision variables. In which all the decision variables exist with different parameters. It is used to judge the quality of the solution obtained for the problem being handled. In other words, after calculating the values of the decision variables, we substitute them in the objective function to obtain the objective value. The minimum objective value is the optimal solution for minimization problems, and the maximum is the optimal solution for the maximization problem.

Mathematically the single objective optimization problem can be formulated as Eq. (1) while the Multi objective optimization problem can be formulated as Eq. (2).

The optimization problems can be categorized in different ways. Categorizing optimization problems is an important step in choosing the algorithm that provides the optimal solution. It is not easy to introduce a rigorous or comprehensive taxonomy for optimization problems. This is due to the multiplicity of the classification term. But due to the important role of this taxonomy, in this paper we present a simplified and summarized version of the available taxonomies, illustrated in Fig. 8. In the following subsections, the different subcategories of the optimization problem are discussed in detail.

Fig. 8
figure 8

Optimization problem taxonomy

4.2 Taxonomy According to the Objective Function

In terms of the number of objectives, there are two types. If the number of objectives is greater than one, the problem is called a multi-objective optimization problem; otherwise, the problem is named a single-objective optimization problem. Usually, real-world optimization problems are multi-objective. For example, if we need to design a table, we will consider two objectives, for example, minimizing the weight and the price of the table.

Single-objective optimization Only one global optimal solution exists in single-objective optimization. The objective function only considers one objective; therefore, the best optimal solution can be easily determined by comparing the obtained solutions using basic comparison operators < , > , ≤ , ≥ , and = , the nature of this type allows the algorithm to easily tackle optimization problems. Without loss of generality, Eq. (1) is used to determine the mathematical structure of a single-objective optimization problem.

$$\begin{gathered} Minimize:f(x_{1} , \, x_{2} , \, x_{3} , \, \ldots , \, x_{n - 1} , \, x_{n} ) \hfill \\ Subject \, to: \hfill \\ g_{1} \left( {x_{1} , \, x_{2} , \, x_{3} , \, \ldots , \, x_{n - 1} , \, x_{n} } \right) \, \ge \, 0,\quad i = 1,2,3 \ldots m \hfill \\ h_{1} \left( {x_{1} , \, x_{2} , \, x_{3} , \, \ldots , \, x_{n - 1} , \, x_{n} } \right) \, = \, 0,\quad i = 1,2, \, 3 \ldots p \hfill \\ lb_{i} \le x_{i} \le \, ub_{i} ,\quad i = 1,2,3 \ldots n \hfill \\ \end{gathered}$$
(1)

where the problem decision variable is symbolized by n, m and P exist to represent the number of inequality and equality constraints, respectively. For the ith variable, ubi and lbi are used to represent the upper and lower boundaries, respectively.

Multi-objective optimization In contrast to single objective optimization, A set (more than one) of objectives need to be optimized simultaneously in the multi-objective optimization problems. Usually, these objectives are a conflict with each other, so most of work in this type is paid to achieving a trade-off between these objectives. The set of solutions in this type is called a Pareto optimal solution. The Pareto optimal dominance is employed to compare the solutions obtained in order to determine the optimal solutions. Extra storage is needed to hold the Pareto optimal solutions. Without loss of generality, Eq. (2) is used to determine the mathematical structure of a single objective optimization problem.

$$\begin{gathered} Minimize: \hfill \\ F\left( x \right) = \left[ { \, f_{1} \left( x \right),\;f_{2} \left( x \right),\;f_{3} \left( x \right), \ldots ,\;f_{o} \left( x \right)} \right] \hfill \\ Subject\;to: \hfill \\ g_{1} \left( x \right) \ge 0,\quad i = 1,2,3 \ldots m \hfill \\ h_{1} \left( x \right) = 0,\quad i = 1,2, \, 3 \ldots p \hfill \\ l_{i} \le x_{i} \le u_{i} ,\quad i = 1,2,3 \ldots n \hfill \\ \end{gathered}$$
(2)

where the problem decision variable is symbolized by n, m and P exist to represent the number of inequality and equality constraints respectively. For the variable ith, Ui and Li are used to represent the upper and lower limits, respectively. The number of objectives is denoted by o, and the gi and hi are the ith inequality and equality constraints, respectively. In general, the clash among objectives enforces the problem designer to consider more than one criterion in the comparison of obtained solutions and therefore the classical comparison operator does not perform better, instead, the Pareto dominance Eq. (3) is used to define the best optimal solutions.

$$\begin{gathered} \forall i \in \left\{ {1,2, \ldots ,O} \right\}: f_{i} \left( x \right) \le f_{i} \left( y \right)\Lambda \exists i \in \left\{ {1,2, \ldots ,k} \right\}:f_{i} \left( x \right) < f_{i} \left( y \right) \hfill \\ Where{ }\;\vec{x} = \left( {x_{1} , x_{2} , \ldots ,x_{k} } \right),\;{\text{and}}\;\vec{y} = \left( {y_{1} , y_{2} , \ldots ,y_{k} } \right) \hfill \\ \end{gathered}$$
(3)

Here the two solutions are represented by the vectors x and y. The x is said to dominate y denoted as (x ≤ y) if x has at least one better value in all objectives.

4.3 Taxonomy According to Function Form

From another angle, classification can be done according to function form. If we have a real-world optimization problem, the constraints are linear qualities and inequalities and the objective function formed as linear then the problem is said to be a linear optimization problem. In nonlinear optimization, one or both of the objective functions and constraints are nonlinear, and this is the realistic and complex one [86].

4.4 Taxonomy According to the Design Variable

According to the nature of the design variables, we can present three different types of optimization problems, as detailed in the following points.

Discrete optimization problems In discrete optimization problems the values of the design variables are discrete and in which there is a finite set of values. The shortest path problem and the minimum spanning tree problem are two instances of this type. For more details, we can mention that the discrete optimization consists of integer programming and combinatorial optimization. Integer programming deals with the formulation and solution of discrete integers (or binary integers) valued in the design variables. On the other hand, combinatorial optimization emphasizes the combinatorial origin, formulation, or solution of a problem. Mainly it seeks to achieve pairs (i.e., Assignments, groupings, orderings) of discrete and finite values under the influence of specific constraints. These pairs involve a component of solutions of potential combinatorial problem solutions [87]. In Bioinformatics, Artificial intelligence and other fields combinatorial optimization can be applied such as identifying propositional formula models or defining the 3D structure of protein, finding the shortest path in graphs, the travelling salesman problem, the knapsack problem in addition to the pin packing problem, the quadratic assignment problem which has been tackled in this study.

Continuous optimization problems In continuous optimization problems, A range of values is assigned to the design variables, so every design variable has an infinite set of values. These problems have two types constrained continuous optimization problems which there is a constant on the variables. For unconstrained continuous optimization problems there is an absence of these constraints maximization the general yield for differential amplifiers, optimization of the mechanical system of shock absorption are two examples of this type [88].

Mixed discrete–Continuous optimization problems In many problems a design variable has a mixture of discrete and continuous values, in this case we call the problem mixed discrete–continuous type. This type is the most widely used one, where numerous real-world problems are complex and possess a mixed quantitative and qualitative input. In [89], a set of instances is addressed using black-box optimization techniques.

4.5 Taxonomy According to Constraints

Furthermore, the classification can be according to the restrictions on the design variables.

Unconstrained Optimization Problem If there are no constraints on the design variables we call this type as unconstrained optimization problem, the unconstrained optimization can be viewed as iterative methods stating with initial estimation for the optimal solution then a set of iteration is used to reach for the optimal solution. Usually, the solutions were reduced iteratively to an optimal solution. In [90], Fletcher and Roger mentioned that the unconstrained optimization methods differ according to how much information the user provides, such as the gradient method, the second derivative method, and the non-derivative method.

Constrained optimization problem If there is one or more constraints, the optimization problem falls under the constrained optimization problem class. Furthermore, there are two subclasses of this type. The first subclass is equality constraint problem in which the values of the design variables are restricted to be equal to the specific value. The second subclass is inequality-constrained problem the design variables are restricted to greater / smaller than a specific value. From a formulation perspective, every equality constraint can be mathematically transformed to two inequality constraints. For example, ϕ(x) = 0 is equivalent to ϕ(x) < 0 and ϕ(x) > 0 [86]. Mainly, the constrained optimization covers three types of optimizations which are network optimization, bound constraints optimization, and the global optimization. Global optimization includes one of the most widely used problems, which is engineering design optimization problems.

5 Performance Assessment of Optimization Algorithms

First of all, we must refer to an important term, the efficiency of the algorithm, which means how the algorithm responds against finding the optimal solutions for the problem to be solved. Achieving an optimal solution is not the only purpose of a good optimization algorithm; instead, the algorithm must be high quality and achieve a better situation in the applicability process on different classes of problems. To judge the quality and applicability of the algorithm, the algorithm must be compared against a set of qualitative and quantitative measures. The good quality algorithm performs better and achieves better results when tested against qualitative and quantitative measures. In this section, we will present the whole assessment environment used to test the quality and applicability of the algorithm.

CEC Test suite CEC stands for Congress on evolutionary computation. Mainly the CEC holds a different class of problems, which may be uni-model, multi-modal, fixed-dimension multi-modal, and composite. CEC is usually used to test the performance of the algorithm and its ability to solve different classes of problems. In the art of optimization almost all studies perform the CEC function as a fitness function to test algorithm's performance itself and to compare the algorithm's performance against other algorithms.

Statistical Measures In this metric, the Best, Worst, Mean, and standard deviation are computed to the obtained solutions to judge about the quality of all solutions obtained together. The best solution is the one with a minimum value of fitness function in minimization optimization and the opposite is right for maximization optimization. The Worst is the solution which has a maximum value of fitness function on the minimization optimization and the opposite is right for maximization optimization. while the mean is used to compute the average value of all obtained solutions (obtained from executing the algorithm many times), and the small value of the mean means that the algorithm is doing better. Finally, the standard deviation or STD is the statistical measure that gives the reader insight into the differences among the obtained solutions, and the algorithm with small STD value is also better than the other with large value. There is also an important statistical measure, which is capable of measuring the whole performance of algorithms for any number of functions. This measure uses the mean rank sum value to rank the algorithms. Ranking these values in ascending order enables us to say that the algorithm with the lowest value is the best among all algorithms participating in comparison for all functions together.

Convergence curve Drawing a relation between the solutions scored by the algorithm and the number of iterations or number of function evaluations is the primary goal of the convergence curve. To summarize the behavior of the algorithm, the convergence curve is drawn to judge the speed of the algorithm in reaching the global optimal solution. For the minimization problem and to compare the performance of many algorithms. The lower convergence curve is better than the upper one. Also, we can compute how fast the algorithm converges towards the optimal solution through the rate of convergence measure.

Diversity Diversity measure is one of the measures related to the algorithm’s convergence behavior. In simple words, diversity means how the search agents of the algorithm are distributed in the search space. A high diversity value of the algorithm can be translated into a great exploration ability of the algorithm, and a low value can be translated into a great exploitation ability of the algorithm. Hence the diversity values of the algorithm must be smoothly transited from high value in the first iterations of the algorithm to low value in the rest of iterations of the algorithm. In this context, we can say that the good diversity of the algorithm leads to avoiding premature convergence and achieve a good speed in achieving the optimal solution hence score a high level of efficiency.

Trajectory diagram In order to test the behavior of a specific agent of the algorithm over the curse of iterations the trajectory diagram is used. The fluctuations of the curve are an indication of the better performance of that agent and its ability to explore and exploit the search space better.

Search history diagram To visualize the history of positions scored by the search agent during the process of optimization, the search history curve is drawn.

Exploration and exploitation The exploration and exploitation (EXPL-EXPT) curves are used to visualize the exploitative and explorative capabilities of the algorithm. Usually, the overlaps between the two curves exist to tell us about the shifting between exploration and exploitation, and therefore an EXPL-EXPT balance.

Real-world problems To test the ability of the algorithm in solving different classes of problem the real-world problem is tackled. Engineering design problems are the most widely used problems as many algorithms use the (pressure vessel, welded beam, 15/3/25/52-bar truss system, tension/compression spring…etc.) classical design problems to quiz the algorithm performance.

Operation platforms Alongside the previous measures, the algorithm quality can be affected by the environment setup in which the algorithm is executed. The good environment in both software and hardware capabilities leads to good behavior of the algorithm. In this context, we must mention that when we compare more than one algorithm to judge which is better, we must execute the algorithms in the same environment to achieve a fair comparison.

6 Metaheuristics Applications

As mentioned above, MAs have a great degree of applicability, as they operate better in solving different problems that involve a computation time restriction, a high-dimensional problem, and other kinds of problems. Specifically, MAs are capable of dealing with different classes of optimization problems in different fields. In the following subsections, the applicability of MAs in some of these fields are illustrated in detail.

6.1 IEEE Congress on Evolutionary Computation (IEEE CEC)

CEC stands for Congress on evolutionary computation. Mainly the CEC holds a different class of problems, which may be uni-model, multi-modal, fixed-dimension multimodal, and composite. CEC is usually used to test the performance of the algorithm and is considered as an indication of the capabilities of the algorithm to solve different classes of problems. In the art of optimization, almost all studies perform the CEC function as fitness functions to test the algorithm's performance itself and to compare the algorithm's performance against other algorithms. Almost a different version of the CEC test suite is introduced every year. Tables 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12 are presented below to provide the reader with the basic information on each version of the CEC benchmark function and how metaheuristic algorithms are applied to solve these benchmark functions.

Table 2 Metaheuristics optimization algorithms for solving CEC 2005 benchmark functions
Table 3 Metaheuristics optimization algorithms for solving CEC2009 test function
Table 4 Metaheuristics optimization algorithms for solving CEC 2013 test functions
Table 5 Metaheuristics optimization algorithms for solving CEC 2014 test functions
Table 6 Metaheuristics optimization algorithms for solving CEC 2015 test functions
Table 7 Metaheuristics optimization algorithms for solving CEC 2017 test functions
Table 8 Metaheuristics optimization algorithms for solving CEC 2018 test functions
Table 9 Metaheuristics optimization algorithms for solving CEC2019 test functions
Table 10 Metaheuristics optimization algorithms for solving CEC 2020 test functions
Table 11 Metaheuristics optimization algorithms for solving CEC 2021 test functions
Table 12 Metaheuristics optimization algorithms for solving CEC 2022 test functions

6.2 Engineering Design Problems

It is easy to provide an optimal design for a simple problem that contains a small number of design variables with a small range of values. In contrast to complex problems with many components, algorithms consume a huge amount of time to develop an optimal design. For example, the mechanical problem with different components and multiple objectives and constraints. Another example for complex problems is the engineering design problems in which the design process starts with exploiting the experience of designers to guess an optimal design for any problem, but this is not the optimal direction. In order to treat this poor thinking, we need systematic work that guarantees achieving an optimal design that is better than any other human design. Automatic techniques or, in other words, metaheuristics algorithms (MAs) are used to effectively diversify the search space with large parameters, minimizing the cost, and improving the product life cycle. Mainly, the MAs tune the parameters of the problem to produce the best optimal values of the design variables, hence achieving the optimal design. Table 13 is used to highlight the work of single-objective metaheuristics optimization algorithms in solving engineering design problems; also, Table 14 is introduced to clarify the tries of multi-objective optimization algorithms in tackling engineering design problems.

Table 13 Single objective algorithms for solving engineering design problems
Table 14 Multi objective algorithms for solving engineering design problems

6.3 NP‑Hard Problems

In NP‑hard problems the NP stands for nondeterministic polynomial time where the nondeterministic refers to nondeterministic Turing machines which apply the idea of brute-force search method. On the other hand, the polynomial is used to refer to the amount of time required to apply the quick search to get the single solution of the deterministic algorithm, or the time consumed by the nondeterministic Turing machines to perform extensive search. P is the set of all decision problems solvable in polynomial time. Specifically, the decision problem has two answers YES and No. Consequently, if all YES answers are checked in polynomial time, then the problem belongs to set of NP problems; on the other hand, co-NP is used for NO answer. If the polynomial-time solution obtained for a specific problem leads to a polynomial-time solution for all problems in the NP in this case the problem is said to be NP-hard. Also, a problem is NP-Complete iff it is NP-Hard, and it is in NP itself. Due to the high computational complexity, the exhaustive search methods do not have the ability of getting the best solution.

Quadratic Assignment Problem (QAP) As mentioned in [195] the QAP is NP-hard, as the polynomial time is not sufficient to obtain the approximate solution from optimal solution. QAP was first introduced by Koopmans and Beckmann in 1957 [196] as an extension of the linear assignment problem. QAP is considered a combinatorial optimization problem that has been considered and tackled by many research studies in the last three decades. However, the good results obtained in these studies but until now the QAP is not well solved as there is no exact algorithm capable of solving problems with more than 20 input sizes in a reasonable amount of computational time [197]. In QAP we seek to locate the facilities in its appropriate location under the condition that it is an exact-one-to-exact-one problem, that is, each site can only grasp only one facility and each facility must be placed in only one site where the distances between facilities and sites are determined. The main optimization goal of QAP is to minimize the distance and flow between each pair of facilities delegated to their relevant sites. Recently, QAP is addressed by many books, studies and reviews, as listed in Table 15. From another angle, there are several problems that are considered as special types of QAP (Table 16).

Table 15 Meta-heuristics optimization algorithms for solving the QAP
Table 16 Special types of quadratic assignment problem

The Bin Packing Problem (BPP) The Bin Packing Problem (BPP) is one of most familiar combinatorial problems that is considered as strongly NP-hard problem [218]. In BPP we need to pack a set of items m into n bins with the aim of minimizing the number of bins required to hold all items. The BPP can be mathematically formulated as in Eq. 4 [219].

$$\begin{gathered} Min:\;z\left( n \right) = \mathop \sum \limits_{j = 1}^{n} y_{j} \hfill \\ {\text{Subject}}\;{\text{to}}: \hfill \\ \mathop \sum \limits_{j = 1}^{n} W_{i} X_{ij} \le C_{yj} \quad \forall {\text{j}} \in \left\{ {1\;{\text{up}}\;{\text{to}}\;{\text{n }}} \right\} \hfill \\ \mathop \sum \limits_{j = 1}^{n} X_{ij} = 1\quad \forall {\text{i}} \in \left\{ {1\;{\text{up}}\;{\text{to}}\;{\text{n }}} \right\} \hfill \\ Y_{j} ,X_{ij} = \left\{ {\begin{array}{*{20}c} {1,} & {{\text{if}}\;{\text{item}}\;i\;{\text{is}}\;{\text{located}}\;{\text{in}}\;{\text{bin}}\; j} \\ {0,} & {{\text{otherwise}}} \\ \end{array} } \right. \hfill \\ \end{gathered}$$
(4)

where capacity of the bin yj is symbolized by Cyj.

The BPP benchmark data sets consist of three different types that are commonly classified as Easy, Medium, and Hard class as mentioned in [220]. Also, the BPP appears in one, two, three, and multi-dimensional form (Table 17).

Table 17 Applying the meta-heuristics optimization for the bin packing problem

Travelling Salesman Problem (TSP) One of the most familiar combinatorial optimization NP-Hard is the TSP in which we need to minimize the route as possible consumed to visit all cites precisely once and return to the initial city given a list of cities and distances among them. For example, in the TSP of 20 city, we have a huge number of feasible solutions (approx.1.22 × 1017). Guess how much time is required to perform this task using exhaustive search? the answer is very long. Therefore, exhaustive searches have disabilities in tackling such problems. The use of MAs destroys this disability, as it was used to find near optimal solutions in a reasonable amount of time [233]. Vehicle routing problems (VRPs) is the general form of TSP and is a multi-objective real-world problem tackled by many MAs such as genetic algorithm (GA), particle swarm optimization (PSO) and colony optimization (ACO) as in [234].

Job Shop Scheduling (JSS) JSS is a NP-Hard problem in which the algorithm seeks to consume a polynomial time to solve it. In JSS we need to process a finite set of jobs using a limited set of machines. JSS is a general type of scheduling problem. JSS is addressed by many MAs in [235,236,237], and [238].

6.4 Medical Science

Most of medical activities (i.e., Diagnosing, imaging, treatment, and monitoring) depends in its work on the computer or electronic device that is operate using an algorithm-based software [68]. Several researchers have used GAs for edge detection of images acquired using different imaging modalities, including magnetic resonance imaging, CT, and ultrasound [239]. In [240], Pereira et al., applied a set of computational tools for mammogram segmentation to improve the detection of breast cancer using GA combined with wavelet analysis to allow the detection and segmentation of suspicious areas with 95% sensitivity. GA has been applied for feature selection to identify a region of interest in mammograms as normal or containing a mass [241]. Also GA is combined with a support vector machine to differentiate benign and malignant breast tumours in ultrasound images [242], GA is combined with diversity index to discover lung nodules by developing an automatic threshold clustering method [80]. In [243] electroencephalography signals were used to detect hypoglycemia in patients with type 1 diabetes. Depending on neural networks in conjunction with ant colony optimization (ACO) and particle swarm optimization (PSO) Suganthi and Madheswaran use a more advanced computer-aided decision support system and mammogram to group tumours and detect breast cancer stages as described in [244]. Based on artificial bee colony (ABC) algorithm Kockanat and et al., Develop a technique for demonising images using 2D impulse response digital filter as illustrated in [245].

6.5 Robotics

Robotics is a vital active research field that owns some challenges that needs to be optimized such as task performance, Decrease the robotics cost, achieve a better reliability, in addition to minimize the unit complexity over other traditional robot systems. In this context, metaheuristics can be used to tune machine learning methods to enhance the collaborative behavior of robotics. One of the most active problems in robotics is the redundant humanoid manipulator issue. The complexity of this problem comes from the existence of multiple number of degrees of freedom and complex joint structure. This problem causes difficulty in achieving an inverse kinematics solution. Scientists make an effort to formulate this problem as a minimization problem, hence the MAs can perform better in solving this problem. In [246], the multilayer perceptron neural network is trained by the exploitative and explorative capabilities of the bee’s algorithm to learn the inverse kinematics of a robot manipulator arm. To conquer the problem of multi-solution, the GA is used to achieve a global optimal solution for inverse kinematics of 7-DOF (seven degree of freedom) manipulator [247]. Also, the inverse kinematics of the seven-degree-of-freedom (7-DOF) manipulator is perfectly tackled by the particle swarm optimization algorithm (PSO) by exploiting the strong intelligent scene and collaborative behavior among particles [248]. Biogeography-based optimization (BBO) is hybrid with differential evolution (DE) and uses the merits of the hybrid migration operator and the adapted Gaussian mutation operator to solve the inverse kinematics problem of the 8-DOF redundant humanoid manipulator [249].

6.6 Finance

Metaheuristics algorithms can be one of the most promising techniques used to solve different types of problem that occur in the finance and banking activities. In the following points, we will introduce a list of the most familiar problems and how the metaheuristics used to solve these problems.

Portfolio optimization and selection problem (POSP) in this problem, investors seek to assign optimal weights to the assets of the portfolio to achieve a minimal risk of investment. In [250], the authors provide a survey to solve POSP using metaheuristics and examples. Furthermore, the three GA, TS, and SA metaheuristic algorithms are used to solve POSP. The authors of [251] use the PSO algorithm to solve the POSP version with a cardinality constraint.

Index tracking problem (ITP) The ITP is a trading strategy that can depend mainly on two processes (hold and buy). In ITP we want to simulate the behavior of the index of the stock market using a minimum number of stocks. In other words, the ITP is developed to passively simulate the performance of the stock market index. For the specific German index, the authors of [252] use the SA to minimize tracking errors. The combinatorial search is hybrid with the DE for solving the ITP. The authors of [253] compare the performance of GA with quadratic programming and propose a solution approach to minimize the returns on the index using data from the FTSE100 index. Finally, in [254] the authors conducted a set of experiments to solve a special type of ITP and noticed that there was an improvement in an index.

Options pricing problem (OPP) Speculative activities are one of the most familiar tasks in financial markets, and the option can be one of the tools for speculative activities. Due to the fast dynamic motion of the financial market, it is difficult to guess the price of the option using traditional methods, so metaheuristic algorithms can be a promising choice in that case. In order to find parameters that achieve consistency between the model and market prices, Gilli and Schumann [255] use the PSO and DE to study the pricing of the calibration option. Finally, the authors of [256] have shown that the pricing of option operations can be enhanced compared to the traditional binomial lattice method when we use the ACO algorithm.

6.7 Telecommunications Networks

The recently needs for developing complex and large computer systems lead to an urgent demand for designing and developing high quality and more extensive network design and routing techniques and optimally solving problems in such an area. Also, we can notice that most problems in telecommunications are complex and hard to solve using traditional techniques and approximate algorithms, so there is urgent need to employ metaheuristic algorithms to solve network design and routing problems. A set of nodes (i.e., computers, databases, equipment, or radio transmitters) can be connected together using a transmission link (i.e., optical fiber, copper cable, radio, or satellite links) to construct communication networks. Under a set of constraints such as reliability, throughput, delay and link capacity, we seek to achieve a minimum cost of configurations as an objective function for these networks, and many problems can be appeared such as number of nodes, number of routing paths, the frequency assignment, and the capacity installation. A large number of studies using metaheuristics in solving telecommunications problems such as Kim et al. [257] employ a SA algorithm in the mobile radio system to allocate the nominal cells of channels. To minimize the installation cost and maximize traffic, the authors in [258] use the tabu search algorithm with randomized greedy procedures to find the location of the base stations of the universal mobile-based communication system. Specifically, good approximate solutions for large and medium-sized instances are obtained by the randomized greedy procedures, and these solutions were improved by using the tabu search algorithm. Finally, a new metaheuristic algorithm developed based on the Genetic Algorithm and Ant System was proposed to achieve better and efficient solutions for real-life transportation network design problems in large real networks located in two different places (Canada, city of Winnipeg) [259].

6.8 Food Manufacturing Industry

Recently, the metaheuristics can be considered as one of the most widely used efficient decision-making techniques that can be used to solve problems in different disciplines. In this section, we will present brief information about using metaheuristics in one of these disciplines, which is the food manufacturing industry. Specifically, metaheuristics can be applied in many food processes such as thermal drying, fermentation, and distillation. In [260], the authors develop a new hybrid method based on artificial bee colony (ABC) and the record-to-record travel algorithm (RRT) for Optimizing the Traceability in the Food Industry. The proposed method is employed to solve and provide the optimal minimal solution for the batch dispersion manufacturing problem. The hybrid RRT-ABC is used in the French food industry to carry out real-world experiments (that is, sausage manufacturing) to obtain high-performance results compared to traditional methods. The Artificial Bee Colony Algorithm (ABCA) used in the development of a delivery route optimization method to achieve a fresh food distribution without decreasing the quality of the food [261]. Finally, in [262], the Simulated Annealing (SA) is hybrid with the Virus Colony Search Algorithm (VCS) to improve the quality of the result of a sustainable Closed-Loop Supply Chain Network (CLSCN) design in the olive industry.

7 Open Issues and Challenges

However, the good features and abilities of the MAs in solving a wide range of problems, like other techniques, suffers from a set of problems in the following points, we will refer to these problems.

The stochastic nature and near optimal solution As we know, generating an optimal solution is one of the main features of deterministic algorithms such as simplex method. On the contrary to that, the metaheuristics algorithms (as it is a stochastic algorithm in nature) does not guarantee optimality of the obtained solution, but it provides an acceptable solution. This is one of the significant disadvantages of MAs. It is worth mentioning that the deterministic methods (unlike the stochastic methods) face difficulty when dealing with high-complex problems (that is, high-dimensionality and non-differentiable problems). Practically, when we decide to use one of the previous two methods, we choose to gain something and give the other.

The scale-ability and expensive computational cost Practically, the MAs score great promising results in solving problems in different natures such as discrete, continuous and combinatorial problems that contain a large number of decision variables. However, when solving large-scale global optimization problems (LSGOP) the MAs consume an expensive amount of computational cost. This scalability, challenge is one of the most important challenges that researchers must consider in the future due to the great growth in the size of the optimization problems when dealing with high-dimensional machine learning and large-scale engineering problems. In this context, many strategies are developed by the researchers to cover this problem such as the parallelization, approximation and surrogate modelling, hybridization of local search and memetic algorithms, decomposing the big problems into sub-problems, and befit from the sampling techniques.

The weakness of theoretical and mathematical analysis In most sciences such as chemistry, Biology, physics and others, the mathematical analysis of a method can be computed accurately to specify how much the method costs in terms of computational cost. Unlike those sciences, in metaheuristics we encounter a challenge in computing the exact computational cost of the algorithm, the reason behind this difficulty is from mathematical perspective it is difficult to analyze why the metaheuristics algorithms are so successful. Also, researchers need to pay attention to solving problems in determining the convergence analysis of many metaheuristics’ optimization algorithms. Finally, researchers also need to develop innovative methods that allow researchers to easily analyze and compute the algorithm's cost in the case of modification and scaling up the algorithm.

Intensification and diversification trade-off The algorithm's degree of effectiveness is measured by the ability of the algorithm to transit smoothly between the exploration (that is, explore as much as possible the feasible area) and the exploitation (that is, achieving good steps towards the optimal solution's area) stages. Achieving a high degree of intensification and diversification balance is one of the most important challenges or issues in most MAs. However, some algorithms achieve an acceptable degree of trade-off between exploration and exploitation; the vast majority of MAs need to address this challenge by scoring a high level of global diversification and local intensification [263].

Large-scale real-world problem formulation Nowadays the vast majority of problems in recent fields such as data science and big data analysis tasks are considered as large-scale real-world problem (LSRP) that is due to the large number of problem components and problem dimensions. Formulating a large-scale real-world problem (LSRP) is one of the crucial issues in metaheuristic algorithms. The issue comes from the large number of optimization variables (decision variables) included in the problem, how these variables interact with each other, how much the variables or components are related to each other, and what is the effect of one variable on the other variables. Also, it is worth mentioning that the large number of variables is translated as the problem size, which affects the computational cost of the algorithm that deals with this problem.

The limitations of the No-Free-Lunch theorem One of the most fundamental theories in the field of optimization is the No-Free-Lunch theorem [264] which states that there is no universal optimizer for all kinds of problems that is the algorithm may do better in some kinds of problems and do no better for the other kinds. We cannot generalize this theory, as it has been proved for the type of single-objective optimization, but it does not hold yet for problems with continuous and infinite domains in addition to multi-objective optimization [265, 266]. In this context, the researchers in the field of metaheuristics must answer how to apply the NFL in terms of several dimensions?

Comparing different algorithms Comparing similar algorithms through the absolute value of the objective function or number of function evaluations is a possible task. On the other hand, we encounter a problem in comparing different algorithms with different objectives through a formal theoretical analysis. Practically no fair/honest or rigorous comparisons exist in this field [267].

Parameter tuning and control The algorithm's parameter plays the most vital role in determining the performance of any optimization algorithms. The algorithm's designer can change the performance of the algorithms by applying the parameter tuning process of the algorithm. Specifically, we can say that poor tuning leads to poor performance, and the opposite is true. As mentioned in [268], it is practically not an easy task to tune the algorithm parameter and control it by changing its values. Another point we must refer to is that, for well-tuned parameters, there are no clear reasons for unchanging the values of these parameters during the optimization process. Until now, the process of parameter tuning has been implemented by applying parametric tests, while parameter control can be implemented stochastically in which the values of the parameters are picked randomly within a prespecified range. Therefore, there is an urgent need to develop automatic systematic methods to control and tune the parameters. The authors in [269] and [270] proposed a self-tuning method as a trial to encounter problems of parameter tuning and control, but with this trial, the computational cost is still expensive. Based on the previous notes, there is an urgent need to develop an automatic method that applies an adaptive change of the parameters in addition to less effect on the computational cost of the algorithm.

The lack of big data applicability Dealing with big data and developing a big data algorithm has turned into an urgent demand today as the data volume has increased dramatically with the help of automatic data collection methods. In this context, we noticed that there is no more concentration on the application of metaheuristics on big data in the current literature. There are no more studies on how to benefit from applying metaheuristics along with big data algorithms. Consequently, in this review, we inform the researchers to spend more effort and trials in developing new reliable methodologies and algorithms to solve big data problems with the help of metaheuristics.

The lack of machine learning and metaheuristics combination One of the most powerful and influential methods for making a decision and performing a predictions task is the machine learning (ML). Recently, very helpful results have been achieved by the ML techniques. So, researchers in the metaheuristics field must pay an attention to methods that benefit from the ML techniques in optimizing the work of current MAs algorithms or developing a new ML-based metaheuristics algorithms. The following points may be helpful and promising with regard to this point.

  • Using the new advances in reinforcement, ensemble, and deep learning in applying an automatic choice of specific problems to be handled by existing and new optimization algorithms [271].

  • Benefit from the capabilities of machine learning techniques in optimizing the work of the optimization field by generating an automatic model for representing the optimization problems, adjusting the analysis techniques for analyzing the search space, in addition to beating large and complex problems by decomposing them into smaller size problems [272]. In another prospective, we can use the ML capabilities in applying automatic configurations of the algorithms by allowing the ML algorithms to choose the appropriate values for the algorithm's operators, especially for metaheuristic algorithms due to a large number of parameters [273].

Shortened the gap between the metaheuristic’s algorithms and the problem domain knowledge Treating the problem as a black box is a double-edged weapon. However, this can be considered as a strength of the metaheuristic’s algorithms over other algorithms, but it also a challenge. Considering and integrating the domain knowledge of the problem with the designed algorithm will dramatically increase its performance. For example, a problem-orientated research direction can be obtained by designing the algorithm's operator and search mechanisms based on the characteristics of the problem which also can be benefit in reducing the complexity of the algorithm by considering the optimality conditions of the problem being considered [273].

In summary, the following observations from the experiment are:

  • Apply the MAs on parallel computing and combine the metaheuristic techniques with the modern parallel computing technologies to generate a powerful method matched with the future generation of computing.

  • Exploit the benefits of artificial intelligence and machine learning techniques to provide new algorithms that have the ability to automatically adjust the parameters and automatically analyze the algorithms.

  • Developing new methods directed towards strengthens the ability of MAs in addressing the large-scale global optimization (LSGO) problems.

  • A great effort must be paid for the hybridization process to allow the algorithms to use the Powers of many algorithms, also generating intelligent techniques that can provide the researcher with insights about what algorithms best suited to be hybridize together?

7.1 Emerging Technologies

After discussing the open issues and challenges, we see that there is much future work in the field of metaheuristics, a set of guidelines must be declared to help the future researcher in the field to address these challenges. In this section, the guidelines used to dive deeper into potential future research directions are introduced. Specifically, we will concentrate on two emerging technologies which are machine learning and quantum computing and how these technologies enhance the optimization process.

7.1.1 Quantum-Inspired Metaheuristics

Metaheuristics can be employed to obtain a global optimal solution for a wide range of different problems in different computational aspects. These methods can benefit from the concept of quantum computing (QC) to enhance the solutions obtained. Hybridizing the quantum computing with the metaheuristics will produce a quantum-inspired metaheuristic algorithm (QIMAs). QIMAs can be considered as an alternative approach to classical optimization methods for solving the optimization algorithm [274]. The main idea behind the QIMAs is to better use the quantum computing principles with the metaheuristics in order to boost the performance of the classical optimization algorithms by scoring a higher-performing results than traditional metaheuristic algorithms. Specifically, the use of QC in metaheuristics will accelerate convergence, enhance exploration, enhance exploitation, and provide a good balance between the two capabilities of the algorithm. The most promising merit that affects the performance of the algorithm is the parallel processing feature in QC [275]. Finally, QIMAs can be used in different disciplines such as engineering and science.

7.1.2 Intelligent Optimization

In this section we will introduce a new type of optimization that is considered as one of the most promising topics in the future of the metaheuristic field. Intelligent optimization (IO) is developed as a test to intelligently adjust the set of inputs and their values to achieve an optimal output(s). In other words, IO cost minimal consumption in determining and choosing the optimal solution among all possible solutions of the problem. The importance of using the IO is dramatically increased when solving complex and NP-hard problems in which the selection of the optimal solution through an exhaustive search is considered impossible or practically difficult. In addition, IO can be used as an important solution for the time-consuming problem of many optimization algorithms. IO can be used in all steps of the optimization process, such as defining the problem, handling, and formulating the objective function(s) and constraints.

7.1.3 Hybrid Metaheuristics and Mathematical Programming

In the last years, hybrid optimization algorithms have achieved promising results compared to classical optimization algorithms. The main aim behind the hybrid metaheuristics is to provide a reliable and high-performance solutions for the large and complex problems. One of the most widely used combinations is hybrid metaheuristics with mathematical programming approaches. This combination will increase the quality of the solution, as it benefits from the two methods in determining an exact solution in a reasonable amount of time. The following points define the mathematical programming approaches that can be used with metaheuristics to increase the quality of the solutions obtained [276].

  • Enumerative algorithms: in this approach we can use one of the well-known tree search methods such as dynamic programming and branch and bound. These methods follow the divide-and-conquer strategy where the search space can be divided into smaller search spaces, and then in each sub area we apply the optimization separately. By applying this strategy, the quality of the solution will increase, and the time consumed will decrease.

  • Decomposition and Relaxation methods: in this approach we can decompose the large problem using the Bender’s decomposition method or apply the Lagrangian relaxation method to convert the problem into smaller problems.

  • Pricing and Cutting plane algorithms: in this approach, we prune using polyhedral combinatorics.

8 Conclusion

In this review, a comprehensive study of metaheuristic algorithms is introduced that involves defining the concept of optimization. Studying the appearance of metaheuristic term. Introducing an explanation of the features of the MAs more than other techniques; Different taxonomies of the MAs according to different aspects such as inspiration source, number of search agents, population updating mechanisms, and number of parameters. Studying the metrics used in the Performance Evaluation of the algorithm. A great effort is paid to clarify the optimization problem in detail, concentrating on different classification techniques, and, moreover, the study reviews the use of metaheuristics in different application areas such as engineering design problems, NP hard problems, medical science, and robotics. Finally, we introduce some of the issues that exist in the MAs literature and the future directions of this important field.