1 Introduction

Optimization is the process of improving on a current solution. In engineering design, historically, optimization has often been performed manually where designers use intuition to produce solutions to problems so that the solution performs better than the initial starting point. However, it has now become commonplace to couple numerical optimization algorithms with computational analysis to provide a robust engineering optimization approach and an example of this is the coupling of iterative computational fluid dynamics (CFD) methods to optimization algorithms to produce an aerodynamic shape optimization (ASO) process (Jameson et al. 1998; Martins et al. 2004; Hicken and Zingg 2010; Allen and Rendall 2013). The solution to many engineering optimization problems requires feasibility; constraints appear on the total cost, or other physical barriers to the solution, that must be adhered to. Mathematically, this can be described by statement 1; a single objective constrained optimization problem.

$$\begin{aligned} \begin{aligned}&\underset{{\mathbf {x}} \in {{\mathbf {S}}} \in \mathfrak {R}^D}{\text {min}}&f({\mathbf {x}}) \\&\text {subject\,to}&{\mathbf {g}}({\mathbf{x}}) \le {\mathbf {0}} \\&&{\mathbf {h}}({\mathbf {x}}) = {\mathbf {0}} \\ \end{aligned} \end{aligned}$$
(1)

In statement 1, \({\mathbf {x}}\) is the solution vector \([x^1,x^2,\dots ,x^D]^T\) where each element of the vector is a design variable; \(f({\mathbf {x}})\) is the value of the objective function for the given solution vector; \({\mathbf {g}}({\mathbf {x}})\) represents inequality constraints and \({\mathbf {h}}({\mathbf {x}})\) represents equality constraints; \({{\mathbf {S}}}\) is the bounded region of \(\mathfrak {R}^D\) where the solution must lie, which has an upper bound in the kth dimension, \(U^k\), and a lower bound, \(L^k\), where \(k=\{1,2,\dots ,D\}\).

The choice of optimization algorithm for solving statement 1 is often driven by the degree of multimodality present in the problem, where global search algorithms are popular for multimodal problems. Many global search algorithms are agent-based, so use a set of agents who evolve and move by various mechanisms (often inspired by nature) to provide robust design space interrogation. However, to handle constraints, often an ad-hoc method is used that is added onto the optimization algorithm which usually either alters the optimization problem or the algorithm itself. Many of the state-of-the-art constraint handling methods are incompatible with all agent-based algorithms so it would be an advantage for a constraint handling method to be applicable for use with any agent-based optimization algorithm. A user of the framework could then wrap it around any swarm algorithm chosen and perform constrained optimization. Hence, a novel generic framework has been developed which is designed for use with any agent-based optimizer and is presented in Sect. 4 of this paper. This new framework is coupled to four different agent-based optimization algorithms, and compared to using other commonly employed constraint handling frameworks for purely analytical problems (Sect. 5) and some engineering benchmark problems (Sect. 6). It is also demonstrated on a constrained aerodynamic shape optimization problem (Sect. 7). The following two sections first present a brief outline of agent-based optimization and current methods for handling constraints.

2 Agent-based search algorithms

This section introduces the agent-based search algorithm system. A system of agents that form an agent-based search algorithm is made up of N individuals. The nth agent in the population has a location within the search space defined by a vector of design variables \({\mathbf {x}}_n=[ x_n^1, x_n^2, \dots , x_n^D ]^T\) where D is the number of design variables), which has an initial position within the search space bounds hence \({\mathbf {L}} \le {\mathbf {x}}_n(0) \le {\mathbf {U}}\). The optimizers used in this work are briefly outlined below, with references provided for more in-depth discussions of the algorithms. It is important to note that the motivation of the work presented herein is not whether one swarm algorithm is superior to another, but it is the development and performance analysis of the new framework when coupled to a number of different optimizers. A comprehensive review and archive of swarm intelligence can be found in Engelbrecht (2005).

2.1 Particle swarm optimization (PSO)

The first algorithm considered is PSO (Kennedy and Eberhart 1995). PSO uses knowledge of the cognitive (individual) and social (population) history of the search to construct a search procedure. Equation 2 gives the velocity expression used in a basic PSO optimizer which is used to calculate the movement of an agent.

$$\begin{aligned} {\mathbf {v}}_n(t+1) = w(t) {\mathbf {v}}_n(t) + c_1 {\mathbf {r}}_{1_n}({\mathbf {p}}_n-{\mathbf {x}}_n(t)) + c_2 {\mathbf {r}}_{2_n}({{\mathbf {s}}}-{\mathbf {x}}_n(t)) \end{aligned}$$
(2)

In Eq. 2 the subscript n is the variable for the nth agent in a swarm of N agents; w is the inertia weight; \({\mathbf {p}}_n\) is best position found by the nth agent so far during the optimization, which gives the cognitive aspect of the search; \({{\mathbf {s}}}\) is the best position found by the whole swarm so far (it should be noted that this location is often denoted as \({\mathbf {g}}\), but to avoid using the same nomenclature to represent constraints, the symbol \({{\mathbf {s}}}\) is used throughout this paper) which gives the social aspect of the search. The individuals’ and swarm’s best positions give the algorithm a memory quality that help drive it towards the globally optimal solution. The vectors, \({\mathbf {r}}_{1_n}\) and \({\mathbf {r}}_{2_n}\) are made up of D different random numbers (i.e., for each dimension, a new random number is used) that add a stochastic nature to the algorithm and are uniformly distributed between 0 and 1. The constants, \(c_1\) and \(c_2\), are the cognitive and social parameters respectively which give the local and global search extent of the scheme. Full reviews of implementations and performance aspects of PSO are presented in Poli et al. (2007) and Clerc (2013).

2.2 Gravitational search algorithm (GSA)

The second algorithm considered is the gravitational search algorithm (GSA) (Rashedi et al. 2009). GSA is an agent-based global search algorithm for unconstrained global optimization where the principles of basic Newtonian mechanics act as the basis on which the algorithm is constructed. The agents in GSA act as masses, where an agent’s mass is related to its fitness. This information is propagated through the population by global gravitational attractive forces which act as a vehicle to allow each agent in the population to have knowledge of the fitness of all other agents in the population, leading to an efficient optimization process.

The force acting on the agents is controlled by a gravitational constant, which exponentially decays from G(0) at the start of the optimization procedure by a decay constant \(\alpha\). An acceleration is then calculated from the force acting on an agent and its mass, from which the movement can be calculated.

GSA has been considered in this work to test constraint handling on a swarm optimizer that is constructed slightly differently to PSO; in GSA there is a global transfer of data between all agents in the population. The number of computations to evaluate the movement of agents is slightly higher than PSO, however, it has been shown to be an effective algorithm.

2.3 Hybrid gravitational search particle swarm (HGSAPSO)

Individually, both PSO and GSA are effective at performing global optimization, however, researchers have also considered the performance of a hybrid version of PSO and GSA (Mirjalili et al. 2012; Tsai et al. 2013). This hybrid GSA/PSO algorithm has no standard name in the literature so for clarity, in this paper it is referred to as hybrid-GSA-PSO (HGSAPSO) and is the third algorithm considered. The HGSAPSO algorithm merges the memory qualities of PSO with the global knowledge qualities of GSA to provide a search algorithm that is superior to both.

The two are merged by adding together a weighted combination of the acceleration from PSO, \({\mathbf {a}}_{{pso}_n}\) (which is the right two terms of the velocity of PSO), which is given by:

$$\begin{aligned} {\mathbf {a}}_{{pso}_n}(t)=c_1 {\mathbf {r}}_{1_n}({\mathbf {p}}_n-{\mathbf {x}}_n(t)) + c_2 {\mathbf {r}}_{2_n}({{\mathbf {s}}}-{\mathbf {x}}_n(t)) \end{aligned}$$
(3)

with the acceleration from GSA, \({\mathbf {a}}_{{gsa}_n}\), to give:

$$\begin{aligned} {\mathbf {a}}_n(t)=W {\mathbf {a}}_{{gsa}_n}(t) + (1-W) {\mathbf {a}}_{{pso}_n}(t) \end{aligned}$$
(4)

where W provides tuning between the memory qualities of PSO (which are emphasised more for lower W) and the knowledge transfer qualities of GSA (which are emphasised more for higher W). The velocity and movement are calculated using the same method as GSA, hence, if \(W=0\) then the PSO velocity equation can be recovered with a random inertia term.

2.4 Differential evolution

Differential evolution (DE) (Storn and Price 1995) is the final algorithm considered in this work. It is a swarm intelligence algorithm built around the concept of evolutionary mechanics and is specifically designed for continuous optimization. An in-depth review of the use of DE and its developments is given in Das and Suganthan (2011). To summarise, DE uses mutation, crossover and selection steps to create candidate solutions, child solutions and new generations. The mutation stage involves the production of a new, candidate solution to introduce variability and exploration into the algorithm. The new solution is a weighted combination of three randomly chosen agents within the swarm, where a factor F is used to control the mutation. The crossover stage is used to enhance diversity in the population by combining aspects of the new and old solutions based on a crossover probability, CR. The new solution is accepted if its fitness is better than the old solution.

3 Constraint handling in agent-based search algorithms

Constraint handling is the name of the group of approaches developed to solve constrained optimization problems using agent-based optimization algorithms. The most common method are by penalty functions, special operators or separation approaches. These three methods are briefly outlined here, though a comprehensive review of constraint handling in nature-inspired numerical optimization algorithms has been presented previously (Mezura-Montes and Coello Coello 2011).

The principle of penalty functions is that a constrained optimization problem can be transformed into an unconstrained one by incorporating the constraints in an augmented objective function. The augmented objective function can then be directly solved by the optimizer. Common approaches are a death-penalty (Hu and Eberhart 2002), static penalty (Venter and Sobieszczanski-Sobieski 2003) or dynamic penalty (Parsopoulos and Vrahatis 2002). Penalty functions tend to be simple to implement and therefore are a good option for users wishing to obtain an optimized result quickly, however, the performance requires careful selection of the penalty parameters. Most penalty approaches are generally applicable to any agent-based search algorithm and this makes their performance particularly important for comparison in this work.

The special operators work on the principle that a feasible solution is better than an infeasible one, and as such manipulate the underlying search algorithm to drive the solution towards the feasible space. The first methods of this nature were based on the idea of feasible directions (Vanderplaats 1999), and have since been developed into more sophisticated approaches (Venter and Sobieszczanski-Sobieski 2003; Sun et al. 2011; Lu and Chen 2008). An obvious prerequisite of these operators is the requirement for at least one agent to be initially feasible. This can be problematic when the feasible space is small compared to the whole design space, or if equality constraints are present, where the feasible design space is a \(D-1\) manifold, i.e., a line for two dimensional space. Furthermore, they also tend to alter the natural self-organising swarm dynamics that cause the algorithm to be effective at optimizing search spaces.

Finally, separation approaches are also common. These are the antithesis to penalty functions, where instead of combining the constraint violation and objective function, the two are optimized separately (Lu and Chen 2006; Liang and Suganthan 2006). These methods can generally be split into “hard feasibility rules” (binary tournament selection is an example of this Deb 2000) and “soft feasibility rules” (\(\alpha\)-constrained (Takahama and Sakai 2005a) and \(\epsilon\)-constrained (Takahama and Sakai 2005b) methods are examples). Care has to be taken with methods such as these that premature convergence is not obtained, or that many user-defined tuning parameters are not added.

Many constraint handling techniques have been proposed for coupling to agent-based search algorithms, however, not all techniques are suitable for coupling to all agent-based optimizers. GSA, for example, is particularly difficult to couple to many constraint handling techniques. The global transfer of data that occurs in GSA (due to the global attractive forces) means data can be transferred from the infeasible to the feasible region. In reality, what this means is that if, say a penalty function is used, then the masses of the feasible agents may only vary by a small amount due to the (possibly) very large difference in objective function that can exist between the feasible and infeasible agents. This can severely restrict the ability for GSA to locate the feasible globally optimal solution. Furthermore, GSA is an example of a swarm algorithm that uses the fitness function directly in its update scheme. This makes GSA (and its derivatives such as HGSAPSO) difficult to couple to both traditional and state-of-the-art constraint handling frameworks without compromising the performance of either GSA or the constraint handling framework.

GSA is used here as an example but this argument can be made for other agent-based algorithms too, hence it would be advantageous for a constraint-handling framework to be compatible with any agent-based algorithm. A user of the framework could then wrap it around any swarm algorithm chosen and perform constrained optimization. This is the point that is dealt with in this paper and has led to the development of a generic constraint handling framework that is suitable for coupling to any agent-based search algorithm.

4 Proposed constraint handling framework

The development of the framework proposed in this work is driven by the requirement to have a general constraint handling method that can be used for any agent-based optimization algorithm. Ideally, modifications to the fundamentals of a given optimizer should be avoided as this can alter the underlying self-organising swarm intelligence of the algorithm, but instead, the constraint handling framework should be able to be added on to an existing optimizer. The approach that is presented here adheres to this requirement and works with the general agent-based optimizer system. Discussion points that are considered in the framework’s development process are laid out below, followed by a formal explanation of the constraint handling framework itself.

4.1 Development points for new framework

The requirement highlighted in this paper is to have a generic constraint handling framework to be used with any agent-based optimizer, i.e., a user should be able to add this framework onto an already existing unconstrained optimizer. Therefore, to be able to do this it is proposed that the total population of N agents be split into two independent sub-swarms, where one sub-swarm contains entirely feasible agents and the other is entirely infeasible agents. The constraint violation, \(\phi ({\mathbf {x}}_n)\) is defined as follows:

$$\begin{aligned} \phi ({\mathbf {x}}_n)=\sum _{k=1}^{G} \langle g_k({\mathbf {x}}_n) \rangle + \sum _{k=1}^{H} |h_k({\mathbf {x}}_n)| \end{aligned}$$
(5)

where G and H are the number of inequality and equality constraints respectively, and \(\langle g_k({\mathbf {x}}_n) \rangle\) is the constraint violation of the inequality constraint. An infeasible agent is defined as being one where \(\phi ({\mathbf {x}}_n)>0\). At any iteration during the search there will be \(N_f\) feasible and \(N_{\times }\) infeasible agents (Eq. 6 must therefore hold). It should be noted, however, that \(N_f\) and \(N_{\times }\) may change at every iteration as agents migrate from the feasible to infeasible space and vice-versa.

$$\begin{aligned} N=N_f + N_{\times } \end{aligned}$$
(6)

By splitting the population into sub-swarms, the constrained optimization problem can also be split into two separate problems. Hence, the framework uses the problem separation approach where the feasible agents optimize the value of the objective function and the infeasible agents optimize the constraint violation, as shown in Eq. 7 where \(\zeta\) represents the objective function of an agent; this formulation necessitates the use of sub-swarms.

$$\begin{aligned} \begin{aligned}&&\zeta ({\mathbf {x}}_n) = \left\{ \begin{array}{l l} f({\mathbf {x}}_n) &{} \quad \text {if }\phi ({\mathbf {x}}_n)\le 0\\ \phi ({\mathbf {x}}_n) &{} \quad \text {else} \\ \end{array} \right. \\ \end{aligned} \end{aligned}$$
(7)

The solution that the infeasible swarm is searching for is located at every point in the feasible region, hence this acts as a mechanism to firstly find the feasible space, and secondly keep the agents there. The solution that the feasible swarm is searching for is at the global minimum of the feasible region, which is the solution to the constrained optimization problem given by statement 1, assuming a feasible solution exists.

By considering sub-swarms, separate agent-based search algorithms can be used for each of the two sub-problems. Hence, a user’s already existing unconstrained optimizer can be used to solve the feasible part of Eq. 7 while being independent of the infeasible part. This avoids unwanted dilution of the feasible data by the infeasible data and promotes the performance qualities of a user’s swarm optimizer. To solve Eq. 7 for the infeasible sub-swarm any agent-based search algorithm can be used, however, for this constraint handling framework, a simple PSO is suggested to take advantage of the memory components. The \({\mathbf {p}}_n\) and \({{\mathbf {s}}}\) positions, which are the individual and global best positions found so far should be made feasible if they can be feasible to allow a velocity component to point towards the feasible region. This can be done using a binary tournament selection procedure. While a simple PSO movement procedure is suggested, in fact, a user could use any swarm algorithm for the unconstrained problem too. This would add further flexibility to the framework.

4.2 Separation-sub-swarm (3S) framework

The constraint handling framework described in this paper is hereafter called separation-sub-swarm (3S), and is fully outlined in this section for completion. At any iteration, t, during the optimization there are a constant number of N agents in the whole swarm, where an individual agent is the nth agent of the whole swarm. The objective function and constraint values of each agent must be calculated, and from that the population can be split into a sub-swarm containing \(N_f\) feasible agents (all constraints are satisfied) and \(N_{\times }\) infeasible agents (at least one constraint is violated). An individual agent in the feasible sub-swarm is the ith feasible agent, whereas an individual in the infeasible sub-swarm is the jth infeasible agent.

The objective function of the ith agent in the feasible swarm is:

$$\begin{aligned} \zeta ({\mathbf {x}}_i) = f({\mathbf {x}}_i) \end{aligned}$$
(8)

and the objective function of the jth agent in the infeasible swarm is:

$$\begin{aligned} \zeta ({\mathbf {x}}_j) = \phi ({\mathbf {x}}_j) \end{aligned}$$
(9)

The movement procedure of the infeasible agents in the population is by a simple PSO so all agents in the population (including the feasible ones) must have their best position, \({\mathbf {p}}_n\), updated. The swarm’s best position, \({{\mathbf {s}}}\), also needs to be updated. A domination procedure based on a tournament selection is used, where the domination operator \(\prec\) is used to determine the domination between two locations in the search space, \({\mathbf {x}}_a\) and \({\mathbf {x}}_b\). The domination operator then works as follows:

$$\begin{aligned} {\mathbf {x}}_a \prec {\mathbf {x}}_b \Leftrightarrow \left\{ \begin{array}{l} f({\mathbf {x}}_b)<f({\mathbf {x}}_a)\;\text { and }\;\phi ({\mathbf {x}}_a),\phi ({\mathbf {x}}_b)\le 0 \\ \phi ({\mathbf {x}}_b)\le 0\; \text { and } \;\phi ({\mathbf {x}}_a)>0 \\ \phi ({\mathbf {x}}_b)<\phi ({\mathbf {x}}_a) \;\text { and }\; \phi ({\mathbf {x}}_a),\phi ({\mathbf {x}}_b)>0 \\ \end{array} \right. \end{aligned}$$
(10)

and

$$\begin{aligned} {\mathbf {x}}_b \mapsto {\mathbf {x}}_a \Leftrightarrow {\mathbf {x}}_a \prec {\mathbf {x}}_b \end{aligned}$$
(11)

Hence, if \({\mathbf {x}}_a\) is dominated by \({\mathbf {x}}_b\) then \({\mathbf {x}}_a\) is updated with \({\mathbf {x}}_b\). \({\mathbf {p}}_n\prec {\mathbf {x}}_n\) and \({\mathbf {s}}\prec {\mathbf {x}}_n\) are used to update \({\mathbf {p}}_n\) and \({{\mathbf {s}}}\). The velocity and movement of the jth infeasible agent is then calculated as:

$$\begin{aligned} {\mathbf {v}}_{j}(t+1)= & {} {\mathbf {r}}_{0_j} {\mathbf {v}}_{j}(t)+ c_1 {\mathbf {r}}_{1_j}({\mathbf {p}}_j-{\mathbf {x}}_j(t)) + c_2 {\mathbf {r}}_{2_j}({\mathbf {s}}-{\mathbf {x}}_j(t)) \end{aligned}$$
(12)
$$\begin{aligned} {\mathbf {x}}_j(t+1)= & {} {\mathbf {x}}_j(t) + {\mathbf {v}}_j(t) \end{aligned}$$
(13)

where \({\mathbf {r}}_{0_j}\) is a D long vector of random numbers, analogous to \({\mathbf {r}}_{1_j}\) and \({\mathbf {r}}_{2_j}\). The pseudocode of the 3S framework is given in Algorithm 1 where the highlighted line of the code is where the user would add their selected agent-based search algorithm for updating the feasible swarm. This demonstrates that the 3S framework is compatible with any agent-based optimizer with no modification required to the mechanics of a user’s swarm algorithm.

figure a

When implementing the 3S framework to handle constraints with any swarm optimizer, tracking of the swarm’s best position means that the user can track the feasibility of the optimum design found so far. At the end of the optimization, if no feasible solution has been found then the value stored in \({\mathbf {s}}\) is the closest solution to the feasible region that could be found.

5 Constrained analytical optimization

The 3S framework for handling constraints when using agent-based optimization algorithms is outlined fully in Algorithm 1. This framework is compatible with any agent-based algorithm, the user need only add the framework to their optimizer to be able to handle constraints. To demonstrate this, constrained analytical optimization is performed in this section for four different swarm algorithms: PSO, GSA, HGSAPSO and DE. The simple PSO with inertia is used to demonstrate the effect of using 3S with a PSO-based algorithm; it should be noted that many other PSO algorithms are also available though the goal of this work is not to prove whether one agent-based algorithm is superior to another, but to prove that the newly designed 3S framework is suitable for use with any agent-based algorithm.

5.1 Methods used for comparison

To compare the performance of 3S, other constraint handling frameworks that are also suitable for use with any swarm optimizer are also tested. These methods are a death penalty method, static penalty method, dynamic penalty method and feasible directions method. The formulations of these methods are also detailed below.

The death penalty, which is the simplest method implemented, randomly reinitialises an infeasible agent.

For the static penalty, the formulation of the augmented objective function is given as:

$$\begin{aligned} \zeta ({\mathbf {x}}_n) = f({\mathbf {x}}_n) + \theta \phi ({\mathbf {x}}_n) \end{aligned}$$
(14)

where \(\theta\) is the static penalty factor. The penalty factor was chosen based on comparing three orders of magnitude factors (\(\theta =1,10,100\)) on the G7 problem using PSO (details of runs are found later in Sect. 5.2). When \(\theta =1\), from the 25 runs, not one was found feasible and it appears that this factor is not large enough to create a penalty suitable to find a feasible solution. However, when \(\theta =10\) or \(\theta =100\) the feasibility rate was 100%, but the standard deviation of the \(\theta =10\) factor was approximately half of when \(\theta =100\) indicating that a too high penalty factor weights feasibility over exploitation and therefore inhibits a good optimum being found. Hence, a penalty factor of \(\theta =10\) was chosen to give a reasonable balance between feasibility and exploitation.

The more complicated dynamic penalty formulation (Parsopoulos and Vrahatis 2002) is:

$$\begin{aligned} \zeta ({\mathbf {x}}_n) = f({\mathbf {x}}_n) + \kappa \left( \sum _{k=1}^{G} \theta _k \langle g_k({\mathbf {x}}_n) \rangle ^{\gamma _k} + \sum _{k=1}^{H} \theta _k |h_k({\mathbf {x}}_n)|^{\gamma _k} \right) \end{aligned}$$
(15)

where \(\kappa\) is the dynamic penalty which is given by \(t\sqrt{t}\) at the tth iteration; \(\theta _k=10\) if \(\langle g_k\rangle ,|h_k|<0.001\), else \(\theta _k=20\) if \(\langle g_k\rangle ,|h_k|<0.1\), else \(\theta _k=100\) if \(\langle g_k\rangle ,|h_k|<1.0\), otherwise \(\theta _k=300\); \(\gamma _k=1.0\) if \(\langle g_k\rangle ,|h_k|<1\), otherwise \(\gamma _k=2\).

The feasible direction approach used here is based on an approach developed for PSO (Venter and Sobieszczanski-Sobieski 2003) where a manipulation of the velocity vector is made if an agent is infeasible to attempt to force it towards the feasible region. Hence for the jth infeasible particle, the velocity vector is constructed as:

$$\begin{aligned} {\mathbf {v}}_j(t+1) = {\mathbf {r}}_{0_j} ({\mathbf {s}}(t)-{\mathbf {x}}_j(t)) \end{aligned}$$
(16)

where \({\mathbf {s}}\) is the swarm’s best position found, which is feasible if possible. If this is feasible then the velocity vector will always point towards the feasible region. As before, the update of the swarm’s best position is by the domination operator, hence \({\mathbf {x}}_n \mapsto {\mathbf {s}} \Leftrightarrow {\mathbf {s}} \prec {\mathbf {x}}_n\).

5.2 Run details

The parameter values used for GSA, PSO, HGSAPSO and DE are given in Table 1 where N is the number of agents, T is maximum number of iterations, G(0) is the initial gravitational constant, \(\alpha\) is the gravitational decay constant, \(c_1\) is the cognitive parameter, \(c_2\) is the social parameter, w is the inertia factor, and W controls the influence of GSA and PSO in HGSAPSO respectively. These values were chosen as they reflected parameter values found in the literature and gave all constraint handling frameworks good performance. If any agent exits the bounds of the design space, i.e., if \({\mathbf {x}}_i \notin {\mathbf {S}}\) (as given in statement 1) then it is reinitialised in its last position. The stopping criteria is until the number of function evaluations reaches \(nf_{max}\). This is kept constant through all the tests to ensure fairness of comparison between the different constraint handling techniques.

Table 1 Parameter values used in GSA, PSO, HGSAPSO and DE

The CEC2006 constrained analytical function suite (Liang et al. 2013) is used in this paper for testing the performance of the constraint handling techniques. The suite contains 24 test cases that are all minimization problems and contain various numbers of linear and non-linear inequality and equality constraints, various sizes of feasible search space, and various types of objective function. Cases 20 and 22 are omitted due to not having a feasible solution within the bounds of the stated design space. The nature of the cases are outlined in Table 2. Equality constraints are transformed into inequality constraints to within a small tolerance: \(|h_j({\mathbf {x}})|-\epsilon \le 0\).

Table 2 Summary of 24 analytical test cases, where \(\rho\) is the ratio of the feasible search space to the whole search space, and \(G_L\), \(G_N\), \(H_L\), \(H_N\) represent the number of linear inequalities, nonlinear inequalities, linear equalities and nonlinear equalities respectively, a is number of active constraints at solution

5.3 Results

25 independent runs of each of the test functions using each constraint handling framework with each swarm algorithm were performed when testing. The degree of violation for the equality constraints was \(\epsilon =0.0001\), so as a result optimum solutions better than the theoretical optimum were possible. The error was calculated as the difference between the best feasible solution and the analytical minimum objective value stated in the CEC 2006 definitions (Liang et al. 2013). The final results were ranked based on the final error in the solution. Any infeasible solutions were ranked after any feasible ones, and by their final constraint violation. The best result (the one with the lowest error) of the 25 runs using the five different constraint handling methods with GSA, PSO, HGSAPSO and DE are presented in Tables 3, 4, 5 and 6. Also presented are the standard deviation of all of the feasible results and the number of runs that resulted in a feasible solution. If an infeasible solution results from any of the runs then the worst value is presented as INF, meaning infeasible. If feasible solutions are available then the best results presented are from the feasible solutions only, hence if no feasible solution can be found from the 25 runs, then the best solutions will be infeasible. The average feasibility rate of each constraint handling technique is shown graphically in Fig. 1. The feasibility rate is the percentage of runs that found a feasible solution. Example convergence plots for the best solution found for a number of functions using different swarm algorithms are shown in Fig. 2.

Fig. 1
figure 1

Average feasibility rates

Fig. 2
figure 2

Convergence history examples

Table 3 Feasible results of 24 analytical function test suite optimized using various constraint handling methods with GSA swarm (INF means no feasible solution could be found)
Table 4 Feasible results of 24 analytical function test suite optimized using various constraint handling methods with PSO swarm (INF means no feasible solution could be found)
Table 5 Feasible results of 24 analytical function test suite optimized using various constraint handling methods with HGSAPSO swarms (INF means no feasible solution could be found)
Table 6 Feasible results of 24 analytical function test suite optimized using various constraint handling methods with DE swarms (INF means no feasible solution could be found)

The performance of the 3S constraint handling framework compared to the other generic frameworks of penalty functions and feasible directions demonstrates that, overall, this new framework produces superior performance, both in terms of finding a feasible solution and finding the best feasible solution. In terms of finding a feasible solution through the optimizations, Fig. 1 demonstrates that the 3S framework, overall, outperforms all of the other methods tested. The feasibility rate for 3S is consistently above 90% for all of the swarm algorithms tested demonstrating that the 3S framework can be described as a generic constraint handling framework. The primary reason for the framework to be able to find a feasible solution at such a high rate is possibly due to the domination procedure that applies a lexicographic ordering that enforces feasibility to precede anything else. This introduces a velocity component that points, at least, towards a less infeasible solution, or towards feasibility is a feasible solution is known. Secondary to this is also the separation that takes place in the algorithm which inhibits data being transferred from the infeasible search to the feasible search. This benefits the search properties of the user specified swarm algorithm, which are promoted.

A clear picture of the quality and repeatability of the final results obtained can be sought by considering the standard deviation and the number of feasible runs. In particular, the 3S framework has consistently small standard deviations and large numbers of results that gave a feasible solution and is homogeneous across the four swarm algorithms tested. It is interesting to note, that when coupling to DE, 3S provides best solutions that are, overall, closer to the best available solution than the other swarm algorithms. Another example of where DE has high performance is when using feasible direction with DE. In general, the feasible directions framework has poor results both in terms of feasibility, and quality of the final solution. This performance is, however, vastly improved when considering DE as the swarm algorithm, over PSO, GSA and HGSAPSO.

Comparing Fig. 2a with b and e with f allows comparisons of the performance of the constraint handling frameworks on problems containing different numbers of inequality and equality constraints. Figure 2a is the convergence of the frameworks with GSA on problem G11, which has 1 equality constraint, whereas, Fig. 2b is the convergence of the frameworks with GSA on problem G16, which has 38 inequality constraints. An equality constrained problem effectively acts as an optimization along a line, hence represents a difficult optimization problem. The 3S framework, however, has performed well in both cases, producing a final error that is multiple orders of magnitude lower than the other frameworks for both cases. This is a trend that is seen when considering the use of DE as the swarm algorithm too. Comparing Fig. 2e (1 equality constraint) with Fig. 2f (2 inequality constraints) shows, again, that the 3S framework performs well with both inequality and equality constrained problems.

If rate of convergence is a limiting factor for making a choice of which constraint handling framework to use then the 3S framework allows the limiting factor of the convergence to be driven by the attached swarm optimizer. This is particularly useful in examples where the number of objective function evaluations should be kept to a minimum. This behaviour is demonstrated by considering the different rates of convergence of Fig. 2c (HGSAPSO), d (PSO) and f (DE). The convergence rates of all three of these optimizers are considerably different, and are driven by the chosen swarm algorithm. These three figures also demonstrate, unlike the other methods, that the 3S framework does not restrict the high rates of convergence seen with methods such as DE and PSO. This functionality of the 3S framework comes about due to splitting the population into independent swarms depending on feasibility. The coupling of other high performance swarm optimizers is therefore facilitated without restricting the performance of these optimizers.

6 Engineering benchmark problems

The 3S framework is also tested on a number of common mechanical engineering benchmark problems. Again, as with the CEC2006 suite of problems, the 3S framework is used with four different swarm algorithms (GSA, PSO, HGSAPSO and DE) and is compared to the other generic constraint handling methods outlined in this paper (death penalty, static penalty, dynamic penalty and feasible directions). The three engineering benchmark cases investigated represent commonly used problems to investigate algorithm performance. The chosen problems are (1) welded beam design (Rao 2013) (2) pressure vessel design (Kannan and Kramer 1994) and (3) spring design (Belegundu and Arora 1985).

Table 7 Summary of engineering benchmark problems

A summary of the number of design variables, design variable designations and number of constraints (all are inequalities) are shown in Table 7. The problem geometries are shown in Figs. 3, 4 and 5 respectively. Hence, the welded beam problem has four continuous design variables which are weld thickness h, weld length l, bar thickness t and bar width b; the pressure vessel problem has design variables are the thickness of the cylinder, \(T_s\), the thickness of the head, \(T_h\), the radius of the cylinder, R, and the length of the cylinder L; the spring problem variables are the number of coils of the spring, N (i.e., the total length of the material to produce the spring), the maximum diameter of the spring, D, and the thickness of the material, d. The pressure vessel problem is a mixed integer problem. That is, \(x_1\) and \(x_2\) must be integer multiples of 0.0625 to represent the different gauges of steel available. In the implementation of the swarm algorithms an integer design variable is derived from a continuous set by rounding the continuous variable to the nearest integer multiple of 0.0625. It should be noted that the four constraints are not related to the requirement of having integer variables, and these constraints are separate to this requirement.

Fig. 3
figure 3

Welded beam design problem

Fig. 4
figure 4

Pressure vessel design geometry

Fig. 5
figure 5

Spring design geometry

Again, the various constraint handling frameworks combined with the swarm optimizers are run on the three benchmark problems 25 times to obtain a distribution of results. The parameter values used in the algorithms are those used in the CEC2006 test (Table 1). The results of the runs for each of the cases are shown in Tables 8, 9 and 10 and Fig. 6. Tables 8, 9 and 10 also gives the number of constraints that satisfy a given tolerance, which have been taken from the best solution of the engineering benchmark cases, i.e., this is the total number of constraints that satisfy \(g_k({\mathbf {x}}) \le e\) for the best solution, where e is the chosen tolerance. The discussion of the results follows in the next sections after the definition of each of the problems. To obtain a comparison of how the 3S method is working, as well as comparing to the other constraint handling methods outlined in this paper, previously published results are also given in a subsequent section.

Fig. 6
figure 6

Convergence histories of engineering benchmark problems using DE

Table 8 Results of welded beam design
Table 9 Results of pressure vessel design
Table 10 Results of spring design

6.1 Welded beam design

The results of the 25 runs using each of the constraint handling frameworks with each of the swarm algorithms is shown in Table 8. It can be seen, like in the CEC2006 runs, that the 3S framework performs consistently better than the other frameworks. The best, median and mean solutions for all the swarm algorithms is regularly lower when using the 3S framework. Furthermore, the standard deviation is always low, indicating that 3S is a very consistent algorithm compared to the others tested. Again, feasible directions performs the poorest of the constraint handling methods tested, however, the static penalty performs on par with 3S when using the PSO and HGSAPSO algorithms. It is also clear, from Table 8 that the 3S framework is able to converge the active constraint values right down when coupled with both DE and PSO.

An example convergence history of the welded beam problem when using DE with the five constraint handling frameworks is shown in Fig. 6a. Only the first 20000 function evaluations are shown just to demonstrate the convergence. It can be seen that when using 3S for constraint handling, the high convergence rate associated with DE is not compromised and the solution is converged onto within only a few iterations of the optimization. An optimum solution is converged onto in less than 15,000 function evaluations. A high convergence rate was also observed in the CEC2006 runs. The separation of the two swarms promotes good convergence characteristics by allowing the feasible swarm to inherit the convergence characteristics of the parent swarm algorithm, and this is an advantage of using 3S.

6.2 Pressure vessel design

The results of the tests of running the optimizations using each of the swarm algorithms with each of the constraint handling frameworks 25 times are shown in Table 9. It can be seen that over all four of the swarm algorithms tested that the 3S framework has outperformed the other constraint handling frameworks. It has produced lower best solutions as well as smaller standard deviations, again indicative of an effective algorithm. The death penalty constraint handling method when used with DE is the only method over all of those tested that performs on par with the 3S framework. Convergence tolerance is similar to the welded beam examples, again showing that 3S is able to converge down the active constraints to very small tolerances.

Figure 6b shows the convergence history of the five constraint handling methods tested when used with DE. The full convergence history is shown (up to the maximum number of allowed function evaluations) to fully compared the constraint handling methods. As in the welded beam problem, the convergence in the pressure vessel problem of the 3S framework is very quick, converging within 10,000 function evaluations. Again, this demonstrates that using the 3S framework does not restrict the good convergence properties of DE, but instead promotes them. Figure 6b also shows that while other commonly used constraint handling frameworks (such as the dynamic penalty and feasible directions) produce final answers that are comparable to the results obtained using 3S, the convergence of these constraint handling methods is relatively poor.

6.3 Spring design

The final results are shown in Table 10. In this design problem, the 3S framework produces optimization results that are always lower than every other constraint handling framework tested with all four of the swarm algorithms used. The median and mean solution also mirrors this result. Finally, the standard deviation is consistently small. These results further demonstrate the high performance of 3S both in terms of ability to find a low optimum solution and ability to robustly find that solution over multiple runs.

A convergence history for the first 20,000 function evaluations is shown in Table 6c. This demonstrates, as in the other engineering benchmark cases tested, the high convergence of a swarm algorithm with 3S used to handle the constraints. A solution that is close of the optimum is found within as few as 2000 function evaluations, and a converged solution is obtained within 10,000 function evaluations. The 3S method allows both a better solution to be obtained, as well as obtaining it with a quicker convergence rate.

6.4 Comparison to previously published results

To obtain a comparison of how the 3S method is working, as well as comparing to the other constraint handling methods outlined in this paper, previously published results are also given. The previously published results all represent good optimization results, with low optimum solutions. It should be noted that, unlike the CEC2006 functions, the optimum location of these engineering design problems is not known exactly. It therefore makes sense to compare to other results previously published. The results compared to are (1) Coello Coello (2000); (2) Mezura-Montes and Coello Coello (2005); (3) Parsopoulos and Vrahatis (2005); (4) He and Wang (2007); (5) Kim et al. (2010); (6) Akay and Karaboga (2012); (7) Brajevic and Tuba (2013); (8) Sadollah et al. (2013); (9) Gandomi (2014); (10) Baykasoglu and Ozsoydan (2015); (11) Salimi (2015). These represent a spread in the most competitive results obtained for these benchmarks using both historically successful algorithms and more recent state-of-the-art algorithms. For all of the engineering benchmark problems tested, the 3S framework combined with the differential evolution swarm optimizer produces the best solution with the lowest objective function (compared to using GSA, PSO or HGSAPSO). Hence, the solutions found using the this combination are compared to the previously published results, and this is outlined in Table 11.

Table 11 Comparison of statistical results in previous publications with 3S-DE

The comparison to the previously published results demonstrate, and emphasise the good performance of the 3S framework for handling constraints. The results compare well, and often outperform other results previously published. A particularly encouraging result is that the mean value of the runs performed using the 3S-DE algorithm is lower than all other results highlighted in all of the benchmark functions tested. The best solution found is always either lower or at least as good as all of the results highlighted; an important result.

7 Aerodynamic design example

The performance of the new constrained optimization framework has been successfully demonstrated for a suite of analytical test functions and engineering benchmark functions. The final aspect of the work presented in this paper is demonstration of the framework within a typical ASO process, which is presented in this section.

ASO is the process of numerically solving the optimum design problem for aerodynamic bodies at specific flow conditions. Due to the high cost of the objective function evaluation, which is a CFD solution, the choice of optimization algorithm becomes important, and often, performing a low number of full CFD solutions is the primary driver for choosing a specific class of algorithm. Hence, agent-based algorithms tend not to be chosen, and instead gradient-based methods are employed. However, in a number of independent studies, multiple local optimum solutions for ASO problems have been demonstrated (Namgoong et al. 2002; Khurana et al. 2010; Leung and Zingg 2012), so the use of agent-based algorithms is important for performing aerodynamic design. As such, ASO is a suitable application area to consider the 3S constraint handling framework.

7.1 Geometry and mesh control

The shape parameterization and mesh deformation module must be flexible enough to allow sufficient design space investigation, robust enough to be applicable to any geometry, and efficient enough to maximise design space coverage with a minimum number of design parameters. The integration of global search algorithms into the aerodynamic shape optimization process necessitates the requirement for a minimum number of design variables to reduce computational burden as much as possible. An efficient aerodynamic optimization process is obtained by using a singular value decomposition (SVD) approach (Poole et al. 2015b) to derive an efficient, orthogonal set of design parameters. These are obtained by decomposing a set of 100 training aerofoils which all have high transonic performance to find the aerofoil mode shapes. The mode shapes are then used to deform the aerofoil surface during the optimization.

Using the SVD modes, a new shape is constructed by deforming a current aerofoil shape using a combination of the modes, as given by Eq. 17 where \({\mathbf {X}}^{new}\) and \({\mathbf {X}}^{initial}\) are the deformed surface and initial (undeformed) surface respectively. The mth modal deformation is given by the mth column of the modal matrix, \({\mathbf {U}}\), shown in Fig. 7. The first D columns of the modal matrix are used, and the design variables are the magnitude of the deformations, \(\varvec{\beta }\).

$$\begin{aligned} {\mathbf {X}}^{new} = {\mathbf {X}}^{initial} + \sum _{m=1}^{D} \beta _m {\mathbf {U}}_m \end{aligned}$$
(17)
Fig. 7
figure 7

First six aerofoil modes

Deformations of the surface are propagated through the full CFD volume mesh using a set of 24 control points that are on the surface. These are linked to the CFD volume mesh using radial basis functions (RBF), wherein global interpolation is used to provide direct control of the design surface and the CFD mesh, which is deformed in a high-quality fashion (Rendall and Allen 2008).

7.2 Flow-solver

The flow-solver used for objective function evaluation is a structured multiblock finite-volume, inviscid upwind code using the flux vector splitting of van Leer (1982). Convergence acceleration is achieved through multigrid (Allen 2002). A high-quality single block O-mesh was generated using a conformal mapping approach, and Fig. 8 shows two views of the \(257 \times 97\) point mesh used for the test case, which extends to 100 chords at farfield. All surface cells have an aspect ratio of one.

Fig. 8
figure 8

\(257 \times 97\) O-mesh for RAE2822 aerofoil

7.3 Design problem

The case tested is the drag reduction of the RAE2822 aerofoil at \(M=0.73\), \(\alpha =2.67^{\circ }\). This is a highly loaded case with \(C_l=1.02\). The flow is governed by the Euler equations, hence is an inviscid optimization and the primary drag component is the wave drag. The problem is constrained by the lift, moment and internal volume of the aerofoil. Hence, the problem can be written as:

$$\begin{array}{ll}\text {Objective:}&\text {Minimize\,drag }{(C_D)} \\&\text {with\,respect\,to\,modal\,deformation\,magnitude} \\\text {Constraint\,1\,(lift):}&C_l \ge \ C^{initial}_l \\\text {Constraint\,2\,(pitching\,moment):}&\left| C_{m}\right| \le \ \left| C^{initial}_{m}\right| \\\text {Constraint\,3\,(internal\,volume):}&V \ge \ V^{initial} \end{array}$$

A similar optimization problem, albeit for viscous flow, is currently being studied for the AIAA Aerodynamic Design Optimization Discussion GroupFootnote 1; see Telidetzki et al. (2014), Bisson et al. (2014), LeDoux et al. (2015), Poole et al. (2015a) for example results. While the cost of the objective function evaluation for viscous flow is greater than inviscid flow, shocks in equivalent inviscid flows tend to be stronger and less smeared, making it a difficult optimization problem.

The design variables used were the first 6, 8 and 10 modes from the SVD method. A pitch design variable was also included to allow load balancing.

7.4 Results

The final drag results obtained for each number of design variables are given in Table 12. The convergence of the objective function and the percentage of feasible agents through the optimization is shown in Fig. 9; these are for the 10 mode case. It can be seen that due to using orthogonal design variables, the design space of N design variables is always contained within \(N+n\) design variables, hence the ideal results for optimization mean that:

$$\begin{aligned} J(N+n) \le J(N) \end{aligned}$$
Table 12 Aerodynamic optimization results (\(C_d\) in counts)
Fig. 9
figure 9

Convergence history of objective and percentage of feasible agents through optimization

This is the trend that is seen. The monotonically decreasing final optimized drag results with increasing design variables indicates the success at finding a global optimum for this highly loaded aerodynamic problem. Furthermore, it is clear that for all three of the design variables tested, the final result lies on the lift constraint, as expected. It is interesting to note, however, that the lift constraint is often the only active constraint, and that the volume of the optimized aerofoils is always greater than the initial aerofoil. The moment constraint is only active on the six design variable case. The fact that the constraints are active indicates that it is likely that even towards the end of the optimization, a good number of agents will be infeasible, and this is shown in Fig. 9. This also shows that the agents spend a good amount of time at the start of the optimization searching through the infeasible space for good solutions, indicating the exploration ability of the approach. Towards the end of the optimization, exploitation becomes key and the number of feasible agent increases.

The surface pressure coefficient distributions of the initial and best optimized solution (10 design variable case) are shown in Fig. 10. These show that the constrained global search framework has successfully eliminated the shock, and therefore shows the largest drag reduction. Finally, a mesh convergence study has been performed with the deformed mesh from the 10 mode optimization results used as the basis and presented in Fig. 11. This mesh has been doubled in each direction twice, and indicates that the final mesh converged solution is slightly less than 31 drag counts.

Fig. 10
figure 10

Surface pressure coefficients

Fig. 11
figure 11

Mesh convergence

8 Concluding remarks

A new, generic framework for handling constraints when performing constrained optimization using agent-based search algorithms has been presented. The constraint handling framework is called separation-sub-swarm (3S) and has the advantage of being a high performance framework that is independent of the type of swarm algorithm used for an optimization, so can be added to the general swarm algorithm system. The new method works by considering the whole population of agents as two independent sub-swarms. All feasible agents optimize the objective function value, where the user selected swarm algorithm acts on this, the infeasible agents then minimise the constraint violation. This independence of the infeasible swarm from the feasible swarm allows the 3S framework to be added on any existing swarm algorithm, including those algorithms where the fitness function value is used within the agent update scheme.

Constrained analytical tests on the CEC2006 suite of analytical benchmark functions and on three standard engineering design problems have been presented using the 3S framework coupled to a particle swarm optimization (PSO), gravitational search algorithm (GSA), hybrid-GSA-PSO and differential evolution (DE). The 3S framework was compared to a death penalty, a static penalty, a self adaptive dynamic penalty and a feasible directions approach. Results showed that overall, the 3S framework produced solutions that were much more likely to be feasible, returning average feasibility rates of over 90% for all swarm algorithms tested. Results were, in general, closer to the theoretical best solution available compared to the other frameworks tested. For the engineering benchmarks, the framework produced final optimum solutions that are at least as good (and often better) than the results published previously. This demonstrates the high performance of the 3S framework within the umbrella of constraint handling frameworks.

Finally, the constraint handling framework was applied to an aerodynamic shape optimization problem for transonic drag minimization of an aerofoil. The design variables used were obtained using a modal extraction technique the produces orthogonal design variables. Results demonstrate that the new framework is capable of producing shock-free optimization results in inviscid flow. Furthermore, the use of orthogonal design variables should result in improved optimization performance for an increasing number of design variables, and this is demonstrated indicating that the constrained global method has successfully located the global optimum solution.