1 Introduction

Particle swarm optimization (PSO) is an optimization algorithm modelled after the behaviour of birds in a flock (Kennedy and Eberhart, 1995) and belongs to the field of swarm intelligence (SI). PSO searches for a candidate solution by iteratively updating the positions of particles in a swarm. Position updates are informed by the best position a given particle has found, as well as by the best position found by the neighbourhood particles. PSO performance is greatly contingent on the selection of appropriate control parameter (CP) values which govern the search behaviour (Beielstein et al., 2002; Van den Bergh and Engelbrecht, 2006; Bonyadi and Michalewicz, 2016; Bratton and Kennedy, 2007). Control parameter configurations are usually selected as constant values, which is often not ideal for a specific problem (Van den Bergh and Engelbrecht, 2006; Jiang et al., 2007), because different objective function landscapes may benefit from varying degrees of exploration versus exploitation. However, tuning of control parameters for the problem at hand is computationally expensive and inefficient.

Alternatively, SAPSO algorithms have been proposed and attempt to adjust control parameters during the optimization process. Recent studies (Harrison et al., 2018a, 2016) have, however, shown that most SAPSO approaches introduce more parameters to which PSO performance is sensitive. These approaches also result in divergent behaviour, infeasible solutions, and small particle step sizes, and are generally ineffective at attaining better solutions. The self-adaptive process is further complicated by the fact that the previously optimal control parameter configuration may no longer be useful at the current moment (Harrison et al., 2018b).

The contribution of this study lies in the proposition of a SAPSO algorithm, which uses a cultural algorithm’s belief space to adjust the PSO control parameters during the search, reducing the number of parameters to which PSO performance is sensitive, and improving the performance of the algorithm. The proposed BS-SAPSO performs 3% to 55% better than the various baselines in terms of objective function value solution quality, and (depending on the implementation) has two or one, instead of three runtime parameters which have to be adjusted.

Section 2 elaborates on PSO and BS, and Sect. 3 explains the design decisions pertaining to the BS-SAPSO. Section 4 explains the experimental procedure followed and the evaluation metrics used, and Sect. 5 relays the results obtained. Section 6 concludes the paper.

2 Background

This section elaborates on PSO itself, the effect of control parameter configurations, existing attempts at designing a SAPSO algorithm, metrics used to evaluate such algorithms, and the functioning of the belief space.

2.1 Particle swarm optimization

PSO makes use of population-based, stochastic search to find candidate solutions to an optimization problem by iteratively updating the positions of particles in a swarm. In addition to having a certain inertia, particles in the swarm update their positions in accordance with both the best positions they have personally found, and the best positions found by the particle neighbourhood. The latter are referred to, respectively, as the cognitive and social components of the velocity update rule (Shi and Eberhart, 1998; Kennedy and Eberhart, 1995):

$$\begin{aligned} v_{i j}(t+1)=\omega v_{i j}(t)+c_{1} {r_1}_{i j}(t)\left[ y_{i j}(t)-x_{i j}(t)\right] +c_{2} {r_2}_{i j}(t)\left[ \hat{y}_{ij}(t)-x_{i j}(t)\right] \end{aligned}$$
(1)

where for particle i in dimension j at time t, \(v_{ij}(t)\) is its velocity, \(x_{ij}(t)\) is its position, \(y_{ij}(t)\) is its personal best position, and \(\hat{y}_{ij}(t)\) its neighbourhood best position. The inertia coefficient is denoted \(\omega\), and the cognitive and social coefficients are \(c_1\) and \(c_2\), respectively. Stochasticity is introduced via the random constants \({r_1}_j\) and \({r_2}_j\), sampled from a uniform distribution over (0,1). Position updates are governed by

$$\begin{aligned} \textbf{x}_{i}(t+1)=\textbf{x}_{i}(t)+\textbf{v}_{i}(t+1) \end{aligned}$$
(2)

2.2 Control parameter configurations

In order to optimally traverse the search space of the problem at hand, search trajectories should not diverge and also not repeat already explored paths cyclically. Furthermore, any optimization algorithm must manage a trade-off between exploration and exploitation; that is, searching as of yet unexplored areas, versus searching more thoroughly in locations which are already known to potentially yield good solutions. This trade-off is governed by three control parameters, namely the inertia weight, \(\omega\), the cognitive acceleration coefficient, \(c_1\), and the social acceleration coefficient, \(c_2\) (Engelbrecht, 2007). Constant control parameter values are often used, e.g. by selecting the generic values of \(\omega = 0.729844\), \(c_1 = 1.496180\) and \(c_2 = 1.496180\) (Harrison et al., 2018a).

However, since more exploration is initially desirable, with exploitative behaviour towards the end of the search, one school of thought holds that PSO should find more optimal solutions if the control parameters can be adapted in a way which reflects this shift from exploration towards exploitation. An example approach would be to choose control parameters as follows [adapted from Sermpinis et al. (2013) to conform to \(c_1 + c_2 > 4\) (Shi and Eberhart, 1998)]:

$$\begin{aligned} \begin{aligned} \omega (t)&=0.4 \left( \frac{t-n_t}{n_t}\right) ^2+0.4 \\ c_{1}(t)&=-3 \frac{t}{n_t}+3.5 \\ c_{2}(t)&=+3 \frac{t}{n_t}+0.5 \end{aligned} \end{aligned}$$
(3)

where t is the current time step and \(n_t\) is the maximum number of timesteps. Equation (3) ensures a large \(c_1\) and small \(c_2\) initially, after which \(c_1\) is decreased and \(c_2\) increased. Since \(c_1\) promotes exploration by increasing the personal component, whereas \(c_2\) increases exploitation by increasing the social component, particles initially explore a lot and finally exploit more. The inertia weight \(\omega\) is also decreased to shift from exploration to exploitation, so that particles initially take large steps to decrease the likelihood of becoming stuck in local minima and explore more of the search landscape, but eventually take small steps to search more thoroughly in the area which has been found to produce better solutions. A variant of Eq. (3) is the time-variant acceleration coefficients PSO (PSO-TVAC) (Ratnaweera et al., 2004), which is similar with exception of \(\omega\) which is kept constant. Time-variant SAPSO algorithms therefore consider the total number of time steps used, and attempt to adjust the control parameters accordingly (Harrison et al., 2018a). The premise of time-variant SAPSO is therefore that a sufficient balance between exploration and exploitation can be found provided the computational budget limit.

Conversely, true self-adaptive approaches adjust control parameters based on introspective information derived during the search process. True SAPSO therefore seeks to adapt more specifically to the problem at hand, and to where in the search landscape a particle finds itself (Hashemi and Meybodi, 2011; Tanweer et al., 2015; Zhan et al., 2009; Jun and Jian, 2009). Adaptation of control parameter configurations is usually achieved by introduction of a governing equation which modifies control parameter configurations, and thus particle movement. The governing equation is mostly based on a behavioural tendency present in the swarm which is believed to contain information that can be exploited to improve performance (Harrison et al., 2018a). The premise of true SAPSO is therefore that the behaviour of particles contains information useful towards improving algorithm performance.

2.3 Convergence condition

A PSO particle is considered stable if it has convergent control parameters as specified by a derived stability condition (Poli, 2009; Poli and Broomhead, 2007). If the criterion,

$$\begin{aligned} c_{1}+c_{2}<\frac{24\left( 1-w^{2}\right) }{7-5 w} \quad \text {and}\quad w \in [-1,1] \end{aligned}$$
(4)

holds for all particles’ control parameters, the swarm is guaranteed to reach an equilibrium state. Note, however, that the criterion does not place a bound on the number of iterations required to reach this state, and also does not specify that velocities will not potentially assume very large values during the path to equilibrium.

2.4 Velocity clamping

Velocity clamping aims to prevent the explosion of particle velocities, which occurs when particles have sufficient inertia that their velocities grow without bound. Velocity explosion results in particles leaving the search space, often permanently remaining outside of the search boundaries (Oldewage et al., 2017). For a search space bounded by \(\left[ \varvec{\ell }, \textbf{u}\right]\), the velocity can be constrained in each dimension j to a fixed value as follows:

$$\begin{aligned} v_{i, j}^{t+1}= {\left\{ \begin{array}{ll}v_{i, j}^{t+1} &{} \text{ if } -v_{\max , j} \le v_{i, j}^{t+1} \le v_{\max , j} \\ v_{\max , j} &{} \text{ if } v_{\max , j}<v_{i, j}^{t+1} \\ -v_{\max , j} &{} \text{ if } v_{i, j}^{t+1}<-v_{\max , j}\end{array}\right. } \end{aligned}$$
(5)

where

$$\begin{aligned} v_{\max , j}=\delta \left( u_j-\ell _j\right) , \quad \delta \in (0,1) \end{aligned}$$
(6)

Another velocity clamping method is to limit the magnitude of the velocity vector, i.e.

$$\begin{aligned} \textbf{v}_i^{t+1}= {\left\{ \begin{array}{ll}\textbf{v}_i^{t+1} &{} \text{ if } \left\| \textbf{v}_i^{t+1}\right\| <=v_{\max } \\ \frac{v_{\max }}{\left\| \textbf{v}_i^{t+1}\right\| } \textbf{v}_i^{t+1} &{} \text{ if } \left\| \textbf{v}_i^{t+1}\right\| >v_{\max }\end{array}\right. } \end{aligned}$$
(7)

where

$$\begin{aligned} v_{\max }&= \delta \sqrt{\sum _{j=1}^n\left( u_j-\ell _j\right) ^2} \nonumber \\&= \delta |\textbf{u}-\varvec{\ell }| \end{aligned}$$
(8)

with n-dimensional vectors \(\textbf{u}=\left[ u_1, \ldots , u_n\right] ^T\) and \(\varvec{\ell }=\left[ \ell _1, \ldots , \ell _n\right] ^T\).

Clamping velocities in all dimensions results in line search, whereas clamping per dimension has the disadvantage that it modifies the direction in which a particle is travelling. However, the random components \(\textbf{r}_1\) and \(\textbf{r}_2\) of the PSO change the particle’s trajectory regardless of clamping, and therefore the additional modification of direction is not considered a problem.

Conversely, clamping by magnitude preserves the particle direction, but if the largest and smallest velocity components are on very different scales of magnitude, scaling all dimensions by some constant factor results in the smallest components becoming irrelevant.

2.5 Self-adaptive particle swarm optimization

Despite multiple SAPSO algorithms being published, many result in particles diverging towards infeasible search space, or premature convergence as a consequence of step sizes rapidly tending towards zero (Harrison et al., 2016). Most of the algorithms also introduce more control parameters than were present initially. These findings are corroborated by another paper (Harrison et al., 2018a), in which many of the 18 SAPSO algorithms analysed were found to demonstrate either divergence or premature convergence. Furthermore, in the attempt to suitably adjust the control parameters, \(\omega\), \(c_1\) and \(c_2\), merely three manage to decrease the number of parameters to which performance is sensitive, namely SAPSO by Li, Fuand and Zhang (SAPSO-LFZ) (Li et al., 2008), self-adaptive inertia weight PSO (SA-IWPSO) (Dong et al., 2008), and PSO with random acceleration coefficients (PSO-RAC) (Harrison et al., 2018a).

Note also that the SAPSO term is sometimes also used to refer to hyperheuristic adaptation strategies. In this case, instead of control parameter adaptation, a pool of candidate PSO variants is maintained, and the best performing variant is selected at each time step. Examples of such approaches are the heterogeneous PSO algorithms by Engelbrecht (2010) and Nepomuceno and Engelbrecht (2013), as well as the Self-adaptive Particle Swarm Optimization-based Echo State Network for Time Series Prediction approach by Xue et al. (2021). Such approaches are not considered in this paper, because hyperheuristic adaptation is fundamentally different to adjustment of the control parameters of an optimization algorithm.

2.6 Cultural algorithms

Cultural algorithms (CA) are evolution-inspired algorithms which maintain a belief space in parallel with a population space in order to engender dual inheritance of beneficial traits and information (Reynolds, 1994). The population space represents individuals, each with a set of behavioural traits, and the belief space represents beliefs which generalize on individual experiences. The population space may consist of a set of candidate solutions to the optimization problem, while the belief space maintains a set of beliefs about where in the search landscape the optimum resides. Any population-based metaheuristic can be used in the population space to find an optimal solution to the relevant optimization problem, with genetic algorithms (GAs) (Chahar et al., 2021) being generally used due to the analogy to nature with dual genetic and cultural inheritance.

Fig. 1
figure 1

Cultural algorithm (Jalili and Hosseinzadeh, 2014)

As shown in Fig. 1, at each time step individuals in the population space are evaluated according to a fitness function, and the best individuals’ beliefs are accepted to the belief space. The belief space is then used to influence the behaviour of individuals in the population space, thereby affecting the behaviour of particles in the population as it exists at later time steps. Since PSO particles’ behaviour is determined by their control parameters, the belief space can steer the particles by influencing the control parameters. CAs therefore simulate the exchange of ideas between individuals, generally through mechanisms such as imitation, adaptation, or recombination. By allowing entities to learn from one another, CAs aim to leverage the collective intelligence of the population to find more optimal solutions.

CAs have seen real-world application in the fields such as civil engineering, mechanical engineering, electrical engineering, and computer science, predominantly, however, in the latter two (Maheri et al., 2021). Specific examples of problems include the optimization of memory usage and improvement of computational efficiency, as well as fault detection (Pan et al., 2010). The CA has also seen use in the structural optimization, for example, of dome structures, which were optimized subject to various constraints, such as stress, displacement, and frequency (Jalili et al., 2019).

3 Design of belief space-guided self-adaptative mechanism

The purpose of this section is to relay the design of the BS-SAPSO algorithm. The section starts by describing the general architecture and interaction between the PSO and the BS, and then elaborates on specific design choices regarding selection, sampling, and updating of the belief and population spaces.

3.1 Architecture

In order to automatically tune the PSO control parameters, the PSO is augmented using the belief space concept, borrowed from CAs. The belief space represents the control parameter configurations believed to be best by the particles in the population space. Throughout the PSO search, certain particles affect the belief space, and in turn the belief space influences the control parameters of all particles in the population. An overview of the BS-SAPSO algorithm is given in Algorithm 1.

figure a

For brevity, swarm henceforth refers to the collection of particle positions, solutions to the collection of objective function values found by the particles at a given time step, particlePos to the position of a specific particle, and particleSol to the solution of a specific particle. Furthermore, function denotes the objective function being optimized, CP the control parameters with a mapping to their particles, and BS the belief space. Given Algorithm 1, the functions which are not self-evident, are the subjects of investigation in this paper:

  • Selection methods, used to select the particles that will have their beliefs accepted into the belief space, explained in Sect. 3.2.

  • Update and sampling triggers, to determine when the belief space is updated and control parameters are sampled. Given that a belief space update only affects the search if the control parameters are updated as well, \({\textsc {UpdateBSTrigger}}\) is taken to be the same as \({\textsc {UpdateCPTrigger}}\), with a discussion in Sect. 3.3.

  • Update methods, used to specify how the chosen particles’ beliefs affect the belief space. Here, \({\textsc {UpdateBS}}\) is performed as elaborated on in Sect. 3.4.

  • Sampling methods, to determine how new control parameter values (beliefs) are sampled from the belief space to influence the population space, with \({\textsc {UpdateCP}}\) explained in Sect. 3.5.

Ultimately, all of the abovementioned aspects influence the trade-off between exploration and exploitation, and therefore the performance of the algorithm. It is expected that if the belief space converges too quickly to a certain control parameter configuration, the algorithm will become too exploitative, and if the belief space does not converge quickly enough, the algorithm will become too explorative.

3.2 Selection methods

Selection methods determine which particles are allowed to influence the belief space, and the following selection methods are proposed:

  • Random selection, where all particles have an equal probability of updating the belief space. While random selection is not expected to perform well, it serves to set a BS-SAPSO performance baseline.

  • Elitist selection, where only the \(n_e\) particles with the best objective function values are allowed to influence the belief space, with \(n_e\) reduced by one on every update. Elitist selection postulates that since the particles have found better solutions, their control parameters are more likely to be correct for the function at hand.

  • Roulette wheel selection, where particles are selected based on a probability proportional to their objective function values. While similar to elitist selection, the use of probabilities prevents the strict exclusion of worse-performing particles, but merely reduces the probability of their selection.

  • Rank selection, where particles are ranked according to objective function value, and selected with a probability proportional to their ranking. This selection method is similar to roulette wheel selection, but prevents giving too much weight to the best particles in cases where the best solution quality is orders of magnitude larger than the worst.

  • Improvement selection, where only the \(n_e\) particles which demonstrated the most improvement in objective function values are allowed to influence the belief space, with \(n_e\) reduced by one on every update. While similar to elitist selection, the difference lies therein that value is placed on the improvement engendered by a control parameter configuration, rather than the actual value, which might have resulted from lucky initialization.

  • Improvement-magnitude selection, where the \(n_e\) particles which showed the most improvement in objective function values are allowed to influence the belief space with a selection probability proportional to the objective function value improvement the particle underwent. Similar to improvement selection, but with probabilities instead of hard classes, this method gives more weight to particles that show more improvement, without strictly excluding the other particles.

  • Tournament selection, where \(n_e\) particles are selected at random, after which particles are selected with a probability proportional to their objective function values. Tournament selection increases the randomness of the process, while retaining a degree of elitism after the random selection.

Generally, it is expected that methods which incorporate more randomness will lead to more exploration of the control parameter space, while methods which are more deterministic will lead to more exploitation.

3.3 Update and sampling triggers

To maintain consistency of the effect between updating the belief space and sampling new control parameters, the same trigger is used for both \({\textsc {UpdateBSTrigger}}\) and \({\textsc {UpdateCPTrigger}}\), and the following conditions that set the trigger are investigated:

  • Always: setting the trigger always, i.e. on every time step, has the advantage of not introducing an additional control parameter, but will probably lead to rapid convergence of the belief space, because every consecutive belief space update necessarily reduces the range of acceptable control parameter values.

  • Time-variant: the belief space can be updated according to a function of the current iteration, in an analogous fashion to how control parameters are updated by Eq. (3).

  • Fixed: updating the belief space at fixed intervals \(p_f\), instead of at every time step, is an option to prevent early convergence, but introduces an additional hyperparameter.

  • Stagnate: the belief space can be updated when the best solution found by the swarm does not improve (i.e. stagnates) for \(p_s\) iterations, but this introduces an additional hyperparameter.

3.4 Update method

The belief space represents a range of acceptable control parameter values, and as such the limits of this range are updated based on the selected particles. Due to the large amount of variation introduced by the various selection methods in Sect. 3.2, \({\textsc {UpdateBS}}\) is kept fixed by setting the belief space boundaries to the minimum and maximum control parameter values out of the group of selected particles.

3.5 Sampling method

In order to update control parameters based on the belief space, \({\textsc {UpdateCP}}\) samples new control parameter values for each particle in the swarm from a uniform distribution between the belief space limits.

4 Experimental procedure

This section details the experimental procedure as pertaining to evaluation metrics and implementational details. Since many different facets of the algorithm are evaluated, a ‘run’ henceforth refers to a single execution of the BS-SAPSO on a single function, and an ‘experiment’ refers to running the BS-SAPSO algorithm for \(r=30\) runs on the whole function set, with a specific control parameter and algorithmic configuration.

4.1 Evaluation metrics

In order to analyse the performance and behaviour of the BS-SAPSO, a number of metrics are employed:

  1. 1.

    Normalized global best solutions The objective function values, albeit solutions, at the global best positions, are ultimately the indicator of how well an algorithm performs on a given optimization problem. Since functions often span vastly different magnitudes, normalized global best solutions are used. After all experiments are completed, the highest and lowest global best solutions which have been found throughout all experiments and all runs are used to normalize global best solutions to [0,1] for each experiment. Note that this does not affect the search, since the BS-SAPSO is executed on the function set as-is, after which the solutions are merely scaled to the same range to allow for calculating a score per experiment, by averaging over the global best solutions of the runs in that experiment. The scale to which is normalized is arbitrary, but unitary scaling results in easily interpretable results, where a lower value indicates a better solution.

  2. 2.

    Average swarm diversity provides information regarding the level of exploration and exploitation. Diversity is calculated using (Olorunda and Engelbrecht, 2008)

    $$\begin{aligned} \mathcal {D}=\frac{1}{n_{s}} \sum _{i=1}^{n_{s}} \sqrt{\sum _{j=1}^{n_{x}}\left( x_{i j}-\bar{x}_{j}\right) ^{2}} \end{aligned}$$
    (9)

    with the swarm centre at

    $$\begin{aligned} x_{j}=\frac{\sum _{i=1}^{n_{s}} x_{i j}}{n_{s}} \end{aligned}$$
    (10)

    where \(n_s\) is the number of particles in the swarm, and is \(n_x\) the number of dimensions.

  3. 3.

    Percentage of particles in infeasible space If a particle violates the boundaries of feasible search space even in one dimension, it is considered as being in infeasible space. Infeasible particles should not be considered when updating the best positions found as to not direct the search out of feasible space (Engelbrecht, 2013).

  4. 4.

    Percentage of particles that are stable A stable particle has convergent control parameter configurations, in accordance with the stability condition [refer to Sect. 2.3 in Eq. (4)].

  5. 5.

    Average particle velocity represents average step sizes, which have to decrease to achieve convergence, but should not tend towards zero too early in the search process, as to not get stuck in a local minimum. Average particle movement is calculated using (Harrison et al., 2018a)

    $$\begin{aligned} \Delta (t+1)=\frac{1}{n_s} \sum _{i=1}^{n_s}\left\| \textbf{x}_{i}(t+1)-\textbf{x}_{i}(t)\right\| \end{aligned}$$
    (11)

4.2 Implementation

The PSO variant used is the inertia weight PSO (Shi and Eberhart, 1998) as given by Eqs. (1) and (2), and the neighbourhood of each particle is the whole swarm. The swarm is initialized with particles uniformly distributed within the feasible search space, and the belief space is initialized to the min-max ranges of \(\omega \in [0, 1]\), \(c_1 \in [0, 4]\) and \(c_2 \in [0, 4]\). For each experiment (i.e. aspect which is investigated), \(r=30\) independent runs are performed over 31 minimization functions, which are given in Appendix C. The evaluation functions used to select the best algorithm can, however, not also represent function performance without bias, which is why a test set of 24 additional functions was introduced, discussed in Sect. 4.3, and listed in Appendix C.

Objective function value is only calculated for particles which reside within feasible space. Particles outside of feasible space have their objective values set to infinity, which, assuming minimization, automatically disqualifies those particles from updating the personal or global best known positions. A swarm size of \(n_s = 30\) is used, together with a dimension space of \(n_d = 30\) for each function, and \(i_{max} = 5000\) time steps per run. All plots and scores are the mean and standard deviation over all the functions and runs of a given experiment, thus \((30 \times 45) + (30 \times 7) = 1560\) runs per experiment. Given the combination of algorithmic aspects investigated, a total of 98 experiments are performed and by implication 132,300 runs. The naming convention for experiments is (selection_method)(\(n_e\))_(update_trigger)(p), in accordance with the explanations in Sect. 3.3. The performance of all experiments is compared in Table 9 in Appendix B according to the normalized global best solution quality achieved by that experiment, with explanation of the metrics in Sect. 4.1.

4.3 Benchmark function set

Lang and Engelbrecht (2021) proposed a benchmark set of 24 functions as An Exploratory Landscape Analysis-Based Benchmark Suite. This benchmark set was constructed following an intensive analysis of BBOB (Hansen et al., 2009) and CEC (Liang et al., 2013a, b, 2014; Wu et al., 2016) benchmarks. The analysis used self-organizing maps to cluster the functions according to their fitness landscape characteristics, and showed that many of the functions in the BBOB and CEC benchmarks did not differ significantly. The analysis also indicated gaps, where not all fitness landscape characteristics are sufficiently represented, with overemphasis on others (Cenikj et al., 2022; Lang and Engelbrecht, 2020a, b). Following the analysis, a more comprehensive benchmark set with wide coverage of fitness landscape characteristics was proposed, all the while containing many functions which are also used in the CEC, BBOB, and various other benchmark suites.

5 Results

This section explains how implementational details were varied across experiments, and presents the results obtained. Note that some plots contain gaps, for example, in Fig. 21, which appear when particles move so far out of the search space that the swarm diversity calculation in Eq. (9) results in numerical overflow.

5.1 Performance baselines

This section sets four performance baselines, where one is simply the CP-tuned inertia weight PSO, and the others are selected from the self-adaptive approaches which Harrison et al. (2018a) found to exhibit good search characteristics. Because this study surveyed the state of self-adaptive PSO, it sets a good starting point for comparison.

The inertia weight PSO baseline (PSO-IW) uses the constant control parameter configuration from Sect. 2.2, and the second (PSO-TVIW) uses the time variant configuration given by Eq. (3). The third is PSO-TVAC, explained in Sect. 2.2, and the fourth (PSO-RAC) samples random convergent control parameters per particle on every time step (Engelbrecht, 2022; Harrison et al., 2017), that is, the control parameter configurations conform to Poli’s convergence criterion in Eq. (4). Table 1 gives the normalized global best solutions per baseline.

Figure 2 shows the CP values of the PSO baselines throughout the search. Figure 3 confirms that all particles are stable for the constant and random baselines, whereas for the time-variant baseline, initially all particles are unstable, which changes exactly halfway through the search, with the entire swarm then becoming stable. Figure 4 shows that for all baselines, initially almost all particles reside outside the feasible search space, but towards the end most are within feasible space. Figure 5 shows that the particle velocities of the constant baseline decrease smoothly, whereas particle velocities for the time-variant baseline explode initially, but then return to smaller values. The random baseline has a high average velocity, but also decreases smoothly. Figure 6 confirms the observations made in Fig. 5, as the swarm diversity is usually directly related to the average particle velocity.

Table 1 Normalized global best solutions for performance baselines
Fig. 2
figure 2

Control parameters for baseline

Fig. 3
figure 3

Stable particles for baseline

Fig. 4
figure 4

Infeasible particles for baseline

Fig. 5
figure 5

Particle velocity vectors for baseline

Fig. 6
figure 6

Swarm diversity for baseline

5.2 Belief space and control parameter updates on every time step

In this section, belief spaces are updated on every iteration and control parameters are sampled on every iteration. The most basic version of the BS-SAPSO algorithm (always_random) randomly selects particles to update the belief space on every time step. Thereafter, one aspect of the algorithm is changed at a time, in accordance with the explanations in Sect. 3.2. Table 2 gives the normalized global best solutions per experiment, and indicates that of the experiments which always update the belief space, elitist and improvement selection tend to perform better, followed closely by improvement-magnitude selection. Only a selected number of metrics are plotted, firstly due to space constraints, but also since most metrics within a section generally resemble each other across experiments. Since the values given are mean values over 45 functions with 30 runs each, the standard deviation is also given where applicable.

A tendency observed when plotting the belief space boundaries, consisting of { \((w_{min}, w_{max})\), \(({c_1}_{min}, {c_1}_{max})\), \(({c_2}_{min}, {c_2}_{max})\) }, in Fig. 7, is that belief space convergence is almost instantaneous. This occurs because the belief space boundaries are updated on every time-step, which is too frequent to allow for exploration of the control parameter space. In Fig. 7, the minimum and maximum belief space boundaries therefore converge to the same value, and the plot shows a flat line.

Another observation is that the belief space boundaries tend to converge around the mean of the initial minimum and maximum belief space boundaries, implying that belief spaces assume statistically likely values rather than displaying self-adaptive behaviour. Belief space updates on every time step therefore do not seem to be a viable approach.

Table 2 Normalized global best solutions for BS/CP updates on every time step

Generally, the number of stable particles decreases when more particles are allowed to influence the belief space (i.e. \(n_e\) is increased), as shown in Fig. 8, with the inverse being true for the number of infeasible particles in Fig. 9. The former is explained by larger belief space ranges being used for selection of new control parameter values if \(n_e\) is higher. Furthermore, for all experiments there are cases of exploding particle velocities, for example, as in Fig. 10, and thus exploding swarm diversity, as in Fig. 11, creating an unstable search process. All methods in Table 2 perform worse than the baselines in Table 1 and therefore need additional modification.

Fig. 7
figure 7

Belief space boundaries for elitist selection

Fig. 8
figure 8

Stable particles for elitist selection

Fig. 9
figure 9

Infeasible particles for elitist selection

Fig. 10
figure 10

Particle velocity vectors for elitist selection

Fig. 11
figure 11

Swarm diversity for elitist selection

5.3 Belief space and control parameter updates at fixed intervals

All experiments in this section are based on the best performing experiments of Sect. 5.2 (that is, elitist and improvement selection), but use belief spaces which are updated at fixed intervals, with control parameters also sampled at the same fixed intervals.

Table 3 Normalized global best solutions for BS/CP updates at fixed intervals

Convergence is slower than before, as can be seen in the belief space boundaries in Fig. 12, and is achieved by fixing the belief space updates and control parameter sampling to longer intervals (\(p_f=20\) and \(p_f=50\)). The delayed convergence improves performance, but increases the number of unstable particles, and introduces the additional parameter \(p_f\). Furthermore, the improvement and elitist selection methods which initially use more particles for belief space updates (higher \(n_e\)) tend to perform better, as shown in Table 3. Finally, despite having a relatively stable swarm, demonstrated in Fig. 13, and low numbers of infeasible particles as shown in Fig. 14, particle velocities in Fig. 15 and swarm diversity in Fig. 16 explode similarly as in Sect. 5.2.

Fig. 12
figure 12

Belief space boundaries for improvement selection at \(p_f=50\) time steps

Fig. 13
figure 13

Stable particles for improvement selection at \(p_f=50\) time steps

Fig. 14
figure 14

Infeasible particles for improvement selection at \(p_f=50\) time steps

Fig. 15
figure 15

Particle velocity vectors for improvement selection at \(p_f=50\) time steps

Fig. 16
figure 16

Swarm diversity for improvement selection at \(p_f=50\) time steps

5.4 Belief space and control parameter updates upon stagnation of global best solution

All experiments in this section are similar to those in Sect. 5.3, but use belief spaces which are updated when the global best solution does not improve for a certain number of iterations, with control parameters sampled at the same time steps (Figs. 17, 18, 19, 20).

Table 4 Normalized global best solutions of for BS/CP updates on stagnation

Delaying convergence by updating the belief space and control parameters when the objective function value stagnates for longer intervals (\(p_s=20\) and \(p_s=50\)) improves performance compared to performance reported in Sect. 5.2. Furthermore, improvement and elitist selection methods which initially use more particles for belief space updates (higher \(n_e\)) tend to perform better. Velocity explosion is, however, especially bad here, and exceeds the plot limits regularly, as depicted in Fig. 21 (Table 4).

Fig. 17
figure 17

Belief space boundaries for improvement selection at \(p_s=20\) time steps

Fig. 18
figure 18

Stable particles for improvement selection at \(p_s=20\) time steps

Fig. 19
figure 19

Infeasible particles for improvement selection at \(p_s=20\) time steps

Fig. 20
figure 20

Particle velocity vectors for improvement selection at \(p_s=20\) time steps

Fig. 21
figure 21

Swarm diversity for improvement selection at \(p_s=20\) time steps

5.5 Belief space and control parameter updates at delayed intervals

Considering the results of Sects. 5.3 and 5.4, a clear tendency is that performance improves if larger time step intervals are used with high values of \(n_e\). Therefore, this section repeats the experiments from these sections, but with even larger time step intervals, and with \(n_e\) set to the maximum of \(n_e = n_s = 30\). Figure 22 show a slower belief space convergence, accompanied by less stable particles and more infeasible particles in Figs. 23 and 24. Particle velocities and swarm diversity still explodes, as shown in Figs. 25 and 26, respectively. The best performing methods in Table 5 outperform the baselines in Table 1, but only marginally.

Table 5 Normalized global best solutions for BS/CP updates at delayed fixed intervals
Fig. 22
figure 22

Belief space boundaries for improvement selection for \(n_e=30\)

Fig. 23
figure 23

Stable particles for improvement selection for \(n_e=30\)

Fig. 24
figure 24

Infeasible particles for improvement selection for \(n_e=30\)

Fig. 25
figure 25

Particle velocity vectors for improvement selection for \(n_e=30\)

Fig. 26
figure 26

Swarm diversity for improvement selection for \(n_e=30\)

5.6 Stability-guided BS-SAPSO

The experiments in this section repeat the three best performing experiments from Sects. 5.2 to 5.4, with the additional condition that control parameters must adhere to Poli’s convergence criterion [Eq. (4)], and that the belief spaces are prevented from assuming ranges which preclude sampling of convergent control parameters. The reason for enforcing Poli’s criterion is to prevent the particles from permanently leaving the search space, thereby resulting in a more thorough search of the feasible search space, which will hopefully lead to better solution quality.

Sampling control parameter values which adhere to Poli’s convergence criterion, given by Eq. (4), is straightforward; control parameters which do not adhere to the criterion can simply be discarded and resampled, or else the sampled values for \(\omega\) and \(c_1\) can place boundaries on allowable values for \(c_2\), for more computational efficiency.

However, the situation can arise where belief space boundaries assume values which do not permit the sampling of any convergent control parameters. This can happen because belief space boundaries are updated only with reference to a number of influential particles, without considering whether the combination of these new belief space boundaries allows for convergent control parameter values.

To prevent the aforementioned, limits need to be placed on the values which belief space boundaries can assume. Limiting the belief space is, however, not straightforward, due to the parabolic part of Eq. (4). The possible cases which can arise are shown in Fig. 27, where

Fig. 27
figure 27

Potential belief space configurations for Poli’s convergence criterion

$$\begin{aligned} \omega ' = \frac{24\left( 1-w^{2}\right) }{7-5 w}\quad \text {for } w \in [-1,1]\quad \text {and}\quad c' = c_{1}+c_{2} \end{aligned}$$

with \(\omega\) the x-axis and \(\omega '\) the y-axis. Also, \(c'_{min}\) (orange) and \(c'_{max}\) (red) represent the range \(c'\) can assume, based on the belief space boundaries \(c_1\) and \(c_2\). Furthermore, \(\omega _{min}\) (blue) and \(\omega _{max}\) (purple) represent the range \(\omega\) (not \(\omega '\)) can assume—these are therefore the belief space boundaries for \(\omega\). The area enclosed by \(c'_{min}\), \(c'_{max}\), \(\omega _{min}\) and \(\omega _{max}\) demarcates the control parameter values which can potentially be sampled. To adhere to the convergence criterion, these control parameters must therefore fall underneath the curve \(\omega '\) (black).

Figure 27a demonstrates the completely convergent case; the enclosed area completely falls under \(\omega '\) —it is therefore impossible to sample non-convergent control parameters. Figure 27b shows overlap: both convergent and non-convergent configurations are possible, which is acceptable, because non-convergent configurations can simply be discarded. In Fig. 27c, the relatively high \(c'\) range, in conjunction with the low \(\omega\)-range, disallows convergent control parameters. The same is true for Fig. 27d, except that here both the \(c'\) range and the \(\omega\) range are too high.

Belief space updates which move belief space boundaries outside the ranges which allow for convergent parameters can be prevented by requiring that the updated belief space adheres to either

$$\begin{aligned} c_{1}+c_{2}<\frac{24\left( 1-w_{max}^{2}\right) }{7-5 w_{max}} \quad \text {for } w_{max} \in [-1,1] \end{aligned}$$
(12)

to prevent the situation in Fig. 27c, or

$$\begin{aligned} c_{1}+c_{2}<\frac{24\left( 1-w_{min}^{2}\right) }{7-5 w_{min}} \quad \text {for } w_{min} \in [-1,1] \end{aligned}$$
(13)

to prevent the situation in Fig. 27d. Note the inversion: \(\omega _{max}\) has to be considered for the small \(\omega\) case, producing Fig. 27e as the corrected version of Fig. 27c with Eq. (12) enforced. Similarly, \(\omega _{min}\) has to be considered for the large \(\omega\) case, producing Fig. 27f as the corrected version of Fig. 27d with Eq. (13) enforced. In order to clearly demonstrate the difference between Fig. 27c, d, the \(\omega\) range has been kept fixed, adjusting only the \(c'\) range. In reality, both ranges would be adjusted to maintain the possibility of sampling convergent parameters, as once again both convergent and non-convergent control parameters can be sampled, discarding non-convergent parameters.

Table 6 Normalized global best solutions of the stability-guided BS-SAPSO

As is clear from Fig. 28, the belief space never actually converges. Furthermore, the stability guiding mechanism works perfectly, as can be seen from Fig. 29, which shows a completely stable swarm, leading to a very small number of infeasible particles in Fig. 30. As a result, particle velocities in Fig. 31 and swarm diversity in Fig. 32 take on values which are orders of magnitude smaller than in Sect. 5.5. However, while the stability guiding mechanism reduces the instability of the search, it actually degrades performance, as is clear when comparing Table 6, where the prefix ‘sg’ indicates experiments employing stability guiding, to Table 5. The performance degradation is probably attributable to exploration of the search space being reduced when only stable control parameter values can be selected.

Fig. 28
figure 28

Belief space for ‘Stagnate’, improvement selection, stability guided

Fig. 29
figure 29

Stable particles for ‘Stagnate’, improvement selection, stability guided

Fig. 30
figure 30

Infeasible particles for ‘Stagnate’, improvement selection, stability guided

Fig. 31
figure 31

Velocity vectors for ‘Stagnate’, improvement selection, stability guided

Fig. 32
figure 32

Swarm diversity for ‘Stagnate’, improvement selection, stability guided

5.7 BS-SAPSO with velocity clamping

The experiments in this section repeat the experiments from Sect. 5.5 while imposing velocity clamping per dimension in an attempt to mitigate velocity explosion, because the stability guidance mechanism from Sect. 5.6, which also sought to reduce velocity explosion, degraded performance of the PSO. The maximum velocity, \(V_{max}\), set equal to the velocity that would allow a particle to traverse the whole search space in one step, and is therefore determined by the domains over which each function is defined. The results are shown in Table 7, where the prefix ‘vc’ indicates velocity clamping (Figs. 33, 34, 35 ).

Table 7 Normalized global best solution of velocity clamped BS-SAPSO

The benefits of velocity clamping are clear, with performance scores that are much better than those of the baseline methods in Table 1. The particle velocities in Fig. 36 and swarm diversity plots in Fig. 37 now clearly do not grow without bounds, and remain in the same order of magnitude as the search space dimensions.

Fig. 33
figure 33

Belief space boundaries for improvement selection for \(n_e=30\)

Fig. 34
figure 34

Stable particles for improvement selection for \(n_e=30\)

Fig. 35
figure 35

Infeasible particles for improvement selection for \(n_e=30\)

Fig. 36
figure 36

Particle velocity vectors for improvement selection for \(n_e=30\)

Fig. 37
figure 37

Swarm diversity for improvement selection for \(n_e=30\)

5.8 Overview of results

Generally, the trend observed throughout is that selection methods with higher selective pressure and less randomness, such as elitist and improvement selection, perform better. Furthermore, higher values for \(n_e\) and p lead to slower convergence of the belief space, and perform better. Whereas updating the control parameters at fixed intervals results in the belief space boundaries following the statistical mean, updating the belief space only upon stagnation of the global best position demonstrates more nuanced updating of belief space boundaries, because the time step at which the loss stagnates is largely influenced by the landscape of the specific objective function. Doing so does, however, not improve performance, most likely because belief space boundaries are only updated when the loss has already stagnated, at which point it may be too late to select new control parameters which could otherwise promote more exploration. Introduction of a stability criterion for the control parameters, which ensures that the control parameters are always within the belief space boundaries, does not improve performance, but does reduce the number of infeasible particles. Finally, velocity clamping per dimension is shown to be a very effective mechanism for mitigating velocity explosion, and greatly improves performance.

Table 9 in Appendix B shows that the best performing experiments updated the belief space at fixed intervals with \(p=\lbrace 100,200,300 \rbrace\), used improvement-based selection with \(n_e=30\), and implemented velocity clamping. The best performing BS-SAPSO variant achieves an improvement of 20% in global best solution quality over PSO-IW on the evaluation function set, and 3% on the test set. When compared to the PSO-TVIW baseline, the improvements in performance are 46 and 55%, respectively.

5.9 Statistical significance

In order to determine whether the performance differences between the BS-SAPSO variants and the baselines are statistically significant, Friedman and Mann–Whitney U tests are performed on the normalized global best solutions, on both the evaluation and test sets.

For the evaluation set, the Friedman test resulted in a test statistic of 2052.02 and a p-value of 0.0, indicating statistical significance. For the test set, the Friedman test resulted in a test statistic of 2237.54 and a p-value of 0.0, indicating statistical significance.

While the Friedman test was performed for all BS-SAPSO variants, the Mann–Whitney U test (and subsequent analysis) is only performed for the 18 best performing variants, together with PSO-RAC. The Mann–Whitney U test used the Bonferroni correction to account for multiple comparisons, with \(\alpha = 0.02 / num\_comparisons\). These are compared against PSO-IW, because it is the best performing baseline. The results of the Mann–Whitney U tests are shown in Table 8, with box plots in Fig. 38 and 39. Corresponding Mann–Whitney U heatmaps of the p-values are given in Figs. 40 and 41 in Appendix A.

The top-performing algorithms, such as vc_fixed100_improve30, are therefore vetted as statistically significant improvements over PSO-IW.

Table 8 Mann–Whitney U test results for BS-SAPSO variants compared to PSO-IW
Fig. 38
figure 38

Box plot of normalized global best solutions on the evaluation set

Fig. 39
figure 39

Box plot of normalized global best solutions on the test set

5.10 Computational complexity

The computational load of the inertia weight PSO is influenced by a number of factors:

  • The number of particles in the swarm (\(n_s\)): each particle represents a potential solution to the optimization problem, therefore more particles imply the exploration of more potential solutions.

  • The dimensionality of the problem (\(n_d\)): higher dimensionality implies a larger search space and therefore more complexity.

  • The number of iterations (\(i_{max}\)): the PSO algorithm iteratively updates the particles’ positions based on the best-known positions, and more iterations require more compute.

As such, the computational complexity of the BS-SAPSO algorithm is expressed as follows:

  • Initialization of particles has a complexity of \(\mathcal {O}(n_s \cdot n_d)\), because each particle needs an initial position in each dimension.

  • The main loop, where the particles’ positions and velocities are updated, has a complexity of \(\mathcal {O}(i_{max} \cdot n_s \cdot n_d)\). The position and velocity has to be updated in every dimension, for every particle, on every time step.

  • Selection of the global best solution has a complexity of \(\mathcal {O}(n_s)\), because it requires each particle to be checked.

  • Updating of the belief space has a complexity of \(\mathcal {O}(n_s)\), because it requires each particle to be checked.

  • Updating of the control parameters has a complexity of \(\mathcal {O}(1)\), because it is a simple operation.

The total computational complexity of the BS-SAPSO algorithm is \(\mathcal {O}(i_{max} \cdot n_s \cdot n_d)\), which is the same as the complexity of the inertia weight PSO algorithm. The BS-SAPSO does not introduce nested loops, and the additional operations are \(\mathcal {O}(1)\) or \(\mathcal {O}(n_s)\). Therefore, the BS-SAPSO algorithm does not increase the asymptotic computational complexity of the inertia weight PSO algorithm.

6 Conclusions

The study designed a self-adaptive particle swarm optimization (SAPSO) algorithm using a belief space (BS). Considering the performance improvements obtained above the baseline, the proposed belief space-guided self-adaptive particle swarm optimization (BS-SAPSO) algorithm can be considered successful. The best performance is achieved by setting the number of particles which update the BS \((n_e)\) to a high value, as well as using large values for the interval at which the belief space is updated (p), which prevents premature BS convergence by delaying updates. However, while improving performance and adapting the PSO control parameters (CPs), the algorithm does introduce \(n_e\) and p as new parameters, as well as the velocity clamping bound \(v_{max}\). That being said, if \(n_e\) is always set to equal the swarm size (which seems to be optimal in almost all experiments conducted), and \(v_{max}\) as the distance between the points furthest apart in the search space, the case can be made that the CPs have been reduced from three to one. While not exhaustive, the brute force grid search undertaken in Sect. 5 does serve to elucidate which type of belief space behaviour results in relatively better and worse performance, and shows that it is generally the case that more exploration (slower convergence) of the belief space leads to better performance. It is also important to note that all experiments were conducted for a fixed swarm size, function dimensionality, and maximum time steps. To summarize, the algorithm improves performance, prevents velocity explosion, and reduces the number of CPs from three to one.

7 Future work

Since the belief space tends to converge around the average of the initial belief space boundaries, it might be worth performing an ablation study to see whether the belief space mechanism is truly adapting to each objective function, or whether it is simply converging at the statistical mean. To test this, the convergence of the belief space limits could be hard-coded, after which the performance of the algorithm could be compared to the performance of the algorithm with a self-adaptive belief space mechanism. If the performance is the same, it would suggest that the belief space mechanism is not adapting to the objective function, but that decreasing the belief space boundaries in a way such as in Fig. 33 does nonetheless improve performance.

Evaluation of the BS approach to self-adaptation on other metaheuristics, such as genetic algorithms and differential evolution, could also be worthwhile and shed light on generalisability of the algorithm. Similarly with other problem classes, such as large-scale, dynamic, multi-objective, and real-world optimization problems.

A potentially promising avenue for improving adaptiveness might be the use of reinforcement learning (RL). Since the problem of finding a self-adaptive PSO algorithm essentially amounts to finding a policy which governs the adjustment of control parameter values, based on observing the particles, in order to find the optimal objective function value, this presents an exemplary reinforcement learning problem where an agent finds a policy to govern its actions, based on observations, in order to maximise a reward signal. Such an approach stands to gain from the following advantages:

  • an RL agent could explore millions of policies, many more than could be investigated manually;

  • if the RL experiment is configured correctly, and if there is a policy to be found, there is a higher likelihood of finding said policy, because the agent can take into account far more correlations and relationships in the runtime behaviour than can be done manually; and

  • a policy found in this way should represent as an algorithmic set of actions and will therefore hopefully not introduce additional control parameters

An investigation into RL-based SAPSO therefore seems like a promising avenue for future work.