Introduction

Many algorithms vanished after a decade. The particle swarm optimization algorithm is still attracted many researchers’ attention over nearly a quarter century. Swarm intelligence, which is based on a population of individuals, is a collection of nature-inspired searching techniques. Particle swarm optimization (PSO), which is one of swarm intelligence algorithms, was invented by Eberhart and Kennedy in 1995 [29, 42]. It is a population-based stochastic algorithm modeled on social behaviors observed in flocking birds. Each particle, which represents a solution, flies through the search space with a velocity that is dynamically adjusted according to its own and its companion’s historical behaviors. The particles tend to fly toward better search areas over the course of the search process [31].

Optimization, in general, is concerned with finding “best available” solution(s) for a given problem. For optimization problems, it can be simply divided into unimodal problem and multimodal problem. As the name indicated, unimodal problem has only one optimum solution, on the contrary, multi-modal functions have several or numerous optimum solutions, of which many are local optimal solutions. It is difficult for optimization algorithms to find the global optimum solutions. Avoiding premature converge is important in multimodal problem optimization, i.e., an algorithm should have a balance between fast converge speed and the ability of “jumping out” of local optima.

Many real-world applications could be modeled as optimization problems. As an outstanding swarm intelligence algorithm, the particle swarm optimization has been widely used to solve enormous real-word problems. It is difficult, if not impossible, to list all the problems that could be solved via the PSO algorithms. The scheduling problem and data mining problem are two typical real-world applications that could be solved by the PSO algorithms.

The aim of this paper is to provide a comprehensive review of the particle swarm optimization algorithms. The remaining of the paper is organized as follows. The basic concepts and the developmental history of PSO algorithm are reviewed in Sect. 2. In Sect. 3, different variants of PSO algorithms and the applications on solving various problems are introduced. The characteristics and issues of PSO algorithms are described in Sect. 4. Section 5 gives some real-world application of PSO algorithms. Finally, Sects. 6 and 7 concludes with future research directions and some remarks, respectively.

The historical development

Rome was not built in a day. After a quarter century, several papers on the PSO algorithms, which include the original and canonical PSO algorithms, have been cited more than ten thousand times [29, 42, 71]. There has been a great development for PSO algorithm after it was proposed in 1995 [29, 42].

Original particle swarm optimization

Each particle represents a potential solution in particle swarm optimization, and this solution is a point in the n-dimensional solution space. The original PSO algorithm is simple in concept and easy in implementation [29, 42]. The velocity \(v_{ij}\) and position \(x_{ij}\) of the jth dimension of the ith particle are updated as follow [32, 43]:

$$\begin{aligned} v_{ij}&= v_{ij} + c_{1}\text {rand}()(p_{ij} - x_{ij}) + c_{2}\text {Rand}()(p_{nj} - x_{ij}) \end{aligned}$$
(1)
$$\begin{aligned} x_{ij}&= x_{ij} + v_{ij} \end{aligned}$$
(2)

where \(c_{1}\) and \(c_{2}\) are positive constants, and \(\text {rand}()\) and \(\text {Rand}()\) are two random functions in the range [0, 1) and are different for each dimension and each particle.

Canonical particle swarm optimization

In the original particle swarm optimizer, the velocity is difficult to control during the search process. The final solution is heavily dependent on the initial seeds (population). For different problems, there should be different balances between the local search ability and global search ability. Shi and Eberhart introduced a new parameter, an inertia weight w, to balance the exploration and exploitation [71, 78]. This inertia weight w is added to Eq. (1), and it can be a constant, linear decreasing value over time [72], or fuzzy value [73, 76]. The new velocity update equation is as follows

$$\begin{aligned} v_{ij} = wv_{ij} + c_{1}\text {rand}()(p_{ij} - x_{ij}) + c_{2}\text {Rand}()(p_{nj} - x_{ij}) \end{aligned}$$
(3)

Adding an inertia weight w can increase the probability for an algorithm to converge to better solutions, and have a way to control the whole process of algorithm’s searching. Generally speaking, algorithm should have a more exploration and less exploitation ability at first, which has a high probability to find more local optima. Exploration should be decreased, and exploitation should be increased to refine candidate solutions over the time. Accordingly, the inertia weight w, should be linearly decreased or even dynamically determined by a fuzzy system.

The equations of classical particle swarm optimization algorithm can be rewritten in the vector form. The velocity and position update equations are as follow [32, 43, 71]:

$$\begin{aligned} \mathbf {v}_{i}&\leftarrow w \mathbf {v_{i}} + c_{1} \mathbf {rand()} (\mathbf {p}_{i} - \mathbf {x}_{i}) + c_{2} \mathbf {Rand()} (\mathbf {p}_{n} - \mathbf {x}_{i}) \end{aligned}$$
(4)
$$\begin{aligned} \mathbf {x}_{i}&\leftarrow \mathbf {x}_{i} + \mathbf {v}_{i} \end{aligned}$$
(5)

where w denotes the inertia weight [73, 78], \(c_{1}\) and \(c_{2}\) are two positive acceleration constants, \(\text {rand}()\) and \(\text {Rand}()\) are two random functions to generate uniformly distributed random numbers in the range [0, 1) and are different for each dimension and each particle, \(\mathbf {x}_{i}\) represents the ith particle’s position, \(\mathbf {v}_{i}\) represents the ith particle’s velocity, \(\mathbf {p}_{i}\) is termed as personal best, which refers to the best position found by the ith particle, and \(\mathbf {p}_{g}\) is termed as local best, which refers to the position found by the members in the ith particle’s neighborhood that has the best fitness evaluation value so far.

The inertia weight w can be different for different particle at different dimension. The inertia weight can be written as \(\mathbf {w}_{i}\). Consider the iteration number t, the equations are rewritten as:

$$\begin{aligned} \mathbf {v}_{i}(t+1)&\leftarrow \mathbf {w}_{i} \mathbf {v}_{i}(t) + c_{1} \mathbf {rand()} (\mathbf {p}_{i} - \mathbf {x}_{i}(t)) \\&\quad + c_{2} \mathbf {Rand()} (\mathbf {p}_{n} - \mathbf {x}_{i}(t)) \\ \mathbf {x}_{i}(t+1)&\leftarrow \mathbf {x}_{i}(t) + \mathbf {v}_{i}(t+1) \end{aligned}$$

Random variables are frequently utilized in swarm optimization algorithms. The length of search step is not determined in the optimization. This approach belongs to an interesting class of algorithms that are known as randomized algorithms. A randomized algorithm does not guarantee an exact result but instead provides a high probability guarantee that it will return the correct answer or one close to it. The result(s) of optimization may be different in each run, but the algorithm has a high probability to find a “good enough” solution(s).

The flowchart of particle swarm optimization algorithm is shown in Fig. 1 and the procedure of particle swarm optimization algorithm is given in Algorithm 1.

Fig. 1
figure 1

The flowchart of particle swarm optimization algorithm

figure a

Footnote 1

A particle updates its velocity according to Eq. (4), and updates its position according to Eq. (5). The \(c_{1} \text {rand()} (\mathbf {p}_{i} - \mathbf {x}_{i})\) part can be seen as a cognitive behavior, while \(c_{2} \text {Rand()} (\mathbf {p}_{g} - \mathbf {x}_{i})\) part can be seen as a social behavior.

In particle swarm optimization, a particle not only learns from its own experience, but also learns from its companions. It indicates that a particle’s “moving position” is determined by its own experience and its neighbors’ experience.

Fully informed particle swarm optimization

Fully informed PSO (FIPS) does not share the concept of “global/local best”. A particle in FIPS does not follow the leader in its neighborhood, but follow all other particles in its neighborhood. The basic equations of the FIPS algorithm are as follow [44, 56]:

$$\begin{aligned} \mathbf {v}_{i}&\leftarrow \chi \left( \mathbf {v_{i}} + \sum _{k = 1}^{N_{i}}\frac{U(0, \varphi ) (\mathbf {p}_{\text {nbr}(k)} - \mathbf {x}_{i})}{N_{i}} \right) \end{aligned}$$
(6)
$$\begin{aligned} \mathbf {x}_{i}&\leftarrow \mathbf {x}_{i} + \mathbf {v}_{i} \end{aligned}$$
(7)

where \(\chi \) denotes the acceleration coefficient, \(U(0, \varphi )\) is a random function to generate random numbers in the range \([0, \varphi ]\), \(N_{i}\) represents the neighborhood size of the ith particle, and \(\mathbf {p}_{\text {nbr}(k)}\) represents the kth particle’s personal best position. Each particle in PSO algorithm represents a potential solution which is a point in the D-dimensional solution space. Each particle is associated with two vectors, i.e., the velocity vector and the position vector. Throughout this paper, i is used to index the particles or solutions (from 1 to S) and d is used to index the dimensions (from 1 to D). The S represents the number of particles, and D represents the number of dimensions. The position of the ith particle is represented as \(\mathbf {x}_{i}\), \(\mathbf {x}_{i} = [x_{i1}, x_{i2}, \ldots , x_{id}, \ldots , x_{iD}]\). \(x_{id}\) represents the value of the dth dimension for the ith solution, where \(i= 1, 2, \ldots , S\), and \(d = 1, 2, \ldots , D\). The velocity of a particle is represented as \(\mathbf {v}_{i}\), \(\mathbf {v}_{i} = [v_{i1}, v_{i2}, \ldots , v_{id}, \ldots , v_{iD}]\).

The parameter w was introduced to control the global search and local search ability [71]. The PSO algorithm with the inertia weight is termed as the canonical/classical PSO. The canonical PSO algorithm could be rewritten as the PSO with constricted factor (PSO-CF) version [26]. The explosion, stability, and convergence of PSO-CF algorithm were analyzed via the new velocity updating equation.

$$\begin{aligned} p_{md} = \frac{\varphi _{1} p_{id} + \varphi _{2} p_{nd}}{\varphi _{1} + \varphi _{2}} \end{aligned}$$
(8)

Equation (4) is reformed as follows:

$$\begin{aligned} v_{id}^{t+1} = \chi ( v_{id}^{t} + \varphi \text {rand()} (p_{md}^{t} - x_{id}^{t})) \end{aligned}$$
(9)

where the \(\varphi = \varphi _{1} + \varphi _{2}\). Equation (9) also could be reformed as follows:

$$\begin{aligned} v_{id}^{t+1} =&\chi [ v_{id}^{t} + \varphi _{1} \text {rand}()(p_{id}^{t} - x_{id}^{t}) + \varphi _{2} \text {rand}()(p_{nd}^{t} - x_{id}^{t})] \end{aligned}$$
(10)

Based on Eqs. (10), (9) could be easily transferred to Eq. (4) via \(\chi = w\), \(c_{1} = \chi \times \varphi _{1}\), and \(c_{2} = \chi \times \varphi _{2}\). The inertia weights and constriction factors in PSO were discussed in [30].

The state-of-the-art

There are many variants of PSO algorithms which have been proposed, including multiple swarms, new efficient learning strategy, diversity maintaining strategy, and hybrid algorithms to solve various optimization problems.

Algorithms

It’s meaningless and difficult, if not impossible, to count the number of PSO variants. The following list gives the name of several PSO variants for examples. These PSO variants could be generally categorized into five groups:

  1. 1.

    The first adjusts the configuration parameters to balance the global and local search abilities:

    • Standard particle swarm optimization (SPSO-BK) [8],

    • Standard particle swarm optimization 2011 [24],

    • Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients (HPSO-TVAC) [66],

    • Adaptive particle swarm optimization (APSO) [96].

  2. 2.

    The second aims to enhance the population diversity by designing new information propagation strategies:

    • Fully informed particle swarm optimization [44, 56],

    • Particle swarm optimization algorithm with Re-initialize strategy [20].

  3. 3.

    The third is the hybridization of PSO algorithm and other auxiliary search techniques:

    • Multiple strategies-based orthogonal design particle swarm optimizer (MSODPSO) [64],

    • Particle swarm optimization algorithm with parasitic behavior (PSOPB) [62],

    • Particle swarm optimization with dynamical exploitation space reduction strategy (DESP-PSO) [21],

    • Particle swarm optimization with an aging leader and challengers (ALC-PSO) [13],

    • Adaptive particle swarm optimization with heterogeneous multicore parallelism and GPU acceleration [85].

  4. 4.

    The fourth introduces multiple swarms or coevolving groups to improve the global search ability:

    • Cooperative particle swarm optimizer (CPSO) [7],

    • Dynamic multi-swarm particle swarm optimizer (DMS-PSO) [50],

    • Particle swarm optimization with interswarm interactive learning strategy [63],

    • Cooperatively coevolving particle swarms (CCPSO2) [48].

  5. 5.

    The fifth is PSO algorithms with new efficient learning strategy:

    • Comprehensive learning particle swarm optimizer (CLPSO) [49],

    • Orthogonal particle swarm optimization (OPSO) [38],

    • Orthogonal learning particle swarm optimization (OLPSO) [97],

    • Genetic learning particle swarm optimization [35].

Different topology structure can be utilized in PSO, which will have different strategies to share search information for every particle. Global star and local ring are two most commonly used structures. A PSO with global star structure, where all particles are connected to each other, has the smallest average distance in swarm, and on the contrary, a PSO with local ring structure, where every particle is connected to two near particles, has the biggest average distance in swarm [22, 56].

Problems

The optimization problem could be classified into several categories. According to the number of objectives, the optimization problems can be divided as single-objective, multiobjective problems, and many-objective problems. Based on the properties of decision variables, the problems are labeled as dynamic problems, large-scale optimization problems, etc.

Single-objective problems

The single-objective problems is the basic problems for optimization. The original PSO algorithm was tested on single-objective problems with continuous search ranges [29, 42]. The modified PSO algorithm has been used to solve discrete optimization or combinatorial optimization problems [6].

Multiobjective problems

Multiobjective Optimization refers to optimization problems that involve two or three conflicting objectives, and a set of solutions is sought instead of one [27]. For the multi-objective problems, the traditional mathematical programming techniques have to perform a series of separate runs to satisfy different objectives [27]. Many kinds of PSO algorithms have been used to solving multiobjective optimization problems [28], such as adaptive multiobjective PSO algorithm [36], geometric structure-based PSO algorithm [94], normalized ranking based PSO algorithm [17], just to name a few. An essential issue in utilizing particle swarm optimizer to solve multiobjective or many-objective problems is the setting of the personal best \(\mathbf {p}_{i}\) and the neighborhood best \(\mathbf {p}_{n}\) [17].

Many-objective problems

Many-objective optimization refers to algorithms which solve problems with more than three conflicting objectives [33]. Unlike the multiobjective optimization, the Pareto optimality is not effective because nearly all solutions are Pareto non-dominated for problems with more than three objectives [86]. The population diversity is another difficulty for many objective optimization because the similarity is hard to estimate in high-dimensional space [86]. The solution comparison or the selection of more “representative” solutions is an essential issue in many-objective optimization. The various PSO algorithms have been proposed to solve many-objective optimization problems, such as PSO algorithm with two-stage strategy and a parallel cell coordinate system [39], the normalized ranking-based PSO algorithm [17], or two archive algorithm [86], just to name a few.

Multimodal multiobjective problems

PSO algorithm is used to solve new kind of optimization problems, such as multimodal multiobjective problems [95]. For multimodal multiobjective problems, the algorithm needs to find multiple global optima in search space which satisfy more than one objective in objective space [19].

Beside the aforementioned problems, PSO algorithms also have been applied to other problems, such as large scale problems [15, 16], dynamic multimodal optimization problems [88], etc.

Characteristics and issues

The evolution and learning are two basic characteristics of PSO algorithms, or more general, swarm intelligence algorithms. There are many unsolved issues for swarm intelligence algorithms, such as the exploration versus exploitation, the population diversity, and the parameter setting.

Evolution

Evolution is an important concept in evolutionary computation and swarm intelligence. In biology, the “evolution” is defined as “a change in the heritable characteristics of biological populations over successive generations.” In swarm intelligence algorithms, the evolution indicates that the solutions are generated iteration by iteration, and move toward better and better search areas over the course of the search process. For PSO algorithms, the “leader” particles or the learned particles are selected in this self-evolution or self-improvement process. The obtained fitness values are improved based on the self-evolution process. Normally, there is one group in the swarm intelligence algorithms, and information is propagated in all individuals. The hybrid algorithm and multiple sub-swarms are two special swarms which have different evolution methods. For these two kinds of algorithms, the search information is different for each group and the search information is exchanged at certain times.

Hybrid algorithms

The aim of hybrid algorithm is to combine the strength of two or more algorithms, while simultaneously trying to minimize any substantial disadvantage [81]. The PSO algorithm has been combined with the fuzzy modeling [83], self-organizing radial basis function (RBF) neural network [37] to solve various problems.

Multiple sub-swarm

Niching method is able to locate multiple solutions in multimodal search areas. The PSO algorithm could be combined with niching techniques to rapidly solve problems [9]. The swarm of particles could be divided into multiple sub-swarms [93], dynamically changing group size [48], or multiple species [99]. The different swarm could have different function during the search, and the search information could be propagated effectively via interswarm interactive learning strategy [63].

Learning

Learning has two aspects in particle swarm optimization algorithm. One is learning from the problems, which means the algorithm is able to adjust its search strategies or parameters dynamically during the search. The other one is learning from particles themselves, which indicates the approach of search information propagated among all particles.

The developmental swarm intelligence (DSI) is a new framework of swarm intelligence [77]. DSI algorithm has two kinds of functionalities: capability learning and capacity developing. The capacity developing is a top-level learning or macro-level learning methodology. The capacity developing describes the learning ability of an algorithm to adaptively change its parameters, structures, and/or its learning potential according to the search states of the problem to be solved. In other words, the capacity developing is the search potential possessed by an algorithm. The capability learning is a bottom-level learning or micro-level learning. The capability learning describes the ability for an algorithm to find better solution(s) from current solution(s) with the learning capacity it possesses. The flowchart of developmental particle swarm optimization algorithm is shown in Fig. 2.

Fig. 2
figure 2

The flowchart of developmental particle swarm optimization algorithm

Learning from problem

The capability learning indicates that the algorithm has an ability that could learning from problem. The PSO could dynamically adjust its search strategy during the search. Several particle swarm optimization algorithms with different capability learning strategies have been proposed, such as adaptive PSO [96].

The objects of learning

The objects of learning could be different for various PSO algorithms. For example, a particle is able to learn personal best search information from other particles in the comprehensive learning PSO (CLPSO) [49]. The learning equation of CLPSO is given in Eq. (11).

$$\begin{aligned} v_{id} = wv_{id} + c \times \text {rand}()(pbest_{fi(d)d} - x_{id}) \end{aligned}$$
(11)

where \(\mathbf {f}_{i} = [f_{i(1)}, f_{i(2)},\cdots , f_{i(D)}]\) defines which particle’s pbest the ith particle will follow.

The speed of learning

A particle updates its position in the search space at each iteration. The velocity update equation consists of three parts, which are previous velocity, cognitive part, and social part. The cognitive part means that a particle learns from its own searching experience, and correspondingly, the social part means that a particle can learn from other particles, or learn from the best in its neighbors in particular. Topology defines the neighborhood of a particle [22].

Particle swarm optimization algorithm has different kinds of topology structures, e.g., star, ring, four clusters, or Von Neumann structure. A particle in a PSO with a different structure has different number of particles in its neighborhood with a different scope. Learning from a different neighbor means that a particle follows different neighborhood (or local) best, in other words, topology structure determines the connections among particles, and the strategy of search information propagation.

In general, PSO with star topology has the smallest diameter and average distance, which means that search information has the fastest propagation in all topologies, and on the contrary, PSO with ring topology has the largest diameter and average distance.

Exploration versus exploitation

The most important factor affecting an optimization algorithm’s performance is its ability of “exploration” and “exploitation”. Exploration means the ability of a search algorithm to explore different areas of the search space to have high probability to find good optimum. Exploitation, on the other hand, means the ability to concentrate the search around a promising region to refine a candidate solution.

A good optimization algorithm should optimally balance the two conflicted objectives, which indicates that the ability of exploration and ability of exploitation should be adjusted via the population diversity analysis when solving different problems or on different search stages. For example, to solve multimodal problem, great exploration ability means that algorithm has great possibility to “jump out” of local optima.

Population diversity

Population diversity is a measurement of population state in exploration or exploitation. It illustrates the information of particles’ position, velocity, and cognitive. Particles getting diverge means that algorithm in an exploration state, on the contrary, particles converging into a small search area means that algorithm in an exploitation state.

Population diversity of PSO is useful for measuring and dynamically adjusting algorithm’s ability of exploration or exploitation accordingly. Shi and Eberhart gave three definitions on population diversity, which are position diversity, velocity diversity, and cognitive diversity [74, 75]. Position, velocity, and cognitive diversity are used to measure the distribution of particles’ current positions, current velocities, and pbests (the best position found so far for each particles), respectively. From diversity measurements, the useful information can be obtained.

Low diversity, which particles converging into a small search area, is often regarded as the main cause of premature convergence. Several mechanisms have been proposed to promote diversity in particle swarm optimization, such as PSO with elitist re-initialization.

Parameter setting

In swarm intelligence research, one comment often received for any new proposed algorithm is “the authors should give a fair comparison on all algorithms in the paper.” It’s very normal that reviewers ask for a fair comparison. One reason from a real comment is as follows: “The thing is that the authors have certainly very carefully tuned their parameter values to get the best possible results on their test functions. However, I am almost certain that they did not do the same for the other methods they compared to.” What is a fair comparison among all algorithms? Usually, the proposed algorithm and the other compared algorithms are tested on a set of new benchmark functions, which are different from benchmark functions used by other algorithms when they were firstly proposed. Should all algorithms have different parameter settings, and each algorithm have exactly the same settings with the settings when it was firstly proposed? Or should all algorithms have the same parameter settings?

In the algorithm comparison, it maybe a good option for new variant of PSO algorithm or other swarm intelligence algorithms that compare the proposed algorithm with the standard PSO algorithm. It should be noted that there are two variants called standard particle swarm optimization (SPSO) algorithms. The first one, which termed as SPSO-BK, was defined by Bratton and Kennedy in 2007 [8], and the other one, which termed ad SPSO-C, was defined by Clerc in 2006, 2007, and 2011 [25]. The analysis of these two algorithms was given in [23]. The strategy of population size setting is different for problems with different scale [12].

Real-world applications

Particle swarm optimization as one of outstanding swarm intelligent algorithms has been widely used to solve enormous real-word problems, such as optimal design of electric machines problem [47], Wi-Fi indoor positioning problem [14], indoor high precision three-dimensional positioning problem [10], energy management problem [57], economic load dispatch problems [61], just to name a few. It is no exaggeration to say that PSO has unique performance nearly in every area, like industrial engineering, intelligent manufacture, data mining, information and communication system, automatic control system, image processing. We cannot list all these applications because of the various areas. Here, we take scheduling problem and data mining problem as example to review the application of PSO in real-world problems.

Scheduling problem

Scheduling problem is one kind of combinatorial optimization problem and a very popular area in different industrial field. The job shop scheduling problem (JSP) [90], the test task scheduling problem (TTSP) [52], the parallel machine scheduling problem (PMSP) [91] are typical representatives of the scheduling problem. What they all have in common is the rational allocation of jobs or tasks to machines or resources. Therefore, we take the TTSP, the unrelated parallel machine scheduling problem (UPMSP) [91] and the flexible job shop scheduling problem (FJSP) [70] into consideration to illustrate the characteristics of these scheduling problem and the application of PSO.

TTSP is one of the key technologies to improve the performance of automatic test system (ATS) [82]. According to the actual problem, the mathematical model of TTSP is that test tasks have to be arranged on test resources. Each task may have multiple options to choose, and each task may be tested on multiple instruments at the same time. Although FJSP has similarity with TTSP, there are some difference between them. Each operation in FJSP has to be carried on only one machine and the operations of a job are in accordance with the predetermined order [53]. For UPMSP each job requires a given processing time and machines are considered unrelated when the processing times of the jobs depend on the machine to which they are assigned to [84]. In these mathematical models, each job of FJSP and PMSP can be performed on every machine with no constraint. However, a task in TTSP has to be performed on some predetermined resources.

PSO and various variant PSO or hybrid PSO have demonstrated their performance in solving these scheduling problems.

Test task scheduling problem (TTSP)

For TTSP, Lu proposed a hybrid particle swarm optimization and tabu search for single-objective TTSP with constraints [51]. PSO is used for solving the test task sequence problem and tabu search is used for the instrument resource dispatching problem. It is one kind of non-integrated strategy for solving the scheduling problem. A new kind of inertia weight related with the iteration process and a constraint handling mechanism based on coding strategy were used. An encoding strategy of every particle was invented for handling the serial task sequence constraints. Lu also combine PSO with variable neighborhood MOEA/D to solving multi-objective TTSP, which used PSO to find the ideal point in multi-objective evolutionary algorithm based on decomposition (MOEA/D) [54]. The makespan and the mean load of instrument are the two objectives. In addition, PSO was used as an embedded algorithm in an integrated solution framework based on packet scheduling and dispatching rule for job-based scheduling problems. PSO demonstrated its performance through comparison with other kind of meta-heuristics algorithms.

Flexible job shop scheduling problem (FJSP)

For FJSP, Nouiri investigated a two stage particle swarm optimization (2S-PSO), which consists of PSO after initial swarm for objective of makespan and PSO after final swarm for stability or other objective, to solve the flexible job shop predictive scheduling problem considering possible machine breakdowns [59]. The objective is to solve the problem under uncertainty with only one breakdown. 2S-PSO are tested on various benchmark data varying from partial FJSP to total FJSP. The proposed 2S-PSO evaluates the effect of disruptions on the solution using the robustness and stability measure. Singh proposed an quantum behaved particle swarm optimization (QPSO) for FJSP [79]. QPSO can effectively address the drawback of PSO, which is easy to trap at local optimum due to the large reduction in velocity value as iteration proceeds and poses difficulty in reaching at best solution. In addition, mutation has been introduced in QPSO for avoiding the premature convergence. Zhang proposed an effective hybrid particle swarm optimization algorithm for multi-objective FJSP [98]. PSO and a tabu search algorithm are combined to obtain the local and global searching ability. It is a very useful integrated strategy for multi-objective optimization problems.

Parallel machine scheduling problem (PMSP)

For PMSP, Hulett focused on scheduling non-identical parallel batch processing machines to minimize total weighted tardiness and PSO was used to solve the problem [40]. The smallest position value rule is used to convert the continuous position values of the particle to a discrete job permutation. It is one kind of application for testing printed circuit boards in an electronics manufacturing facility. Likewise, a heuristic is proposed to simultaneously group the jobs into batches and schedule them on a machine. Shahidi–Zadeh investigated a comparison study for solving a bi-objective unrelated parallel batch processing machines scheduling problem [68]. The multi-objective particle swarm optimization (MOPSO), non-dominated sorting genetic algorithm (NSGA-II), multi-objective ant colony optimization algorithm (MOACO), and multi-objective harmony search (MOHS) are used to solve the problem. The MOPSO got a good performance in diversity and spacing of Pareto optimal frontiers. Shahvari focused on a bi-objective batch processing problem with dual-resources on unrelated-parallel machines. Four bi-objective PSO-based search algorithms are proposed to efficiently solve the optimization problem for medium- and large-size instances [69].

Data mining problem

Data mining problem has different branches, like outlier detection, association rule, cluster, classification, prediction. PSO can be used to solve all these branches. Therefore, we have various variant PSO, such as PSO for outlier detection, PSO for classification, PSO for association rule mining and PSO for prediction analysis for time series. In addition, these variant PSO has been used in sensor networks, medical dialysis, network security, financial monitoring, image processing and other fields.

Outlier detection

An outlier is a data which is different from the other data in that domain.This abnormal data or point can be very useful to describe the abnormality of that system. The outlier detection is useful in many applications [3, 65]. Misinem proposed a rough set outlier detection strategy based on PSO, which is to find minimum non-reduct [58]. Ye proposed a new algorithm for high-dimensional outlier detection based on constrained PSO [92]. The concept of outliers is defined as sparsely populated patterns in lower-dimensional subspaces. PSO is used with a specifically designed particle coding and conversion strategy as well as some dimensionality-preserving updating techniques to the search for best abnormally sparse subspaces. Condition based maintenance (CBM) is gaining importance in industry because of the need to increase machine availability. An application of PSO is presented for detection of machinery fault for CBM [67]. It is also one kind of application for outlier detection. Alam used PSO-based clustering strategy to realize web bots detection [2]. Feng proposed a multi-objective vector evaluated PSO with time variant coefficients for outlier identification in power systems [34]. It is one kind of unsupervised classification of electric load data.

Association rule mining

Association rules aims in extracting important correlation, frequent pattern, association or casuals structures among the set of items in the data sets [5]. Association rule basically extracts the patterns from the database based on the two measures such as minimum support and minimum confidence. Ankita had reviewed the application of PSO in association rule mining [5]. PSO is implemented for association rule mining in two ways. One is to generate rules by implementing PSO in the traditional algorithm of association rule mining. Another is optimization of association rule generated by traditional algorithm using PSO. Maragatham investigated a weighted particle swarm optimization technique for optimizing association rules [55]. They consider the utility based temporal association rule mining method for generating the association rules and PSO is used to optimize the generated rules. Indira proposed an adaptive PSO that yields a finer solution by performing a diversified search over the entire search space [41]. The parameters such as inertia weight and acceleration coefficients are adjusted dynamically. We must say that PSO has abundant application for mining associate rule [4, 45, 89].

Classification

Data clustering, one of the most important techniques in data mining, aims to group unlabeled data into different groups on the basis of similarities and dissimilarities between the data elements. A typical clustering process involves feature selection, selection of a similarity measure, grouping of data, and assessment of the output. Alam had reviewed the research on particle swarm optimization based clustering [3]. PSO is often used in this area to optimize the parameters of traditional algorithm, like support vector machines (SVMs), backpropagation (BP) network and others. Porwik focused on signatures verification based on probabilistic neural network (PNN) classifier optimised by PSO algorithm [60]. Optimal parameters of the PNN have been determined by means of PSO procedure. Cervantes proposed a PSO-based method for SVM classification on skewed data sets [11]. PSO algorithm is used to evolve the artificial instances, eliminating noise instances for enhance the performance of support vector machines. Zhang focused on image segmentation using PSO and PCM with Mahalanobis distance [100]. PSO is used to optimize the initial clustering centers.

Prediction analysis for time series

Time series is an ordered sequence of observations that are evenly spaced at uniform time intervals and measured successively. Prediction of time series uses a sequence of historical values to develop a model for forecasting future values [46]. PSO was combined with other algorithms, like RBF neural networks, regression analysis. Lee used RBF neural networks with a nonlinear time-varying evolution PSO algorithm to realize the time series prediction in a practical power system [46]. Akande proposed a hybrid PSO and support vector regression model for modelling permeability prediction of hydrocarbon reservoir [1]. PSO is investigated for the optimal selection of SVR hyper-parameters for the first time in modelling the hydrocarbon reservoir. Zou combined least square support vector regression and PSO together for short term load forecasting in power system to solve the power dispatch problem [101].

Future research

Theory analysis

Particle swarm optimization, more widely, the swarm intelligence algorithms are based on the “trail and error” strategy. More research should be conducted on foundational problems of swarm intelligence. For example, search mechanism of swarm intelligence algorithms, the learning ability of swarm intelligence algorithms, the balance of exploration and exploitation, and more effective search strategy of algorithm should be studied.

Data-driven based algorithm

The data-driven algorithm indicates that the algorithm could extract the features from the solved problem and obtain the landscape by learning on the data set. Figure 3 gives a framework of data-driven swarm intelligence algorithms. Each candidate solution is a data sample from the search space. The model could be designed or adjusted via the data analysis on the previous solutions. The landscape or the difficulty of a problem could be obtained during the search, i.e., the problem could be understood better. With the learning process, more suitable algorithms could be designed to solve different problems, thus, the performance of optimization could be improved [18]. Several particle swarm optimization algorithms, especially the surrogate-assisted PSO algorithms, have been employed in data-driven algorithms, such as a surrogate-assisted PSO algorithm with committee-based active learning strategy to solve expensive problems [87], a surrogate-assisted cooperative swarm optimization to solve high-dimensional expensive problems [80].

Fig. 3
figure 3

A framework for data-driven swarm intelligence algorithms

Applications

Different optimization problems could be modeled in many areas in our everyday life. With the particle swarm optimization algorithms, or more generally swarm intelligence, more effective applications or systems can be designed to solve real-world problems. The particle swarm optimization algorithm not only could be used in problem with explicit model, but also in problem with implicit model. With the applications in complex engineering or design problems, the strength and limitation of various particle swarm optimization algorithm could be revealed and interpreted.

Conclusion

After nearly a quarter century, the particle swarm optimization algorithm has gained a great reputation and a wide range of successful applications in evolutionary computation and swarm intelligence. Particle swarm optimization algorithm, which is modeled on the social behaviors observed in flocking birds, is a population-based stochastic algorithm. In this paper, the history development, the state-of-the-art, and the applications of the PSO algorithm are reviewed. In addition, the characteristics and issues of the PSO algorithm are also discussed from the evolution and learning perspectives. Every individual in the PSO algorithm is learning from itself and another particle with good fitness value. The search performance and convergence speed are affected by different learning strategies. PSO algorithm has been widely used to solve enormous real-word problems. The scheduling and data mining problems are used as illustrations on PSO solving real-world application problems.

“Now this is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.Footnote 2” Particle swarm optimization algorithm has been invented for a quarter century, it still could be researched in many disciplines. With the analysis of different evolution and learning strategies, particle swarm optimization algorithm could be utilized on solving more real-world application problems effectively, and the strength and limitation of various PSO algorithm could be revealed.