1 Introduction

The optimal power flow (OPF) problem is a large-scale, highly nonlinear, non-convex optimization problem (Singh et al. 2021). In 1962, Carpentier studied economic scheduling and added more constraints (Carpentier 1962). This expanded the economic dispatch problem, which is served as the foundation for the development of the OPF problem. The basic task of the power system is to operate safely and reliably to meet the power supply demand of the load side (Abdelaziz et al. 2016). The OPF flow is an effective tool, which can help experts make decisions on the planning and dispatching of power systems. Its core process is to adjust the control variables, such as active power generated by thermal power units, and obtain the power transmitted on each branch and the voltage of each node through power flow calculation. Through multiple iterations, the decision variables of the system are modified to obtain satisfactory operation status (Khunkitti et al. 2021).

The OPF problem has many constraints and multiple local optimal solutions (Davoodi et al. 2018). It means that there are a lot of infeasible solutions in the system. And it is easy to fall into the local optimal and stagnate in the process of solving. It is difficult for traditional optimization methods to find satisfactory results, such as nonlinear programming (Lavaei and Low 2011), Newton method (Santos and Costa 1995) and gradient method (Dommel and Tinney 1968). Intelligent algorithms with global search capabilities have received widespread attention. They are widely used in optimization problems in science and engineering, and can obtain competitive solutions. Many intelligent algorithms are employed in order to tackle the OPF problems and economic dispatch problem, including Particle Swarm Optimization (PSO) (Gomez-Gonzalez et al. 2012), Cuckoo Search (CS) (Ponnusamy and Rengarajan 2014), Ant Lion Optimizer (ALO) (Ali et al. 2016), Differential Evolution (DE) (Sayah and Zehar 2008) and Mine Blast Algorithm (MBA) (Ali and Abd Elazim 2018). These algorithms employ continuous iterations until a predefined termination condition is met. In each iteration, each individual updates according to certain rules or formulas, and constantly moves to the global optimal position. Many algorithms are faced with the problems of unbalanced exploration and exploitation and difficulty in jumping out of the local optimum. Some scholars have proposed different improvement strategies for the OPF problem.

Salma et al. proposed an improved salp swarm algorithm (ISSA), which incorporates random mutation and adaptively adjusts the exploration and exploitation process. The study considers the multiple fuel costs, valve point effects and prohibited operating zones of generators in the OPF system. ISSA has been able to find competitive solutions in multiple case studies (Abd El-sattar et al. 2021). Awad et al. proposed a new differential evolution algorithm, called DEa-AR, to solve the stochastic optimal active–reactive power dispatch (OARPD) problems involving renewable energy. DEa-AR uses arithmetic compound crossover strategy and adjusts the scaling factor based on Laplacian distribution. It also added an archive to place the inferior solution for later use. The simulation results show that the proposed algorithm can effectively solve the OARPD problem containing renewable energy and provide a high-quality solution (Awad et al. 2019). Farhat et al. proposed an enhanced slime mold algorithm (ESMA) based on neighborhood dimension learning search strategy so as to enhance its exploitation capability. Its test system incorporates wind and photovoltaic generators, and its objective function incorporates a carbon tax in order to reduce emissions. The testing results show that ESMA obtains the optimal solution and show better convergence performance (Farhat et al. 2022). Bentuati proposed an Enhanced Moth Swarm Algorithm (EMSA), which combines MSA with a reverse learning strategy to maintain the diversity of the moth population. It was tested in 12 cases of three OPF testing systems, and the results showed that EMSA had better performance (Bentouati et al. 2021).

Real problems usually have multiple objective functions. If these indexes do not conflict with each other, an optimal solution can be found by using optimization techniques. However, it is more common for objective functions to conflict with each other, and the improvement of one objective function will inevitably lead to the reduction of another objective function. This problem is known as a multi-objective optimization problem (MOP), and its optimal solutions form a set called the Pareto solution set (Rizk-Allah et al. 2020). The OPF problem consists of multiple objective functions, such as thermal power unit fuel costs, active power loss and emissions, which are inherently conflicting with one another (Fonseca and Fleming 1993). Therefore, the OPF problem is regarded as a multi-objective optimization problem to balance these conflicting objective functions. In many literatures above, the OPF problem is treated as a single objective optimization problem, and its objective functions are optimized separately. However, this approach is no longer suitable at present. American ecosystem conservation organizations strongly urge power plants not only to pursue the lowest power generation cost, but also to consider the pollution index (Taher et al. 2019). So the trend in recent years is to develop a multi-objective method to solve the OPF problems. Intelligent algorithms combining multi-objective thought have achieved exciting results on this problem, including NSGA-II (Jeyadevi et al. 2011), MOPSO (Hazra and Sinha 2011), MOEA/D (Medina et al. 2014), MOGJO (Snášel et al. 2023), etc. According to the law that there is no free lunch in the world, no perfect algorithm can have excellent performance in any problem, so the multi-objective optimization algorithm for OPF needs further research.

Shabanpour et al. proposed a modified teaching–learning-based optimization (MTLBO) based on an adaptive wavelet mutation strategy, which attached an external archive and used fuzzy clustering techniques to maintain the diversity of the external archive. It solves the multi-objective OPF problem including power generation cost and emissions, and obtains a set of Pareto solutions (Shabanpour-Haghighi et al. 2014). El-Sattar et al. used a Jaya optimization algorithm to solve the OPF problem, and solved the single-objective and multi-objective cases respectively. In the multi-objective framework, the Jaya algorithm is combined with the Pareto concept to obtain the non-dominant solution, and then the fuzzy set theory is used to obtain the optimal compromise solution. However, the solution set obtained by this method in solving multi-objective OPF problem is uneven (El-Sattar et al. 2019). Zhang proposed an improved decomposition method based on multi-objective evolutionary algorithm (MOEA/D) to deal with the competition of each index in the optimal power flow. An improved Chebyshev decomposition method is introduced to decompose each index in order to obtain uniformly distributed Pareto frontiers on each target. Simulation results show that it can find well-distributed Pareto solution sets (Zhang et al. 2016). Khan et al. proposed a multi-objective hybrid firefly and particle swarm optimization algorithm (MOHFPSO) by using a multi-objective structure based on non-dominated sorting and crowded distance methods. And MOHFPSO applied the ideal distance minimization method to select the optimal compromise solution from the Pareto optimal set. Although the Pareto solution set obtained is improved compared with the standard algorithm, its coverage rate decreases (Khan et al. 2020). Chen et al. proposed a Novel Hybrid Bat Algorithm (NHBA) to modify the local search formula and add a mutation mechanism by using a monotone random fill model (MRFME) based on extreme value. In order to obtain more feasible solutions, a non-dominated sorting method combining the Pareto fuzzy dominance (CPFD) of constraints is proposed. The results of OPF show that this method can deal with constraints better (Chen et al. 2019). Zhang et al. improved the NSGA-III algorithm named I-NSGA-III and applied it to the multi-objective OPF problem. An adaptive elimination strategy was proposed to reduce the use of selection strategies, and boundary point preservation strategy was integrated to maintain population diversity. Experimental results of OPF show that the proposed algorithm had better performance on three objectives, but not on two objectives (Zhang et al. 2019).

Multi-objective optimization algorithms often employ two strategies: population elitism and archive elitism. Population elitism algorithms (such as MOJAYA, MOHFPSO, NHBA) typically have a fixed population size. Excellent individuals may not be preserved and can be discarded during the evolutionary process. For algorithms with an archive (such as MTLBO, MOEA/D), the archive is usually used to store non-dominated solutions. However, the evolutionary process of the population is non-greedy, which does not guarantee the convergence and stability of the algorithm’s search for optimal solutions. We believe that combining these two aspects can help maintain the stability of the algorithm’s search for optimal solutions and find better solutions. Moreover, due to the different nature and characteristics of various problems, the same operator may perform well or poorly on different problems. For example, DE algorithm has developed many operators to adapt to different types of optimization problems. In the absence of prior knowledge, we are committed to developing an adaptive, parameter-free local optimizer that allows the algorithm to spontaneously select the appropriate operator for position updating.

In this paper, the coyote optimization algorithm (COA) is selected for research. COA is a new optimization algorithm proposed by Pierezan and Coelho in 2018 (Pierezan and Coelho 2018). COA combines the principles of evolution and swarm intelligence and has a unique algorithm setup, which includes swarm search of sub-populations and considers the birth and death process of coyotes. The algorithm has demonstrated excellent performance and has been successfully applied in numerous fields. Souza proposed a binary version of COA, which utilizes a hyperbolic transfer function to select the best feature subset for classification and employs the naive Bayes classifier to verify the performance of COA. The results show that COA can find subsets with fewer features and achieve better classification accuracy (Souza et al. 2020). Li added a differential evolution strategy to COA and combined it with the fuzzy Kapoor entropy and fuzzy median aggregation method to utilize it in the realm of threshold image segmentation and exhibit improved image segmentation quality (Li et al. 2021). Ali applied COA to solve the Unit Commitment (UC) problem in power systems, which aims to satisfy constraints while achieving an economic minimum cost over time. During the simulation experiments, COA was employed to determine the optimal generation schedule. The obtained results demonstrated that COA outperformed the existing literature in terms of both total cost reduction and shorter CPU running time (Ali et al. 2023).

Existing multi-objective algorithms for the multi-objective OPF problem face challenges in balancing convergence and diversity simultaneously. It is necessary to provide sufficient pressure during the offspring selection process and enhance the diversity and convergence of multi-objective optimization algorithms. This will help in discovering a higher quality solution set in the MOOPF problem. In this paper, a multi-objective COA based on hybrid elite mechanism and Meta-Lamarckian learning strategy (MOCOA-ML) is proposed for solving multi-objective OPF problems. The main contributions are as follows.

  1. (1)

    The coyote optimization algorithm was combined with non-dominant ranking. Non-dominant ranking was used to judge the dominant relationship among individuals, and the individuals equal to the population number were selected from all the individuals to enter the next iteration.

  2. (2)

    An external archive is added to retain the excellent individuals, which is similar to the archive in MOPSO and adopts the mechanism of grid. The role of the archive is to make the stored solution set more diverse and to have a greater probability of selecting the elite in the sparse area when selecting the elite coyote in COA. It is conducive to the development of the sparse area.

  3. (3)

    A local development optimizer based on the Meta-Lamarckian learning strategy is proposed to optimize the population solution. The local optimizer integrates four kinds of crossover operators, and adaptively adjusts the probability of each operator in the optimization process to achieve more efficient search.

The remaining sections of the paper are organized as follows. In Sect. 2, the mathematical model of the OPF problem is presented. In Sect. 3, the proposed MOCOA based on hybrid elite mechanism and Meta-Lamarckian learning is introduced. In Sect. 4, the performance of the proposed algorithm is tested by benchmark functions. In Sect. 5, six cases are selected in IEEE 30-node system and IEEE 57-node system for simulation experiments. In Sect. 6, the conclusion and future research direction are presented.

2 Mathematical model of multi-objective optimal power flow problem

2.1 Multi-objective optimization problem

Comparing solutions is a simple task in single-objective optimization since there is only one objective function to consider. For the minimization problem, the solution X is superior to Y if and only if f(X) is less than f(Y). However, in the field of multi-objective problems, each solution has multiple evaluation indexes, so some definitions need to be introduced.

Definition 1

Pareto domination. When a solution X is superior to a solution Y in all objectives, then the solution X dominates the solution Y, or alternatively, the solution X is dominated by the solution Y. If the solution X has at least one goal better than the solution Y, and there is some index worse than the solution Y, then the solution X and the solution Y do not dominate each other.

Definition 2

Pareto optimal solution. Solutions that are not dominated by either solution are called Pareto optimal solutions and are also called non-dominated solutions.

Definition 3

Pareto solution set. A set of non-dominant solutions is called a Pareto solution set.

Definition 4

Pareto frontier. Pareto solution sets form Pareto frontier after function mapping.

The multi-objective optimal power flow (MOOPF) problem is a constrained optimization problem. The objective is to minimize the selected objective functions under the condition of satisfying the equality constraints and inequality constraints. Since each index conflicts with each other, the answer to this problem is a Pareto solution set, which represents the optimal trade-off between multiple objectives. Mathematically, the MOOPF problem can be expressed in the following form:

$$\begin{gathered} Minimize \,F\left( {x, y} \right) = f_{1} \left( {x,y} \right), f_{2} \left( {x,y} \right) \ldots f_{N} \left( {x,y} \right) \hfill \\ \quad \quad \quad \quad \quad \quad \quad \quad \quad s.t. \,g\left( {x,y} \right) = 0 \hfill \\ \quad \quad \quad \quad \quad \quad \quad \quad \quad h\left( {x,y} \right) \le 0 \hfill \\ \end{gathered}$$
(1)

where, \(f(x, y)\) represents the objective function of the OPF problem; \(F\left(x, y\right)\) represents the set of multiple objective functions; \(g(x,y)\) represents the equality constraint; The inequality constraint is represented by \(h(x,y)\); \(x\) and \(y\) represent control variables and state variables respectively.

2.2 Objective function

In this experiment, a total of three objective functions are selected, which are fuel cost, active power loss and pollution emission.

2.2.1 Fuel cost

The fuel cost of each thermal power unit has a certain functional relationship with the active power. In the study, approximate fitting is performed in the form of quadratic function, which is shown in Eq. (2).

$${f}_{cost}={\sum }_{i=1}^{NG}{a}_{i}+{b}_{i}{P}_{{G}_{i}}+{c}_{i}{P}_{{G}_{i}}^{2}$$
(2)

where, \({a}_{i}\), \({b}_{i}\), \({c}_{i}\) are the fuel cost coefficient of the \(i\)-th generator, and \({P}_{{G}_{i}}\) are the active power emitted by the \(i\)-th generator; \({N}_{G}\) is the total number of generators.

2.2.2 Active power loss

There are resistance and conductance with fixed parameters in transmission line. Active power loss occurs when power is transferred through the grid. The mathematical formula of active power loss is shown in Eq. (3).

$${f}_{Ploss}={\sum }_{i=1}^{Nl}{\sum }_{j\ne i}^{Nl}{G}_{ij}[{V}_{i}^{2}+{V}_{j}^{2}-2{V}_{i}{V}_{j}cos({\delta }_{i}-{\delta }_{j})]$$
(3)

where, \(i\) and \(j\) are the \(i\)-th and \(j\)-th nodes respectively, \({G}_{ij}\) is the conductance between the two nodes, \(V\) is the node voltage, and \(\delta\) is the phase angle corresponding to the node voltage; \(Nl\) is the total number of transmission lines.

2.2.3 Emission

In the current society, environmental protection is an important topic. It is necessary to reduce the emission index of thermal power units. The total emission of air pollutants such as \({CO}_{x}\) and \({NO}_{x}\) produced by thermal power units can be defined as:

$${f}_{Emission}={\sum }_{i=1}^{NG}{\gamma }_{i}+{\beta }_{i}{P}_{{G}_{i}}+{\alpha }_{i}{P}_{{G}_{i}}^{2}+{\xi }_{i}{\text{exp}}({\lambda }_{i}{P}_{{G}_{i}})$$
(4)

where, \({\alpha }_{i}\), \({\beta }_{i}\), \({\gamma }_{i}\), \({\xi }_{i}\) and \({\lambda }_{i}\) are the emission coefficients of the \(i\)-th generator.

2.3 Control variables

The control variables are the quantity that can be adjusted manually in the power system. They mainly include the active power output by the generator, the voltage of the generator bus, the tap position of the on-load tap changer and the reactive power of the shunt capacitor. The operation state of the power system can be changed by changing the control variables.

$$x=[{P}_{{G}_{2}}, ...,{P}_{{G}_{NG}},{V}_{{G}_{1}}, ...,{V}_{{G}_{NG}},{T}_{1}, ...{,T}_{NT}{, ...,Q}_{{C}_{1}}, ..., {Q}_{{C}_{NC}}]$$
(5)

where, the active power output of the generator is\({P}_{{G}_{2}}, ...,{P}_{{G}_{NG}}\); The magnitude of the generator bus voltage is\({V}_{{G}_{1}}, ...,{V}_{{G}_{NG}}\); \({T}_{1}, ...{,T}_{NT}\) is the setting of the transformer tap position; \({Q}_{{C}_{1}}, ..., {Q}_{{C}_{NC}}\) is the reactive capacity of shunt capacitor; \(NT\) is the number of transformers; \(NC\) is the number of reactive capacitors.

2.4 State variables

State variables are called dependent variables, which changes as the control variable changes. The state variable in the OPF problem is shown in Eq. (6). Once the control variables in the system are defined, by employing the Newton–Raphson method, the power flow of the entire system and the voltage value of each node can be determined.

$$y=[{P}_{{G}_{1}}, {V}_{{L}_{1}}, ...,{V}_{{L}_{NL}},{Q}_{{G}_{1}}, ...,{Q}_{{G}_{NG}}{, ...,S}_{{l}_{1}}, ..., {S}_{{l}_{Nl}}]$$
(6)

where, \({P}_{{G}_{1}}\) is the active power input by the balance node (in 30-node system and 57-node system); NL respectively represent the number of load nodes (PQ nodes). \({V}_{{L}_{1}}, ...,{V}_{{L}_{NL}}\) is the voltage of each load node in the power system; \({Q}_{{G}_{1}}, ...,{Q}_{{G}_{NG}}\) is the reactive power generated by the generator; \({S}_{{l}_{1}}, ..., {S}_{{l}_{Nl}}\) is the power transmitted on the line;

2.5 Equality constraints

The power in the power system must satisfy the law of conservation of energy, which means that the power emitted is equal to the power consumed. The most typical equality constraint is the balance of active power and reactive power in the system, as shown in Eqs. (7 and 8).

$${P}_{{G}_{i}}={P}_{{D}_{i}}+{V}_{i}{\sum }_{j=1}^{NB}{V}_{j}[{G}_{ij}cos({\delta }_{i}-{\delta }_{j})+{B}_{ij}sin({\delta }_{i}-{\delta }_{j}), i=1, ...,NB$$
(7)
$${Q}_{{G}_{i}}={Q}_{{D}_{i}}+{V}_{i}{\sum }_{j=1}^{NB}{V}_{j}[{G}_{ij}cos({\delta }_{i}-{\delta }_{j})+{B}_{ij}sin({\delta }_{i}-{\delta }_{j}), i=1, ...,NB$$
(8)

Equation (7) is the active power equation constraint, and \({P}_{{D}_{i}}\) is the active power demand of load. Equation (8) is the constraint of reactive power equation, and \({Q}_{{D}_{i}}\) is the reactive power demand of load. \({\delta }_{i}\) represents the phase Angle of the \(i\)-th bus. \({G}_{ij}\) and \({B}_{ij}\) are the conductance and inductance of the transmission line between the \(i\)-th bus and the \(j\)-th bus, respectively. \(NB\) indicates the number of nodes.

No additional treatment is needed for this equality constraint, because the termination condition of Newton–Raphson method can meet Eqs. (7 and 8). The successful execution of the power flow calculation program indicates that the results conform to the equation constraints.

2.6 Inequality constraints

Inequality constraints mainly restrict the safe operation of devices in the system. The following four parts are considered here: generator constraints, reactive capacitor capacity constraints, transformer constraints and safety constraints.

(1) Generator constraints

$${{P}_{{G}_{i}}^{min}\le P}_{{G}_{i}}\le {P}_{{G}_{i}}^{max}, i=1, ...,NG$$
(9)
$${{Q}_{{G}_{i}}^{min}\le Q}_{{G}_{i}}\le {Q}_{{G}_{i}}^{max}, i=1, ...,NG$$
(10)
$${{V}_{{G}_{i}}^{min}\le V}_{{G}_{i}}\le {V}_{{G}_{i}}^{max}, i=1, ...,NG$$
(11)

(2) Reactive compensation constraint

$${{Q}_{{C}_{j}}^{min}\le Q}_{{C}_{j}}\le {Q}_{{C}_{j}}^{max}, j=1, ...,NC$$
(12)

(3) Transformer constraint

$${{T}_{K}^{min}\le T}_{K}\le {T}_{K}^{max}, K=1, ...,NT$$
(13)

(4) Safety constraints

$${{V}_{{L}_{m}}^{min}\le V}_{{L}_{m}}\le {V}_{{L}_{m}}^{max}, m=1, ...,NL$$
(14)
$${S}_{{l}_{n}}\le {S}_{{l}_{n}}^{max}, n=1, ...,Nl$$
(15)

where, \({{\text{S}}}_{{{\text{l}}}_{{\text{n}}}}^{{\text{max}}}\) represents the maximum transmission power on the i-th transmission line.

Some of these inequality constraints restrict the value range of control variables, and the upper and lower limits of control variables can meet these inequality constraints. The other part is to limit the value range of the state variables and the penalty function method is selected to deal with it. The penalty function method can transform the constrained optimization problem into an unconstrained optimization problem. Equations (16 and 17) are the penalty function and the modified objective function formula respectively.

$$Penalty = k_{P} \times \left( {P_{G1} - P_{G1}^{lim} } \right)^{2} + k_{Q} \times \sum\nolimits_{i = 1}^{NG} {\left( {Q_{Gi} - Q_{Gi}^{lim} } \right)^{2} } + k_{V} \times \sum\nolimits_{m = 1}^{NL} {\left( {V_{Lm} - V_{{L_{m} }}^{lim} } \right)^{2} } + k_{S} \times \sum\nolimits_{n = 1}^{Nl} {\left( {S_{{l_{n} }} - S_{{l_{n} }}^{lim} } \right)^{2} }$$
(16)
$${f}_{i}={f}_{i}+penalty$$
(17)

where, \({f}_{i}\) is the i-th objective function; \(penalty\) is a penalty item; The value of \({k}_{p}\) is set to \({10}^{6}\); The value of \({k}_{Q}\) is set to \({10}^{6}\); The value of \({k}_{V}\) is set to \({10}^{9}\); The value of \({k}_{S}\) is set to \({10}^{6}\). The voltage constraint of the load node is easily violated, so the maximum penalty coefficient is set for it.

2.7 Fuzzy membership function

After solving the MOOPF problem, a set of Pareto solutions are obtained. Because these solutions are in the same dominant level, the pros and cons of each solution in Pareto frontier cannot be directly judged. In MOP, the fuzzy system can be used to deal with the contradictory relations of various objective functions. The concept of fuzzy membership function is introduced in Ref. Hazra and Sinha (2011). Membership function defined by a single objective function can be described as follows:

$${\mu }_{i}^{k}=\left\{\begin{array}{ll}1 &\quad {f}_{i}\le {f}_{i}^{min}\\ \frac{{f}_{i}^{max}-{f}_{i}}{{f}_{i}^{max}-{f}_{i}^{min}} &\quad {f}_{i}^{min}<{f}_{i}<{f}_{i}^{max} \\ 0&\quad {f}_{i}\ge {f}_{i}^{max}\end{array}\right.$$
(18)

where, \({f}_{i}^{max}\) is the maximum value of the \(i\)-th objective function in Pareto solution set, and \({f}_{i}^{min}\) is the minimum value of the \(i\)-th objective function in Pareto solution set. The image of this function is shown in Fig. 1.

Fig. 1
figure 1

Fuzzy membership function

For the \(k\)-th individual in the solution set, the normalized membership function \({\mu }^{k}\) is defined as follows:

$${\mu }^{k}=\frac{{\sum }_{i=1}^{N}{\mu }_{i}^{k}}{{\sum }_{k=1}^{Po}{\sum }_{i=1}^{N}{\mu }_{i}^{k}}$$
(19)

where, \(Po\) represents the number of Pareto solution sets, and \(N\) represents the number of objective functions.

The greater the value of normalized membership function \({\mu }^{k}\), the higher the satisfaction of the solution. The solution with the maximum membership function is the best compromise solution.

3 Multi-objective coyote optimization algorithm based on hybrid elite mechanism and Meta-Lamarckian learning strategy

3.1 Coyote optimization algorithm

COA is inspired by the behavior of coyotes and operates on a swarm-based approach. COA does not prioritize the wolf hierarchy and has a distinct algorithmic structure. The focus of COA is to imitate the social structure and experience-sharing aspect of coyotes. In the COA, the population of coyotes is divided into \({N}_{p}\) packs with \({N}_{c}\) coyotes in each pack. The number of coyotes in each pack is fixed. Therefore, the population number in this algorithm is obtained by multiplying \({N}_{p}\) and \({N}_{c}\). Each coyote has a social condition attribute (a set of decision variables), and the social condition of the \(c\)-th coyote of the \(p\)-th pack is written as:

$${soc}_{c}^{p, t}=\overrightarrow{x}=({x}_{1},{x}_{2},\dots ,{x}_{D})$$
(20)

where, \(SOC\) represents the decision variable, \(D\) is the search space dimension. The first step is to initialize the coyote population. As a randomized algorithm, the initial social conditions for each coyote of COA are set randomly. It passes through Eq. (21) and assign a random value to the \(j\)-th dimension of the \(c\)-th coyote of the \(p\)-th pack in the searching space during the t-th iteration.

$${soc}_{c,j}^{p,t}={lb}_{j}+{r}_{j}\bullet ({ub}_{j}-{lb}_{j})$$
(21)

where, \({lb}_{j}\) and \({ub}_{j}\) represent the lower and upper bounds of the \(j\)-th dimensional control variable respectively, \({r}_{j}\) is a random number between [0, 1]. Coyotes were then assessed for their adaptation to current social conditions.

$${fit}_{c}^{p,t}=f({soc}_{c}^{p,t})$$
(22)

where, \({fit}_{c}^{p,t}\) is the fitness value (objective function value).

There is one alpha coyote in each pack, and it is the individual with the best fitness value. In the minimization problem, the alpha of the \(p\)-th pack at time \(t\)-th is defined as:

$${alpha}^{p,t}=\left\{{soc}_{c}^{p,t}|{arg}_{c=\left\{\mathrm{1,2}\dots {N}_{c}\right\}}minf({soc}_{c}^{p,t})\right\}$$
(23)

COA assumes that coyotes have a certain amount of intelligence and organization, and each population shares social conditions that will help the population develop. Thus, the COA associates individual information from coyotes and calculates it as a cultural trend for the group.

$${cult}_{j}^{p,t}=\left\{\begin{array}{c}{O}_{\frac{{N}_{C}+1}{2},j}^{p,t} , {N}_{c} is odd\\ \frac{{O}_{\frac{{N}_{C}}{2},j}^{p,t}+{O}_{\frac{{N}_{C}+1}{2},j}^{p,t} }{2}, otherwise\end{array}\right.$$
(24)

where, \({O}^{p,t}\) denotes the social condition ranking of all coyotes in the \(p\)-th pack in the range [1, D] at the \(t\)-th iteration. All in all, the cultural disposition of the pack was equal to the median of the social conditions of all coyotes in the pack.

For showing the social conditions of different coyotes in pack affect each other, the COA assumes that each coyote individual receives alpha effects (\({\delta }_{1}\)) and population effects (\({\delta }_{2}\)). The former represents the cultural difference between the random coyote \({cr}_{1}\) and the alpha coyote, while the latter represents the difference between the cultural tendency of the random coyote \({cr}_{2}\) and the group. \({\delta }_{1}\) and \({\delta }_{2}\) are shown in Eq. (25).

$$\begin{gathered} \delta_{1} = alpha^{p,t} - soc_{{cr_{1} }}^{p,t} \hfill \\ \delta_{2} = cult^{p,t} - soc_{{cr_{2} }}^{p,t} \hfill \\ \end{gathered}$$
(25)

Therefore, the new social conditions of coyotes are updated by the influence of alpha and group.

$${new\_soc}_{c}^{p,t}={soc}_{c}^{p,t}+{r}_{1}\bullet {\delta }_{1}+{r}_{2}\bullet {\delta }_{2}$$
(26)

where, \({r}_{1}\) represents the weights affected by alpha and population. \({r}_{1}\) is defined as random numbers in the range [0, 1] generated with uniform distribution. \({r}_{2}\) decreases linearly with the number of iterations, \({r}_{2}=1-it/Maxit\). The new social situation is then assessed by Eq. (27).

$${new\_fit}_{c}^{p,t}=f(new\_{soc}_{c}^{p,t})$$
(27)

Coyotes have the cognitive ability to judge whether new social conditions are better than old ones, which means that only when they get better social conditions, they will be updated.

$${soc}_{c}^{p,t+1}=\left\{\begin{array}{ll}new\_{soc}_{c}^{p,t} , &\quad {new\_fit}_{c}^{p,t}<{fit}_{c}^{p,t}\\ {soc}_{c}^{p,t}, &\quad otherwise\end{array}\right.$$
(28)

After each pack position update, coyote births and deaths are considered. The birth of a new coyote is a crossover of the social conditions of the parents (chosen at random) and then the random effects of the environment. The formula for the birth is shown in Eq. (29).

$${pup}_{j}^{p,t}=\left\{\begin{array}{ll}{soc}_{{r}_{1},j}^{p,t} , &\quad { rnd}_{j}<{P}_{s} or j={j}_{1}\\ {soc}_{{r}_{2},j}^{p,t} , &\quad { rnd}_{j}\ge {P}_{s}+{P}_{a} or j={j}_{2}\\ {R}_{j} , &\quad otherwise\end{array}\right.$$
(29)

where, \({r}_{1}\) and \({r}_{2}\) are random coyote individuals in the \(p\)-th pack; \({j}_{1}\) and \({j}_{2}\) are the two random dimensions of the problem; \({P}_{s}\) is the scattering probability, \({P}_{a}\) is the association probability; \({R}_{j}\) is a random number in the range of control variables for the \(j\)-th dimension; and \({rnd}_{j}\) is a random number generated with a uniform distribution in the range [0, 1]. Scattering and association probabilities guide the cultural diversity of coyotes so that \({P}_{s}\) and \({P}_{a}\) can be defined as:

$${P}_{s}=1/D$$
$${P}_{a}=(1-{P}_{s})/2$$
(30)

where, \({P}_{a}\) has the same effect on both parents.

After evaluating the fitness values of all coyotes, the one with the highest fitness value is chosen as the global optimal solution for the problem. The pseudo code for COA is presented in Algorithm 1.

Algorithm 1
figure a

Pseudo code of the COA

3.2 Meta-Lamarckian learning

Traditionally, Meta-Lamarckian Learning is often used in memetic algorithm (MA), and a local search process is added after the iterative update of MA (Ong and Keane 2004). It is difficult for a single local search method to achieve good results in different problems. Therefore, multiple local search (LS) methods are often used in MA searches. Meta-Lamarckian learning is motivated by the desire to improve search performance and reduce the probability of using inappropriate local methods. Meta-Lamarckian learning (adaptive) strategy can be described as cooperative and/or competitive. Competition means that LS method with higher fitness improvement has a higher chance to be selected for subsequent optimization. Cooperation means that LS and their improvement rewards work together to select an LS for subsequent optimization (Konstantinidis et al. 2018). Meta-Lamarckian Learning usually uses the improvement of the fitness value of a single objective function as an indicator to establish a reward mechanism, so it is often used in single-objective optimization and multi-objective optimization based on decomposition. For being used in multi-objective optimization, the reward mechanism is defined as follows:

$${\rho }_{k}=\frac{{n}_{s}}{n}$$
(31)

where, \({\rho }_{k}\) is the reward value of the \(k\)-th LS method, \(n\) is the number of times that the LS method is used in the iteration, and \({n}_{s}\) is the number of times that the LS method is used to generate a non-inferior solution.

The incentive mechanism is to calculate the ratio of the number of non-inferior solutions generated by using each local optimization strategy to the number of generated individuals. In each iteration, after obtaining the reward value \({\rho }_{k}\) of each LS method, the probability of updating the roulette LS method after normalization is used for the next iteration. In other words, if a certain LS is used to generate the highest proportion of high-quality solutions, it is more likely to be selected in the next iteration. The probability of each LS method being selected at the beginning is equal. With the progress of iteration, the method with high reward value obtains higher probability of adoption. Random roulette works as follows:

  • Step 1: Calculate the reward value \({\rho }_{k}\) of each LS method.

  • Step 2: Standardize (normalize) the reward value of each LS method to obtain the relative reward value.

  • Step 3: Allocate space for each LS based on relative reward value.

  • Step 4: Generate a random number and select the LS method of the disk position corresponding to the random number.

The common local optimization methods include crossover, mutation, Powell method and simplex search method. In this experiment, a total of four crossover operators are adopted into the local optimizer, which are respectively called horizontal crossover operator, longitudinal crossover operator, elite crossover operator and direct crossover operator.

3.2.1 Transverse crossover operator

Inspired by the crisscross optimization algorithm (CSO) (Meng et al. 2014), the transverse crossing process is selected as the LS. As shown in Eq. (32), the function of this operator is to generate new individuals at the position between parents with a high probability and individuals at the extension line of parents with a low probability.

$${Xnew}_{i,d}={r}_{1}\bullet {X}_{{i}_{1},d}+\left(1-{r}_{1}\right)\bullet {X}_{{i}_{2},d}+a\bullet ({X}_{{i}_{1},d}-{X}_{{i}_{2},d})$$
(32)

where, \({Xnew}_{i,d}\) is the i-th new individual in the d-th dimension; \({r}_{1}\) is a random number between [0, 1]; \(a\) is the random number between [-1, 1]; \({X}_{{i}_{1},d}\) and \({X}_{{i}_{2},d}\) are randomly selected parents in the cross operation.

3.2.2 Longitudinal crossover operator

Inspired by the CSO (Meng et al. 2014), the longitudinal crossover process is selected as the LS. As shown in Eq. (33), the effect of the operator is to change the value of one dimension of the individual.

$${Xnew}_{i,{d}_{1}}={r}_{2}\bullet {X}_{i,{d}_{1}}+\left(1-{r}_{2}\right)\bullet {X}_{i,{d}_{2}}$$
(33)

where, \({r}_{2}\) is a random number between [0, 1]; \({X}_{i,{d}_{1}}\) and \({X}_{i,{d}_{2}}\) are values of the same individual in the dimensions of \({d}_{1}\) and \({d}_{2}\). Since solutions may have different upper and lower limits in different dimensions, the values of each dimension should be normalized.

3.2.3 Direct crossover operator

Equation (29) is selected as the LS, whose function is to generate individuals in the position of parent or parent.

3.2.4 Elite crossover operator

An elite crossover operator is proposed based on the direct crossover operator. As shown in Eq. (34), it serves to cross the position of the current coyote with that of the alpha coyote it follows.

$${Xnew}_{i,d}=\left\{\begin{array}{ll}{alpha}_{i,d} ,&\quad { rnd}_{j}<{P}_{s} or j={j}_{1}\\ {X}_{i,d} ,&\quad { rnd}_{j}\ge {P}_{s}+{P}_{a} or j={j}_{2}\\ {R}_{j} , &\quad otherwise\end{array}\right.$$
(34)

where, \({alpha}_{i,d}\) is an elite coyote that \({X}_{i,d}\) has followed.

3.3 Multi-objective COA based on hybrid elite framework and Meta-Lamarckian learning strategy

3.3.1 Elite non-dominant sorting

In the proposed MOCOA, NSGA-II’s elite non-dominant ranking method and the crowding distance method to maintain diversity are introduced. The crowding distance is calculated to rank the populations of the same non-dominant level. First, a non-dominant ranking was used to obtain non-dominant levels of different individuals, and then the crowding distance method was used to maintain the diversity between the optimal sets.

3.3.1.1 Fast non-dominated sorting

Firstly, all targets of the objective function \(F\) are evaluated for each solution obtained from the basic search method (COA) or the initially generated random population \({P}_{O}\). Each solution \(p\) has two properties, \({n}_{p}\) is the number of solutions that dominate individual \(p\), and \({S}_{p}\) is the set of solutions that individual \(p\) dominates.

  1. (1)

    For solutions with \({n}_{p}=0\), the solutions are not dominated by any individual, whose non-dominated level \({p}_{rank}\) is set to 1 and stored in set \({F}_{1}\).

  2. (2)

    For each solution \(p\) with \({n}_{p}= 0\), access each member \(q\) in the set \({S}_{p}\), and its dominant count \({n}_{q}\) decreases by 1. If the \({n}_{q}\) count drops to zero, the corresponding solution \(q\) is stored in the second non-dominated level set \({F}_{2}\), whose non-dominated level \({p}_{rank}\) is set to 2.

  3. (3)

    Repeat the process for each member of the second non-dominated level to obtain the third non-dominated level, and then repeat the process until the whole population is divided into different non-dominated levels.

3.3.1.2 Determine crowding distance

To ensure that Pareto optimal solutions are well-distributed in the objective space, NSGA-II utilizes a crowding distance method to assess the quality of each solution within the same front, resulting in a more evenly distributed solution set. The main goal of using the crowding distance approach is to preserve population diversity by achieving a trade-off between solutions. Specifically, it refers to the density of individuals in a single rank layer after the non-dominant sorting of a population in accordance with the dominant relationship.

The crowding degree/crowding distance is calculated as follows. For each objective function, find two solutions adjacent to the current solution and calculate the functional difference between the two solutions. To calculate the crowding distance of a given solution, the differences between the objective function values of neighboring solutions are summed. The individual crowding degree at the boundary of each non-dominant layer is directly set to infinity (Jeyadevi et al. 2011). The sum of the two sides of the rectangle in Fig. 2 is the crowding distance of the \(p\)-th individual.

Fig. 2
figure 2

Crowding distance

3.3.1.3 Crowding comparison operator and elite strategy

After the previous fast non-dominant ranking and crowding degree calculation, the \(i\)-th individual in the population has two attributes: the non-dominant layer \({p}_{rank}\) (the number of levels) and the crowding distance \({p}_{d}\). According to these two attributes, the crowding degree comparison operator can be defined as follows. The individual \(p\) is compared with another individual \(q\). If any of the following conditions are true, the individual \(p\) wins.

  1. (1)

    If the non-dominated layer of individual \(p\) is better than the non-dominated layer of individual \(q\), \({p}_{rank}<{q}_{rank}\);

  2. (2)

    If they have the same rank and the individual p has a larger crowding distance than the individual \(q\), that is, \({p}_{rank}={q}_{rank}\) and \({p}_{d}>{q}_{d}\).

The first condition ensures that the selected individual belongs to the superior non-inferior rank. The second condition can select the individual in the less crowded area (with a greater distance of crowding) among two individuals with the same non-inferior rank.

The elite strategy is used to select individuals to enter the next iteration. The new population \(P\) generated in the \(t\)-th iteration is combined with the old population \(Q\). Then a series of non-dominated sets are generated by non-dominated sorting, and the degree of crowding is calculated. Set the population number to \(N\) in the iteration, and select from the first layer until enough \(N\) individuals are selected according to the crowding comparison operator. These \(N\) better individuals enter the next iteration process and continue to update according to the formula of COA. This selection process is shown in Fig. 3.

Fig. 3
figure 3

Individual selection based on non-dominant ranking

3.3.2 Archives based on grid mechanism

3.3.2.1 Grid mechanism

For the solution set stored in the archive, the target space is divided equally by grid, and the number of grids on each target is set manually. Figure 4 is a schematic diagram in two-dimensional space. The number of grids on each target is 5. Each grid containing the solution is given an index number. For example, the index number of grid A in Fig. 4 is (2, 2). The purpose of the grid mechanism is to distinguish the density of the archive space in order to find a more crowded or sparse area for the next operation (Coello and Lechuga 2002).

Fig. 4
figure 4

Individual selection process of grid mechanism

3.3.2.2 External archive

An external archive is a storage unit defined as a fixed size. It can save or retrieve the non-dominant Pareto optimal solution obtained so far. The key module for archiving is an archiving controller that can control archiving when the solution wants to access the archive or when the archive is full. It is important to note that the archive has a maximum number of members. During the iteration, the non-dominant solutions obtained to date were compared to archived data. There are three different cases that can happen.

  1. (1)

    New members are dominated by at least one archive member. In this case, the solution should not be allowed to enter the archive.

  2. (2)

    The new solution dominates one or more solutions in the archive. In this case, delete the dominant solution from the archive and allow the new solution to enter the archive.

  3. (3)

    If the new solution and archive members are not mutually dominant, the new solution should be added to the archive.

If the archive is full, the grid mechanism should first be run to rearrange the segmentation of the object space. Through the roulette selection technique, the grids are selected to remove one of the solutions and the probability of each grid being selected is shown in Eq. (35). Then, the new non-dominated solution is recorded in the archive to improve the diversity of Pareto optimal frontier. As shown in Fig. 4, when the archive is full, there is a greater probability to select B(5, 1), the most crowded area, and randomly delete one of the solutions.

$$P=\frac{n}{E}$$
(35)

where, \(E\) is a constant and \(n\) is the number of solutions in the grid.

3.3.2.3 Elite selection mechanism

Elite is the alpha coyote in COA. Firstly, the grid mechanism is used to divide the archive. Then select a solution from the archive as alpha coyote through roulette. The probability of selection is calculated by Eq. (36).

$$P=\frac{E}{n}$$
(36)

where, \(E\) is a constant and \(n\) is the number of solutions in the grid.

The fewer the number of solutions in the grid, the greater the chance that the grid will be selected. As shown in Fig. 4, there is only one solution in A (2, 2), so A has the highest probability of being selected. The mechanism selects the sparser location solution as alpha coyote (elite). Alpha coyote, as the leader of the population, will guide the population to search for a more sparse solution set space, which can help it to find a more uniform Pareto front.

3.3.3 The process of MOCOA-ML

The proposed multi-objective coyote optimization algorithm (MOCOA) uses the multi-objective framework of non-dominated sorting and external files to obtain the Pareto optimal solution. There is one and only one optimal solution obtained by the single objective COA, which is the solution corresponding to the optimal fitness value. The MOCOA adopts the idea of COA to update the population position, merges the new solution set and the old solution set. Then, the non-dominant sorting and the crowding distance methods are used to get the undominated relations in the new set. After that, according to the size of the population, select the better individuals to enter the next iteration process, and other poor individuals are eliminated (dead). Archive the non-dominated solution (the first frontier individual) obtained from the non-dominated sorting. If the archive is full, use the grid mechanism to delete and add the individual. In addition, during the iteration process, alpha coyotes (leaders) are also selected from the archive according to the roulette method. After the iteration, output the Pareto solution set in the archive.

The Meta-Lamarckian learning strategy was combined with multi-objective coyote optimization algorithm, and it was named MOCOA-ML. On each loop, after the coyote position in each pack is updated, the local optimizer based on Meta-Lamarckian learning starts working, randomly picking individuals in the pack for a local search. Each time LS is selected and the scheme is selected by way of roulette according to the reward value in the last iteration. The individual generated by the local optimizer is compared with one of its parents. If the new individual dominates the parent, the new individual replaces the parent. If the new individual and the parent do not dominate, or the new individual is dominated by the parent, the new individual replaces any individual in the previous iteration. In other words, we tend to retain solutions generated by the local optimizer. This approach does not add any more computational pressure to the non-dominated sorting process, since the number of individuals participating in non-dominated sorting is still twice as large as the number of populations. The flow chart of MOCOA-ML is shown in Fig. 5. It should be noted that MOCOA-ML differs from MOCOA in whether it contains a local optimizer under the Meta-Lamarckian learning strategy.

Fig. 5
figure 5

Flow chart of MOCOA-ML

4 Test function simulation and result analysis

In order to verify the performance of MOCOA-ML, several test functions were selected for testing, and the results were compared with MOCOA, MODA, MOPSO, MOJAYA, NSGA-II, MOEA/D, MOAOS and MOTEO. Because the algorithms used in the experiments were all random algorithms, and in order to be true and fair, when MOCOA-ML and other multi-objective intelligent optimization algorithms were used for the test, each group of experiments were independently run 10 times. The maximum number of iterations is set to 300, the population number N is set to 100, and the size of archive is set to 100. The parameter settings of the multi-objective improved algorithm and comparison algorithms are shown in Table 1. For each algorithm, calculate the fuzzy membership function of each solution according to Eqs. (18, 19) in Sect. 2.7, and the solution with the maximum membership function \({\mu }_{max}\) is considered the best compromise solution.

Table 1 Setting of algorithm parameters

4.1 Performance metrics

Convergence and diversity are two key points in finding an appropriate Pareto optimal solution set for a particular problem. Convergence refers to the ability of multi-objective algorithm to determine the accurate approximation of Pareto optimal solution. Diversity refers to the ability of the algorithm to find a more complete Pareto front. The ultimate goal of the multi-objective optimization algorithm is to find the most accurate approximate value of the true Pareto optimal solution (convergence) with uniform distribution (diversity) on all targets. In this part, three commonly used indicators are selected to reflect the advantages and disadvantages of Pareto solution set of each algorithm. They are inverted generational distance (IGD) (Coello and Cortés 2005) and hypervolume (HV) (Zitzler and Thiele 1999). The first indicator is a reverse indicator, and the second is a positive indicator.

4.1.1 Inverted generational distance

The IGD metric is used to calculate the minimum distance between an individual on the actual Pareto frontier and the set of individuals generated by the algorithm. It can be expressed as:

$$IGD=\frac{\sqrt{({\sum }_{p=1}^{ko}({d}_{i}^{\prime}))}}{ko}$$
(37)

where, \(ko\) is the number of Pareto solutions. It is expressed as the Euclidean distance between the \(p\)-th real Pareto solution and the nearest obtained Pareto solution.

4.1.2 Hypervolume

The HV value is the volume of the space covered by the Pareto front. The higher the HV value, the better the diversity and convergence of the corresponding Pareto frontier.

$$HV=\delta (\bigcup_{i=1}^{|N|}{c}^{i})$$
(38)

where, \(\delta\) is a Lebesgue measure used to measure volume. \(|N|\) represents the number of Pareto solution sets, and \({c}^{i}\) represents the hypercube formed by the reference point and the i-th solution in the solution set.

4.2 Function optimization simulation and result analysis

4.2.1 Simulation result and analysis of benchmark test functions

In order to prove the performance of MOCOA-ML, experiments were carried out on the test functions ZDT1-ZDT4, ZDT6, DTLZ2 and DTLZ4-DTLZ7. MOCOA, MODA, MOPSO, MOJAYA, NSGA-II, MOEA/D, MOAOS and MOTEO were selected as the comparison algorithms. Tables 2, 3 records the optimal value, average value and standard deviation of each algorithm in IGD and HV. Figure 6 shows the Pareto frontier of double-objective test functions, and Fig. 7 shows the Pareto frontier of three-objective test functions.

Table 2 Simulation results of test functions (IGD)
Table 3 Simulation results of test functions (HV)
Fig. 6
figure 6figure 6figure 6figure 6

Pareto frontiers obtained by each algorithm on two-objective test functions

Fig. 7
figure 7figure 7figure 7

Pareto frontiers obtained by each algorithm on three-objective test functions

From the experimental results, it can be observed that MOCOA-ML has the ability to find the Pareto front of each test function and has better convergence and coverage. The average rankings obtained from the Friedman test are listed in Tables 2 and 3, and MOCOA-ML ranks first in both IGD and HV. This indicates that it outperforms MOCOA, MODA, MOPSO, MOJAYA, NSGA-II, MOEA/D, MOAOS and MOTEO on most test functions. Additionally, MOCOA-ML performs better than MOCOA in all performance metrics, demonstrating the effectiveness of the Meta-Lamarckian learning strategy. Among the 10 test functions, MOTEO performs well, second only to MOCOA-ML. NSGA-II and MOPSO also have good results, while MOJAYA performs the worst. MOCOA-ML exhibits better competitiveness in both bi-objective and tri-objective problems. In summary, MOCOA-ML showcases commendable performance and can be considered as a viable alternative algorithm. It is important to note that in Table 3, the rankings obtained from the Friedman test are inverted due to HV being a positive indicator.

4.2.2 UF test function results and analysis

In this section, MOCOA-ML was used to solve the test functions UF1-UF10. MOCOA, MODA, MOJAYA, MOPSO, NSGA-II, MOEA/D, MOAOS and MOTEO were selected as the comparative algorithms. The best values, average values, standard deviations of IGD and HV obtained by each algorithm, and the average rankings obtained from the Friedman test are recorded in Tables 4, 5. Figure 8 shows the best Pareto front obtained from the experiments. From Tables 4 and 5, it can be observed that MOCOA-ML exhibits significantly better convergence and diversity in UF1-UF6, UF8, and UF10, while its performance is slightly worse in UF9. The average rankings obtained from the Friedman test show that MOCOA-ML achieves the first rank in both IGD and HV indicators among the nine algorithms. MOCOA ranks second, followed by MOPSO. These results indicate that MOCOA-ML has highly competitive performance on the UF series test functions.

Table 4 Simulation results of test functions (IGD)
Table 5 Simulation results of test functions (HV)
Fig. 8
figure 8figure 8

Pareto frontiers obtained by each algorithm under CEC 2009

4.3 Population diversity analysis

MOCOA-ML utilizes a grid mechanism to maintain the diversity of the archive. To validate the effectiveness of this mechanism, the diversity of the archive during the convergence process of MOCOA-ML on different test functions was analyzed. The diversity curves for some test functions are shown in Fig. 9. The formula for calculating the population diversity \(div\) is given by Eq. (39) (Zamani et al. 2021).

$$div=\frac{1}{N}\sum_{i=1}^{N}\sqrt{\sum_{d=1}^{D}{({x}_{id}-{x}_{mean,d})}^{2}}$$
(39)

where, \(N\) represents the number of individuals in the archive, \(D\) represents the maximum dimension of the decision variables, \({x}_{id}\) represents the value of the \(i\)-th individual on the \(d\)-th dimension, and \({x}_{mean,d}\) represents the mean value of all individuals in the archive on the \(d\)-th dimension.

Fig. 9
figure 9

Iteration curves of population diversity

In the curve shown in Fig. 9, smaller values indicate poorer diversity in the archive, while larger values indicate higher population dispersion. It can be observed that in ZDT1, div decreases continuously. This is because the true Pareto solution set of ZDT1 contains a significant number of zeros. During the iterative process, the individuals in the archive gradually approach the true Pareto solution set, resulting in this outcome. On the other hand, in other test functions, the div values remain at a high level, indicating good diversity in the archive. This further demonstrates the effectiveness of the grid mechanism in maintaining diversity.

4.4 Performance index analysis

To compare the performance of different algorithms, them are evaluated by using the Performance Index (PI) (Deep and Thakur 2007). PI is a positive indicator that takes into account the algorithm’s runtime. A higher value of PI indicates better algorithm performance. The detailed formulas for calculating PI are given by Eqs. (40 and 41).

$$A{vet}_{i}^{j}=\frac{1}{H}\sum_{T=1}^{H}Time(T)$$
(40)
$${PI}_{i}=\frac{1}{Nf}\sum_{j=1}^{Nf}(\alpha \times \frac{Min{f}^{j}}{A{vef}_{i}^{j}}+\beta \times \frac{Min{t}^{j}}{Ave{t}_{i}^{j}})$$
(41)

where, \(A{vet}_{i}^{j}\) represents the average runtime of the \(i\)-th algorithm on the \(j\)-th test function, \(H\) represents the number of runs of an algorithm on a test function, \(Time(T)\) represents the time taken by the algorithm in the \(T\)-th run, \({PI}_{i}\) is the performance index of the \(i\)-th algorithm, \(Nf\) represents the total number of test functions, \(Min{f}^{j}\) represents the minimum average error value obtained by all algorithms on the \(j\)-th function, \(A{vef}_{i}^{j}\) represents the average error value obtained by the \(i\)-th algorithm on the \(j\)-th function, and \(Min{t}^{j}\) represents the minimum time obtained by all algorithms on the j-th function. \(\alpha\) and \(\beta\) are parameters in the range [0, 1] and have a linear relationship, \(\alpha +\beta =1\).

In this section, \(\alpha\) is set to 0, 0.2, 0.4, 0.6, 0.8, and 1. \(Min{f}^{j}\) and \(A{vef}_{i}^{j}\) are used as the average minimum IGD value and the average IGD value obtained by the \(i\)-th algorithm, respectively. The PI values of these algorithms on different series of test functions are plotted in Fig. 10. It can be observed that the MOCOA-ML algorithm has a certain advantage in terms of PI values on the test functions in all three series.

Fig. 10
figure 10

PI diagram of test functions

4.5 Wilcoxon signed-rank test

Wilcoxon signed-rank test is a non-parametric hypothesis test used to compare whether there is a difference in medians between two related samples (Zamani et al. 2022). It is suitable for situations where the data of the two related samples do not follow a normal distribution. In this test, the p-value with a significance level of 95% (α = 0.05) is calculated. If the p-value is less than or equal to the significance level (0.05), the null hypothesis was rejected. If the p-value is greater than the significance level, we fail to reject the null hypothesis.

Wilcoxon signed-rank tests are conducted to compare the IGD and HV values of MOCOA-ML on different test functions with eight other algorithms. The results are shown in Tables 6 and 7. The “+” sign indicates that the algorithm is significantly better than MOCOA-ML, the “−” sign indicates that the algorithm is significantly worse than MOCOA-ML, and the “=” sign indicates that there is no significant difference between the algorithm and MOCOA-ML. From the results in the tables, it can be concluded that in most cases, MOCOA-ML performs better than the eight compared algorithms.

Table 6 Results of Wilcoxon signed-rank test on IGD
Table 7 Results of Wilcoxon signed-rank test on HV

4.6 Mean absolute error

The mean absolute error (MAE) is used to analyze the IGD indicators obtained by all algorithms to determine the difference between the obtained Pareto frontier and the real Pareto frontier. The calculation formula of MAE is shown in Eq. (42) (Zamani et al. 2021). The obtained results are shown in Table 8. MOCOA-ML ranked second in the ZDT test functions and first in the DTLZ and UF test functions. MOCOA-ML also performs well in MAE analysis.

$$MAE=\frac{\sum_{j=1}^{Nf}|{C}_{j}-{D}_{j}|}{Nf}$$
(42)

where, \(Nf\) is the number of functions, \({C}_{j}\) is the optimal IGD value of the \(j\)-th function, \({D}_{j}\) is the optimal IGD value obtained from the \(j\)-th function.

Table 8 MAE results of IGD index for each algorithm

5 Case study of optimal power flow

In order to verify the performance of the proposed MOCOA-ML to solve the MOOPF problem, simulation studies are carried out in IEEE 30-bus system and IEEE 57-bus system respectively in this section.

5.1 Test system and parameter setting

IEEE 30-node system is shown in Fig. 11. The system consists of 6 generators, 4 adjustable transformers and 9 capacitors. On the basis of 100 MVA, the active power demand at the load side is 283.4 MW and the reactive power demand is 126.2 MVAr. The voltage range of generator bus is 0.95–1.1 p.u.. The normal range of load bus voltage is 0.95–1.05 p.u. Bus 1 is the balance bus. The parameters of the generators are listed in Table 9, including cost coefficients and emission coefficients.

Fig. 11
figure 11

Standard IEEE 30-bus system

Table 9 Coefficient values of generators of IEEE 30-bus system (Biswas et al. 2018)

The structure of the IEEE 57 node system is shown in Fig. 12. The system consists of 7 generators, 50 load buses, 80 transmission lines, 17 adjustable transformers and 3 capacitors. The total active power demand at the load side is 1250.8 MW and the total reactive power demand is 336.4MVAr. The load bus voltage range is 0.94–1.06 p.u., and the transformer tap range is 0.9–1.1 p.u.. Shunt capacitor maximum reactive power is 30 MVAr. The cost and emission coefficients of generators are shown in Table 10. In this experiment, fuel cost, active power loss and total emissions were selected as objective functions and tested in two systems, with a total of 6 cases. The specific case combinations are shown in Table 11.

Fig. 12
figure 12

Standard IEEE 57-bus system

Table 10 Coefficient values of generators of IEEE 57-bus system (Biswas et al. 2018)
Table 11 Objective function combination

5.2 Simulation results and analysis

In order to verify the performance of MOCOA-ML, the results were compared with MOCOA, MOPSO, MOGWO and MSSA. For the IEEE 30-node test system, the population of each algorithm is set to 100, and the maximum number of iterations is 300. For the IEEE 57 node test system, the population of each algorithm is set to 100, and the maximum number of iterations is 700. Detailed parameter settings of each algorithm are shown in Table 12. They are independently run 30 times in each case.

Table 12 Setting of algorithm parameters

5.2.1 Case 1

Fuel cost and active power loss are considered in this case. The optimal solutions and compromise solutions of MOCOA-ML and MOCOA on each objective are shown in Table 13. MIN C represents the solution corresponding to the minimum fuel cost, MIN P represents the solution corresponding to the minimum active power loss, and COMP represents the best compromise solution. The minimum fuel cost and the minimum power loss obtained by MOCOA-ML are 800.7669 $/h and 3.1147 MW, respectively, and the compromise solutions are 834.6730 $/h and 5.3332 MW, which are not dominated by the compromise solutions of 836.5858$/h and 5.2834 MW obtained by MOCOA. The simulation result is shown in Fig. 13, showing the Pareto frontier found by each algorithm. It can be seen that MOCOA-ML has a Pareto solution set with more advanced position, and the result is better than MOCOA. Table 14 compares the compromise solutions of each algorithm. The compromise of MOCOA-ML is superior to MOPSO, MOGWO, MSSA, NSGA-III (Chen et al. 2019), PSO-Fuzzy (Liang et al. 2011) and EGA (Herbadji et al. 2019).

Table 13 Solution set obtained in Case 1
Fig. 13
figure 13

Pareto frontier obtained by each algorithm in Case 1

Table 14 Compromise solution obtained by each algorithm

5.2.2 Case 2

Fuel cost and emissions are considered in this case. The simulation result is shown in Fig. 14, showing the Pareto frontier found by each algorithm. It can be seen that MOCOA-ML has a more advanced Pareto solution set, with slightly better results than MOCOA. The optimal solutions and compromise solutions of MOCOA-ML and MOCOA on each objective are shown in Table 15. The minimum fuel cost and minimum emissions obtained by MOCOA-ML are 800.7411 $/h and 0.20485 ton/h respectively, and the compromise solutions are 834.2074 $/h and 0.2454 ton/h. Table 16 compares the compromise solutions of each algorithm. The compromise scheme of MOCOA-ML is superior to that of MOCOA, MOGWO, MSSA and AGSO (Daryani et al. 2016), and it has the same dominant level as the compromise scheme of other algorithms.

Fig. 14
figure 14

Pareto frontier obtained by each algorithm in Case 2

Table 15 Solution set obtained in Case 2
Table 16 Compromise solution obtained by each algorithm

5.2.3 Case 3

Fuel cost, emissions and active power loss are considered in this case. The simulation result is shown in Fig. 15, which shows the Pareto frontier found by each algorithm. It can be seen that MOCOA-ML has a more advanced Pareto solution set, and the result is better than MOCOA. The optimal solutions and compromise solutions of MOCOA-ML and MOCOA in each objective are shown in Table 17. The minimum fuel cost, minimum emission and minimum active power loss obtained by MOCOA-ML are 800.8717 $/h, 0.20484 ton/h and 3.1408 MW respectively, and the compromise solutions are 873.9523 $/h, 0.2191 ton/h and 4.3810 MW. Table 18 compares the compromise solutions of each algorithm. The compromise of MOCOA-ML is superior to that of MOCOA, MOPSO and MOGWO, and is at the same dominant level as that of other algorithms.

Fig. 15
figure 15

Pareto frontier obtained by each algorithm in Case 3

Table 17 Solution set obtained in Case 3
Table 18 Compromise solution obtained by each algorithm

5.2.4 Case 4

In this case, fuel cost and active power loss are considered and simulated in IEEE 57 node system. The simulation result is shown in Fig. 16, showing the Pareto frontier found by each algorithm. It can be seen that MOCOA-ML has a more advanced and more malleable Pareto solution set, and its results are superior to MOCOA. The optimal solutions and compromise solutions of MOCOA-ML and MOCOA in each objective are shown in Table 19. The minimum fuel cost and minimum active power loss obtained by MOCOA-ML are 41,675.44 $/h and 10.0428 MW respectively, and the compromise solution is 42,146.23 $/h and 11.0192 MW. Table 20 compares the compromise solutions of each algorithm. The compromise of MOCOA-ML is superior to that of MOCOA, MOPSO, MOGWO, MSSA, BMPSO (Qian and Chen 2022) and MOJFS (Shaheen et al. 2021).

Fig. 16
figure 16

Pareto frontier obtained by each algorithm in Case 4

Table 19 Solution set obtained in Case 4
Table 20 Compromise solution obtained by each algorithm

5.2.5 Case 5

Fuel cost and emissions are considered in this case. The simulation result is shown in Fig. 17, showing the Pareto frontier found by each algorithm. It can be seen that the Pareto solution set of MOCOA-ML has obvious advantages, and the result is better than that of MOCOA. The optimal solutions and compromise solutions of MOCOA-ML and MOCOA in each objective are shown in Table 21. The minimum fuel cost and minimum emissions obtained by MOCOA-ML are 41,698.88 $/h and 0.9546 ton/h respectively, and the compromise solutions are 42,474.51 $/h and 1.0632 ton/h. Table 22 compares the compromise solutions of each algorithm. The compromise of MOCOA-ML is superior to that of MOCOA, MOPSO, MSSA, MPIO-PFM (Chen et al. 2020), NSGA-III (Chen et al. 2019) and MOJFS (Shaheen et al. 2021).

Fig. 17
figure 17

Pareto frontier obtained by each algorithm in Case 5

Table 21 Solution set obtained in Case 5
Table 22 Compromise solution obtained by each algorithm

5.2.6 Case 6

Fuel cost, emissions and active power loss are considered in this case. The simulation result is shown in Fig. 18, showing the Pareto frontier found by each algorithm. It can be seen that MOCOA-ML has a more advanced Pareto solution set, and the result is better than MOCOA. The optimal solutions and compromise solutions of MOCOA-ML and MOCOA on each objective are shown in Table 23. The minimum fuel cost, minimum emission and minimum active power loss obtained by MOCOA-ML are 41,695.75 $/h, 0.9556 ton/h and 10.3392 MW respectively, and the compromise solutions are 42,669.53 $/h, 1.0682 ton/h and 11.0802 MW. Table 24 compares the compromise solutions of each algorithm. The compromise of MOCOA-ML is superior to that of MOCOA, MOGWO, MSSA, MPIO-PFM (Chen et al. 2020) and MOALO (Herbadji et al. 2019), and is at the same dominant level as the compromise of other algorithms.

Fig. 18
figure 18

Pareto frontier obtained by each algorithm in Case 6

Table 23 Solution set obtained in Case 6
Table 24 Compromise solution obtained by each algorithm

5.3 Evaluation on performance metrics

In this section, two performance indicators, IGD and HV, are selected to evaluate the algorithm. The former is an inverse index, reflecting the difference between the Pareto solution set found by the algorithm and the real Pareto solution set. The latter is a positive indicator, reflecting convergence and coverage. This is described in details in the previous section. Since the real Pareto frontier of OPF problem cannot be obtained, all the solution sets obtained by running each algorithm for 30 times are taken as a whole, and the non-dominant solution is found to replace the real Pareto frontier. The experimental results of IGD and the average rankings obtained from the Friedman test are shown in Table 25, and the box diagram is shown in Fig. 19. The results of HV and the average rankings obtained from the Friedman test are shown in Table 26, and the box diagram is shown in Fig. 20. The values within parentheses in Tables 25, 26 are the p-values obtained from the Wilcoxon signed-rank test (at a significance level of 95%). This test compares the results of MOCOA-ML with those of other algorithms. The IGD and HV values of the proposed algorithm are better than those of other algorithms. And MOCOA-ML obtained the optimal index in all cases. At the same time, its standard deviation is also smaller, which indicates the stability of the optimization algorithm to a certain extent.

Table 25 Statistical results of IGD in different cases
Fig. 19
figure 19

Boxplots of the IGD

Table 26 Statistical results of HV in different cases
Fig. 20
figure 20

Boxplots of the HV

6 Conclusions and future works

In this paper, a MOCOA based on hybrid elite framework and Meta-Lamarckian learning strategy is proposed to deal with multi-objective OPF problems with complex constraints. MOCOA-ML retains the main position updating formula of COA, and selects better populations by non-dominated sorting. Additional external archive can store uniform and diverse Pareto solution sets. Combined with the Meta-Lamarckian learning strategy, a local optimizer is established to further improve the performance of the algorithm. The experimental contents and results are as follows. (1) The proposed method was tested on 20 test functions, including ZDT series, DTLZ series and UF series, which were run independently 10 times in each case to record the performance metrics of the algorithm. The results of the test functions show that MOCOA-ML can find true Pareto frontiers on most functions, and the diversity of solution sets is best. (2) Simulation experiments were conducted on 6 OPF cases in IEEE 30-bus system and IEEE 57-bus system with fuel cost, active power loss and emissions as objective functions. The experimental results of OPF demonstrate that MOCOA-ML outperforms other advanced multi-objective optimization algorithms, such as MOPSO, MSSA and MOGWO. It effectively balances convergence performance with ductility, resulting in a superior and more uniformly distributed Pareto solution set.

The OPF problem has complex constraints, so proposing and selecting different constraint processing methods will directly affect the quality of the solution set. This paper only uses the basic penalty function method. The future work will focus on the constraint processing technology. In addition, the integration of new energy into the power grid is the future research trend. In the future work, the modeling of wind turbines and photovoltaic power stations will be carried out, and new energy will be added into the OPF problem and simulation will be carried out so as to deal with the challenges brought by the uncertainty of power and load demand of distributed generation.