Introduction

With the gradual emergence of the disadvantages of fossil energy, the proportion of renewable energy continues to increase [1]. Due to the temporal and spatial uncertainty of renewable energy power generation, the curtailment rate remains high [2]. Hybrid renewable energy systems (HRESs) can effectively improve the reliability of the power supply system, reduce the cost of power generation, and increase the utilization rate of renewable energy by combining different types of energy, conventional energy storage systems (ESS) and power generation systems [3, 4]. In recent years, it has attracted the attention of many researchers. HRESs can be presented as photovoltaic-battery hybrid systems, photovoltaic-fan-battery hybrid systems, photovoltaic-wind-diesel-storage hybrid systems, and other complex hybrid systems according to their components combinations [5]. A typical HRES mainly includes photovoltaic arrays, wind generators, battery energy storage devices, diesel generators, and user loads.

Due to the need to determine the component capacity of the HRES, we often need to simulate the operating state of the system during the life cycle [5]. Generally speaking, according to user requirements or the estimated life of each component of the system, simulations can be carried out for up to 20 years with a time slot of minutes [6]. For situations where the accuracy requirements are not high, it is also necessary to carry out a year-long simulation with a time slot of hours [3]. Therefore, it takes a long time to perform a single simulation. Since there are many nonlinear constraints and the evaluation contains a simulation process, it is difficult to use traditional linear programming methods to solve HRES [7]. Thus, many researchers tend to use evolutionary multi-objective optimization algorithms (MOEAs) to solve the problem.

Without loss of generality, the multi-objective optimization problems (MOPs) [8, 9] can be expressed in the following form:

$$\begin{aligned} \begin{aligned} \text {Min}~~F(\textbf{x})&= \{f_1(\textbf{x}),f_2(\textbf{x}),\cdots , f_m(\textbf{x}) \} \\ \mathrm {s.t.}~~~~ \textbf{x}&= (x_1,x_2,\ldots , x_n) \in \Omega \end{aligned} \end{aligned}$$
(1)

where \(\Omega \) denotes the search space, m is the number of objectives, and \(\textbf{x}\) is a decision vector that consists of n decision variables \(x_i\). A solution \(\mathbf {x_a}\) is considered to Pareto dominate another solution \(\mathbf {x_b}\) if and only if \(\forall i=1,2,...,m, f_i(\mathbf {x_a}) \le f_i(\mathbf {x_b})\) and \(\exists j=1,2,...,m, f_j(\mathbf {x_a}) < f_j(\mathbf {x_b})\). Furthermore, a Pareto optimal solution is a solution that is not Pareto dominated by any other solution. The set of Pareto optimal solutions is called the Pareto set (PS). The image of PS is known as Pareto Front (PF) [10].

To address MOPs, many multi-objective evolutionary algorithms (MOEAs) have been proposed, which have been proven effective on benchmark problems. Generally speaking, MOEAs can be divided into three types: (1) Pareto dominance based method, e.g., NSGA-II [11] and SPEA2 [12]; (2) decomposition-based method, e.g., MOEA/D [13] and NSGA-III [14]; (3) indicator-based method, e.g., IBEA [15] and HypE [16]. For most MOEAs, searching for the optimal solution set requires a lot of solution evaluations. In many real-world MOPs, the evaluation of the objective function is very time-expensive, e.g., in the aero-engine design problem [17], a lot of time-consuming simulations are required, which greatly limits the application of MOEAs in such problems. So far, one of the commonly used methods for solving time-expensive MOPs is to introduce surrogate models to replace the time-expensive multi-objective calculations, which is usually called surrogate-assisted evolutionary multi-objective optimization (SAEMO) [18, 19]. Generally, SAEMO algorithms can be usually divided into three categories [20]. The first type is to directly use the surrogate model to replace the time-consuming objective function calculation for environmental selection [21, 22]. The second type of SAEMO algorithm is to convert MOP into a single objective optimization problem through the aggregation function [23]. Then a surrogate model is introduced to fit this single objective problem. The third type is to train the classification model according to the dominance relationship, and the surrogate model is used as the classifier to assist the MOEAs [24, 25].

Since HRES is a time-consuming optimization problem, it’s reasonable to utilize the SAEMO algorithm to solve it. However, most of the proposed SAEMO algorithms are designed especially for the continuous optimization problems [18]. The performance of these algorithms on discrete optimization problems is doubted. In addition, the existing SAEMO algorithms mainly focus on improving the performance of convergence in the objective space and ignore enhancing the diversity of solutions in the decision space [26]. There arise many MOPs in that multiple Pareto optimal solution sets correspond to the same point on the Pareto front, which are called multimodal multi-objective optimization problems (MMOPs) [10, 27].

As we can see from Fig. 1, point \(p_1\) and point \(p_3\) are far away in the decision space, while they share the same objective vector \(y_1\). For HRES, two different configurations may have the same objective values, like \(p_1\) and \(p_2\). It is necessary to obtain all optimal solutions such that the decision-makers (DMs) can better understand the problem. Moreover, it is easier to transfer to another solution if some constraints arise [28]. However, obtaining all Pareto solutions for MMOPs is difficult for traditional MOEAs due to the following reasons: (1) the use of the convergence-first strategy to select a new generation will cause premature population; (2) the crowding distance in the objective space will force the algorithm to remove solutions with similar objective values. As a result, simply adopting traditional MOEAs or SAEMO algorithms to solve HRES will face challenges in computation time and the lack of solution diversity.

Fig. 1
figure 1

Illustration of a multimodal optimization with single objective function

To better address HRES, in this study, we first propose a mathematical model for an HRES, which considers the annualized cost of the system, fuel emission and the power shortage rate. Since the optimization process contains a long-time-span simulation, it’s time-expensive. Thus, to accelerate the searching process, we proposed a novel surrogate-assisted multimodal multi-objective evolutionary algorithm, termed SaMMEA. Specifically, the proposed algorithm first uses the Gaussian process (GP) [29] to establish a surrogate model for the original objective functions. During the evolution, the surrogate model will be simultaneously updated. To enhance the diversity of the proposed algorithm, a novel environmental selection strategy is proposed and adopted.

The rest of this paper is structured as follow: Section “Preliminary work” gives the preliminary work of this paper, including the review of HRES and SAEMO; Sections “HRES model” and “Surrogate-assisted MOEA” describe the detail of the proposed HRES model and the proposed surrogate-assisted algorithm, respectively, followed by Section “Experiment”, which analyses the performance of the algorithm. Finally, a brief conclusion of this paper and future work are given in Section “Conclusion”.

Preliminary work

Hybrid renewable energy system

In the planning stage, the HRES seeks to minimize the installed capacity of components, finding a set of ideal configurations, including whether to install a component and how much capacity. To this end, numerous research has developed and improved the HRES model [6, 7]. Since the HRES is often formulated as mixed-integer programming, it may be resolved using standard techniques like branch-and-cut [4]. However, as microgrid components have been updated, the HRES model has progressively displayed nonlinear properties that cannot be easily addressed by conventional linear programming techniques [3]. It has been claimed that some straightforward non-linear problems may be handled using conventional linear programming techniques by transforming the non-linear constraints and objective functions into their linear forms. Such a transition, nevertheless, is constrained and loses the details of the initial issues [30].

The HRES problem, on the other hand, is a common multi-objective optimization problem [31]. The complete life cycle cost, power dependability, pollutant emission, and other factors need to be taken into account concurrently by DMs. Therefore, many studies tend to choose MOEAs to solve HRES. Mayer et al. [32] proposed a multi-objective design framework for household-scale systems considering solar photovoltaic, wind turbine, solar heat collector, heat pump, heat storage, battery, and heat insulation thickness, which is then solved by a multi-objective genetic algorithm. In addition, Wang et al. [33] proposed a scenario-dominance-based multi-objective evolutionary algorithm (denoted as s-NSGA-II) to simultaneously optimize multi-scenario for HRES design. In [34], a multi-objective particle swarm optimization algorithm and the Monte Carlo Simulation method are utilized to study the effect of the deterministic and stochastic behavior of electric vehicles on the number of components. To overcome the heavy computational budget, a surrogate model is introduced. Du et al. [35] proposed a surrogate modeling and analysis methodology to study dynamic hybrid energy systems. In addition, to replace the heavy computational burden that limits the applicability under massive scenarios, Jiang et al. [36] proposed a surrogate model-assisted quantified evaluation method for community integrated energy systems operation considering the variation in energy quality.

To sum up, the HRES model has been well-studied. However, most of the studies focus on using MOEAs to find a well-distributed PF, while the diversity of solutions in the decision space is less considered. In addition, few studies try to use the surrogated-assisted model to accelerate the searching process of the algorithm. Therefore, it’s reasonable and necessary to propose a surrogated-assisted multi-objective evolutionary algorithm considering the solution diversity to better address the HRES.

Surrogated-assisted method

In SAEMO, the uncertainty of model estimation will affect the search direction of the algorithm, thereby affecting performance. Thus, maintaining a model to accurately estimate the solutions is important. So far, many methods have been utilized successfully to construct SAEMO algorithms, e.g., polynomial regression, radial basis function [37], neural networks [38] and GP model [29]. Compared with other surrogate-assisted models, the GP model can provide not only individual estimation but also the uncertainty of the estimation [29]. Therefore, this paper chooses the GP model as the original objective function evaluation model. Through the help of the trained GP model, algorithms can explore different areas where the optimal solution set may exist, thereby helping to improve the search ability.

GP is a machine learning method based on statistical theory. Its properties are defined by the mean function \(\mu (\textbf{x})\) and the covariance function \(k(\textbf{x}_i,\textbf{x}_j)\), which can be expressed as follow:

$$\begin{aligned}{} & {} \mu (\textbf{x})=E[f(\textbf{x})] \end{aligned}$$
(2)
$$\begin{aligned}{} & {} k(\textbf{x}_i,\textbf{x}_j)=E[(f(\textbf{x}_i) - \mu (\textbf{x}_i))\cdot (f(\textbf{x}_j) - \mu (\textbf{x}_j))] \end{aligned}$$
(3)

where \(\textbf{x}_i\) means a decision vector.

Suppose a training data set \(\textbf{X}=[\textbf{x}_1,\textbf{x}_2,...,\textbf{x}_n]\) with its label \(\textbf{Y}=[\textbf{x}_1,\textbf{x}_2,...,\textbf{x}_n]\), then the GP model can be defined as:

$$\begin{aligned} \hat{f}(\textbf{x})=f(\textbf{x})+\epsilon \end{aligned}$$
(4)

where \(\hat{f}(\textbf{x})\) is the predicted value of the GP regression model on \(\textbf{x}\); \(\epsilon \) is a random variable that obeys a Gaussian distribution with a mean value of 0 and a variance of \(\delta ^2\). Then, we have

$$\begin{aligned} \hat{f}(\textbf{x}) \sim N(0,K+\delta ^2I) \end{aligned}$$
(5)

where K is a symmetric positive definite covariance matrix, and each element \(k_{ij}\) represents the correlation between \(\textbf{x}_i\) and \(\textbf{x}_j\). Then

$$\begin{aligned} \left[ \begin{array}{c} \textbf{Y} \\ \textbf{Y}^{*} \end{array}\right] =N\left( 0,\left[ \begin{array}{cc} K(\textbf{X}, \textbf{X})+\sigma ^{2} I &{} K\left( \textbf{X}, \textbf{X}^{*}\right) \\ K\left( \textbf{X}^{*}, \textbf{X}\right) &{} K\left( \textbf{X}^{*}, \textbf{X}^{*}\right) \end{array}\right] \right) \end{aligned}$$
(6)

where \(K(\textbf{X}, \textbf{X}^*)\) represents the covariance matrix between the test sample \(\textbf{X}^*\) and the training sample \(\textbf{X}\), \(K(\textbf{X}^*, \textbf{X}^*)\) is the covariance matrix of the test sample itself.

Subsequently, the maximum likelihood estimation method is used to find the optimal hyperparameters to finally determine the Gaussian process model. When the input \(\textbf{X}^{*}\) is given, it uses the input \(\textbf{X}\) in the training set and its observation target output value \(\textbf{X}\) to predict the posterior distribution of the prediction \(\hat{f}(\textbf{X}^{*})\) with the largest probability, which can be expressed:

$$\begin{aligned}{} & {} \hat{f}\left( \textbf{X}^{*} \mid \textbf{X}^{*}, \textbf{X}, \textbf{Y}\right) \sim N\left( \mu , \sum \right) \end{aligned}$$
(7)
$$\begin{aligned}{} & {} \mu =K\left( \textbf{X}^{*}, \textbf{X}\right) \left( K(\textbf{X}, \textbf{X})+\sigma ^{2} I\right) ^{-1} \textbf{Y} \end{aligned}$$
(8)
$$\begin{aligned}{} & {} \sum = K\left( \textbf{X}^{*}, \textbf{X}^{*}\right) - K\left( \textbf{X}^{*}, \textbf{X}\right) \left( K(\textbf{X}, \textbf{X})+\sigma ^{2} I\right) ^{-1} \nonumber \\{} & {} \qquad \qquad \times K\left( \textbf{X}, \textbf{X}^{*}\right) \end{aligned}$$
(9)

HRES model

As the most widely used renewable energy, solar and wind have become normal energy resources in HRES. In addition, for a normal HRES, energy storage systems, diesel generators and user loads are necessary. This part will mathematically model the main components, e.g., the photovoltaic power generation, wind turbine power generation, battery energy storage system and diesel generator, to provide a basis for the establishment of the HRES model.

Photovoltaic

Solar photovoltaic power generation technology is one of the most ideal ways to use solar energy. It is safe and environmentally friendly without complicated components. The output power of the actual photovoltaic system is greatly affected by weather and panel installation angle. In order to simplify the model in this part, we only consider the light intensity and ambient temperature [39]. The maximum output power formed by \(N_s\) photovoltaic panels can be expressed as:

$$\begin{aligned} P_{pv}= & {} N_s\cdot V_{OC}\cdot I_{SC}\cdot F_{loss} \end{aligned}$$
(10)
$$\begin{aligned} V_{OC}= & {} V_{STC}-K_V \cdot T_c \end{aligned}$$
(11)
$$\begin{aligned} I_{SC}= & {} (I_{STC}+K_I\cdot (T_C-25))\cdot \frac{S_p}{1000} \end{aligned}$$
(12)
$$\begin{aligned} T_C= & {} T_A+\frac{NCOT-20}{800}\cdot S_p \end{aligned}$$
(13)

where \(K_I\) and \(K_V\) are the temperature coefficients, \(I_{STC}\) and \(V_{STC}\) are the short circuit current and open circuit voltage under standard conditions respectively. \(S_p\) and \(F_{loss}\) are solar radiation and the packing factor provided by the manufacturer.

Wind turbine

Another important component of HRES is the wind power generation system. The power output of a wind turbine is usually non-linear. When the wind speed is less than the cut-in wind speed, the wind turbine is in a shutdown state; when the wind speed is greater than the cut-in wind speed, the output power is approximately equal to the wind speed; when the wind speed is greater than the rated wind speed but less than the cut-out wind speed, appropriate measures need to be taken to limit the output power of the wind turbine to prevent the wind power generation system from being overloaded and damaged; when the wind speed exceeds the cut-out wind speed, the wind turbine must be shut down to ensure the safety of the system [40].

According to the power output curve of the wind turbine, the following wind power generation model can be established:

$$\begin{aligned} P_{wt}(v)=\left\{ \begin{array}{l} 0, v<V_{c} \\ \frac{1}{2} C_{P} \rho A_{wg} v^{3}, \quad V_{c} \le v<V_{r} \\ P_{wg}^{max}, \quad V_{r} \le v<V_{f} \\ 0, v \ge V_{f} \end{array}\right. \end{aligned}$$
(14)

where v and \(C_p\) are wind speed and the performance coefficient of the wind turbine, which is provided by the manufacturer. \(\rho \) and \(A_{wg}\) are the air density and the area swept by the rotor, respectively. In addition, \(V_c\), \(V_r\) and \(V_f\) are cut-in, rated and cut-out wind speed, respectively, which are set to 4 m/s, 14 m/s and 20 m/s respectively in this study according to the producer.

Energy storage system

In HRESs, batteries are generally used for ESS. When the renewable energy generation is greater than the load demand, the battery pack is used to store excess energy, and when the weather conditions are so bad that the power generation cannot meet the load, the battery pack discharges to supply energy for the user load. Most battery models will consider the state of charge (SOC), which should be kept within the maximum and minimum range given by the manufacturer to ensure safety [3]. The relationship between SOC changes, renewable energy generation and load power demand can be expressed as:

$$\begin{aligned}{} & {} SOC(t+1)=SOC(t)+\frac{(P_{bat}(t)/V_{bus})\cdot \Delta t \cdot \eta _{bat}}{C_n} \nonumber \\ \end{aligned}$$
(15)
$$\begin{aligned}{} & {} SOC_{min}\le SOC(t) \le SOC_{max} \end{aligned}$$
(16)

where \(P_{bat}(t)\) is the battery input/output power (a positive value indicates charging mode, and a negative value means discharging mode), \(V_{bus}\) and \(\eta _{bat}\) are the DC bus voltage and the bidirectional charging and discharging efficiency. \(C_n\) (Ah) is the total rated capacity of the ESS.

Diesel generator

In an HRES, a diesel generator is generally used as a backup energy source. It will only work when the renewable energy generation is less than the load demand and the ESS cannot meet the lack of power. Although the introduction of diesel generators can further improve the reliability of the HRES, it will also increase the cost of the system. At the same time, the consumption of fossil fuels will increase the emission of harmful pollutants and greenhouse gases. The fuel consumption of a diesel generator depends on its own nature. In order to simplify the calculation, the fuel consumption \(F_{cons}\) of a diesel generator can be approximately assumed to be a linear function of its power output:

$$\begin{aligned} F_{cons}=\gamma _1\cdot P_{dgr}\cdot \Delta t+\gamma _2 \cdot P_{dg}\cdot \Delta t \end{aligned}$$
(17)

where \(P_{dgr}\) and \(P_{dg}\) are the rated power and actual power of the diesel generator, \(\gamma _1\) and \(\gamma _2\) are the coefficients of the fuel consumption curve.

Objectives and simulation process of HRES

The calculation of the objective values of HRES needs to simulate the running process. In this study, the time series simulation method is adopted in order to accurately calculate the loss of power supply (LPSP), the annual cost of the system (ACS) and fuel emission (\(F_{emission}\)). Specifically, the calculation of the above three objectives can be expressed as:

$$\begin{aligned} ACS \,=\, \sum _{i \in E} (C_{inv}^{i}*N_{i} + C_{om}^{i}*N_{i})+C_{rep}*N_{bat} \end{aligned}$$
(18)

where \(C_{inv}^{i}\) and \(C_{om}^{i}\) are the annual cost and operation maintenance cost of equipment i, \(C_{rep}\) is the replacing cost of a battery system. E is the set of equipment, including photovoltaic, wind turbine, ESS and diesel generator.

$$\begin{aligned} LPSP = \frac{\sum _{t=0}^{T} P_{available}(t)<P{load}(t)}{T} \end{aligned}$$
(19)

where \(P_{available}(t)\) and P load(t) are the available power and the needing power at time t. In addition, to keep the stable running of the system, we have \(LPSP<10\%\).

$$\begin{aligned} F_{emission} = \sum _{t=0}^{T} F_{cons}(t)*\eta _{emission} \end{aligned}$$
(20)

where \(F_{cons}(t)\) and \(\eta _{emission}\) are the fuel consumption at time t and the emission factor which depends on the quality of the diesel generator and fuel. Its value is generally in the 2.4-2.8 kg/lit.

Overall, the mathematical model of the HRES problem in this study can be expressed as follows:

$$\begin{aligned} \begin{aligned} \text {Min}~~F(\textbf{x}) =&\{ACS, LPSP, F_{emission} \} \\ \mathrm {s.t.}~~~~&LPSP<10\% \\&\textbf{x} = (N_{s}, N_{w}, N_{b}, N_{d}) \in \Omega \end{aligned} \end{aligned}$$
(21)

where \(N_{s}\), \(N_{w}\), \(N_{b}\) and \(N_{d}\) are the number of photovoltaic panels, wind turbines, energy storage systems and diesel generator respectively. Notably, other constraints are embedded into the simulation process of the HRES. Therefore, for a given solution \(\textbf{x} = (N_{s}, N_{w}, N_{b}, N_{d})\), it’s an infeasible solution if and only if its \(LPSP>10\%\).

Figure 2 shows the simulation flow chart used in the HRES optimization process. Specifically, P(t), \(P_L(t)\), \(P_{need}(t)\) and \(P_{DG}(t)\) are the power of renewable energy, user load, the rest required energy and diesel generator respectively in time slot t. SOC(t) presents the state of charge of the ESS. The data, including solar radiation, average wind speed and the relevant performance indicator values of the system components, is first given at the start of the simulation procedure. Then, in each simulation time step, the optimizer will compute the renewable energy power generation based on the input data and the mathematical model. The power output of renewable energy is then compared with the entire amount of load requirements at the current moment. If the renewable power and the load demand are equal, the following step is skipped; if the power supply is larger than the demand, the ESS will be charged; if the power generation is less than the demand, the ESS will be first considered to supply the energy. Once the ESS can not meet the requirement (SOC is low), the diesel generator will be turned on. If the diesel generator fails to meet the load requirements, the LPSP will be counted. The above process will be repeated until \(t=T\).

Fig. 2
figure 2

Simulation process of HRES

For the HRES in this study, the decision variables are the number of systems components, e.g., wind turbine, photovoltaic panel, energy storage system and diesel generator. As the power suppliers, wind turbines, photovoltaics, and diesel generator can both provide energy to meet the requirement. In addition, if there is no renewable energy supply (no solar energy and no wind), the diesel generator and the ESS can both provide power. That is, they are functionally interchangeable. For example, we can assume that the energy generation of 1 wind turbine is equal to 10 photovoltaic panels. As a result, if there is no other preference, different optimal solutions may have the same/similar objective values, which is the multimodal nature of HRES.

Surrogate-assisted MOEA

Motivation and framework

For now, introducing a model to replace the direct calculation of objectives is proved effective and efficient to solve time-expensive problems. However, most of the existing algorithms focus on continuous optimization problems. In addition, due to the lack of diversity-maintenance mechanisms, the diversity of solutions in the decision space is relatively weak, which will lead to quick premature and loss of optimal solutions. So far, many MMEAs have been proposed to enhance the diversity and obtained as many optimal solutions as possible [10, 28, 41, 42]. However, all of them are utilized to solve continuous optimization problems. For discrete problems like HRES, the performance has not been systematically studied. To address the above issues, we proposed a novel surrogate-assisted multimodal multi-objective evolutionary algorithm, termed SaMMEA. The framework of SaMMEA can be seen in Algorithm 1.

figure b

From Algorithm 1 we can see that the main procedures of SaMMEA are similar to most of the MOEAs. The main difference between SaMMEA and the normal MOEAs is that we introduce a surrogate-assisted model and adopt the multimodal-based method to select solutions. Specifically, we first adopt the Latin Hypercube method to sample n solutions and obtain the decision vectors \(\mathbf {X_{train}}\), see \(line\ 1\). Then, we need to calculate the objective values of these sampling solutions and get the objective vectors \(\mathbf {Y_{train}}\). The above-mentioned data is then used to train the GP model to help the searching process. During the iterations, the GP model \(\textbf{M}\) will be updated, see \(line\ 5\). Then, the objective values of the new offspring generated by variation will be calculated by the trained model \(\textbf{M}\) to accelerate the searching process. After that, a novel environmental selection method is adopted to enhance the diversity of solutions in the decision space as well as the convergence quality of the whole population. Finally, the non-dominated solutions will be the final solution set.

Model management

We introduce a surrogate-assisted model to partially replace the direct function evaluation in SaMMEA. Specifically, before the algorithm run, a set of solutions is first sampled from the whole decision space to roughly train the GP model. However, for the whole decision space, the trained model is inaccurate since the number of samples is small. Therefore, it’s necessary to update the model simultaneously. Since the evaluation of the objective function is time-expensive, it’s unreasonable to add all of the solutions to update the model. Thus, we proposed a novel model management strategy to select solutions. Specifically, the selecting process can be seen in Algorithm 2.

figure c

As we can see from Algorithm 2, we first find the non-dominated solutions from the current population. Then, if the number of non-dominated solutions is larger than the given number n, a second-selection method is performed to select solutions with the largest crowding distance. It’s worth mentioning that, to keep the population diversity and approximate the true objective function, we proposed to choose solutions with minimum crowding distance in the decision space, which is calculated by the following equation.

$$\begin{aligned} CrowdDis_i = \frac{N-1}{\sum _{j=1}^{N}1/ \left\| \textbf{x}_j-\textbf{x}_i \right\| } \end{aligned}$$
(22)

where \(\left\| \textbf{x}_j-\textbf{x}_i \right\| \) indicates calculating the Euclidean distance of solutions \(\textbf{x}_i\) and \(\textbf{x}_j\). Note that \(\textbf{x}_i\) and \(\textbf{x}_j\) have been previously normalized.

Fig. 3
figure 3

The updating process of the GP model during evolution

The updating process of the model can also be seen from Fig. 3. As we can see, from left to the right, the three sub-figures present the trained model (red dashed line) and the true objective function (black solid line) in the early, middle, and final stages, respectively. As the evolution goes, the trained model approximates the true objective function more accurately. As we can see from \(line\ 1\) in Algorithm 2, only the non-dominated solutions will be chosen to update the GP model. In addition, the special crowding distance is utilized to select solutions with good diversity. As a result, the areas with better convergence quality are more likely to be sampled to ensure the exploration and exploitation for final optimal solutions.

Solution generation

Since the HRES in this study is a discrete problem, we used a posterior-fix method to initialize the population. Specifically, solutions are uniformly generated according to the lower and upper bounds of each decision variable, which are rounded afterwards. In addition, the crossover and mutation operation is important for improving the searching efficiency. Based on the discrete encoding, a posterior-fix method is adopted. Specifically, the simulated binary crossover (SBX) and polynomial mutation (PM) operators are employed to generate offspring. After that, all decision variables are rounded to integer numbers.

Environmental selection for discrete optimization problems

There are a variety of solutions with the same objective values. As a result, different solutions with equal objective values may be discarded if we simply use the convergence-first technique and crowding distance in the objective space to choose solutions from the joint population. Traditional MOEAs, on the other hand, perform poorly while solving MMOPs. The reasons may be summarized as follows: (1) the lack of a diversity-maintenance mechanism would lead to early convergence; (2) multiple solutions with the same objective values are difficult to coexist at the same time.

Table 1 Parameter setting for HRES problem

To solve this problem, we start by removing duplicate solutions (\(line\ 1\) in Algorithm 3). By doing so, we can ensure that the future generation inherits the finest solutions discovered so far while still preserving variety. After deleting duplicate solutions, we use the non-dominated Pareto sorting approach to sort the population (see \(lines\ 2--4\)). If the number of solutions in the joint population exceeds the population size N, a second-selection approach is used to preserve both the population size and the population distribution. The second-selection approach is broken into three parts: (1) compute the crowding distance of all solutions using Eq. (22); (2) eliminate the solution with the maximum crowding distance; and (3) repeat steps (1) and (2) until the population size reaches N.

figure d

To summarize, we first delete duplicate solutions to retain the variety of solutions in the decision space, as illustrated in Algorithm 3, line 1. The population is then sorted using the non-dominated sorting approach, lines 2–4, to increase the algorithm’s convergence capabilities. Finally, the second-selection approach is used to preserve the population size and solution distribution.

Experiment

Experimental setting

Compared algorithms

To verify the effectiveness of SaMMEA in solving HRES, NSGA-II/SDR [43], K-RVEA [44], MOEA/D-EGO [22], MMOEA/DC [42] and MMEA-WI [10] are chosen as competitor algorithms. Specifically, MMOEA/DC and MMEA-WI are the latest representative MMEAs. NSGA-II/SDR is selected as the representative normal MOEA, while K-RVEA and MOEA/D-EGO are chosen as the recent work that focuses on expensive optimization problems.

For all algorithms, we set the population size \(N=100\). For fairness, the maximum number of function evaluations \(N_E\) is set to 5000. However, since several SAEMOs are selected as the competitor algorithms, it’s hard to set the number of true function evaluations and the model prediction. Thus, we uniformly set the number of true function evaluations and the model prediction to 1000 and 4000 respectively. Note that the specific parameters in each algorithm are set according to the original papers.

Fig. 4
figure 4

The wind speed, solar radiation, temperature and user load data of \(Case\ 1\) used in this study

Notably, since some algorithms are not designed for problems with constraints, we uniformly adopt the penalty function to deal with infeasible solutions. Specifically, if one solution is unfeasible, then the objective values will plus a large number. All experiments are implemented on a PC configured with an Intel i9-9900X @ 3.50 GHz and 64 G RAM.

Parameter setting for HRES

To comprehensively study the performance of the proposed SaMMEA on solving HRES, three test instances are utilized as the benchmark problems. To be specific, the parameter setting of HRES can be seen in Table 1, which is provided by equipment manufacturers and set according to the local market. In addition, the wind speed, temperature, solar radiation, and user load of \(Case\ 1\) are presented in Fig. 4. It’s worth mentioning that, the time span of the whole data is a year. For better presentation, we only show the first 7 days data.

Performance metric

A number of metrics have been proposed to examine the performance of MOEAs, for example, HyperVolume, GD, and IGD indicators [45]. Most of them measure the performance of solutions in the objective space. However, since we focus on solution diversity in the decision space, we choose IGD and IGDX as the performance metrics, which are widely used indicators. For a solution set, \(\textbf{X}\), the IGD and IGDX are calculated as follows:

$$\begin{aligned} IGD(\textbf{X})= & {} \frac{1}{|\mathbf {X^*} |} \sum _{\textbf{y} \in \mathbf {X^*}} \min _{\textbf{x} \in \textbf{X}} \{ED(F(\textbf{x}),F(\textbf{y})) \}, \end{aligned}$$
(23)
$$\begin{aligned} IGDX(\textbf{X})= & {} \frac{1}{|\mathbf {X^*} |} \sum _{\textbf{y} \in \mathbf {X^*}} \min _{\textbf{x} \in \textbf{X}} \{ED(\textbf{x},\textbf{y}) \}. \end{aligned}$$
(24)

where \(ED(\textbf{x},\textbf{y})\) is the Euclidean distance between \(\textbf{x}\) and \(\textbf{y}\). \(\textbf{X}\) and \(\mathbf {X^*}\) denote the obtained solution set and a set of a finite number of Pareto optimal solutions uniformly sampled from the true PS, respectively.

Notably, a small IGD value means that the solution set \(\textbf{X}\) has both reasonable convergence and diversity in the objective space, which indicates that the algorithm is an effective MOEA. Similar to IGD, the IGDX calculates the convergence and spread of \(\textbf{X}\) in the decision space. A small IGDX value means the solution set \(\textbf{X}\) is a good approximation to the true Pareto solution set \(\mathbf {X^*}\) in the decision space, which indicates the algorithm is an effective MMEA. In [46], the authors mentioned that a solution set with a satisfactory IGDX usually has an acceptable IGD, while a reasonable IGD does not naturally produce a reasonable IGDX.

Since the HRES problem in this study is a real-world engineering problem, the true Pareto optimal solutions \(\mathbf {X^*}\) can not be analytically obtained. Therefore, an exhaustive way is utilized. To be specific, solutions in the entire decision space are enumerated and evaluated individually, which is really time-consuming. After that, a quick non-dominated sorting method is applied to find the true Pareto front.

Comparison with other algorithms

Table 2 The average IGD, IGDX and runtime for all algorithms over 31 independent runs, where the best results are highlighted with bold font, where the second line for each indicator means the variance

For all algorithms, we run the experiments 31 times independently. The average IGD, IGDX and runtime results of these algorithms are listed in Table 2. As we can see, in terms of IGDX, SaMMEA obtained the best results for all three cases, while MMEA-WI performs a bit worse. In terms of IGD, SaMMEA got the best result on \(Case\ 1\), MMEA-WI performed great on \(Case\ 2\) and NSGA-II/SDR received the best performance on \(Case\ 3\). MMOEA/DC and MMEA-WI are two representative MMEAs, they show worse but approximate results compared to SaMMEA. To be specific, SaMMEA is proposed to solve discrete MMOPs, where a special environmental selection method is proposed to keep the diversity of solutions in the discrete decision space. Therefore, it is more suitable for discrete optimization problems than other multimodal techniques in MMOEA/DC and MMEA-WI.

As for NSGA-II/SDR, its performance in obtaining well-distributed solutions in the objective space is great. However, the diversity in the decision space is relatively weak. Notably, MOEA/D-EGO shows poorer performance than K-RVEA and NSGA-II/SDR since we adopt the penalty function to deal with the constraints, which will produce wrong information to lead the algorithm based on decomposition. On the other hand, as we can see from Table 2 in terms of runtime, SaMMEA, K-RVEA and MOEA/D-EGO are time-saving compared to algorithms without the surrogated-assisted strategy. Specifically, these three algorithms are proposed for expensive optimization problems. Instead of directly evaluating solutions by the true objective function, the surrogate-assisted model is introduced to decrease the runtime, which is efficient in dealing with time-expensive problems.

Fig. 5
figure 5

The true PF and the obtained solutions by all compared algorithms

To further analyze the performance of SaMMEA, the solutions closest to the average IGD results obtained by all algorithms on \(Case\ 1\) are selected to present the Pareto front, which is shown in Fig. 5. As we can see, the distributions of solutions obtained by SaMMEA, MMEA-WI and MMOEA/DC are more evenly distributed than other algorithms. In addition, we can find that, in the final obtained solutions by SaMMEA and MMEA-WI, there are many different solutions that have similar objective values. For example, for solutions \(\textbf{x}_1=[17,11,22,0]\) and \(\textbf{x}_2=[2,18,30,1]\), the objective vectors of them are \(\textbf{y}_1=[0,8.69,11829.72,]\) and \(\textbf{y}_2=[0.23,9.45,11493.05]\) respectively. DMs need to know all solutions for better decision-making. However, such solutions will not be maintained in other normal MOEAs.

Fig. 6
figure 6

The true PF, obtained PF and knee points for HRES obtained by SaMMEA on \(Case\ 1\)

Fig. 7
figure 7

Power output of equipment in HRES on four days in different seasons

To sum up, the introduction of a surrogate-assisted model in MOEAs can greatly decrease the runtime of algorithms when solving time-expensive optimization problems. However, such replacement will also bring performance degradation somehow. Therefore, it’s important to figure out a suitable model which can accurately fit the objective functions with small samples. Embedding a multimodal strategy into SAEMO to enhance the diversity quality of solutions in the decision space can help to obtain more equivalent optimal solutions, which is important for DMs. For HRES, the proposed SaMMEA is competitive in terms of convergence and diversity quality compared to K-RVEA and MOEA/D-EGO, and shows better computational complexity compared to other algorithms without the surrogated-assisted model.

Result obtained by SaMMEA

As we mentioned in Section “Comparison with other algorithms”, SaMMEA shows competitive performance in solving HRES. In this section, we will further analyze the result obtained by SaMMEA. Since the final solution set of MOPs contains more than one solution, it’s important for DMs to choose a certain solution to implement, which belongs to multi-criterion decision-making (MCDM). Generally speaking, knee point [47, 48] can be seen as the preferred solution if there is no other DMs’ preference. Specifically, for a two-objective optimization problem, a knee point in the PF refers to the solution with the maximum marginal rates of return, that is, the point at which a small improvement in one objective will lead to severe degradation of at least one other objective. In this study, we use a posterior method to locate the global knee point, namely, the distance-based method. Due to the paper length limits, the detailed approach is not described. For more detailed information about the knee point, please refer to [48]. After the algorithm run on \(Case\ 1\), we select the global knee to present, which can be seen from Fig. 6.

As we can see from Fig. 6, the knee locates in the “pit” area of the whole PF. In this area, a small change in one objective will cause a huge move in other objectives. The objective vector of the knee point is [862.38kg, 1.13\(\%\), 14909.13\(\$\)]. In addition, the numbers of wind turbines, photovoltaic, ESSs and diesel generators are 30, 9, 26 and 2 respectively. Intuitively, the knee point solution is a good trade-off between all objectives. According to [48], the value of the performance indicator HyperVolume for knee point is better than other solutions in the PF. In other words, the knee point is closer to the origin of coordinates than other solutions.

The power output of photovoltaic, wind turbine, diesel generators and the running status of the energy storage system can be seen in Fig. 7. Notably, we choose four representative days in four different seasons to show the running status of the system. As we can see from Fig. 7, the user load in summer is low while in winter the load is high. As a result, the diesel generator will not work during the summer day. For days in spring and summer, the diesel generator will work sometimes. In addition, when the output of renewable energy is high, the ESS will choose to charge while discharging when the energy supply is insufficient. To sum up, the effectiveness of the proposed HRES mathematical model is verified. In addition, the proposed SaMMEA can well solve HRES.

Conclusion

Hybrid renewable energy system (HRES) is a powerful tool to utilize unstable renewable energy. However, the optimization of the HRES configuration needs a long-time simulation, which is generally time-consuming. As a result, using a traditional MOEA to find the optimal solutions is hindered. In addition, since the multi-modality nature of HRES, it’s important to find as many optimal equivalent solutions as possible to provide more information for DMs. In this paper, we proposed to use the surrogated-assisted MOEA to accelerate the searching process. Specifically, we used the GP method to fit objective functions before the optimization. Then, during the evolution, these GP models will be updated and maintained. As a result, the run time can be greatly decreased. In addition, a multimodal-based environmental selection is utilized to enhance the diversity of solutions in the decision space, which also helps to obtain different solutions.

It’s common in real-world engineering problems that a single evaluation of an objective function is time-expensive. So far, introducing the surrogate model to replace the original objective function is one of the most effective ways. However, most of the proposed surrogated-assisted MOEAs focus on continuous optimization problems. In addition, the number of samplings can greatly affect the performance of the algorithm. Thus, for discrete problems or discrete-continuous-mixed problems, such methods perform poorly. Another problem is that, for some problems, the calculation time for different objective functions is different. Directly applying the surrogated-assisted MOEAs to this kind of problem will cause computation resource waste. This is also one of our future research directions.