MOEA/D
The algorithm MOEA/D was proposed by Zhang and Li [27] in 2007, which has been a very popular and effective approach for solving MOPs. Instead of using nondominated sorting strategy for handling multiple objective optimization problems, MOEA/D decomposes multiple objective optimization problems into a few single-objective optimization problems by an aggregation function. In this paper, we adopt the Tchebycheff approach to decompose MOPs. Let \({{\varvec{\lambda}}}^{1},{\dots ,{\varvec{\lambda}}}^{N}\) be a set of even spread weight vectors, a multiobjective optimization problem at time t can be decomposed into N scalar optimization problems, and the i-th subproblem (\(i=1,\cdots ,N\)) at time t is given by:
$$minimize {g}^{t}\left({\varvec{x}}\left|{{\varvec{\lambda}}}^{i},{{\varvec{z}}}^{*}\right.\right)=\underset{1\le j\le m}{\mathit{max}}\left\{{\lambda }_{j}^{i}\left|{f}_{j}\left({\varvec{x}},t\right)-{z}_{j}^{*}\right|\right\},$$
(5)
$$subject to {\varvec{x}}\in \Omega $$
where \({{\varvec{z}}}^{*}={({z}_{1}^{*},\cdots ,{z}_{m}^{*})}^{T}\) is the reference vector, i.e., \({z}_{j}^{*}=min\{{f}_{j}\left({\varvec{x}},t\right),{\varvec{x}}\in\Omega \}\) (for a minimization problem) for each \(j=1,\dots ,m\). MOEA/D minimizes these N subproblems simultaneously in a single run.
In MOEA/D, a neighborhood of weight vector \({{\varvec{\lambda}}}^{{\varvec{i}}}\) is defined as a set of its several closest weight vectors in \(\left\{{{\varvec{\lambda}}}^{1},\cdots ,{{\varvec{\lambda}}}^{N}\right\}\).The neighborhood of the i-th subproblem consists of all the subproblems with the weight vectors from the neighborhood of \({{\varvec{\lambda}}}^{i}\). A population of N solutions is randomly generated and each solution is randomly allocated to a particular subproblem.
Multi-layer perceptron
Artificial neural networks (ANNs) are well-known data-driven machine learning methodologies, which simulate the neural systems of human brains [43]. These networks can model any linear/nonlinear functions by fitting datasets and generalize to unseen situations, which have been widely used for solving classification and regression problems. Multi-layer perceptron (MLP) is the most utilized class of ANNs, which consists of an input layer followed by a few hidden layers and an output layer [44]. Each layer consists of a few nodes, which represent neurons (processing units) with a nonlinear activation functions. Nodes between adjacent layers are connected with weighting values [45]. Figure 1 illustrates a MLP network with a single hidden layer. The output value of each node in layers is calculated as follows:
$$\forall l\in \left\{\mathrm{1,2},\dots ,j\right\},{h}_{l}=\varphi \left(\sum \limits_{i=1}^{m}{{\varvec{W}}}_{i}^{l}{{\varvec{I}}}_{i}+{\beta }_{l}\right),$$
(6)
where \({{\varvec{I}}}_{i}\) is the input vector,\({{\varvec{W}}}_{i}^{l}\) is the connection weights between \({{\varvec{I}}}_{i}\) and the node l; m is the number of input vectors, j is the number of nodes in the layer, \({\beta }_{l}\) is the bias of the lth node, and \(\varphi \) is an activation function, e.g., the standard logistic sigmoid function, that is:
$$\varphi \left({\varvec{x}}\right)=\frac{1}{1+{e}^{-{\varvec{x}}}}$$
(7)
The use of MLPs can be divided into two phases, training and inference. In training phase, a set of training samples is used to determine the weights and bias values of MLPs. In inference phase, a trained MLP model output the result according to the input value. Training the MLP is performed by modifying the weights and bias in successive iterations, such that the error is minimized. The objective of the training is to minimize the mean of squares of the network errors (MSE) [46]:
$${L}_{MSE}=\frac{1}{M}\sum_{i=1}^{M}{({{\varvec{y}}}_{i}^{*}-{{\varvec{y}}}_{i})}^{2},$$
(8)
where \({y}_{i}^{*}\) and \({y}_{i}\) are the target output and predicted output, respectively, for i-th training iteration and M is the total number of training iterations.
A MLP-based predictor
We build a MLP-based predictor to assist MOEA/D in searching the new Pareto-optimal solutions in the new environment. The predictor learns from the historical optimal solutions and predicts a new set of solutions, which acts as the initial population for MOEA/D. Assuming the historical optimal solutions in the previous environments are denoted as \({\{{\varvec{P}}}_{1},{{\varvec{P}}}_{2},\cdots ,{{\varvec{P}}}_{t}\}\), which are expected to be learned to estimate a new set of solutions \({{\varvec{P}}}_{t+1}\) in the later environment.
Supposing there are N solutions in each set \({{\varvec{P}}}_{i}(i=1,\cdots ,t)\), where each solution is the obtained optimal result of the corresponding subproblem in an environment. In MOEA/D, each solution is associated with its subproblem determined by the weight vector. The weight vectors are generated in the initialization stage, which remains unchanged. Therefore, the solutions, as long as obtained in different environments associated with the same weight vector, can comprise a time-series solutions automatically. Therefore, the historical optimal solutions sets \({\{{\varvec{P}}}_{1},{{\varvec{P}}}_{2},\cdots ,{{\varvec{P}}}_{t}\}\) can comprise N time series, i.e.\({{({\varvec{x}}}_{i}^{1},{{\varvec{x}}}_{i}^{2},\cdots ,{{\varvec{x}}}_{i}^{t})}^{T},{{\varvec{x}}}_{i}^{1}\in {{\varvec{P}}}_{1},{\boldsymbol{ }{\varvec{x}}}_{i}^{2}\in {{\varvec{P}}}_{2},\cdots ,{{\varvec{x}}}_{i}^{t}\in {{\varvec{P}}}_{t}, i=1, \cdots , N.\) We assume that each solution has d variables which are independent each other, i.e. \({{\varvec{v}}}_{i}^{k}=\left({x}_{i,1}^{k},{x}_{i,2}^{k},\cdots ,{x}_{i,d}^{k}\right), k=\mathrm{1,2},\cdots ,t, i=1, \cdots ,N\). Thus, each time series \({{({\varvec{x}}}_{i}^{1},{{\varvec{x}}}_{i}^{2},\cdots ,{{\varvec{x}}}_{i}^{t})}^{T}\) can be further divided into d series of variables,\({(x}_{i,j}^{1},{x}_{i,j}^{2},\cdots ,{x}_{i,j}^{t}),i=\mathrm{1,2},\cdots ,N,j=\mathrm{1,2},\cdots ,d\), which means that we need build N*d individual prediction models to estimate a new set of solutions. Considering different variables have different correlations in time series, thus each model should be trained separately.
For each sequence of a solution in j-th dimension, \({(x}_{i,j}^{1},{x}_{i,j}^{2},\cdots ,{x}_{i,j}^{t}),i=\mathrm{1,2},\cdots ,N,j=\mathrm{1,2},\cdots ,d\), we believe that there exists a hidden function to describe the correlation of the sequence. We assume each value is strongly correlated with s preceding ones, thus there is:
$${x}_{i,j}^{t+1}={f}_{MLP}\left({x}_{i,j}^{t-s+1},\cdots ,{x}_{i,j}^{t}\right),i=1,\cdots ,N,j=1,\cdots ,d,$$
(9)
where \({f}_{MLP}\) denotes the MLP-based predictor.
For the sequence of the solution, \({(x}_{i,j}^{1},{x}_{i,j}^{2},\cdots ,{x}_{i,j}^{t}),i=\mathrm{1,2},\cdots ,N,j=\mathrm{1,2},\cdots ,d\), we can get (t-s) training samples like \(\left\{\left(\left({x}_{i,j}^{1},\cdots ,{x}_{i,j}^{s}\right),{x}_{i,j}^{s+1}\right),\cdots ,\left(\left({x}_{i,j}^{t-s},\cdots ,{x}_{i,j}^{t-1}\right),{x}_{i,j}^{t}\right)\right\}\) via sliding the time windows forward, which can be used for training the predictor \({f}_{MLP}\). Supposing t = 8 and s = 4, then we would have a sequence of the solution as \({(x}_{i,j}^{1},{x}_{i,j}^{2},\cdots ,{x}_{i,j}^{8}),i=\mathrm{1,2},\cdots ,N,j=\mathrm{1,2},\cdots ,d\), and we can split the sequence as: \(\left\{\left(\left({x}_{i,j}^{1},\cdots,{x}_{i,j}^{4}\right),{x}_{i,j}^{5}\right),\left(\left({x}_{i,j}^{2},\cdots,{x}_{i,j}^{5}\right),{x}_{i,j}^{6}\right)\right.\), \(\left.\left(\left({x}_{i,j}^{3},\cdots,{x}_{i,j}^{6}\right),{x}_{i,j}^{7}\right),\left(\left({x}_{i,j}^{4},\cdots,{x}_{i,j}^{7}\right),{x}_{i,j}^{8}\right)\right\}\).
The pseudo-code of training the MLP-based predictor is given as Algorithm 1.
The framework of MOEA/D-MLP
The whole framework of our dynamic multiobjective optimization algorithm—MOEA/D-MLP consists of three parts: an environmental change detection mechanism, a MLP-based predictor, and a multiobjective optimization algorithm implemented in the static environment. The pseudo-code of MOEA/D-MLP is shown in Algorithm 2, wherein the static multiobjective optimization algorithm is based on MOEA/D-DE. In Line 6, we randomly choose 10% of individuals as sensors to evaluate if the environment changes. If the average objective function of these sensors change over iteration, this is judged as the environmental changing. In Line 10, the MLP-based predictor starts to work from the (s + 2)-th time step, which ensures that at least one sample can provide for training the MLP network.