In this section, the main framework of the proposed ADSAPSO is introduced, followed by the details of the proposed model management, including the adaptive dropout mechanism and a new infill criterion.
Framework of ADSAPSO
The framework of the proposed ADSAPSO is presented in Algorithm 1, which mainly consists of the adaptive dropout mechanism and the infill criterion. First, N solutions are sampled using the Latin hypercube sampling (LHS) strategy [30] and evaluated using the expensive functions. Then these evaluated candidate solutions are merged into an empty archive Arc. Next, the environmental selection selects \(N_\alpha \) solutions (arc) from archive Arc. Afterwards, the adaptive dropout mechanism and the infill criterion are adopted. The adaptive dropout mechanism aims to select the d-dimensional variables, and the infill criterion is used to choose the new samples to be evaluated by expensive functions. Notably, we first select one well-performing solution set and one poorly-performing solution set, which are statistically different in the decision space. Accordingly, d-dimensional variable set S is selected from the full-length decision variables, where the chosen variables significantly affect the convergence. As for the infill criterion, the RBF-based search is used to optimize the selected d-dimensional decision variables and obtain optimal k d-dimensional decision vector set \(X_d\). Then a replacement operator selects k full-length solutions, whose corresponding decision variables will be replaced by \(X_d\), to form individual set \(X_{{\text {new}}}\). \(X_{{\text {new}}}\) will be evaluated using expensive functions for updating the archives Arc. Finally, the obtained Arc will be output as the final solutions. Note that ADSAPSO uses the same environmental selection operator in NSGA-II [7], and we will not describe its details.
Adaptive dropout mechanism
The adaptive dropout mechanism is a crucial component of the proposed ADSAPSO (line 8 in Algorithm 1), which discards the decision dimensions from D dimensions to d. In D-dimensional decision variables, some variables have a more significant effect on convergence enhancement, and they are important for expensive multiobjective optimization. Thus, the adaptive dropout mechanism adapts a statistical model-assisted method for selecting d decision variables that significantly affect convergence from the original full-length decision variables. An example of the adaptive dropout mechanism is shown in Fig. 2, where the dashed circle indicates the discarded decision variables.
Unlike the principal component analysis technique [31], which maps data from high-dimensional space to low-dimensional space, our proposed adaptive dropout mechanism retains the original information of the selected d dimensions. Specifically, the proposed adaptive dropout mechanism learns from different solution sets’ statistical differences in the decision space.
Generally, the proposed adaptive dropout mechanism consists of three steps: (1) Solutions selection; (2) Statistical models construction; (3) Dimension selection.
\(\textit{(1) Solutions selection:}\) Assuming that there are significant differences in some dimensions in the decision space between a set of well-performing solutions and a set of poorly-performing ones. The distribution differences of these two solution sets can somehow reflect the importance of some decision variables in convergence enhancement. Thus, a suitable solution selection should be used to capture the differences. If well-performing solutions and poorly-performing ones are selected from archive Arc, the poorly-performing solutions remain the same during the update process. In this case, the rankings of poorly-performing solutions in the whole Arc are almost unchanged. In other words, the distribution of the poorly-performing solution sets will not help in the later stages of evolution. To remedy this issue, we select the well-performing solutions and the poorly-performing solutions from archive arc, where solutions in arc are elite solutions selected from Arc. Afterwards, according to the non-dominated sorting, the first \(N_s\) solutions in arc are regarded as well-performing solutions, and the last \(N_{\text {s}}\) solutions in arc are deemed to be poorly-performing ones. Consequently, the indistinguishable situation can be avoided.
\(\textit{(2) Statistical models construction}\) After solutions selection, two statistical models are constructed using the selected well-performing solutions and the selected poorly-performing solutions, respectively. To discover the effect of different variables, we analyze the two sets of solutions in the decision space via the constructed two statistical models. To be more specific, we count the average value of each dimension of the two solution sets as the statistical value separately. Since the selected two sets of solutions have significant differences in the objective space, the influence of different variables on the objective values can be reflected by the statistical differences in the decision space. Thus, dimensions with significant differences in the decision space will be emphasized for better convergence enhancement in the following optimization.
\(\textit{(3) Dimension selection}\) After constructing two statistical models, a difference model is obtained by calculating the difference between each dimension in the two models. Then, we select the top \(\beta \cdot D\) dimensions with the highest absolute difference to set S.
Figure 3 shows an example of the adaptive dropout mechanism, and Fig. 3a demonstrates the detailed process of solutions selection. In this figure, the blue squares are the “Good” solutions and the red triangles are the “Bad” ones. Notably, the black circles on the upper right corner in Fig. 3a are the solutions (in Arc) that are not selected by arc. They will not be selected as “Bad” solutions for preventing the distribution of “Bad” solutions from premature. Figure 3b presents two statistical models, where the horizontal axis represents the dimension, and the vertical one is the dimension value. Besides, the red line indicates the model of poorly-performing solutions (“Bad” solutions), and the blue line indicates the model of well-performing solutions (“Good” solutions). Figure 3c demonstrates the absolute difference of the above two statistical models, where the horizontal axis represents the dimension, and the vertical axis represents the absolute difference between two statistical models. The dotted line represents the selection threshold, and the shaded parts indicate the dimensions to be selected. Specifically, \(d_1\), \(d_2\), \(d_3\), and \(d_4\) represent the lower and upper boundaries that locating to the dimensions above the threshold. The corresponding dimensions in \([d_1,d_2]\) and \([d_3,d_4]\) are the dimensions we ultimately choose.
For high-dimensional EMOPs, a large number of training samples are required for constructing accurate surrogate models, which is unrealistic for expensive multiobjective optimization. Our proposed adaptive dropout mechanism reduces the dimension of decision variables from D to d according to the statistical results. Generally, the proposed method helps build relatively accurate d-dimensional surrogate models with the same number of training samples.
Infill criterion
Once the d-dimensional decision vector set S, which has a more significant effect on convergence, is selected by adaptive dropout, the proposed ADSAPSO is expected to optimize the d decision variables and select promising candidate individuals to be re-evaluated.
In our proposed infill criterion, an RBF-based search is adapted to optimize the selected d-dimensional variables given in Algorithm 2. First, m d-dimensional RBF models are trained using the individuals in \({\text {arc}}_{d}\), which will be used to replace the original expensive functions for evaluating offspring. Notably, RBF is used due to its insensitivity to the increment in the dimension of the function to be approximated [32, 33]. Compared with the Kriging model [34], RBF is more suitable for problems with high-dimensional decision variables, since the computation time for training Kriging models will become unbearable when the number of training samples is large. Then, the particle swarm optimizer (PSO) [35] is adopted for further optimization, due to its promising capability in solving high-dimensional optimization problems as suggested in Ref. [34]. To enhance population diversity at the late stage of the evolution, we add a polynomial mutation operation (Line 6 in Algorithm 2). In the late stage of the evolution, the population may trap in local optima easily, and thus \(0.75 \times {\text {MaxFEs}}\) is empirically adopted as the threshold for adopting polynomial mutation. After the optimization, k d-dimensional better-converged decision vectors are selected from the final decision vector set. Compared with the surrogate models built with D-dimensional training samples, although not all decision variables are optimized at every iteration, the surrogate model built with d-dimensional training samples can better help optimize the d-dimensional variables and accelerate convergence rate. The RBF-based search in the proposed ADSAPSO aims to conduct the local search in a low-dimensional space to quickly obtain better-converged solutions, which is naturally suitable for high-dimensional EMOPs.
Since the optimized k decision vectors of dimension d cannot be evaluated by expensive functions directly, we introduce a Replacement method that extends d-dimensional decision vectors to D-dimensional ones. To be more specific, k solutions of dimension D, named \(X_D\), are first selected from the non-dominated solution set of arc by the environmental selection. Note that the environmental selection adopted here is the same as that in Algorithm 1. Next, the corresponding dimensions of \(X_D\) will be replaced by \(X_d\) to form k new individuals \(X_{{\text {new}}}\). Specifically, solutions in \(X_d\) are well converged, and solutions in \(X_D\) are with good diversity. To some extent, the newly generated solutions \(X_{{\text {new}}}\) can be considered to inherit the convergence and diversity properties of \(X_d\) and \(X_D\) simultaneously. Finally, we re-evaluate \(X_{{\text {new}}}\) for updating the archives. An illustrative example of the Replacement is shown in Fig. 4, where each circle denotes a decision variable, and the dashed circles will be replaced for evaluating full-length decision vectors.