Advertisement

Structural and Multidisciplinary Optimization

, Volume 55, Issue 4, pp 1453–1469 | Cite as

Radial basis functions as surrogate models with a priori bias in comparison with a posteriori bias

  • Kaveh Amouzgar
  • Niclas Strömberg
Open Access
RESEARCH PAPER

Abstract

In order to obtain a robust performance, the established approach when using radial basis function networks (RBF) as metamodels is to add a posteriori bias which is defined by extra orthogonality constraints. We mean that this is not needed, instead the bias can simply be set a priori by using the normal equation, i.e. the bias becomes the corresponding regression model. In this paper we demonstrate that the performance of our suggested approach with a priori bias is in general as good as, or even for many test examples better than, the performance of RBF with a posteriori bias. Using our approach, it is clear that the global response is modelled with the bias and that the details are captured with radial basis functions. The accuracy of the two approaches are investigated by using multiple test functions with different degrees of dimensionality. Furthermore, several modeling criteria, such as the type of radial basis functions used in the RBFs, dimension of the test functions, sampling techniques and size of samples, are considered to study their affect on the performance of the approaches. The power of RBF with a priori bias for surrogate based design optimization is also demonstrated by solving an established engineering benchmark of a welded beam and another benchmark for different sampling sets generated by successive screening, random, Latin hypercube and Hammersley sampling, respectively. The results obtained by evaluation of the performance metrics, the modeling criteria and the presented optimal solutions, demonstrate promising potentials of our RBF with a priori bias, in addition to the simplicity and straight-forward use of the approach.

Keywords

Metamodeling Radial basis function Design optimization Design of experiment 

1 Introduction

With exponentially increasing computing power, designers have today the possibility by simulation driven product development to create new innovative complex products in a short time. In addition, simulation based design also reduces the cost of product development by eliminating the need of creating several physical prototypes. Furthermore, a designer can create an optimized design with respect to multiple objectives with several constraints and design variables. However, the models and simulations, particularly those pertained in multidisciplinary design optimization (MDO), can be very complex and computationally expensive, see e.g. the multi-objective optimization of a disc brake in Amouzgar et al. (2013). Surrogate or metamodels have been accepted widely in the MDO community to deal with this issue. A metamodel is an explicit approximation function that predicts the response of a computational expensive simulation based model such as a non-linear finite element model. It also develops a relation between the input variables and their corresponding responses. In general, the aim of a metamodel is to create an approximation function of the original function over a given design domain. Many metamodeling methods have been developed for metamodel based design optimization problems. Some of the most recognized and studied metamodels are response surface methodology (RSM) or polynomial regression (Box and Wilson 1951), Kriging (Sacks et al. 1989), radial basis functions (Hardy 1971), support vector regression (SVR) (Vapnik et al. 1996) and artificial neural networks (Haykin 1998). Extensive surveys and reviews of different metamodeling methods and their applications are e.g. given by Simpson et al. (2001a, 2008), Wang and Shan (2007) and Forrester and Keane (2009).

Several comparative studies, investigating the accuracy and effectiveness of various surrogate models, can be found in the literature. However, one cannot find an agreement on the dominance of one specific method over others. In an early study, Simpson et al. (1998) compared second-order response surfaces with Kriging. The metamodels were applied on a multidisciplinary design problem and four optimization problems. Jin et al. (2001) conducted a systematic comparison study of four different metamodeling techniques: polynomial regression, Kriging, multivariate adaptive regression splines and radial basis function. They used 13 mathematical test functions and an engineering test problem considering various characteristics of the sample data and evaluation criteria. They concluded that in overall RBF performed the best for both large and small scale problems with high-order of non-linearity. Fang et al. (2005) studied RSM and RBF to find the best method for modeling highly non-linear responses found in impact related problems. They also compared the RSM and RBF models with a highly non-linear test function. Despite the computation cost of RBF, they concluded dominance of RBF over RSM in such optimization problems. Mullur and Messac (2006) compared extended radial basis function (E-RBF), with three other approaches; RSM, RBF and Kriging. A number of modelling criteria including problem dimension, sampling technique, sample size and performance criteria were employed. The E-RBF was identified as the superior method since parameter setting was avoided and the method resulted in an accurate metamodel without a significant increase in computation time. Kim et al. (2009) performed a comparative study of four metamodeling techniques using six mathematical functions and evaluated the results by root mean squared error. Kriging and moving least squares showed promising results in that study. In another study by Zhao and Xue (2010), four metamodeling methods are compared by considering three characteristics of quality of the sample (sample size, uniformity and noise) and four performance measures (accuracy, confidence, robustness and efficiency). Backlund et al. (2012) studied the accuracy of RBF, Kriging and support vector regression (SVR) with respect to their capability in approximating base functions with large number of variables and variant modality. The conclusion was that Kriging appeared to be the dominant method in its ability to approximate accurately with fewer or equivalent number of training points. Also, unlike RBF and SVR, the parameter tuning in Kriging was automatically done during training process. RBF was found to be the slowest in building the model with large number of training points. In contrast, SVR was the fastest in large scale multi-modal problems.

In most of the previously conducted comparison studies, RBF has shown to perform well in different test problem and engineering applications. Therefore, in this paper, we don’t recognize a need to compare RBF with other metamodeling techniques again. Instead we focus on a detailed comprehensive comparison of our proposed RBF with a priori bias with the classical augmented RBF (RBF with a posteriori bias). The factors that are present during the construction of a metamodel (modeling criteria), range from the dimension of the problem, the type of radial basis functions used in RBF, the sampling technique and sample size. The evaluation of the modeling criteria and their affect on the accuracy, performance and robustness of a metamodel will help the designer to chose an appropriate metamodeling technique for their specific application. A recent comparison study of these two approaches have been conducted by the authors Amouzgar and Strömberg (2014). The preliminary results revealed the potential of RBF with a priori bias in predicting the test problem values. This potential is evaluated in detail in this paper for nine established mathematical test functions. A pre-study on the performance of our RBF with a priori bias in metamodel based design optimization is also performed for two benchmarks. The results clearly demonstrate that our RBF with a priori bias is most attractive for the choice of surrogate model in MDO.

2 Radial Basis Functions Networks

Radial basis functions were first used by Hardy (1971) for multivariate data interpolation. He proposed RBFs as approximation functions by solving multi-quadratic equations of topography based on coordinate data with interpolation. A radial basis function network of ingoing variables x i collected in x can be written as
$$ f(\boldsymbol{x}) = \sum\limits_{i=1}^{N_{\Phi}} {\Phi}_{i}(\boldsymbol{x}) \alpha_{i} + b(\boldsymbol{x}), $$
(1)
where f = f(x) is the outgoing response of the network, Φ i i (x) represents the radial basis functions, N Φ is the number of radial basis functions, α i are weights and b = b(x) is a bias. The network is depicted in Fig. 1.
Fig. 1

a A radial basis functions network; b the Gauss function

Examples of popular radial basis functions are
$$\begin{array}{@{}rcl@{}} \text{Linear:}\quad {\Phi}_{i}(r)&=&r,\\ \text {Cubic:}\quad {\Phi}_{i}(r)&=&r^{3},\\ \text {Gaussian:}\quad {\Phi}_{i}(r)&=&e^{-\theta_{i} r^{2}},\; 0\leq \theta_{i} \leq1, \\ \text {Quadratic:}\quad {\Phi}_{i}(r)&=&\sqrt{r^{2}+{\theta_{i}^{2}}},\; 0\leq \theta_{i} \leq1, \end{array} $$
(2)
where 𝜃 i represents the shape parameters and
$$ r(\boldsymbol{x})=\sqrt{(\boldsymbol{x}-\boldsymbol{c}_{i})^{T}(\boldsymbol{x}-\boldsymbol{c}_{i})} $$
(3)
is the radial distance. The shape parameters controls the width of the radial basis functions. A radial basis function with a small value of 𝜃 i gives a narrower effect on the surrounding region. In other words, the nearby points of an unknown point will affect the prediction of the response on that point. In this case the risk of overfitting will occur, which means the sample points will influence only on a very close neighbourhood. An overfitted response surface does not capture the true function accurately, it rather describes the noise, even in noise free data sets. c i is the center point for each radial basis function. The number of center points is commonly set equal to the number of sample points. We have found that using the sample points as the center points will usually result to a more accurate model.
In this work we consider the bias to be a polynomial function, which is considered to be known either a priori or a posteriori. The bias is formulated as
$$ b=\sum\limits_{i=1}^{N_{\beta}} \xi_{i}(\boldsymbol{x}) \beta_{i}, $$
(4)
where ξ i (x) represents the polynomial basis functions and β i are constants. N β is the number of terms in the polynomial function.
Thus, for a particular signal \(\hat {\boldsymbol {x}}_{k}\) the outcome of the network can be written as
$$ f_{k}=f(\hat{\boldsymbol{x}}_{k}) = \sum\limits_{i=1}^{N_{\Phi}} A_{ki} \alpha_{i} +\sum\limits_{i=1}^{N_{\beta}} B_{ki}\beta_{i}, $$
(5)
where
$$ \begin{array}{ccc} A_{ki}={\Phi}_{i}(\hat{\boldsymbol{x}}_{k}) & \text{and} & B_{ki}=\xi_{i}(\hat{\boldsymbol{x}}_{k}). \end{array} $$
(6)
Furthermore, for a set of signals, the corresponding outgoing responses f = {f i } of the network can be formulated compactly as
$$ \boldsymbol{f}= \boldsymbol{A}\boldsymbol{\alpha} + \boldsymbol{B}\boldsymbol{\beta}, $$
(7)
where α = {α i }, β = {β i }, A = [A i j ] and B = [B i j ].

2.1 Bias known a priori

We suggest to set up the RBF in (1) by treating the bias as known a priori. This is presented here. The established approach by letting the bias be unknown is presented next.

The network in (1) is trained in order to fit a set of known data \(\{\hat {\boldsymbol {x}}_{k},\hat {f}_{k}\}\). We assume that the number of data is N d and we collect all \(\hat {f}_{k}\) in \(\hat {\boldsymbol {f}}\). The training is performed by minimizing the error
$$ \boldsymbol{\epsilon} = \boldsymbol{f} -\hat{\boldsymbol{f}} $$
(8)
in the least squares sense. We begin by considering this problem when the constants \(\boldsymbol {\beta }=\hat {\boldsymbol { \beta }}\) are known a priori. The minimization problem then reads
$$ \min_{\boldsymbol{\alpha}}\, \frac{1}{2} \left( \boldsymbol{A}\boldsymbol{\alpha}-(\hat{\boldsymbol{f}} -\boldsymbol{B}\hat{\boldsymbol{\beta}})\right)^{T}\left( \boldsymbol{A}\boldsymbol{\alpha}-(\hat{\boldsymbol{f}} -\boldsymbol{ B}\hat{\boldsymbol{\beta}})\right). $$
(9)
The solution to this problem is given by
$$ \hat{\boldsymbol{\alpha}}= \left( \boldsymbol{A}^{T}\boldsymbol{A} \right)^{-1} \boldsymbol{A}^{T}\left( \hat{\boldsymbol{f}}-\boldsymbol{B}\hat{\boldsymbol{\beta}}\right). $$
(10)
An obvious possibility to define \(\hat {\boldsymbol {\beta }}\) a priori, which is used in this work, is to use the following optimal regression coefficients:
$$ \hat{\boldsymbol{\beta}}=\left( \boldsymbol{B}^{T}\boldsymbol{B}\right)^{-1}\boldsymbol{B}^{T}\hat{\boldsymbol{f}}. $$
(11)

2.2 Bias known a posteriori

If the bias is considered not to be known a priori, then (9) modifies to
$$ \min_{(\boldsymbol{\alpha},\boldsymbol{\beta})} \, \frac{1}{2} \left( \boldsymbol{A}\boldsymbol{\alpha}+\boldsymbol{B}\boldsymbol{ \beta}-\hat{\boldsymbol{f}}\right)^{T}\left( \boldsymbol{A}\boldsymbol{\alpha}+\boldsymbol{B} \boldsymbol{\beta}-\hat{\boldsymbol{ f}}\right). $$
(12)
Furthermore, if we also assume that N Φ + N β > N d , then the following orthogonality constraint is introduced:
$$ \sum\limits_{i=1}^{N_{\Phi}} \xi_{j}(\boldsymbol{c}_{i})\alpha_{i} =0, j=1,\ldots,N_{\beta}. $$
(13)
This can be written on matrix format as
$$ \boldsymbol{R}^{T}\boldsymbol{\alpha} = \boldsymbol{0}, $$
(14)
where
$$ \begin{array}{cc} \boldsymbol{R}=[R_{ij}], & R_{ij} = \xi_{j}(\boldsymbol{c}_{i}). \end{array} $$
(15)
In conclusion, for a bias known a posteriori, we have to solve the following problem:
$$ \left\{ \begin{array}{l} \min\limits_{(\boldsymbol{\alpha},\boldsymbol{\beta})} \, \frac{1}{2} \left( \boldsymbol{A}\boldsymbol{\alpha}+\boldsymbol{B}\boldsymbol{\beta}- \hat{\boldsymbol{f}}\right)^{T}\left( \boldsymbol{A}\boldsymbol{\alpha}+\boldsymbol{B}\boldsymbol{\beta}-\hat{\boldsymbol{ f}}\right) \\ \text{s.t. } \boldsymbol{R}^{T} \boldsymbol{\alpha} =\boldsymbol{0}. \end{array}\right. $$
(16)
The corresponding Lagrangian function is given by
$$ \mathcal{L}(\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\lambda}) = \frac{1}{2} \left( \boldsymbol{A}\boldsymbol{ \alpha}+\boldsymbol{B}\boldsymbol{\beta}-\hat{\boldsymbol{f}}\right)^{T}\left( \boldsymbol{A}\boldsymbol{\alpha}+\boldsymbol{ B}\boldsymbol{\beta}-\hat{\boldsymbol{f}}\right) + \boldsymbol{\lambda}^{T} \boldsymbol{R}^{T} \boldsymbol{\alpha}. $$
(17)
The necessary optimality conditions become
$$\begin{array}{@{}rcl@{}} \frac{\partial \mathcal{L}}{\partial \boldsymbol{\alpha}}&=& \boldsymbol{A}^{T} (\boldsymbol{A}\boldsymbol{\alpha}+\boldsymbol{ B}\boldsymbol{\beta}-\hat{\boldsymbol{f}})+\boldsymbol{R}\boldsymbol{\lambda} =\boldsymbol{0}, \\ \frac{\partial \mathcal{L}}{\partial \boldsymbol{\beta}} &= &\boldsymbol{B}^{T}(\boldsymbol{A}\boldsymbol{\alpha}+\boldsymbol{ B}\boldsymbol{\beta}-\hat{\boldsymbol{f}})=\boldsymbol{ 0},\\ \frac{\partial \mathcal{L}}{\partial \boldsymbol{\lambda}}& =& \boldsymbol{R}^{T}\boldsymbol{\alpha}=\boldsymbol{0}. \end{array} $$
(18)
The optimality conditions in (18) can also be written on matrix format as
$$ \left[ \begin{array}{ccc} \boldsymbol{A}^{T}\boldsymbol{A} \; & \boldsymbol{A}^{T}\boldsymbol{B} \; & \boldsymbol{R} \\ \boldsymbol{B}^{T}\boldsymbol{A} \; & \boldsymbol{B}^{T}\boldsymbol{B} \; & \boldsymbol{0} \\ \boldsymbol{R}^{T} \; & \boldsymbol{0} \; & \boldsymbol{0} \end{array}\right] \left\{ \begin{array}{c} \boldsymbol{\alpha} \\ \boldsymbol{\beta} \\ \boldsymbol{\lambda} \end{array}\right\} = \left\{ \begin{array}{c} \boldsymbol{A}^{T}\hat{\boldsymbol{f} }\\ \boldsymbol{B}^{T}\hat{\boldsymbol{f}}\\ \boldsymbol{0} \end{array} \right\}. $$
(19)
By solving this system of equations, the radial basis function network with a bias known a posteriori is established.
If the center points c i are chosen to be equal \(\hat {\boldsymbol {x}}_{i}\), then R = B, the network becomes an interpolation and (19) can be reduced to
$$ \left[ \begin{array}{ccc} \boldsymbol{A} \;& \boldsymbol{B} \\ \boldsymbol{B}^{T} \; & \boldsymbol{0} \end{array}\right] \left\{ \begin{array}{c} \boldsymbol{\alpha} \\ \boldsymbol{\beta} \end{array}\right\} = \left\{ \begin{array}{c} \hat{\boldsymbol{f}}\\ \boldsymbol{0} \end{array} \right\}. $$
(20)
This is the established approach in setting the RBF in (1). We suggest one can simply use (10) and (11), which are nothing more than two normal equations; (10) is the normal equation to (9) and (11) is the normal equation to the corresponding regression problem. Obviously, the two approaches will produce different RBFs. This is demonstrated in Fig. 2, where the biases are compared using both approaches for the same benchmark problem. In the following the performance of theses two approaches are studied in detail. Further on in the present paper, RBF with the bias known a posteriori is briefly called a posteriori RBF and abbreviated by R B F p o s , and radial basis functions with bias known a priori is called a priori RBF and abbreviated by R B F p r i .
Fig. 2

The bias plotted for test function 4 using both approaches: (a) RBF with a priori known bias, (b) RBF with a posteriori known bias

3 Test functions

The comparison of the two RBF approaches are based on 9 different mathematical test functions presented below. These test functions are commonly used as benchmarks for unconstrained global optimization problems.
  1. 1.
    Branin-Hoo function (Branin 1972)
    $$ {f}_{1}=2(x_{2}-\frac{5.1{x_{1}^{2}}}{4\pi^{2}}+\frac{5x_{1}}{\pi}-6)+10(1-\frac{1}{8\pi})\cos(x_{1})+10. $$
    (21)
     
  2. 2.
    Goldstein-Price function (Goldstein and Price 1971)
    $$ \begin{array}{ll} f_{2}=&\left( 1+(x_{1}+x_{2}+1)^{2}\left( 19-14x_{1}+3{x_{1}^{2}}-14x_{2}\right.\right.\\ &\left.\left.+6x_{1}x_{2}+3{x_{2}^{2}}\right)\right)\\ &\times\left( 30+(2x_{1}-3x_{2})^{2}\left( 18-32x_{1}+12{x_{1}^{2}}+48x_{2}\right.\right.\\ &\left.\left.-36x_{1}x_{2}+27{x_{2}^{2}}\right)\right) . \end{array} $$
    (22)
     
  3. 3.
    Rastrigin function
    $$\begin{array}{@{}rcl@{}} {f}_{3} = 20 + \sum\limits_{i=1}^{N}{{x_{i}^{2}} - 10\cos (2\pi x_{i})}. \end{array} $$
    (23)
    In this study, the Rastrigin function with 2 variables is used (N=2).
     
  4. 4.
    Three-Hump Camel function
    $$ {f}_{4}=2{x_{1}^{2}}-1.05{x_{1}^{4}}+\frac{{x_{1}^{6}}}{6}+x_{1}x_{2}+{x_{2}^{2}}. $$
    (24)
     
  5. 5.
    Colville function
    $$ \begin{array}{ll} {f}_{5}=& 100({x_{1}^{2}}-x_{2})^{2}+(x_{1}-1)^{2}+(x_{3}-1)^{2}+90({x_{3}^{2}}-x_{4})^{2}\\ &+10.1((x_{2}-1)^{2}+(x_{4}-1)^{2})+19.8(x_{2}-1)(x_{4}-1). \end{array} $$
    (25)
     
  6. 6.
    Math 1
    $$ \begin{array}{ll} {f}_{6}=& (x_{1}-10)^{2}+5(x_{2}-12)^{2}+{x_{3}^{4}}+3(x_{4}-11)^{2}\\ &+10{x_{5}^{6}}+7{x_{6}^{2}}+{x_{7}^{4}}-4x_{6}x_{7}-10x_{6}-8x_{7}. \end{array} $$
    (26)
     
  7. 7.
    Rosenbrock-10 function (Rosenbrock 1960)
    $$ {f}_{7}=\sum\limits_{n=1}^{N-1}\left( 100(x_{n+1}-{x_{n}^{2}})^{2}+(x_{n}-1)^{2}\right). $$
    (27)
    In this study, the Rosenbrock function with 10 variables is used (N=10).
     
  8. 8.
    Math 2 (A 10-variable mathematical function)
    $$ {f}_{8}=\sum\limits_{m=1}^{10}\left( \frac{3}{10}+\sin(\frac{16}{15}x_{m}-1)+\sin(\frac{16}{15}x_{m}-1)^{2}\right). $$
    (28)
     
  9. 9.
    Math 3 (A 16-variable mathematical function) (Jin et al. 2001)
    $$ {f}_{9}=\sum\limits_{m=1}^{16}\sum\limits_{n=1}^{16}a_{mn}({x_{m}^{2}}+x_{m}+1)({x_{n}^{2}}+x_{n}+1), $$
    (29)
    where a is defined in Jin et al. (2001).
     
The properties of the test functions are summarized in Table 1.
Table 1

Mathematical test functions

Function

Function

No. of

Design

 

name

variables

range(s)

f 1

Branin-Hoo

2

x 1:[−5,10], x 2:[0,15]

f 2

Goldstein-Price

2

x 1, x 2:[−2,2]

f 3

Rastrigin

2

x 1, x 2:[−5.12,5.12]

f 4

Three-Hump Camel

2

x 1, x 2:[−5,5]

f 5

Colville

4

x i :[−10,10],i = 1,2...4

f 6

Math 1

7

x i :[−10,10],i = 1,2...7

f 7

Math 2

10

x i :[−1,1],i = 1,2...10

f 8

Rosenbrock-10

10

x i :[−5,10],i = 1,2...10

f 9

Math 3

16

x i :[−1,1],i = 1,2...16

4 Modelling and performance criteria for comparison

Standard statistical error analysis is used to evaluate the accuracy of the the two RBF approaches. Details of this analysis are presented in this section.

4.1 Performance metrics

The two standard performance metrics are applied to the off-design test points: (i) Root Mean Squared Error (RMSE) and (ii) Maximum Absolute Error (MAE). The lower the RMSE and MAE values, the more accurate the metamodel will be. The aim is to have these two error measures as near to zero as possible.

The RMSE is calculated by
$$\begin{array}{@{}rcl@{}} \text{RMSE} &=& \sqrt{\frac{{\sum}_{i=1}^{n}\left( \hat{f}_{i}-{f}_{i}\right)^{2}}{n}} \end{array} $$
(30)
and MAE is defined by
$$\begin{array}{@{}rcl@{}} \text{MAE}&=& max \vert \hat{f}_{i}-{f_{i}}\vert, \end{array} $$
(31)
where n is the number of off-design test points selected to evaluate the model, \(\hat {f}_{i}\) is the exact function value at the i t h test point and f i represents the corresponding predicted function value.
RMSE and MAE are typically at the same order of the actual function values. These error measures will not indicate the relative performance quality of the RBFs across different functions independently. Therefore, to compare the performance measures of the two approaches over test functions, the normalized values of the two errors, NRMSE and NMAE, by using the actual function values, are calculated by
$$\begin{array}{@{}rcl@{}} \text{NRMSE} &=& \sqrt{\frac{{\sum}_{i=1}^{n}\left( \hat{f}_{i}-{f}_{i}\right)^{2}}{{\sum}_{i=1}^{n}\left( \hat{f}_{i}\right)^{2}}}, \end{array} $$
(32)
$$\begin{array}{@{}rcl@{}} \text{NMAE}&=& \frac{max \vert \hat{f}_{i}-{f_{i}}\vert}{\sqrt{\frac{1}{n}{\sum}_{i=1}^{n}\left( \hat{f}_{i}-\bar{f}_{i}\right)^{2}}}, \end{array} $$
(33)
where \(\bar {f}\) denotes the mean of the actual function values at the test points.
In addition, the NRMSE and NMAE of a priori RBF is compared to the a posteriori RBF approach by defining the corresponding relative differences. The relative difference in NRMSE (D N R M S E ) of a posteriori RBF is given by
$$\begin{array}{@{}rcl@{}} {{D}_{RBF_{pos}}^{NRMSE}}&=& \frac{NRMSE_{RBF_{pos}}-NRMSE_{RBF_{pri}}}{NRMSE_{RBF_{pri}}} \times 100\%, \end{array} $$
(34)
and the relative difference in NMAE (D N M A E ) of a posteriori RBF is defined by
$$\begin{array}{@{}rcl@{}} {{D}_{RBF_{pos}}^{NMAE}}&=& \frac{NMAE_{RBF_{pos}}-NMAE_{RBF_{pri}}}{NMAE_{RBF_{pri}}} \times 100\%, \end{array} $$
(35)
where NRMSE and NMAE values of the R B F p o s approach are referred by \( NRMSE_{RBF_{pos}} \) and \( NMAE_{RBF_{pos}} ;\) and \( NRMSE_{RBF_{pri}} \) and \( NMAE_{RBF_{pri}} \) are the corresponding NRMSE and NMAE values of the R B F p r i approach.

4.2 Radial basis functions

Several different radial basis functions can be used in constructing the RBF, as mentioned in Section 2. Each will yield to a different result depending to the nature of the problem. However, in real world applications, the mathematical properties of the problem is usually not known in advance. Thus, a designer needs a robust choice of radial basis function which is as independent as possible to the nature of the problem and will result to a acceptably accurate metamodel. In this paper, four different radial basis functions: (i) linear, (ii) cubic, (iii) Gaussian, and (iv) quadratic, formulated in (2), are used to study the effect of radial basis functions on the accuracy of metamodels.

4.3 Sampling techniques

Sampling techniques are used to create DoEs for which the particular RBF then is fitted to. A robust sampling technique is desired for a designer to avoid dependencies to sampling techniques, as much as possible, for different problems. In other words, one would like to have a metamodeling technique that is as independent as possible to the sampling technique. In this study, three different sampling techniques are chosen: (i) Random sampling (RND), (ii) Latin hypercube sampling (LHS) and (iii) Hammersley sequence sampling (HSS) and their effects on the accuracy of the two approaches are investigated. For the optimization problems studied at the end, we also compare these sampling techniques to a successive screening approach for generating appropriate DoEs.

In random sampling, a desired set of uniformly distributed random numbers within the variable bounds of each test function is chosen. As expected there is no uniformity in the created set of DoE. The Latin hypercube sampling technique creates samples that are relatively uniform in each single dimension while subsequent dimensions are randomly paired to fit a m-dimensional cube . LHS can be regarded as a constrained Monte Carlo sampling scheme developed by McKay et al. (1979) specifically for computer experiments. Hammersley sequence sampling produces more uniform samples over the m-dimensional space than LHS. This can be seen in Fig. 3 which illustrates the uniformity of a set of 15 sample points over a unit square, using RND, LHS and HSS. Hammersley sequence sampling uses a low discrepancy sequence (Hammersley sequence) to uniformly place N points in a m-dimensional hypercube given by the following sequence:
$$ {Z}_{m}(n)=\left( \frac{n}{N}, \phi_{R_{1}}(n), \phi_{R_{2}}(n),...,\phi_{R_{m-1}}(n)\right),\qquad n=1,2,...,N, $$
(36)
where R 1, R 2, ...,R m−1 are the first m−1 prime numbers. ϕ R (n) are constructed by reversing the order of the digits of any integer, written in radix-R notation, around the decimal points. In this work, HSS is coded in Matlab based on the theory in the original paper by Kalagnanam and Diwekar (1997), where detailed definition and theory of Hammersley points can be found.
Fig. 3

Uniformity of different sampling techniques: (a) RND, (b) LHS, (c) HSS

4.4 Size of samples

The DoE size (size of samples) has an important effect on choosing an accurate surrogate model. In general, increasing the size of DoE will improve the quality of metamodels when using the RBF approach, however over fitting is an critical issue in these approaches. Three different sample size are used in this paper: (i) Low, (ii) Medium and (iii) High. The number of samples for each sampling group is proportional to a reference value for low and high dimension problems. The number of coefficients k = (m + 1)(m + 2)/2 in a second order polynomial with m number of variables is used as a reference. For all the test functions the number of DoE is chosen as a coefficient of k. The sample size for low dimension test function are: (i) 1.5k for low sample size, (ii) 3.5k for medium sample size equals, and (iii) 6k for high sample size. High dimensional test functions have the size of DoE defined as: i) 1.5k for low sample size, (ii) 2.5k for medium sample size, and (iii) 5k for high sample size.

4.5 Test functions dimensionality

Dimension of a test function or the number of the variables in a problem, is one of the most important properties in generating an accurate surrogate model. In order to investigate the effect of this modelling criteria on the two approaches we divided the test functions into two categories: (i) Low, where the number of variables are less than or equal to 4, and (ii) High, for test functions with the number of variables of more than 4. Labelling the second group by “high”, implies the relative meaning of higher number of variables compared to the first group. Otherwise, high dimensional engineering problems generally consist of considerable higher number of variables. The results are grouped separately for low and high dimension test functions in all modeling criteria. Our goal is that a final conclusion can be drawn by studying the results.

5 Comparison procedure

In this section, we describe the procedure used to compare the two metamodeling approaches (R B F p r i , R B F p o s ) under multiple modeling criteria mentioned in previous sections. The comparison is based on the 9 mathematical test functions and the performance metrics described in previous sections. We summarize the comparison procedure into 6 steps as follow:
  • Step 1: The number of DoEs is determined based on the three sample size groups (low, medium and high) in Table 2, for each test function.

  • Step 2: The design domains are mapped linearly between 0 and 1 (unit hypercube). The surrogate models are fitted on the mapped variables by using the two approaches. For calculating the performance metrics the metamodel is mapped back to the original space.

  • Step 3: To avoid any probable sensitivity of metamodels to a specific DoE, 50 distinctive sample sets are generated for each sample size of step 1 by using RND and LHS described in the previous section. The sensitivity of surrogate models to a specific DoE is avoided to a great extent. Since HSS technique is deterministic, only one sample set is generated by using this method for each sample size. The Latin Hypercube sampling techniques (LHS) is performed by using the Matlab function ”lhsdesign”. The Latin hypercube samples are created with 20 iterations to maximize the minimum distance between points. 50 different sets of sample points are created for each sample size by using the LHS and the random (RND) sampling technique. The Hammersely (HSS) samples, are created from Hammersley quasirandom sequence using successive primes as bases by using an in-house Matlab code.

  • Step 4: Metamodels are constructed using the two RBF approaches (R B F p r i and R B F p o s ) with each of the four different radial basis functions (linear, cubic, Guassian and quadratic) to be compared for each set of DoE generated by the three sampling techniques. Therefore, for each test function 2 (RBF approaches) × 4 (radial basis functions) × 3 (sampling techniques) × 3 (sample sizes) × 50 (set of DoE) = 3600 surrogate models are constructed.

  • Step 5: 1000 test points are randomly selected within the design space. The exact function value \(\hat {f}_{i}\) and the predicted function value f i at each test point is calculated. RMSE, MAE, and the corresponding normalized values are computed by using (30) to (33). The average of the normalized errors is calculated across the 50 sample sets. The average of the normalized root mean squared and maximum absolute error are simply shown by N R M S E and NMAE in this paper. Finally, the relative difference measures of the computed average errors, N R M S E and NMAE for R B F p o s are calculated by using (34) and (35).

  • Step 6: The procedure from step 1 to 5 is repeated for all test problems. In addition to the mean normalized errors (N R M S E and NMAE), the average of low dimension problems (the first five test functions) denoted by “Ave. Low”, the average of high dimension problems (test functions 6 to 9) expressed by “Ave. High” and the average error metrics of all 9 test functions shown by “Ave. All” are computed for the surrogate approaches using different sampling techniques.

Table 2

Modeling criteria of test functions

Function

Function

No. of

Problem

Sample size

No. of

 

name

variables

dimension

Low

Medium

High

test points

f 1

Branin-Hoo

2

Low

9

30

60

1000

f 2

Goldstein-Price

2

Low

9

30

60

1000

f 3

Rastrigin

2

Low

9

30

60

1000

f 4

Three-Hump Camel

2

Low

9

30

60

1000

f 5

Colville

4

Low

23

75

150

1000

f 6

Math 1

7

High

54

90

180

1000

f 7

Math 2

10

High

99

165

330

1000

f 8

Rosenbrock-10

10

High

99

165

330

1000

f 9

Math 3

16

High

229

380

765

1000

It should be noted that, because the variables are mapped to a unit cube (in step 2), the parameter setting can be done without considering the magnitude of the design variables. Thus, the parameter 𝜃 used in the radial basis functions in (2) is set to one (𝜃 = 1). The bias chosen for this study, in (4), is a quadratic polynomial with 6 terms.

6 Results and discussion

In this section, the results gathered from the metamodels constructed according to the comparison procedure in previous section are presented. The effect of each modeling criteria is discussed by comparing the two main error measures, NRMSE and NMAE, for the two RBF approaches by presenting them in several tables and charts. Including all modeling criteria in the comparison study of each criteria for all test functions requires an extensive and very detailed results section which should incorporate all 3600 surrogate models. This is out of scope of this work and can be the topic of future studies. Therefore, for studying the effect of each modeling criteria, a specific selection of other criteria is chosen. They are mentioned in the forthcoming sections.

Before presenting the results, it is worth mentioning that the computational cost of the proposed R B F p r i is less than R B F p o s . This has been investigated by calculating the training time of the two approaches for test functions 3 and 8 with 100 variables and 15453 sampling points by using the cubic radial basis function and HSS sampling method. The computational times related to f 3 are 346.67 and 396.97 seconds for R B F p r i and R B F p o s , respectively. Test function 8 is trained in 350.48 and 591.76 seconds by using R B F p r i and R B F p o s , respectively.

6.1 Effect of basis functions

Table 3 shows the NRMSE and NMAE values of high sample size and LHS sampling technique of R B F p r i and R B F p o s by using the four different basis functions across all test problems. The bold faced values highlight the lowest errors for each test function. It can be seen that the minimum errors are varied through different basis functions between the test functions. However, cubic basis function results in lower values in both NRMSE and NMAE for f 1, f 8 and f 9. Also by studying and comparing the results obtained from all 3600 constructed metamodels one can conclude that the cubic basis function is the preferred choice, because of it’s robust behaviour under different criteria, when there is no prior knowledge on the mathematical property of the problem. This may be because of lack of any extra parameter in the cubic radial basis function. Thus, parameter setting and finding the optimal shape parameter is not required in cubic radial basis function.
Table 3

NRMSE and NMAE (LHS sampling with high sample size)

Test function

RBF approach

NRMSE

NMAE

  

Linear

Cubic

Gaussian

Quadratic

Linear

Cubic

Gaussian

Quadratic

f 1

R B F p r i

0.1908

0.0952

0.4538

0.1349

1.8655

0.8358

6.0043

1.5904

 

R B F p o s

0.1951

0.0975

0.1979

0.4017

2.5765

0.9711

2.5429

6.3368

f 2

R B F p r i

0.3158

0.2594

0.1874

0.1620

3.0204

2.4426

1.9849

1.5635

 

R B F p o s

0.3735

0.2496

0.3674

0.1769

3.6027

2.4930

3.5192

1.8219

f 3

R B F p r i

0.3080

0.4122

8.1544

8.1544

2.3752

4.7238

225.677

87.9028

 

R B F p o s

0.3067

0.4162

0.3078

10.9940

2.4914

4.2085

2.4926

312.287

f 4

R B F p r i

0.3409

0.2634

0.1184

0.1521

2.0918

1.7894

1.3465

1.4218

 

R B F p o s

0.4612

0.2709

0.4538

0.1473

3.0001

1.9679

3.0208

1.6017

f 5

R B F p r i

0.2012

0.1967

0.1590

0.1752

1.2435

1.2760

1.4980

1.3941

 

R B F p o s

0.3220

0.2146

0.3219

0.1767

2.4984

1.5238

2.4598

1.6206

f 6

R B F p r i

0.4469

0.5012

0.6332

0.5617

2.1897

2.5543

3.3058

2.9341

 

R B F p o s

0.6063

0.5254

0.7355

0.6154

3.1961

2.8564

4.1352

3.3001

f 7

R B F p r i

0.1249

0.1185

0.1247

0.1178

3.1090

2.9903

3.3601

3.0411

 

R B F p o s

0.1162

0.1175

0.1162

0.1141

2.7253

2.9346

2.7001

2.9427

f 8

R B F p r i

0.1683

0.1646

0.1741

0.1659

1.1983

1.2398

1.3317

1.2620

 

R B F p o s

0.1842

0.1653

0.1847

0.1697

1.4382

1.2617

1.5064

1.3617

f 9

R B F p r i

0.0211

0.0190

0.0215

0.0196

0.4572

0.3441

0.4860

0.3834

 

R B F p o s

0.0329

0.0209

0.0388

0.0248

1.0555

0.4795

1.5252

0.7643

It is cumbersome to compare the two approaches under each modeling criteria by using all the radial basis functions. Therefore, for each test function and modeling criteria a radial basis function is chosen, and the two metamodels are constructed by using that radial basis function. Table 4 summarizes the chosen radial basis functions for each test function and modeling criteria.
Table 4

Summary of chosen basis functions

Test function

Sampling technique

Sample size

Problem dimension

Overall accuracy

f 1

Cubic

Cubic

Cubic

Cubic

f 2

Quadratic

Quadratic

Quadratic

Quadratic

f 3

Linear

Cubic

Cubic

Linear

f 4

Cubic

Cubic

Cubic

Quadratic

f 5

Quadratic

Cubic

Cubic

Cubic

f 6

Cubic

Cubic

Cubic

Cubic

f 7

Cubic

Cubic

Cubic

Cubic

f 8

Cubic

Cubic

Cubic

Cubic

f 9

Cubic

Cubic

Cubic

Cubic

In cases where the best performed basis function is different for the two approaches under a modeling criteria, the basis function which performed better by using the R B F p o s is selected. This will enable a more reliable comparison between the two approaches.

6.2 Effect of sampling technique

The error measures of surrogate models constructed by the two approaches using the three sampling techniques are shown in Table 5. The values are extracted based on the basis functions chosen according to Table 4. Figure 4 depicts a summary of the results in Table 5 by comparing the performance metrics of “Ave. Low”, ”Ave. High” and ”Ave. All” rows. By observing the NRMSE values in Fig. 4a, the lowest errors for both approaches correspond to the HSS technique followed by the LHS and the random sampling technique which has the highest values for NRMSE. The only exceptions where the LHS generate a better metamodel, are in test function 3 (f 3) and the last high dimensional test function (f 9). Considering the NMAE values in Fig. 4b, the HSS method yields to the lowest errors. Also, the low dimension problems perform better with random sampling technique in comparison to LHS technique. Both R B F p r i and R B F p o s perform better when LHS sampling technique is used in high dimension problems (Fig. 4), however this gain is marginal compared to the two other techniques.
Table 5

NRMSE and NMAE of each sampling technique (high sample size)

Test function

RBF approach

NRMSE

NMAE

  

RND

LHS

HSS

RND

LHS

HSS

f 1

R B F p r i

0.1171

0.0952

0.0747

1.0179

0.8358

0.8646

 

R B F p o s

0.1221

0.0975

0.0752

1.1759

0.9711

1.3284

f 2

R B F p r i

0.2365

0.1620

0.1397

2.0912

1.5635

1.9509

 

R B F p o s

0.3110

0.1769

0.1602

3.1755

1.8219

2.3215

f 3

R B F p r i

0.3164

0.3080

0.3144

2.5913

4.7238

2.1904

 

R B F p o s

0.3117

0.3067

0.3098

2.5876

4.2085

2.4636

f 4

R B F p r i

0.3250

0.2634

0.1126

2.2031

1.7894

0.7961

 

R B F p o s

0.3270

0.2709

0.1260

2.2347

1.9679

1.1646

f 5

R B F p r i

0.1853

0.1752

0.1649

1.3390

1.2760

1.3270

 

R B F p o s

0.1863

0.1767

0.1730

1.6783

2.8564

1.6052

Average Low

R B F p r i

0.2361

0.2008

0.1613

1.8485

2.0377

1.4258

 

R B F p o s

0.2516

0.2058

0.1688

2.1704

2.3652

1.7767

f 6

R B F p r i

0.5021

0.5012

0.4839

2.5461

2.5543

2.6711

 

R B F p o s

0.5300

0.5254

0.5090

2.8477

2.8564

2.7938

f 7

R B F p r i

0.1196

0.1185

0.1138

3.0491

2.9903

2.9283

 

R B F p o s

0.1186

0.1175

0.1134

2.9861

2.9346

2.8446

f 8

R B F p r i

0.1669

0.1646

0.1586

1.2700

1.2398

1.3377

 

R B F p o s

0.1674

0.1653

0.1615

1.2915

1.2617

1.4356

f 9

R B F p r i

0.0192

0.0190

0.1586

0.3513

0.3441

2.0628

 

R B F p o s

0.0209

0.0209

0.0233

0.4878

0.4795

0.6263

Average High

R B F p r i

0.2020

0.2008

0.2287

1.8041

1.7821

2.2500

 

R B F p o s

0.2092

0.2073

0.2018

1.9032

1.8831

1.9251

Average all

R B F p r i

0.2209

0.2008

0.1913

1.8288

1.9241

1.7921

 

R B F p o s

0.2328

0.2064

0.1835

2.0517

2.1509

1.8426

Fig. 4

Comparison of different sampling techniques: (a) normalized root mean squared error (RMSE); (b) normalized maximum absolute error (NMAE)

The “ave. all” bars in Fig. 4a and b along with the data in Table 5, show 4.7 % and 6.8 % improve in NRMSE and NMAE when using HSS technique instead of LHS in R B F p r i approach. While these values are 11.1 % and 14.3 % for the R B F p o s approach. The advantage of R B F p r i over R B F p o s , as being more robust in terms of NRMSE and NMAE with regards to the change of sampling technique can be seen in the aforementioned percentages.

6.3 Effect of sampling size

Figure 5a depicts the NRMSE values of “Ave. Low”, “Ave. High” and “Ave. all” for the three different sample sizes by using the LHS sampling technique, while Fig. 5b shows the similar NMAE values. Both metamodeling approaches will improve in quality with increasing sample size, regardless of the problem’s dimension. Table 6 shows the relative differences (in percentage) of NMRSE and NMAE comparing the R B F p r i and R B F p o s with regards to the different sample size. The R B F p o s approach performs better than R B F p r i with low sample size considering both error metrics, which the negative percentage values in Table 6 reveals the exact degree of superiority. However, by increasing the sample size to medium and high the performance changes and R B F p r i appears to be the dominant approach. This advantage is noticeable in the NMAE values, in contrast to the marginal improvement of NRMSE values when using R B F p r i . Specially in high dimensional problems with medium sample size the relative difference is less than one percent (0.74 %). By looking at the “Ave. All” row of Table 6 we can observe a 13.2 % improvement in NMAE value by using R B F p r i with high sample size and 4 % better accuracy in NRMSE value.
Fig. 5

Comparison of different sample size using LHS technique: (a) normalized root mean squared error (RMSE); (b) normalized maximum absolute error (NMAE)

Table 6

Relative differences of NRMSE and NMAE comparing R B F p r i and R B F p o s considering sampling size

Error metrics

D N R M S E (%)

D N M A E (%)

Sample Size

LOW

MED

HIGH

LOW

MED

HIGH

Average Low

9.10

6.64

4.47

16.36

9.68

13.40

Average high

−17.61

0.74

3.41

−12.84

6.83

12.77

Average All

−2.77

4.02

4.00

3.38

8.42

13.12

6.4 Effect of dimension

The effect of test function’s dimension on metamodel performance can be studied by summarizing the average NRMSE and NMAE values of low and high dimension problems in Table 7. The values are obtained by averaging the “Ave. Low” and ”Ave. High” rows over all sampling techniques in Table 5. Considering the NRMSE, the R B F p r i approach performs better with low dimensional problems, while the R B F p o s approach generates better performance metrics in high dimensional problems. Although the advantage of R B F p o s in high dimensional problem compared to low dimensional is only around 1 % for NRMSE, it will increase to approximately 10 % for the NMAE metric. The R B F p r i has a superiority in performance of around 9.5 % in problems with low dimension in comparison to high dimensional problems considering NMAE. In addition, the last two rows in Table 7 compares the performance of R B F p r i with R B F p o s for low and high dimension problems separately. The results confirm the advantage of R B F p r i over R B F p o s in low dimensional problems with the superiority degree of 4.6 % and 17.2 % with regards to NRMSE and NMAE respectively. On the other hand, the better performance of R B F p o s compared to R B F p r i in high dimensional test functions is marginal and are around 2 % for both NRMSE and NMAE.
Table 7

NRMSE, NMAE and their related relative differences values averaged over all sampling techniques

Performance metrics

RBF approach

Average Low

Average high

NRMSE

R B F p r i

0.1994

0.2105

 

R B F p o s

0.2087

0.2061

NMAE

R B F p r i

1.7707

1.9454

 

R B F p o s

2.1041

1.9038

D N R M S E (%)

R B F p o s

4.59

−2.12

D N M A E (%)

R B F p o s

17.21

−2.16

6.5 Overall accuracy

For the test functions with two input variables (the first four test functions) the three-dimensional surface plots are shown in Figs. 69, respectively. The plots depict the actual function and the corresponding metamodels constructed by using R B F p r i and R B F p o s . The metamodel surfaces in Figs. 67 8 and 9 are generated by using the same set of DoE, created with HSS technique and high sample size, for each test function. The overall accuracy comparison of R B F p r i and R B F p o s can be studied by observing the surface plot figures and using Table 8. The table presents the relative difference in NRMSE and NMAE (as percentage) and the average values of low and high dimensional of all test functions. The values are extracted for each metamodeling approach by using the basis function mentioned in Table 4, the three sampling technique with high sample size. With regards to NRMSE relative differences, 6 out of 9 test functions by using LHS and 7 out of 9 test functions by using RND and HSS have a positive percentage, which clearly reveals the advantage of the new approach over R B F p o s . This advantage is more recognizable for NMAE, with 8 test functions with a positive relative difference values for all sampling techniques. The average rows show approximately 3 % related to RND and LHS and 16 % related to HSS better performance of R B F p r i in NRMSE values regardless of the dimension of the test functions, while this superiority is around 16 %, 13 % and 23 % when considering the NMAE for RND, LHS and HSS respectively. This difference, demonstrates the leverage of R B F p r i in predicting the local deviations of functions, which is provided by MAE metric. On the other hand, the superiority of R B F p r i in measuring the global error, by using RND and LHS, is minor compared to the R B F p o s approach.
Fig. 6

Test function 1: Branin function (a) actual function; (b) R B F p r i ; (c) R B F p o s

Fig. 7

Test function 2: Goldstein-Price function (a) actual function; (b) R B F p r i ; (c) R B F p o s

Fig. 8

Test function 3: Rastrigin function (a) actual function; (b) R B F p r i ; (c) R B F p o s

Fig. 9

Test function 4: Three-Hump Camel function (a) actual function; (b) R B F p r i ; (c) R B F p o s

Table 8

Overall accuracy performance of R B F p r i over R B F p o s

 

R B F p o s

Test functions

RND

LHS

HSS

 

D N R M S E (%)

D N M A E (%)

D N R M S E (%)

D N M A E (%)

D N R M S E (%)

D N M A E (%)

f 1

4,18

15,52

2,39

16,18

0,75

53,64

f 2

27,21

51,85

8,85

16,53

13,64

19,00

f 3

−1,50

−0,14

−0,42

4,90

−1,48

12,47

f 4

0,61

1,43

−3,18

12,65

4,05

9,93

f 5

12,59

25,35

8,71

19,42

14,55

20,97

Ave. Low

8,62

18,80

3,27

13,93

6,30

23,20

f 6

5,40

11,84

4,72

11,83

5,06

4,59

f 7

−0,85

−2,07

−0,84

−1,86

−0,37

−2,86

f 8

0,30

1,69

0,42

1,76

1,78

7,33

f 9

9,76

38,85

9,35

39,34

59,91

79,85

Ave. High

3,65

12,58

3,41

12,77

16,59

22,23

Ave. All

6,41

16,04

3,33

13,42

10,88

22,77

7 Optimization examples

RBF is a most attractive choice for surrogate models in metamodel based design optimization. This is demonstrated here by studying two examples using our approach of RBF with a priori bias. We begin our study with the following non-linear example:
$$ \left\{\begin{array}{l} \displaystyle{\min_{x_{i}} \, \sqrt{1000\left( \frac{4}{x_{1}}-2\right)^{4}+1000\left( \frac{4}{x_{2}}-2\right)^{4}} } \\ \text{s.t. } (x_{1}-0.5)^{4}+(x_{2}-0.5)^{4}-2 \leq 0. \end{array}\right. $$
(37)
The analytical optimal solution is (1.5,1.5) and the minimum of the unconstrained objective function is found at (2,2). The objective function is plotted in Fig. 10.
Fig. 10

Analytical “black-box” function (a) contour plot; (b) four successive iterations generating 12 sample points; (c) contour plot of the RBF of the objective for the DoE with 12 sample points; (d) contour plot of the augmented DoE with 12+3 sample points. The three augmented points are marked with a cross

The problem in (37) is now solved by performing a DoE procedure and setting up corresponding RBFs which in turn define a new optimization problem that is solved using a global search with a genetic algorithm and a local search with sequential linear and/or quadratic programming. First, a set of sampling points are generated by successive linear response surface optimization of the problem in (37) using four successive iterations with automatic panning and zooming (Gustafsson and Strömberg 2008). This screening generates 12 sampling points according to Fig. 10. Then, RBFs are fitted to this DoE and an optimal point is identified. The DoE is then augmented with this optimal point and the RBFs are set up again. This procedure is repeated three times generating in total a DoE with 12 sampling points from screening and three optimal points from RBFs. Finally, meta model based design optimization using our RBFs for this DoE of 12+3 sampling points is performed. The optimal solution generated with this procedure is (1.4962,1.5049), which is very close to the analytical optimum of (37).

The DoEs presented in Fig. 3 are also studied for this example. The corresponding RBFs are set up and the optimal solutions for the random, LHS and HSS DoEs are obtained as (1.7842,1.8003), (1.5618,1.5076) and (1.5698,1.553), respectively. It is clear that not only the choice of meta model will influence the result but also the choice of DoE. The solution from the random DoE is poor. The solutions from LHS and HSS DoEs are similar and acceptable. Thus, the successive screening procedure and optimal augmentation for generating DoE is for this problem superior and performs best. This is a general observation we have found for many examples. We have also used this strategy in order to solve reliability based design optimization problems using meta models. This is discussed in a most recent paper by Strömberg (2016), where also this first example is formulated as an reliability based design optimization (RBDO) problem and is solved for variables with non-Gaussian distributions using a SORM-based RBDO approach. We also consider the following well-known engineering benchmark of a welded beam:
$$ \left\{\begin{array}{l} \displaystyle{ \min_{x_{i}} \, 1.10471{x_{1}^{2}}x_{2}+0.04811x_{3}x_{4}(14+x_{2}) }\\ \text{s.t. } \left\{ \begin{array}{l} \tau(x_{i})-13600\leq 0 \\ \sigma(x_{i}) - 30000\leq 0\\ x_{1}-x_{4} \leq 0\\ 0.125 -x_{1} \leq0\\ \delta(x_{i}) -0.25 \leq0\\ 6000-P_{c}(x_{i}) \leq 0, \end{array}\right. \end{array}\right. $$
(38)
where definitions of shear stress τ(x i ), normal stress σ(x i ), displacement δ(x i ) and critical force P c (x i ) can be found in e.g. the recent paper by Garg (2014), where also several solutions obtained by different algorithms are presented. In addition, the variables are bounded by 0.1 ≤ x 1, x 4 ≤ 2 and 0.1 ≤ x 3, x 4 ≤ 10. We obtain the following analytical solution (0.24437, 6.2175, 8.2915, 0.24437), which is more or less identical to the solution obtained by Garg: (0.24436, 6.2177, 8.2916, 0.24437).
Now, we solve this problem instead by generating a set of sampling points for which RBFs are fitted and then optimized. The procedure is similar to the one presented above. First, 45 sampling points are generated by quadratic response surface screening. This set of points are presented in Fig. 11. The choice of quadratic instead of linear screening depends on the non-linear constraint domain. Linear screening might result in an empty feasible domain. After screening, 15 additional points are added which are generated by optimum from sequentially augmented RBFs. Finally, for 45+15 sampling points, we set up the corresponding RBFs and we obtain the following optimal solution (0.414710, 3.925900, 6.620100, 0.414710). This solution satisfies almost all constraints in (38). The first constraints is slightly violated τ(x i ) = 13729 > 13600, but the five other constraints are fully satisfied. The value of the cost function is 3.113586, which is very close to the analytical optimum value of 2.381. This solution could of course be improved further by augmenting the DoE with additional optimal points.
Fig. 11

Successive quadratic response surface screening generating 45 sampling points. The plots are showing the same DoE from two different views

8 Concluding remarks

In this paper, a new approach for setting up radial basis functions network is proposed by letting the bias be defined a priori by a corresponding regression model. Our new approach is compared with the established treatment of RBF, where the bias is obtained by using extra orthogonality constraints. It is numerically proven that our approach with a priori bias is in general as good as the performance of RBF with a posteriori bias. In addition, we mean that our approach is easier to set up and interpret. It is clear that the bias capture the global behavior and the radial basis functions tune the local response. It is also demonstrated that our RBF with a priori bias performs excellent in metamodel based design optimization and it captures both coarse and dense sampling densities simultaneously of DoEs generated from successive screening and optimal augmentation most accurately. In conclusion, the paper shows that our new RBF approach with a priori bias is a most attractive choice for surrogate model. We believe that our approach has a promising potential and opens up new possibilities for surrogate modelling in optimization, which we hope to be able to explore in a near future.

References

  1. Amouzgar K, Strömberg N (2014) An approach towards generating surrogate models by using rbfn with a priori bias. In: ASME 2014 International design engineering technical conferences and computers and information in engineering conference. American Society of Mechanical EngineersGoogle Scholar
  2. Amouzgar K, Rashid A, Strömberg N (2013) Multi-objective optimization of a disc brake system by using spea2 and rbfn. In: Proceedings of the ASME 2013 international design engineering technical conferences, vol 3 B. American Society of Mechanical Engineers, Portland, doi: 10.1115/DETC2013-12809
  3. Backlund PB, Shahan D W, Seepersad CC (2012) A comparative study of the scalability of alternative metamodelling techniques. Eng Optim 44(7):767–786CrossRefGoogle Scholar
  4. Box GEP, Wilson KB (1951) On the experimental attainment of optimum conditions. J R Stat Soc Series B (Methodological) 13(1):1–45MathSciNetMATHGoogle Scholar
  5. Branin FH (1972) Widely convergent method for finding multiple solutions of simultaneous nonlinear equations. IBM J Res Develop 16(5):504–522. . ISSN 0018- 8646 doi: 10.1147/rd.165.0504 MathSciNetCrossRefMATHGoogle Scholar
  6. Fang H, Rais-Rohani M, Liu Z, Horstemeyer MF (2005) A comparative study of metamodeling methods for multiobjective crashworthiness optimization. Comput Struct 83(25–26):2121–2136. doi: 10.1016/j.compstruc.2005.02.025. ISSN 0045-7949CrossRefGoogle Scholar
  7. Forrester A IJ, Keane AJ (2009) Recent advances in surrogate-based optimization. Progress Aerospace Sci 45(1–3):50–79. doi: 10.1016/j.paerosci.2008.11.001. ISSN 03760421. http://linkinghub.elsevier.com/retrieve/pii/S0376042108000766
  8. Garg H (2014) Solving structural engineering design optimization problems using an artificial bee colony algorithm. J Ind Manag Optim 10(3):777–794MathSciNetCrossRefMATHGoogle Scholar
  9. Goldstein AA, Price J F (1971) On descent from local minima. Math Comput 25(115):569–574. ISSN 00255718. http://www.jstor.org/stable/2005219
  10. Gustafsson E, Strömberg N (2008) Shape optimization of castings by using successive response surface methodology. Struct Multidiscip Optim 35(1):11–28CrossRefGoogle Scholar
  11. Hardy RL (1971) Multiquadric equations of topography and other irregular surfaces. J Geophys Res 76 (8):1905– 1915CrossRefGoogle Scholar
  12. Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall. ISBN 0132733501. http://www.worldcat.org/isbn/0132733501
  13. Jin R, Chen W, Simpson TW (2001) Comparative studies of metamodelling techniques under multiple modelling criteria. Struct Multidiscip Optim 23(1):1–13CrossRefGoogle Scholar
  14. Kalagnanam JR, Diwekar UM (1997) An efficient sampling technique for off-line quality control. Technometrics 39(3):308–319. http://www.jstor.org/stable/pdf/1271135.pdf?acceptTC=true
  15. Kim B-S, Lee Y-B, Choi D-H (2009) Comparison study on the accuracy of metamodeling technique for non-convex functions. J Mech Sci Technol 23(4):1175–1181CrossRefGoogle Scholar
  16. McKay MD, Beckman RJ, Conover WJ (1979) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2):239–245MathSciNetMATHGoogle Scholar
  17. Mullur A, Messac A (2006) Metamodeling using extended radial basis functions: a comparative approach. Eng Comput 21(3):203–217. doi: 10.1007/s00366-005-0005-7 CrossRefGoogle Scholar
  18. Rosenbrock HH (1960) An automatic method for finding the greatest or least value of a function. Comput J 3(3):175–184. doi: 10.1093/comjnl/3.3.175. http://comjnl.oxfordjournals.org/content/3/3/175.abstract MathSciNetCrossRefGoogle Scholar
  19. Sacks J, Schiller SB, Welch WJ (1989) Designs for computer experiments. Technometrics 31(1):41–47. doi: 10.2307/1270363. ISSN 00401706
  20. Simpson TW, Mauery TM, Korte JJ (1998) Multidisciplinary optimization branch, and Farrokh Mistree. Comparison of response surface and kriging models for multidisciplinary design optimization. In: AIAA paper 98-4758. 7 th AIAA/USAF/NASA/ISSMO Symposium on multidisciplinary analysis and optimization, pp 98–4755Google Scholar
  21. Simpson TW, Lin DKJ, Chen W (2001a) Sampling strategies for computer experiments: design and analysis. Int J Reliab Appl 2(3):209–240Google Scholar
  22. Simpson TW, Poplinski JD, Koch PN, Allen JK (2001b) Metamodels for computer-based engineering design: survey and recommendations. Eng Comput 17(2):129–150CrossRefMATHGoogle Scholar
  23. Simpson TW, Toropov V, Balabanov V, Viana FAC (2008) Design and analysis of computer experiments in multidisciplinary design optimization: a review of how far we have come or not. In: 12th AIAA/ISSMO multidisciplinary analysis and optimization conference, pp 10–12Google Scholar
  24. Strömberg N (2016) Reliability based design optimization by using a slp approach and radial basis function networks. In: (to appear) ASME 2016 international design engineering technical conferences and computers and information in engineering conference. American Society of Mechanical EngineersGoogle Scholar
  25. Vapnik V, Golowich SE, Smola A (1996) Support vector method for function approximation, regression estimation, and signal processing. In: Advances in neural information processing systems, vol 9, pp 281–287. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.3139
  26. Wang GG, Shan S (2007) Review of metamodeling techniques in support of engineering design optimization. J Mech Des 129(4):370–380CrossRefGoogle Scholar
  27. Zhao D, Xue D (2010) A comparative study of metamodeling methods considering sample quality merits. Struct Multidiscip Optim 42(6):923–938. doi: 10.1007/s00158-010-0529-3. ISSN 1615-147X. http://www.springerlink.com/index/10.1007/s00158-010-0529-3

Copyright information

© The Author(s) 2016

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Product Development Department, School of EngineeringJönköping UniversityJönköpingSweden
  2. 2.Department of Mechanical Engineering, School of Science and TechnologyUniversity of ÖrebroÖrebroSweden
  3. 3.School of Engineering ScienceUniversity of SkövdeSkövdeSweden

Personalised recommendations