Skip to main content
Log in

On efficient global optimization via universal Kriging surrogate models

  • RESEARCH PAPER
  • Published:
Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Abstract

In this paper, we investigate the capability of the universal Kriging (UK) model for single-objective global optimization applied within an efficient global optimization (EGO) framework. We implemented this combined UK-EGO framework and studied four variants of the UK methods, that is, a UK with a first-order polynomial, a UK with a second-order polynomial, a blind Kriging (BK) implementation from the ooDACE toolbox, and a polynomial-chaos Kriging (PCK) implementation. The UK-EGO framework with automatic trend function selection derived from the BK and PCK models works by building a UK surrogate model and then performing optimizations via expected improvement criteria on the Kriging model with the lowest leave-one-out cross-validation error. Next, we studied and compared the UK-EGO variants and standard EGO using five synthetic test functions and one aerodynamic problem. Our results show that the proper choice for the trend function through automatic feature selection can improve the optimization performance of UK-EGO relative to EGO. From our results, we found that PCK-EGO was the best variant, as it had more robust performance as compared to the rest of the UK-EGO schemes; however, total-order expansion should be used to generate the candidate trend function set for high-dimensional problems. Note that, for some test functions, the UK with predetermined polynomial trend functions performed better than that of BK and PCK, indicating that the use of automatic trend function selection does not always lead to the best quality solutions. We also found that although some variants of UK are not as globally accurate as the ordinary Kriging (OK), they can still identify better-optimized solutions due to the addition of the trend function, which helps the optimizer locate the global optimum.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Bellary SAI, Samad A, Couckuyt I, Dhaene T (2016) A comparative study of Kriging variants for the optimization of a turbomachinery system. Eng Comput 32(1):49–59

    Article  Google Scholar 

  • Benassi R, Bect J, Vazquez E (2011) Robust Gaussian process-based global optimization using a fully Bayesian expected improvement criterion. In: International conference on learning and intelligent optimization. Springer, pp 176–190

  • Blatman G, Sudret B (2011) Adaptive sparse polynomial chaos expansion based on least angle regression. J Comput Phys 230(6):2345–2367

    Article  MathSciNet  MATH  Google Scholar 

  • Couckuyt I, Forrester A, Gorissen D, De Turck F, Dhaene T (2012) Blind Kriging implementation and performance analysis. Adv Eng Softw 49:1–13

    Article  Google Scholar 

  • Couckuyt I, Dhaene T, Demeester P (2014) ooDACE toolbox: a flexible object-oriented Kriging implementation. J Mach Learn Res 15(1):3183–3186

    MATH  Google Scholar 

  • Dubrule O (1983) Cross validation of Kriging in a unique neighborhood. J Int Assoc Math Geol 15(6):687–699

    Article  MathSciNet  Google Scholar 

  • Dwight RP, Han Z-H (2009) Efficient uncertainty quantification using gradient-enhanced Kriging. AIAA paper 2276:2009

    Google Scholar 

  • Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. Wiley, New York

    Book  Google Scholar 

  • Forsberg J, Nilsson L (2005) On polynomial response surfaces and Kriging for use in structural optimization of crashworthiness. Struct Multidiscip Optim 29(3):232–243

    Article  Google Scholar 

  • Jeong S, Obayashi S (2005) Efficient global optimization (EGO) for multi-objective problem and data mining. In: 2005 IEEE congress on evolutionary computation, vol 3, IEEE, pp 2138–2145

  • Jeong S, Murayama M, Yamamoto K (2005) Efficient optimization design method using Kriging model. J Aircr 42(2):413–420

    Article  Google Scholar 

  • Jones DR (2001) A taxonomy of global optimization methods based on response surfaces. J Glob Optim 21 (4):345–383

    Article  MathSciNet  MATH  Google Scholar 

  • Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. J Glob Optim 13(4):455– 492

    Article  MathSciNet  MATH  Google Scholar 

  • Joseph VR, Hung Y, Sudjianto A (2008) Blind Kriging: a new method for developing metamodels. J Mech Des 130(3):031102

    Article  Google Scholar 

  • Keane AJ (2006) Statistical improvement criteria for use in multiobjective design optimization. AIAA J 44 (4):879–891

    Article  Google Scholar 

  • Kersaudy P, Sudret B, Varsier N, Picon O, Wiart J (2015) A new surrogate modeling technique combining Kriging and polynomial chaos expansions-application to uncertainty analysis in computational dosimetry. J Comput Phys 286:103– 117

    Article  MathSciNet  MATH  Google Scholar 

  • Kleijnen JP, van Beers W, Van Nieuwenhuyse I (2012) Expected improvement in efficient global optimization through bootstrapped Kriging. J Glob Optim 54(1):59–73

    Article  MathSciNet  MATH  Google Scholar 

  • Knowles J (2006) ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans Evol Comput 10(1):50–66

    Article  Google Scholar 

  • Krige D (1951) A statistical approach to some mine valuation and allied problems on the Witwatersrand

  • Kulfan BM (2008) Universal parametric geometry representation method. J Aircr 45(1):142–158

    Article  Google Scholar 

  • Liang H, Zhu M (2013) Comment on Metamodeling method using dynamic Kriging for design optimization. AIAA J 51(12):2988–2989

    Article  Google Scholar 

  • Liang H, Zhu M, Wu Z (2014) Using cross-validation to design trend function in Kriging surrogate modeling. AIAA J 52(10):2313–2327

    Article  Google Scholar 

  • Matheron G (1969) Les cahiers du centre de morphologie mathématique de fontainebleau fascicule 1. Le krigeage universel. Ecole de Mines de Paris, Fontainebleau

    Google Scholar 

  • Palar PS, Tsuchiya T, Parks GT (2016) A comparative study of local search within a surrogate-assisted multi-objective memetic algorithm framework for expensive problems. Appl Soft Comput 43:1–19

    Article  Google Scholar 

  • Parr J, Keane A, Forrester AI, Holden C (2012) Infill sampling criteria for surrogate-based optimization with constraint handling. Eng Optim 44(10):1147–1166

    Article  MATH  Google Scholar 

  • Ray T, Tsai H (2004) Swarm algorithm for single-and multiobjective airfoil design optimization. AIAA J 42(2):366–373

    Article  Google Scholar 

  • Sacks J, Welch WJ, Mitchell TJ, Wynn HP (1989) Design and analysis of computer experiments. Stat Sci 4(4):409–423

    Article  MathSciNet  MATH  Google Scholar 

  • Sakata S, Ashida F, Zako M (2003) Structural optimization using Kriging approximation. Comput Methods Appl Mech Eng 192(7):923–939

    Article  MATH  Google Scholar 

  • Schöbi R, Sudret B (2014) Combining polynomial chaos expansions and Kriging for solving structural reliability problems. In: Proc. 7th int. Conf. on comp. Stoch. Mech (CSM7), Santorini, Greece

  • Schöbi R, Sudret B, Marelli S (2016) Rare event estimation using polynomial-chaos Kriging. ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering page D4016002

  • Schobi R, Sudret B, Wiart J (2015) Polynomial-chaos-based Kriging. Int J Uncertain Quantif 5(2):171–193

    Article  MathSciNet  Google Scholar 

  • Shimoyama K, Kawai S, Alonso JJ (2013) Dynamic adaptive sampling based on Kriging surrogate models for efficient uncertainty quantification. In: 15th AIAA non-deterministic approaches conference, pp 2013–1470

  • Simpson TW, Mauery TM, Korte JJ, Mistree F (2001) Kriging models for global approximation in simulation-based multidisciplinary design optimization. AIAA J 39(12):2233–2241

    Article  Google Scholar 

  • Sobol IM (1993) Sensitivity estimates for nonlinear mathematical models. Math Model Comput Exp 1 (4):407–414

    MathSciNet  MATH  Google Scholar 

  • Stein ML (2012) Interpolation of spatial data: some theory for Kriging. Springer, Berlin

    Google Scholar 

  • Swiler L, Paez T, Mayes R, Eldred M (2009) Epistemic uncertainty in the calculation of margins. In: AIAA structures, structural dynamics, and materials conference. Palm Springs CA

  • Ur Rehman S, Langelaar M (2015) Efficient global robust optimization of unconstrained problems affected by parametric uncertainties. Struct Multidiscip Optim 52(2):319–336

    Article  MathSciNet  Google Scholar 

  • Ur Rehman S, Langelaar M, van Keulen F (2014) Efficient Kriging-based robust optimization of unconstrained problems. J Comput Sci 5(6):872–881

    Article  MathSciNet  Google Scholar 

  • Viana FA, Haftka RT, Watson LT (2013) Efficient global optimization algorithm assisted by multiple surrogate techniques. J Glob Optim 56(2):669–689

    Article  MATH  Google Scholar 

  • Wiener N (1938) The homogeneous chaos. Am J Math 60(4):897–936

    Article  MathSciNet  MATH  Google Scholar 

  • Xiu D, Karniadakis GE (2002) The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM J Sci Comput 24(2):619–644

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao L, Choi K, Lee I (2011) Metamodeling method using dynamic Kriging for design optimization. AIAA J 49(9):2034– 2046

    Article  Google Scholar 

Download references

Acknowledgements

Koji Shimoyama was supported in part by the Grant-in-Aid for Scientific Research (B) No. H1503600 administered by the Japan Society for the Promotion of Science (JSPS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pramudita Satria Palar.

Appendices

Appendix A: Trend function selection and hyperparameter optimization strategy for PCK

We propose a sequential hyperparameter optimization strategy based on BFGS at each iteration of the UK construction process that utilizes the optimum solution from the previous iteration. This strategy can be applied to both BK and PCK since both methods work by scanning from the provided polynomial set. Regardless, our strategy uses the optimum solution obtained in the previous iteration as the initial solution for the BFGS search in the current iteration. Our primary motivation for applying this strategy is that the likelihood function for one iteration might change only slightly relative to the previous iteration, indicating the proximity of the global optimum hyperparameter locations. Further, in our strategy, a GA is only used in the first iteration to find the optimum hyperparameters of the OK before adding more trend functions. Here, we utilize a GA with a population size of 100 and a maximum of 200 generations followed by BFGS search. This exhaustive search is used only in the first iteration since the accuracy of the hyperparameters’ optimization procedures that follow relies on the accuracy of the OK hyperparameters. After the final trend function is identified, our GA+BFGS approach is then applied again with this final trend function to search for possible higher values of the likelihood. We call our strategy here the simplified GA+BFGS strategy as opposed to the exhaustive GA+BFGS strategy.

To verify the performance of our simplified GA+BFGS strategy, we compared its performance with the exhaustive GA+BFGS strategy using five test functions mentioned in Appendix B. In this study, we set sample size to 20, 60, and 40 for the two-dimensional problems, Hartman-6, and borehole problem, respectively. We generally used N s = 10 × m, where N s is the sample size, to generate the sample set, with the only exception being the borehole problem in which we set the sample size to 40 to make the optimization problem more difficult. We also compared our simplified GA+BFGS strategy with the simple BFGS strategy that employs a one-shot strategy with a random initial solution at each UK iteration. More specifically here, we compared the lowest LOOCV errors resulting from these three strategies.

For all five test functions, we observe from Fig. 11 that the error performances of the UK for the exhaustive GA+BFGS and simplified GA+BFGS strategies are similar to one another. We observe that the performance of the simple BFGS strategy was not as good as that of the other two strategies. All strategies performed approximately the same for the Hartman-6 problem; this problem is a highly nonlinear and difficult problem in which the UK did not perform better than the standard OK in terms of approximation quality, thus explaining why UK hyperparameter tuning minimally affects LOOCV error. The lower performance of the simple BFGS strategy here signifies that the discovery of the optimum of a likelihood function for the UK is sensitive to the choice of the initial point.

Fig. 11
figure 11

Comparing the LOOCV error for PCK with hyperparameters tuned using various strategies on the: a Branin, b Sasena, c Hosaki, d Hartman-6, and e borehole problems

The time required to train the hyperparameters using these simplified and exhaustive strategies on a two-dimensional function with p = 4 was approximately 3 and 40 seconds, respectively, on a computer with Intel®; Xeon(R) E5-1630 v4 8 core CPU @ 3.70GHz equipped with MATLAB. This indicates that our simplified strategy can perform similarly to the exhaustive strategy in only 7.5% of the time required by the exhaustive approach.

Appendix B: Test functions

  1. 1.

    Branin function (two variables).

    $$\begin{array}{@{}rcl@{}} f_{1}(\boldsymbol{x}) &=& \left( b_{2}-\frac{5.1}{4\pi^{2}}{b_{1}^{2}}+\frac{5}{\pi}b_{1}-6 \right)^{2} \\ &&+ 10 \left[\left( 1-\frac{1}{8\pi} \right) \text{cos }(b_{1})+ 1\right], \end{array} $$
    (29)

    where b 1 = 15x 1 − 5,b 2 = 15x 2, and x 1,x 2 ∈ [0, 1]2.

  2. 2.

    Sasena function (two variables).

    $$\begin{array}{@{}rcl@{}} f(\boldsymbol{x}) &=& 2 + 0.01(x_{2}-{x_{1}^{2}})^{2}+(1-x_{1})^{2} + 2(2-x_{2})^{2} \\ &&+ 7\text{sin }(0.5x_{1}) \sin~(0.7x_{1}x_{2}). \\ &&x_{1}\in[0,5], x_{2}\in[0,5]. \end{array} $$
    (30)
  3. 3.

    Hosaki function (two variables)

    $$\begin{array}{@{}rcl@{}} f(\boldsymbol{x}) &=& \left( 1-8x_{1}+ 7{x_{1}^{2}} \,-\, (7/3){x_{1}^{3}} +(1/4){x_{1}^{4}} \right) {x_{2}^{2}}e^{-x_{1}}. \\ &&\quad\quad\quad\quad\quad\quad\quad x_{1}\in[0,5], x_{2}\in[0,5]. \end{array} $$
    (31)
  4. 4.

    Hartman-6 function (six variables)

    $$\begin{array}{@{}rcl@{}} f(\boldsymbol{x}) &=& -{\sum}_{i = 1}^{4}c_{i}\text{exp } \left\{-{\sum}_{j = 1}^{n}\mathrm{A}_{ij}(x_{j}-\mathrm{P}_{ij})^{2}\right\} , \\ \boldsymbol{x}&=&(x_{1},x_{2},\ldots,x_{n})^{T}, x_{i}\in[0,1] \end{array} $$
    (32)

    where

    $$ \boldsymbol{c} = [1.0, 1.2, 3, 3.2]^{T}. $$
    (33)
    $$ \textrm{\textbf{A}} = \left[\begin{array}{cccccc} 10& 3& 17& 3.5& 1.7& 8\\ 0.05& 10& 17& 0.1& 8& 14\\ 3& 3.5& 1.7& 10& 17& 8\\ 17& 8& 0.05& 10& 0.1& 14 \end{array}\right] $$
    (34)
    $$ \textrm{\textbf{P}} = 10^{-4} \left[\begin{array}{cccccc} 1312& 1696& 5569& 124& 8283& 5886\\ 2329& 4135& 8307& 3736& 1004& 9991\\ 2348& 1451& 3522& 2883& 3047& 6650\\ 4047& 8828& 8732& 5743& 1091& 381 \end{array}\right] $$
    (35)
  5. 5.

    Borehole function (eight variables)

    $$ f(\boldsymbol{x}) = \frac{2\pi T_{u}(H_{u}-H_{l})}{\ln(r/r_{w})\left( 1+\frac{2LT_{u}}{\ln(r/r_{w}){r_{w}^{2}}K_{w}}+\frac{T_{u}}{T_{l}}\right)} $$
    (36)

    where the input variables are defined as shown in Table 3.

    Table 3 The input variables and their input ranges for the borehole test function

Appendix C: Boxplot

For the boxplots, the bottom and top of each box represent the lower quartile Q1 (i.e., 25%) and upper quartile Q3 (i.e., 75%), respectively. The line between the top and bottom of the box represents the median (i.e., 50%). Further, the whiskers below and above the box are drawn from Q1 − 1.5 IQR and Q3 + 1.5 IQR, where IQR represents the interquartile range (i.e., Q3-Q1). Observations that lie beyond the whisker length are identified as outliers. Finally, the circle denotes the mean of the observations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Palar, P.S., Shimoyama, K. On efficient global optimization via universal Kriging surrogate models. Struct Multidisc Optim 57, 2377–2397 (2018). https://doi.org/10.1007/s00158-017-1867-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00158-017-1867-1

Keywords

Navigation