On efficient global optimization via universal Kriging surrogate models

Palar, Pramudita Satria; Shimoyama, Koji

doi:10.1007/s00158-017-1867-1

On efficient global optimization via universal Kriging surrogate models

RESEARCH PAPER
Published: 05 December 2017

Volume 57, pages 2377–2397, (2018)
Cite this article

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

1465 Accesses
33 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper, we investigate the capability of the universal Kriging (UK) model for single-objective global optimization applied within an efficient global optimization (EGO) framework. We implemented this combined UK-EGO framework and studied four variants of the UK methods, that is, a UK with a first-order polynomial, a UK with a second-order polynomial, a blind Kriging (BK) implementation from the ooDACE toolbox, and a polynomial-chaos Kriging (PCK) implementation. The UK-EGO framework with automatic trend function selection derived from the BK and PCK models works by building a UK surrogate model and then performing optimizations via expected improvement criteria on the Kriging model with the lowest leave-one-out cross-validation error. Next, we studied and compared the UK-EGO variants and standard EGO using five synthetic test functions and one aerodynamic problem. Our results show that the proper choice for the trend function through automatic feature selection can improve the optimization performance of UK-EGO relative to EGO. From our results, we found that PCK-EGO was the best variant, as it had more robust performance as compared to the rest of the UK-EGO schemes; however, total-order expansion should be used to generate the candidate trend function set for high-dimensional problems. Note that, for some test functions, the UK with predetermined polynomial trend functions performed better than that of BK and PCK, indicating that the use of automatic trend function selection does not always lead to the best quality solutions. We also found that although some variants of UK are not as globally accurate as the ordinary Kriging (OK), they can still identify better-optimized solutions due to the addition of the trend function, which helps the optimizer locate the global optimum.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Kriging surrogate model with coordinate transformation based on likelihood and gradient

Article 04 April 2017

Ensemble of Kriging with Multiple Kernel Functions for Engineering Design Optimization

Adaptive multi-fidelity sparse polynomial chaos-Kriging metamodeling for global approximation of aerodynamic data

Article 10 April 2021

References

Bellary SAI, Samad A, Couckuyt I, Dhaene T (2016) A comparative study of Kriging variants for the optimization of a turbomachinery system. Eng Comput 32(1):49–59
Article Google Scholar
Benassi R, Bect J, Vazquez E (2011) Robust Gaussian process-based global optimization using a fully Bayesian expected improvement criterion. In: International conference on learning and intelligent optimization. Springer, pp 176–190
Blatman G, Sudret B (2011) Adaptive sparse polynomial chaos expansion based on least angle regression. J Comput Phys 230(6):2345–2367
Article MathSciNet MATH Google Scholar
Couckuyt I, Forrester A, Gorissen D, De Turck F, Dhaene T (2012) Blind Kriging implementation and performance analysis. Adv Eng Softw 49:1–13
Article Google Scholar
Couckuyt I, Dhaene T, Demeester P (2014) ooDACE toolbox: a flexible object-oriented Kriging implementation. J Mach Learn Res 15(1):3183–3186
MATH Google Scholar
Dubrule O (1983) Cross validation of Kriging in a unique neighborhood. J Int Assoc Math Geol 15(6):687–699
Article MathSciNet Google Scholar
Dwight RP, Han Z-H (2009) Efficient uncertainty quantification using gradient-enhanced Kriging. AIAA paper 2276:2009
Google Scholar
Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. Wiley, New York
Book Google Scholar
Forsberg J, Nilsson L (2005) On polynomial response surfaces and Kriging for use in structural optimization of crashworthiness. Struct Multidiscip Optim 29(3):232–243
Article Google Scholar
Jeong S, Obayashi S (2005) Efficient global optimization (EGO) for multi-objective problem and data mining. In: 2005 IEEE congress on evolutionary computation, vol 3, IEEE, pp 2138–2145
Jeong S, Murayama M, Yamamoto K (2005) Efficient optimization design method using Kriging model. J Aircr 42(2):413–420
Article Google Scholar
Jones DR (2001) A taxonomy of global optimization methods based on response surfaces. J Glob Optim 21 (4):345–383
Article MathSciNet MATH Google Scholar
Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. J Glob Optim 13(4):455– 492
Article MathSciNet MATH Google Scholar
Joseph VR, Hung Y, Sudjianto A (2008) Blind Kriging: a new method for developing metamodels. J Mech Des 130(3):031102
Article Google Scholar
Keane AJ (2006) Statistical improvement criteria for use in multiobjective design optimization. AIAA J 44 (4):879–891
Article Google Scholar
Kersaudy P, Sudret B, Varsier N, Picon O, Wiart J (2015) A new surrogate modeling technique combining Kriging and polynomial chaos expansions-application to uncertainty analysis in computational dosimetry. J Comput Phys 286:103– 117
Article MathSciNet MATH Google Scholar
Kleijnen JP, van Beers W, Van Nieuwenhuyse I (2012) Expected improvement in efficient global optimization through bootstrapped Kriging. J Glob Optim 54(1):59–73
Article MathSciNet MATH Google Scholar
Knowles J (2006) ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans Evol Comput 10(1):50–66
Article Google Scholar
Krige D (1951) A statistical approach to some mine valuation and allied problems on the Witwatersrand
Kulfan BM (2008) Universal parametric geometry representation method. J Aircr 45(1):142–158
Article Google Scholar
Liang H, Zhu M (2013) Comment on Metamodeling method using dynamic Kriging for design optimization. AIAA J 51(12):2988–2989
Article Google Scholar
Liang H, Zhu M, Wu Z (2014) Using cross-validation to design trend function in Kriging surrogate modeling. AIAA J 52(10):2313–2327
Article Google Scholar
Matheron G (1969) Les cahiers du centre de morphologie mathématique de fontainebleau fascicule 1. Le krigeage universel. Ecole de Mines de Paris, Fontainebleau
Google Scholar
Palar PS, Tsuchiya T, Parks GT (2016) A comparative study of local search within a surrogate-assisted multi-objective memetic algorithm framework for expensive problems. Appl Soft Comput 43:1–19
Article Google Scholar
Parr J, Keane A, Forrester AI, Holden C (2012) Infill sampling criteria for surrogate-based optimization with constraint handling. Eng Optim 44(10):1147–1166
Article MATH Google Scholar
Ray T, Tsai H (2004) Swarm algorithm for single-and multiobjective airfoil design optimization. AIAA J 42(2):366–373
Article Google Scholar
Sacks J, Welch WJ, Mitchell TJ, Wynn HP (1989) Design and analysis of computer experiments. Stat Sci 4(4):409–423
Article MathSciNet MATH Google Scholar
Sakata S, Ashida F, Zako M (2003) Structural optimization using Kriging approximation. Comput Methods Appl Mech Eng 192(7):923–939
Article MATH Google Scholar
Schöbi R, Sudret B (2014) Combining polynomial chaos expansions and Kriging for solving structural reliability problems. In: Proc. 7th int. Conf. on comp. Stoch. Mech (CSM7), Santorini, Greece
Schöbi R, Sudret B, Marelli S (2016) Rare event estimation using polynomial-chaos Kriging. ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering page D4016002
Schobi R, Sudret B, Wiart J (2015) Polynomial-chaos-based Kriging. Int J Uncertain Quantif 5(2):171–193
Article MathSciNet Google Scholar
Shimoyama K, Kawai S, Alonso JJ (2013) Dynamic adaptive sampling based on Kriging surrogate models for efficient uncertainty quantification. In: 15th AIAA non-deterministic approaches conference, pp 2013–1470
Simpson TW, Mauery TM, Korte JJ, Mistree F (2001) Kriging models for global approximation in simulation-based multidisciplinary design optimization. AIAA J 39(12):2233–2241
Article Google Scholar
Sobol IM (1993) Sensitivity estimates for nonlinear mathematical models. Math Model Comput Exp 1 (4):407–414
MathSciNet MATH Google Scholar
Stein ML (2012) Interpolation of spatial data: some theory for Kriging. Springer, Berlin
Google Scholar
Swiler L, Paez T, Mayes R, Eldred M (2009) Epistemic uncertainty in the calculation of margins. In: AIAA structures, structural dynamics, and materials conference. Palm Springs CA
Ur Rehman S, Langelaar M (2015) Efficient global robust optimization of unconstrained problems affected by parametric uncertainties. Struct Multidiscip Optim 52(2):319–336
Article MathSciNet Google Scholar
Ur Rehman S, Langelaar M, van Keulen F (2014) Efficient Kriging-based robust optimization of unconstrained problems. J Comput Sci 5(6):872–881
Article MathSciNet Google Scholar
Viana FA, Haftka RT, Watson LT (2013) Efficient global optimization algorithm assisted by multiple surrogate techniques. J Glob Optim 56(2):669–689
Article MATH Google Scholar
Wiener N (1938) The homogeneous chaos. Am J Math 60(4):897–936
Article MathSciNet MATH Google Scholar
Xiu D, Karniadakis GE (2002) The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM J Sci Comput 24(2):619–644
Article MathSciNet MATH Google Scholar
Zhao L, Choi K, Lee I (2011) Metamodeling method using dynamic Kriging for design optimization. AIAA J 49(9):2034– 2046
Article Google Scholar

Download references

Acknowledgements

Koji Shimoyama was supported in part by the Grant-in-Aid for Scientific Research (B) No. H1503600 administered by the Japan Society for the Promotion of Science (JSPS).

Author information

Authors and Affiliations

Institute of Fluid Science, Tohoku University, Sendai, Miyagi Prefecture, 980-8577, Japan
Pramudita Satria Palar & Koji Shimoyama

Authors

Pramudita Satria Palar
View author publications
You can also search for this author in PubMed Google Scholar
Koji Shimoyama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pramudita Satria Palar.

Appendices

Appendix A: Trend function selection and hyperparameter optimization strategy for PCK

We propose a sequential hyperparameter optimization strategy based on BFGS at each iteration of the UK construction process that utilizes the optimum solution from the previous iteration. This strategy can be applied to both BK and PCK since both methods work by scanning from the provided polynomial set. Regardless, our strategy uses the optimum solution obtained in the previous iteration as the initial solution for the BFGS search in the current iteration. Our primary motivation for applying this strategy is that the likelihood function for one iteration might change only slightly relative to the previous iteration, indicating the proximity of the global optimum hyperparameter locations. Further, in our strategy, a GA is only used in the first iteration to find the optimum hyperparameters of the OK before adding more trend functions. Here, we utilize a GA with a population size of 100 and a maximum of 200 generations followed by BFGS search. This exhaustive search is used only in the first iteration since the accuracy of the hyperparameters’ optimization procedures that follow relies on the accuracy of the OK hyperparameters. After the final trend function is identified, our GA+BFGS approach is then applied again with this final trend function to search for possible higher values of the likelihood. We call our strategy here the simplified GA+BFGS strategy as opposed to the exhaustive GA+BFGS strategy.

To verify the performance of our simplified GA+BFGS strategy, we compared its performance with the exhaustive GA+BFGS strategy using five test functions mentioned in Appendix B. In this study, we set sample size to 20, 60, and 40 for the two-dimensional problems, Hartman-6, and borehole problem, respectively. We generally used N _s = 10 × m, where N _s is the sample size, to generate the sample set, with the only exception being the borehole problem in which we set the sample size to 40 to make the optimization problem more difficult. We also compared our simplified GA+BFGS strategy with the simple BFGS strategy that employs a one-shot strategy with a random initial solution at each UK iteration. More specifically here, we compared the lowest LOOCV errors resulting from these three strategies.

For all five test functions, we observe from Fig. 11 that the error performances of the UK for the exhaustive GA+BFGS and simplified GA+BFGS strategies are similar to one another. We observe that the performance of the simple BFGS strategy was not as good as that of the other two strategies. All strategies performed approximately the same for the Hartman-6 problem; this problem is a highly nonlinear and difficult problem in which the UK did not perform better than the standard OK in terms of approximation quality, thus explaining why UK hyperparameter tuning minimally affects LOOCV error. The lower performance of the simple BFGS strategy here signifies that the discovery of the optimum of a likelihood function for the UK is sensitive to the choice of the initial point.

The time required to train the hyperparameters using these simplified and exhaustive strategies on a two-dimensional function with p = 4 was approximately 3 and 40 seconds, respectively, on a computer with Intel®; Xeon(R) E5-1630 v4 8 core CPU @ 3.70GHz equipped with MATLAB. This indicates that our simplified strategy can perform similarly to the exhaustive strategy in only 7.5% of the time required by the exhaustive approach.

Appendix B: Test functions

1.
Branin function (two variables).
$$\begin{array}{@{}rcl@{}} f_{1}(\boldsymbol{x}) &=& \left( b_{2}-\frac{5.1}{4\pi^{2}}{b_{1}^{2}}+\frac{5}{\pi}b_{1}-6 \right)^{2} \\ &&+ 10 \left[\left( 1-\frac{1}{8\pi} \right) \text{cos }(b_{1})+ 1\right], \end{array} $$
(29)
where b ₁ = 15x ₁ − 5,b ₂ = 15x ₂, and x ₁,x ₂ ∈ [0, 1]².
2.
Sasena function (two variables).
$$\begin{array}{@{}rcl@{}} f(\boldsymbol{x}) &=& 2 + 0.01(x_{2}-{x_{1}^{2}})^{2}+(1-x_{1})^{2} + 2(2-x_{2})^{2} \\ &&+ 7\text{sin }(0.5x_{1}) \sin~(0.7x_{1}x_{2}). \\ &&x_{1}\in[0,5], x_{2}\in[0,5]. \end{array} $$
(30)
3.
Hosaki function (two variables)
$$\begin{array}{@{}rcl@{}} f(\boldsymbol{x}) &=& \left( 1-8x_{1}+ 7{x_{1}^{2}} \,-\, (7/3){x_{1}^{3}} +(1/4){x_{1}^{4}} \right) {x_{2}^{2}}e^{-x_{1}}. \\ &&\quad\quad\quad\quad\quad\quad\quad x_{1}\in[0,5], x_{2}\in[0,5]. \end{array} $$
(31)
4.
Hartman-6 function (six variables)
$$\begin{array}{@{}rcl@{}} f(\boldsymbol{x}) &=& -{\sum}_{i = 1}^{4}c_{i}\text{exp } \left\{-{\sum}_{j = 1}^{n}\mathrm{A}_{ij}(x_{j}-\mathrm{P}_{ij})^{2}\right\} , \\ \boldsymbol{x}&=&(x_{1},x_{2},\ldots,x_{n})^{T}, x_{i}\in[0,1] \end{array} $$
(32)
where
$$ \boldsymbol{c} = [1.0, 1.2, 3, 3.2]^{T}. $$
(33)

$$ \textrm{\textbf{A}} = \left[\begin{array}{cccccc} 10& 3& 17& 3.5& 1.7& 8\\ 0.05& 10& 17& 0.1& 8& 14\\ 3& 3.5& 1.7& 10& 17& 8\\ 17& 8& 0.05& 10& 0.1& 14 \end{array}\right] $$
(34)

$$ \textrm{\textbf{P}} = 10^{-4} \left[\begin{array}{cccccc} 1312& 1696& 5569& 124& 8283& 5886\\ 2329& 4135& 8307& 3736& 1004& 9991\\ 2348& 1451& 3522& 2883& 3047& 6650\\ 4047& 8828& 8732& 5743& 1091& 381 \end{array}\right] $$
(35)
5.
Borehole function (eight variables)
$$ f(\boldsymbol{x}) = \frac{2\pi T_{u}(H_{u}-H_{l})}{\ln(r/r_{w})\left( 1+\frac{2LT_{u}}{\ln(r/r_{w}){r_{w}^{2}}K_{w}}+\frac{T_{u}}{T_{l}}\right)} $$
(36)
where the input variables are defined as shown in Table 3.
Table 3 The input variables and their input ranges for the borehole test function
Full size table

Appendix C: Boxplot

For the boxplots, the bottom and top of each box represent the lower quartile Q1 (i.e., 25%) and upper quartile Q3 (i.e., 75%), respectively. The line between the top and bottom of the box represents the median (i.e., 50%). Further, the whiskers below and above the box are drawn from Q1 − 1.5 IQR and Q3 + 1.5 IQR, where IQR represents the interquartile range (i.e., Q3-Q1). Observations that lie beyond the whisker length are identified as outliers. Finally, the circle denotes the mean of the observations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Palar, P.S., Shimoyama, K. On efficient global optimization via universal Kriging surrogate models. Struct Multidisc Optim 57, 2377–2397 (2018). https://doi.org/10.1007/s00158-017-1867-1

Download citation

Received: 18 May 2017
Revised: 11 November 2017
Accepted: 15 November 2017
Published: 05 December 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s00158-017-1867-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On efficient global optimization via universal Kriging surrogate models

Abstract

Access this article

Similar content being viewed by others

Kriging surrogate model with coordinate transformation based on likelihood and gradient

Ensemble of Kriging with Multiple Kernel Functions for Engineering Design Optimization

Adaptive multi-fidelity sparse polynomial chaos-Kriging metamodeling for global approximation of aerodynamic data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Trend function selection and hyperparameter optimization strategy for PCK

Appendix B: Test functions

Appendix C: Boxplot

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On efficient global optimization via universal Kriging surrogate models

Abstract

Access this article

Similar content being viewed by others

Kriging surrogate model with coordinate transformation based on likelihood and gradient

Ensemble of Kriging with Multiple Kernel Functions for Engineering Design Optimization

Adaptive multi-fidelity sparse polynomial chaos-Kriging metamodeling for global approximation of aerodynamic data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Trend function selection and hyperparameter optimization strategy for PCK

Appendix B: Test functions

Appendix C: Boxplot

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation