Skip to main content
Log in

Performance study of multi-fidelity gradient enhanced kriging

  • Research Paper
  • Published:
Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Abstract

Multi-fidelity surrogate modelling offers an efficient way to approximate computationally expensive simulations. In particular, Kriging-based surrogate models are popular for approximating deterministic data. In this work, the performance of Kriging is investigated when multi-fidelity gradient data is introduced along with multi-fidelity function data to approximate computationally expensive black-box simulations. To achieve this, the recursive CoKriging formulation is extended by incorporating multi-fidelity gradient information. This approach, denoted by Gradient-Enhanced recursive CoKriging (GECoK), is initially applied to two analytical problems. As expected, results from the analytical benchmark problems show that additional gradient information of different fidelities can significantly improve the accuracy of the Kriging model. Moreover, GECoK provides a better approximation even when the gradient information is only partially available. Further comparison between CoKriging, Gradient Enhanced Kriging, denoted by GEK, and GECoK highlights various advantages of employing single and multi-fidelity gradient data. Finally, GECoK is further applied to two real-life examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. www.eesof.com, Agilent Technologies EEsof EDA, Santa Rosa, CA.

  2. www.cst.com, CST Computer Simulation Technology AG, Darmstadt, Germany.

References

  • Bandler JW, Biernacki RM, Chen SH, Grobelny PA, Hemmers RH (1994) Space mapping technique for electromagnetic optimization. IEEE Trans Microw Theory Tech 12:2536–2544

    Article  Google Scholar 

  • Brezillon J, Dwight R (2005) Discrete adjoint of the Navier-stokes Equations for aerodynamic shape optimization. In: Evolutionary and deterministic methods for design, optimisation and control with applications to industrial and societal problems (EUROGEN). Munich, Germany, p 2005

  • Chemmangat K, Ferranti F, Knockaert L, Dhaene T (2011) Parametric macromodeling for sensitivity responses from tabulated data. IEEE Microw Wirel Components Lett 21(8):397–399

    Article  Google Scholar 

  • Chung HS, Alonso JJ (2002) Using gradients to construct Cokriging approximation models for high-dimensional design optimization problems. In: Problems, 40th AIAA aerospace sciences meeting and exhibit, AIAA. Reno, NV, pp 2002–0317

  • Couckuyt I, Declercq F, Dhaene T, Rogier H, Knockaert L (2010) Surrogate-based infill optimization applied to electromagnetic problems. Int J RF Microw Comput-Aided Eng 20(5):492–501

    Article  Google Scholar 

  • Courrier N, Boucard PA, Soulier B (2014) The use of partially converged simulations in building surrogate models. Adv Eng Softw 67(0):186–197

    Article  Google Scholar 

  • Craig PS, Goldstein M, Seheult AH, Smith JA (1998) Constructing partial prior specifications for models of complex physical systems. J R Stat Soc Ser D : (Stat) 47(1):37–53

    Article  Google Scholar 

  • Cumming JA, Goldstein M (2009) Small sample Bayesian designs for complex high-dimensional models based on information gained using fast approximations. Technometrics 51(4):377–388

    Article  MathSciNet  Google Scholar 

  • Davis GJ, Morris MD (1997) Six factors which affect the condition number of matrices associated with kriging. Math Geol 29(5):669–683

    Article  Google Scholar 

  • Degroote J, Hojjat M, Stavropoulou E, Wüchner R, Bletzinger KU (2013) Partitioned solution of an unsteady adjoint for strongly coupled fluid-structure interactions and application to parameter identification of a one-dimensional problem. Struct Multidiscip Optim 47(1):77–94

    Article  MATH  MathSciNet  Google Scholar 

  • Dwight RP, Han ZH (2009) Efficient uncertainty quantification using gradient-enhanced kriging. In: 11th AIAA on-deterministic approaches conference. Palm Springs, California

  • Forrester AI, Bressloff NW, Keane AJ (2006) Optimization using surrogate models and partially converged computational fluid dynamics simulations. Proc R Soc A : Math Phys Eng Sci Med 462(2071):2177–2204

    Article  MATH  Google Scholar 

  • Forrester AI, Sóbester A, Keane AJ (2007) Multi-fidelity optimization via surrogate modelling. Proc R Soc 463:3251–3269

    Article  MATH  Google Scholar 

  • Forrester AI, Sóbester A, Keane AJ (2008) Engineering design via surrogate modelling: a practical guide, 1st edn. Wiley, New York

    Book  Google Scholar 

  • Goldstein M, Wooff DA (2007) Bayes linear statistics: theory & methods bayes linear statistics. Wiley, New York

    Book  Google Scholar 

  • Higdon D, Kennedy M, Cavendish J, Cafeo J, Ryne RD (2004) Combining field data and computer simulations for calibration and prediction. SIAM J Sci Comput 26(2):448–466

    Article  MATH  MathSciNet  Google Scholar 

  • Huang D, Allen T, Notz W, Miller R (2006) Sequential kriging optimization using multiple-fidelity evaluations. Struct Multidiscip Optim 32(5):369–382

    Article  Google Scholar 

  • Jin R, Chen W, Simpson TW (2000) Comparative studies of metamodeling techniques under multiple modeling criteria. Struct Multidiscip Optim 23:1–13

    Article  Google Scholar 

  • Kennedy MC, O’Hagan A (2000) Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1):1–13

    Article  MATH  MathSciNet  Google Scholar 

  • Kleijnen J (2009) Kriging metamodeling in simulation: A review. Eur J Oper Res 192(3):707–716

    Article  MATH  MathSciNet  Google Scholar 

  • Laurenceau J, Sagaut P (2008) Building efficient response surfaces of aerodynamic functions with kriging and cokriging. AIAA 46(2):498–507

    Article  Google Scholar 

  • Laurenceau J, Meaux M, Montagnac M, Sagaut P (2010) Comparison of gradient-based and gradient-enhanced response-surface-based optimizers. Am Inst Aeronaut Astronaut J 48(5):981–994

    Article  Google Scholar 

  • Laurent L, Boucard P A, Soulier B (2013) Generation of a cokriging metamodel using a multiparametric strategy. Comput Mech 51(2):151–169

    Article  MATH  MathSciNet  Google Scholar 

  • Le Gratiet L (2012) Recursive co-kriging model for design of computer experiments with multiple levels of fidelity with an application to hydrodynamic. arXiv:http://arxiv.org/abs/1210.0686

  • Leary SJ, Bhaskar A, Keane AJ (2004) A derivative based surrogate model for approximating and optimizing the output of an expensive computer simulation. J Glob Optim 30(1):39–58

    Article  MATH  MathSciNet  Google Scholar 

  • Liu W (2003) Development of gradient-enhanced kriging approximations for multidisciplinary design optimisation. PhD thesis, University of Notre Dame, Notre Dame, Indiana

  • March A, Willcox K, Wang Q (2010) Gradient-based multifidelity optimisation for aircraft design using bayesian model calibration. In: 2nd aircraft structural design conference. Royal Aeronautical Society, London, p 1720

  • Morris MD, Mitchell TJ, Ylvisaker D (1993) Bayesian design and analysis of computer experiments: use of gradients in surface prediction. Technometrics 35(3):243–255

    Article  MATH  MathSciNet  Google Scholar 

  • Näther W, Šimák J (2003) Effective observation of random processes using derivatives. Metrika 58:71–84

    MATH  MathSciNet  Google Scholar 

  • Qian PZG, Wu CFJ (2008) Bayesian hierarchical modeling for integrating low-accuracy and high-accuracy experiments. Technometrics 50(2):192–204

    Article  MathSciNet  Google Scholar 

  • Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. The MIT Press. MA, USA

    Google Scholar 

  • Sacks J, Schiller SB, Welch WJ (1989) Designs for computer experiments. Technometrics 31(1):41–47

    Article  MathSciNet  Google Scholar 

  • Sacks J, Welch WJ, Mitchell TJ, Wynn HP (1989b) Design and analysis of computer experiments. Stat Sci 4(4):409–423

    Article  MATH  MathSciNet  Google Scholar 

  • Schneider R (2012) FEINS: finite element solver for shape optimization with adjoint equations. In: Progress in industrial mathematics at ECMI 2010 conference, pp 573–580

  • Simpson T, Poplinski J, Koch PN, Allen J (2001) Metamodels for computer-based engineering design: Survey and recommendations. Engineering with Computers 17(2):129–150

    Article  MATH  Google Scholar 

  • Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer, New York

    Book  MATH  Google Scholar 

  • Toal DJ, Forrester AI, Bressloff NW, Keane AJ, Holden C (2009) An adjoint for likelihood maximization. Proc R Soc A 8(465):3267–3287

    Article  MathSciNet  Google Scholar 

  • Šimák J (2002) On experimental designs for derivative random fields. PhD thesis. TU Bergakademie Freiberg. Freiberg, Germany

  • Wang GG, Shan S (2006) Review of metamodeling techniques in support of engineering design optimization. J Mech Des 129(4):370–380

    Article  Google Scholar 

  • Yu W, Bandler J (2006) Optimization of spiral inductor on silicon using space mapping. In: IEEE MTT-S international microwave symposium digest, pp 1085–1088

  • Zimmermann R (2010) Asymptotic behavior of the likelihood function of covariance matrices of spatial gaussian processes. J Appl Math. http://www.hindawi.com/journals/jam/2010/494070/cta/

  • Zimmermann R (2013) On the maximum likelihood training of gradient-enhanced spatial gaussian processes. SIAM J Sci Comput 35(6):A2554—A2574

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This research has been funded by the Interuniversity Attraction Poles Programme BESTCOM initiated by the Belgian Science Policy Office. Additionally, this research has been supported by the Fund for Scientific Research in Flanders (FWO-Vlaanderen). Ivo Couckuyt and Francesco Ferranti are post-doctoral research fellows of the Research Foundation Flanders (FWO-Vlaanderen). The authors like to thank Frank Mosler from Computer Simulation Technology (CST) for providing the microwave inter-digital filter example.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Selvakumar Ulaganathan.

Appendices

Appendix A: Analytical expressions for gradient, Hessian and likelihood gradients of correlation functions

1.1 A.1 Gaussian correlation function:

Gradient of correlation function with respect to X (i.e., cross-correlation):

$$\frac{\partial\boldsymbol{\Psi}^{(i,j)}}{\partial x^{(j)}} = 2 \theta d \boldsymbol{\Psi}^{(i,j)}$$
(28)

Hessian of correlation function with respect to X (i.e., cross-correlation):

$$\begin{array}{@{}rcl@{}}\frac{\partial^{2}\boldsymbol{\Psi}^{(i,j)}}{\partial x_{u}^{(i)} \partial x_{v}^{(j)}}= \left\{\begin{array}{ll} -4\theta_u \theta_v d_u d_v \boldsymbol{\Psi}^{(i,j)} & \mathrm{if}\, u \not=\, v \\ \left[2 \theta -4 \theta^2 d^2\right]\boldsymbol{\Psi}^{(i,j)} & \mathrm{if}\, u =\, v \end{array}\right.\end{array} $$
(29)

Derivative of correlation function with respect to θ k :

$$ \frac{\partial}{\partial \theta_{k}} \left(\psi (d_{u(v)})\right) = -10^{\theta_{u(v)}}d_{u(v)}^2log(10)exp\left(-\sum\limits_{m=1}^{k} \theta_{m} d_{m}^2 \right) $$
(30)

Derivatives of cross-correlation functions with respect to θ k :

$$\begin{array}{@{}rcl@{}} &&\frac{\partial}{\partial \theta_{k}} \left(\frac{\partial\boldsymbol{\Psi}^{(i,j)}}{\partial x_{v}^{(j)}}\right)\nonumber\\ &&~~~~= \left\{\begin{array}{ll} 2 d_v 10^{\theta_{v}} log(10) \boldsymbol{\Psi}^{(i,j)} \left[1-10^{\theta_{k}}d_{k}^2 \right] & \text{if \(v = k\)} \\ 2 d_v 10^{\theta_{v}} log(10) \boldsymbol{\Psi}^{(i,j)} \left[-10^{\theta_{k}}d_{k}^2 \right] & \text{if \(v \not= k\)} \end{array}\right. \end{array} $$
(31)
$$\begin{array}{@{}rcl@{}} &&\frac{\partial}{\partial \theta_{k }} \left(\frac{\partial^{2}\mathbf{\Psi}^{(i,j)}}{\partial x_{u}^{(i)}\partial x_{v}^{(j)}} \right)\\ &&= \left\{\begin{array}{ll} -4 d_u d_v 10^{\theta_{u}} 10^{\theta_{v}} log(10) \mathbf{\Psi}^{(i,j)} \left[1-10^{\theta_k}d_k^2 \right] & \text{if \(u|v = k \)}\\ 4 d_u d_v d_k^210^{\theta_{u}} 10^{\theta_{v}} 10^{\theta_k} log(10) \mathbf{\Psi}^{(i,j)} & \text{otherwise}\\ \end{array}\right. \end{array} $$
(32)
$$\begin{array}{@{}rcl@{}} &&\frac{\partial}{\partial \theta_{k }} \left(\frac{\partial^{2}\boldsymbol{\Psi}^{(i,j)}}{\partial x_{u=v}^{(i)}\partial x_{u=v}^{(j)}}\right)\\ &&= \left\{\begin{array}{ll} \!\!log(10) \boldsymbol{\Psi}^{(i,j)} \!\left[2(10^{\theta}) \,+\, 4 (10^{3\theta}) d^4 \,-\, 10 (10^{2\theta}) d^2 \right] &\!\! \text{if \((u \,=\, v) \,=\, k \)} \\ \!\!-log(10) \boldsymbol{\Psi}^{(i,j)} 10^{\theta_k}d_k^2 \left[2(10^{\theta}) \,-\, 4 (10^{2\theta}) d^2 \right] & \text{\!\!if \((u\,=\,v) \!\not=\! k \)}\\ \end{array}\right.\\ \end{array} $$
(33)

1.2 A.2 Matérn \(\frac {5}{2}\) correlation function:

Gradient of correlation function with respect to X (i.e., cross-correlation):

$$ \frac{\partial\boldsymbol{\Psi}^{(i,j)}}{\partial x^{(j)}} = \frac{5\theta d (\sqrt{5}a+1) exp(-\sqrt{5}a)}{3} $$
(34)

Hessian of correlation function with respect to X (i.e., cross-correlation):

$$ \frac{\partial^{2}\boldsymbol{\Psi}^{(i,j)}}{\partial x_{u}^{(i)}\partial x_{v}^{(j)}}= \left\{\begin{array}{ll}\! \frac{-25 \theta_{u} \theta_{v} d_u d_v exp(-\sqrt{5}a )}{3} & \text{if \(u \neq v\)}\\ \!\left[\frac{-25 \theta^2 d^2 + 5\theta(\sqrt{5}a+1)}{3} \right] exp(-\sqrt{5}a) & \text{if \(u = v \)} \end{array}\right. $$
(35)

Derivative of correlation function with respect to θ k :

$$\begin{array}{@{}rcl@{}} &&\frac{\partial}{\partial \theta_{k }} (\psi_{\nu = 5/2} (d_{u(v)}))\\ &&= \frac{-(5+5\sqrt{5} a) 10^{\theta} log(10) d_{u(v)}^2 exp(-\sqrt{5}a)}{6} \end{array} $$
(36)

Derivatives of cross-correlation functions with respect to θ k :

$$\frac{\partial}{\partial \theta_{k }} \left(\frac{\partial\boldsymbol{\Psi}^{(i, j)}}{\partial x_{v}^{(j)}}\right) \!=\! \left\{\begin{array}{ll}\!\! 10^{\theta_v} d_v C_2 \left[C_1 \!+\! \left(\frac{-25}{6}\right) 10^{\theta_{k}}d_{k}^2\right] & \text{if \(v \!\not=\! k\)} \\ \!\!10^{\theta_v} 10^{\theta_k} d_v d_k^2 \left(\frac{-25C_2}{6}\right) & \text{if \(v \!\not=\! k\)} \frac{\partial}{\partial \theta_{k }} \left(\frac{\partial\mathbf{\Psi}^{(i,j)}}{\partial x_{v}^{(j)}}\right) \,=\, \left\{\begin{array}{ll}\!\! 10^{\theta_v} d_v C_2 \left[C_1 \,+\, \left(\frac{-25}{6}\right) 10^{\theta_{k }}d_{k}^2\right] & \text{if \(v \,=\, k \)}\\ \!\!10^{\theta_v} 10^{\theta_k} d_v d_k^2 \left(\frac{-25C_2}{6}\right) & \text{if \(v \!\not=\! k\)} \end{array}\right. \end{array} \right\}$$
(37)
$$\begin{array}{@{}rcl@{}} &&\frac{\partial}{\partial \theta_{k }} \left(\frac{\partial^{2}\boldsymbol{\Psi}^{(i,j)}}{\partial x_{u}^{(i)}\partial x_{v}^{(j)}} \right) \\&&~~~~= \left\{\begin{array}{ll} \frac{-25C_2 \left(1 - \frac{\sqrt{5} 10^{\theta_{k }}d_{k}^2 }{2a} \right) 10^{\theta_u} 10^{\theta_v} d_u d_v }{3} &\text{if \(u|v = k \)}\\ \frac{C_2 25 \sqrt{5} 10^{\theta_u} 10^{\theta_v} 10^{\theta_k} d_u d_v d_{k}^2}{6a} & \mathit{otherwise}\\ \end{array}\right. \end{array} $$
(38)
$$ \frac{\partial}{\partial \theta_{k }} \left(\frac{\partial^{2}\boldsymbol{\Psi}^{(i,j)}}{\partial x_{u=v}^{(i)}\partial x_{u=v}^{(j)}}\right) = \left\{\begin{array}{ll} V_3 + V_4 & \text{if \((u = v) = k \)}\\ V_3 & \text{if \((u=v) \neq k \)} \end{array}\right. $$
(39)

where

$$ V_3 \,=\, \left[ \!\left(\frac{25\sqrt{5}}{6a}\right)(10^{\theta_{u=v}})^2(d_{u=v})^2 \,-\, \left(\frac{25}{6}\right)\!10^{\theta_{u=v}}\! \right] \!C_2 10^{\theta_{k }}d_{k}^2 $$
(40)
$$\begin{array}{@{}rcl@{}} V_4 \!&=&\! \left(\frac{-50C_2 (10^{\theta})^2d^2}{3}\right) \,+\, C_1C_2 10^{\theta}~C_1 \,=\, \left(\frac{5\sqrt{5}}{3} a + \frac{5}{3}\right)\\C_2 &=& log(10)exp\left(-\sqrt{5}a\right) \end{array} $$
(41)

Appendix B: Comparison of MLE and Least Squares Estimation (LSE) of scaling parameter (ρ)

Table 7 Comparison of NRMSE on the validation data set for different ways of ρ estimation: 1D function with n e =4, n c =11 and n p =500; Peaks, inductance (L) and quality factor (Q) functions with n e =9, n c =30 and n p =500 and |S 11| m e a n with n e =30, n c =60 and n p =50. LSE(C) and LSE(L) correspond to Least Squares Estimation with constant and linear distribution of ρ, respectively
Table 8 Comparison of R 2 on the validation data set for different ways of ρ estimation: 1D function with n e =4, n c =11 and n p =500; Peaks, inductance ( L) and quality factor ( Q) functions with n e =9, n c =30 and n p =500 and |S 11| m e a n with n e =30, n c =60 and n p =50. LSE(C) and LSE(L) correspond to Least Squares Estimation with constant and linear distribution of ρ, respectively
Table 9 Comparison of RAAE on the validation data set for different ways of ρ estimation: 1D function with n e =4, n c =11 and n p =500; Peaks, inductance ( L) and quality factor ( Q) functions with n e =9, n c =30 and n p =500 and |S 11| m e a n with n e =30, n c =60 and n p =50. LSE(C) and LSE(L) correspond to Least Squares Estimation with constant and linear distribution of ρ, respectively
Table 10 Comparison of RMAE on the validation data set for different ways of ρ estimation: 1D function with n e =4, n c =11 and n p =500; Peaks, inductance ( L) and quality factor ( Q) functions with n e =9, n c =30 and n p =500 and |S 11| m e a n with n e =30, n c =60 and n p =50. LSE(C) and LSE(L) correspond to Least Squares Estimation with constant and linear distribution of ρ, respectively

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ulaganathan, S., Couckuyt, I., Ferranti, F. et al. Performance study of multi-fidelity gradient enhanced kriging. Struct Multidisc Optim 51, 1017–1033 (2015). https://doi.org/10.1007/s00158-014-1192-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00158-014-1192-x

Keywords

Navigation