Skip to main content
Log in

Analysis of dataset selection for multi-fidelity surrogates for a turbine problem

  • RESEARCH PAPER
  • Published:
Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Abstract

Multi-fidelity surrogates (MFS) have become a popular way to combine small number of expensive high-fidelity (HF) samples and many cheap low-fidelity (LF) samples. In some situations LF samples can come from multiple sources and sometimes the HF samples alone can obtain a more accurate surrogate than the combination (HF&LF). Therefore this paper considers using maximum likelihood (ML) and cross validation (CV) to select the dataset leading to best surrogate accuracy, when multiple sample sources are available. The kriging and co-kriging techniques were employed to build surrogates. Unlike conventional model selection, the multi-fidelity datasets selection by ML and CV has to compare the surrogate accuracy of different true functions. The effectiveness of ML and CV is examined through a two-variable turbine problem, where samples can come from one HF and two LF models. The indicators were used to select between using only HF samples or combining them with one set of LF samples or the other. The best selection proved to depend on the design of experiments (DOE), and so datasets were generated for a large number of DOEs. It was found the CV and ML worked relatively well in selection between two LF sample sources for MFS. When selecting between only HF and HF&LF, the ML, which is frequently used in co-kriging hyper-parameter estimation, failed in detecting when the surrogate accuracy of only HF was better than HF & LF. The CV was successful only part of the time. The reasons behind the poor performance are analyzed with the help of a 1D example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Abbreviations

R:

Correlation matrix

x :

Design variable

y :

Function values

ρ :

Scaling factor between high- and low-function models

σ 2 :

Variance

d :

discrepancy

H :

High fidelity

L :

Low fidelity

CV:

Cross Validation

DF:

Discrepancy Function

HF:

High Fidelity

LF:

Low Fidelity

LHS:

Latin Hypercube Sampling

Loo-CV:

Leave-one-out Cross Validation

MFS:

Multi-Fidelity Surrogate

ML:

Maximum Likelihood

RMSE:

Root Mean Squared Error

TT:

Transient Rotor Blade model with Time Transformation

Transient:

Full Transient model

References

  • ANSYS, 2010, ANSYS CFX-Solver Theory Guide, Release 13.0. ANSYS Inc., Canonsburg, PA

  • Arlot S, Alain C (2010) A survey of cross validation procedures for model selection. Stat Surv 4:40–79

    Article  MathSciNet  MATH  Google Scholar 

  • Cherry DG, Gay CH, Lenahan DT (1982) Energy efficient engine. Low pressure turbine test hardware detailed design report. NASA CR167956

  • Dixon, SL, Cesare H (2013) Fluid mechanics and thermodynamics of turbomachinery. Elsevier Inc, Butterworth-Heinemann

  • Fernández-Godino MG, Park C, Kim NH, Haftka RT (2016) Review of multi-fidelity models. arXIV preprint arXiv:1609.07196. http://arxiv.org/abs/1609.07196

  • Forrester AIJ, Keane AJ (2009) Recent advances in surrogate-based optimization. Prog Aerosp Sci 45(1–3):50–79

    Article  Google Scholar 

  • Forrester AIJ, Alexander IJ, Sóbester A, Keane AJ (2007) Multi-fidelity optimization via surrogate modeling. Proc R Soc Lond A Math Phys Eng Sci 463(2088):3251–3269

    Article  MathSciNet  MATH  Google Scholar 

  • Hodson HP and Howell RJ. The role of transition in high-lift low-pressure turbines for aeroengines. Prog Aerosp Sci, Vo. 41, No. 6, 2005, pp. 419–454

  • Kennedy MC, O'Hagan A (2000) Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1):1–13

    Article  MathSciNet  MATH  Google Scholar 

  • Liu HT, Ong YS, Cai J (2018a) A survey of adaptive sampling for global metamodeling in support of simulation-based complex engineering design. Struct Multidiscip Optim 57(1):393–416

    Article  Google Scholar 

  • Liu HT, Ong YS, Cai J, Wang Y (2018b) Cope with diverse data structures in multi-fidelity modeling: a Gaussian process method. Eng Appl Artif Intell 67:211–225

    Article  Google Scholar 

  • Lophaven SN, Nielsen HB and Sondergaard J (2002), DACE: A matlab kriging toolbox ,version 2.0, Technical Report IMM-TR-2002-12, Technical University of Denmark, Copenhagen, 2002. http://www2.imm.dtu.dk/projects/dace/dace.pdf

  • Luo JQ, Liu F, McBean I (2015) Turbine blade row optimization through endwall contouring by an adjoint method. J Propuls Power 31:505–518

    Article  Google Scholar 

  • Martin JD, Simpson TW (2005), Use of kriging models to approximate deterministic computer models,AIAA Journal, 43(4): 853-863. https://doi.org/10.2514/1.8650

  • Myers RH, Montgomery DC (2002) Response surface methodology: process and product optimization using designed experiments, 2nd edn. Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., New York

    MATH  Google Scholar 

  • Myung IJ, Mark AP (1997) Applying Occam’s razor in modeling cognition: a Bayesian approach. Psychon Bull Rev 4(1):79–95

    Article  Google Scholar 

  • Namura N, Shimoyama K, Obayashi S (2017) Kriging surrogate model with coordinate transformation based on likelihood and gradient. J Glob Optim 68(4):827–849

    Article  MathSciNet  MATH  Google Scholar 

  • Neath AA, Joseph EC (2012) The Bayesian information criterion: background, derivation, and applications. Wiley Interdisc Rev: Comput Stat 4(2):199–203

    Article  Google Scholar 

  • Park C, Haftka RT, Kim NH (2017) Remarks on multi-fidelity surrogates. Struct Multidiscip Optim 55(3):1–22

    Article  MathSciNet  Google Scholar 

  • Rasmussen CE and Williams CK (2006), Gaussian processes for machine learning, MIT Press, London. http://www.gaussianprocess.org/gpml/

  • Shan SQ, Wang GG (2010) Survey of modeling and optimization strategies to solve high-dimensional design problems with computationally-expensive black-box functions. Struct Multidiscip Optim 41(2):219–241

    Article  MathSciNet  MATH  Google Scholar 

  • Suzen YB, Huang PG (2005) Numerical simulation of unsteady wake/blade interactions in low-pressure turbine flows using an intermittency transport equation. J Turbomach 127(3):431–444

    Article  Google Scholar 

  • Viana FAC, Haftka RT, Steffen V (2009) Multiple surrogates: how cross validation errors can help us to obtain the best predictor. Struct Multidiscip Optim 39(4):439–457

    Article  Google Scholar 

  • Zhang Y, Schutte J, Meeker J, Palliyaguru U, Kim NH, Haftka RT (2017) Predicting B-basis allowable at untested points from experiments and simulations of plates with holes. In: 12th world congress on structural and multidisciplinary optimization, Braunschweig, Germany. URL: https://www.researchgate.net/publication/318909364

Download references

Acknowledgments

The authors would like to express appreciation for the financial support from the China Scholarship Council (201606280218) and the National Natural Science Foundation of China (51676149), for this work under-taken as part of the first author’s Ph.D. project. The authors would also like to gratefully acknowledge the use of facility in Structural& Multidisciplinary Optimization Group of University of Florida (http://www2.mae.ufl.edu/mdo/) in carrying out this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raphael T. Haftka.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

ESM 1

(RAR 5574 kb)

Appendices

Appendix 1: ML criterion for choosing between datasets

The ML of DF in Forrester’s version (Forrester et al. 2007) is the case when the LF function is given. Actually, it can also be used in selection of samples from alternative LF sources, when fitting a surrogate to the limited HF samples. For such case, the LF sources is treated as another hyper-parameter as D L , i.e., we have to choose D L in addition to ρ and θ d in fitting the co-kriging by (4). The corresponding Bayesian posterior probability is formulated as:

$$ P\left({D}_L,\rho, {\boldsymbol{\uptheta}}_d|{\mathbf{y}}_H\right)= Likelihood\left({\mathbf{y}}_H|{\tilde{D}}_L,\tilde{\rho},{\tilde{\boldsymbol{\uptheta}}}_d\right)\cdot Prior\left({D}_L,\rho, {\boldsymbol{\uptheta}}_d\right)/m\left({\mathbf{y}}_H\right) $$
(A.1)

where, m(y H ) is the marginal distribution of the dataset y H , Prior(D L , ρ, θ d ) is the prior probability of the hyper-parameters D L , ρand θ d , the \( Likelihood\left({\mathbf{y}}_{HF}|{\tilde{D}}_L,\tilde{\rho},{\tilde{\boldsymbol{\uptheta}}}_d\right) \) is the likelihood of the co-kriging when \( {\tilde{D}}_L,\tilde{\rho},{\tilde{\boldsymbol{\uptheta}}}_d \) are given, which is actually equal to the formulation of (4). The term m(y H ) is a constant w.r.t. the variation of hyper-parameters D L , ρ, θ d , thus for the purpose of selection with fixed HF samples, m(y H ) can be discarded. Meanwhile, we do not have prior knowledge of the hyper-parameters, so it is defensible to simplify the justification by using the non-informative prior as Prior(D L , ρ, θ d ) = 1 (Neath and Joseph 2012). Hence, the ML of DF in (4) may be still useful for the selection between LF samples coming from alternative LF sources.

Appendix 2: Polynomial smoothing of turbine data

As kriging and co-kriging are sensitive to the data noise, which will also influence the selection of CV and ML, and hence make complicate the problem. Therefore, polynomial regression is employed to smooth the data sets. The RMSE, adjusted R2 (see B.1), and the mean absolute error and the standard deviation (denoted by σ) of polynomial regression (Myers and Montgomery 2002) as well are calculated to inspect the goodness of the dataset.

$$ adjusted\kern0.5em {R}^2=1-\left\{\sum \limits_{i=1}^n{\left({y}_i-{\hat{y}}_i\right)}^2/\left(n-p\right)\right\}/\left\{\sum \limits_{i=1}^n{\left({y}_i-{\overline{y}}_i\right)}^2/\left(n-1\right)\right\} $$
(B.1)

where, n is the number of samples, p is the number of polynomial coefficients, yi and \( {\hat{y}}_i \) are the true and estimated function value of the ith sample, respectively. \( {\overline{y}}_i \) is the mean function value of the samples. In addition, the relative errors of polynomial regressions are also calculated, as dividing the σ by related function range. Table 9 shows the results of cubic polynomials, and Table 10 provides the regression coefficients of cubic polynomials for different flow models. Obviously, the cubic regression can well predict the function trend of different flow models, as the data noises is small.

Table 9 Data smoothing by using cubic polynomial regressions
Table 10 Regression coefficients of the cubic polynomials of turbine stage efficiency (100%-Loss) for different flow models with normalized design variables

Appendix 3: HF Sampling strategy

The HF sampling strategy was devised to prevent poor design of experiments for the turbine problem. It is not intended to serve as a general approach, as it is tailored to the specifics of this 2D problem, where samples were available on a grid. The HF sampling is based on the strategy of nearest neighbor sampling (NNS) (Park et al. 2017). The basic idea of NNS is shown in the lower left of Fig. 12a: First, m HF samples are generated independently by using Latin hypercube sampling (abbreviated as LHS). Second, each LHS sample (circles) is moved to its nearest LF site (squares). When the number of HF samples is smaller or equal to the number of LF values in each dimension (e.g. 4 or 6 HF samples), the NNS strategy is directly used.

Fig. 12
figure 12

HF sampling strategy in case of large number of HF samples

When the number of HF samples is larger, they are generated as follows: First, the HF samples are sequentially generated by NNS in the four shaded subspaces of Fig. 12; Second, some local samples may not meet the criterion of d > 0.2 (seen in Fig. 12b), the violated samples will be moved to its adjacent LF sample location shown by arrows. The objective is to maximize the distance to its neighboring HF samples as max{d1 + d2 + ⋯} will be imposed to optimize the sample locations. Similar fine tuning strategy is also implemented in the case of small number of HF samples when the distance criterion is violated.

When the number of samples is a multiple of 4, e.g.8, the samples can be evenly distributed in the four subspaces. When the number of samples is not a multiple of 4, e.g. 10, we should have 2 in two subspaces and 4 in other two. The specific number in each subspace is determined randomly.

Appendix 4: Accuracy of trapezoidal integration

Tables 11 and 12 shows the comparison results of the Trapezoidal integration-based RMSE. It is calculated by a dense testing grid of 101 × 101points. The latter can be regarded as accurate enough RMSE owing to sufficient testing samples. Table 11 shows that, the RMSE values estimated by Trapezoidal integration are reasonably close to those of 101 × 101 points. Further, Table 12 shows the changing rate of accuracy order by using Trapezoidal integration-based RMSE; clearly the accuracy order estimated by Trapezoidal integration-based RMSE are in well consistent with that of 101 × 101 points, in other words, The Trapezoidal Integration-based RMSE is accurate enough to judge the selection success of CV and ML in multi-fidelity dataset selection.

Table 11 Averaged RMSE and relative errors of Trapezoidal integration-based RMSE w.r.t. that of 101 × 101 points
Table 12 The changing rate of accuracy order by using the Trapezoidal integration-based RMSE w.r.t that of 101 × 101 points

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, Z., Song, L., Park, C. et al. Analysis of dataset selection for multi-fidelity surrogates for a turbine problem. Struct Multidisc Optim 57, 2127–2142 (2018). https://doi.org/10.1007/s00158-018-2001-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00158-018-2001-8

Keywords

Navigation