Analysis of dataset selection for multi-fidelity surrogates for a turbine problem

Guo, Zhendong; Song, Liming; Park, Chanyoung; Li, Jun; Haftka, Raphael T.

doi:10.1007/s00158-018-2001-8

Analysis of dataset selection for multi-fidelity surrogates for a turbine problem

RESEARCH PAPER
Published: 25 May 2018

Volume 57, pages 2127–2142, (2018)
Cite this article

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Zhendong Guo¹,
Liming Song¹,
Chanyoung Park²,
Jun Li¹ &
…
Raphael T. Haftka ORCID: orcid.org/0000-0003-0417-6911²

980 Accesses
43 Citations
Explore all metrics

Abstract

Multi-fidelity surrogates (MFS) have become a popular way to combine small number of expensive high-fidelity (HF) samples and many cheap low-fidelity (LF) samples. In some situations LF samples can come from multiple sources and sometimes the HF samples alone can obtain a more accurate surrogate than the combination (HF&LF). Therefore this paper considers using maximum likelihood (ML) and cross validation (CV) to select the dataset leading to best surrogate accuracy, when multiple sample sources are available. The kriging and co-kriging techniques were employed to build surrogates. Unlike conventional model selection, the multi-fidelity datasets selection by ML and CV has to compare the surrogate accuracy of different true functions. The effectiveness of ML and CV is examined through a two-variable turbine problem, where samples can come from one HF and two LF models. The indicators were used to select between using only HF samples or combining them with one set of LF samples or the other. The best selection proved to depend on the design of experiments (DOE), and so datasets were generated for a large number of DOEs. It was found the CV and ML worked relatively well in selection between two LF sample sources for MFS. When selecting between only HF and HF&LF, the ML, which is frequently used in co-kriging hyper-parameter estimation, failed in detecting when the surrogate accuracy of only HF was better than HF & LF. The CV was successful only part of the time. The reasons behind the poor performance are analyzed with the help of a 1D example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative analysis of machine learning algorithms for predicting wave runup

Article Open access 18 December 2023

Principle Parameters and Environmental Impacts that Affect the Performance of Wind Turbine: An Overview

Article 18 November 2021

Artificial Intelligence, Machine Learning, and Deep Learning in Structural Engineering: A Scientometrics Review of Trends and Best Practices

Article 24 July 2022

Abbreviations

R:: Correlation matrix
x :: Design variable
y :: Function values
ρ :: Scaling factor between high- and low-function models
σ ² :: Variance
d :: discrepancy
H :: High fidelity
L :: Low fidelity
CV:: Cross Validation
DF:: Discrepancy Function
HF:: High Fidelity
LF:: Low Fidelity
LHS:: Latin Hypercube Sampling
Loo-CV:: Leave-one-out Cross Validation
MFS:: Multi-Fidelity Surrogate
ML:: Maximum Likelihood
RMSE:: Root Mean Squared Error
TT:: Transient Rotor Blade model with Time Transformation
Transient:: Full Transient model

References

ANSYS, 2010, ANSYS CFX-Solver Theory Guide, Release 13.0. ANSYS Inc., Canonsburg, PA
Arlot S, Alain C (2010) A survey of cross validation procedures for model selection. Stat Surv 4:40–79
Article MathSciNet MATH Google Scholar
Cherry DG, Gay CH, Lenahan DT (1982) Energy efficient engine. Low pressure turbine test hardware detailed design report. NASA CR167956
Dixon, SL, Cesare H (2013) Fluid mechanics and thermodynamics of turbomachinery. Elsevier Inc, Butterworth-Heinemann
Fernández-Godino MG, Park C, Kim NH, Haftka RT (2016) Review of multi-fidelity models. arXIV preprint arXiv:1609.07196. http://arxiv.org/abs/1609.07196
Forrester AIJ, Keane AJ (2009) Recent advances in surrogate-based optimization. Prog Aerosp Sci 45(1–3):50–79
Article Google Scholar
Forrester AIJ, Alexander IJ, Sóbester A, Keane AJ (2007) Multi-fidelity optimization via surrogate modeling. Proc R Soc Lond A Math Phys Eng Sci 463(2088):3251–3269
Article MathSciNet MATH Google Scholar
Hodson HP and Howell RJ. The role of transition in high-lift low-pressure turbines for aeroengines. Prog Aerosp Sci, Vo. 41, No. 6, 2005, pp. 419–454
Kennedy MC, O'Hagan A (2000) Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1):1–13
Article MathSciNet MATH Google Scholar
Liu HT, Ong YS, Cai J (2018a) A survey of adaptive sampling for global metamodeling in support of simulation-based complex engineering design. Struct Multidiscip Optim 57(1):393–416
Article Google Scholar
Liu HT, Ong YS, Cai J, Wang Y (2018b) Cope with diverse data structures in multi-fidelity modeling: a Gaussian process method. Eng Appl Artif Intell 67:211–225
Article Google Scholar
Lophaven SN, Nielsen HB and Sondergaard J (2002), DACE: A matlab kriging toolbox ,version 2.0, Technical Report IMM-TR-2002-12, Technical University of Denmark, Copenhagen, 2002. http://www2.imm.dtu.dk/projects/dace/dace.pdf
Luo JQ, Liu F, McBean I (2015) Turbine blade row optimization through endwall contouring by an adjoint method. J Propuls Power 31:505–518
Article Google Scholar
Martin JD, Simpson TW (2005), Use of kriging models to approximate deterministic computer models,AIAA Journal, 43(4): 853-863. https://doi.org/10.2514/1.8650
Myers RH, Montgomery DC (2002) Response surface methodology: process and product optimization using designed experiments, 2nd edn. Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., New York
MATH Google Scholar
Myung IJ, Mark AP (1997) Applying Occam’s razor in modeling cognition: a Bayesian approach. Psychon Bull Rev 4(1):79–95
Article Google Scholar
Namura N, Shimoyama K, Obayashi S (2017) Kriging surrogate model with coordinate transformation based on likelihood and gradient. J Glob Optim 68(4):827–849
Article MathSciNet MATH Google Scholar
Neath AA, Joseph EC (2012) The Bayesian information criterion: background, derivation, and applications. Wiley Interdisc Rev: Comput Stat 4(2):199–203
Article Google Scholar
Park C, Haftka RT, Kim NH (2017) Remarks on multi-fidelity surrogates. Struct Multidiscip Optim 55(3):1–22
Article MathSciNet Google Scholar
Rasmussen CE and Williams CK (2006), Gaussian processes for machine learning, MIT Press, London. http://www.gaussianprocess.org/gpml/
Shan SQ, Wang GG (2010) Survey of modeling and optimization strategies to solve high-dimensional design problems with computationally-expensive black-box functions. Struct Multidiscip Optim 41(2):219–241
Article MathSciNet MATH Google Scholar
Suzen YB, Huang PG (2005) Numerical simulation of unsteady wake/blade interactions in low-pressure turbine flows using an intermittency transport equation. J Turbomach 127(3):431–444
Article Google Scholar
Viana FAC, Haftka RT, Steffen V (2009) Multiple surrogates: how cross validation errors can help us to obtain the best predictor. Struct Multidiscip Optim 39(4):439–457
Article Google Scholar
Zhang Y, Schutte J, Meeker J, Palliyaguru U, Kim NH, Haftka RT (2017) Predicting B-basis allowable at untested points from experiments and simulations of plates with holes. In: 12th world congress on structural and multidisciplinary optimization, Braunschweig, Germany. URL: https://www.researchgate.net/publication/318909364

Download references

Acknowledgments

The authors would like to express appreciation for the financial support from the China Scholarship Council (201606280218) and the National Natural Science Foundation of China (51676149), for this work under-taken as part of the first author’s Ph.D. project. The authors would also like to gratefully acknowledge the use of facility in Structural& Multidisciplinary Optimization Group of University of Florida (http://www2.mae.ufl.edu/mdo/) in carrying out this work.

Author information

Authors and Affiliations

Xi’an Jiaotong University, No.28, Xianning West Road, Xi’an, 710049, China
Zhendong Guo, Liming Song & Jun Li
Department of Mechanical and Aerospace Engineering, University of Florida, Gainesville, FL, 32611-6250, USA
Chanyoung Park & Raphael T. Haftka

Authors

Zhendong Guo
View author publications
You can also search for this author in PubMed Google Scholar
Liming Song
View author publications
You can also search for this author in PubMed Google Scholar
Chanyoung Park
View author publications
You can also search for this author in PubMed Google Scholar
Jun Li
View author publications
You can also search for this author in PubMed Google Scholar
Raphael T. Haftka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raphael T. Haftka.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

ESM 1

(RAR 5574 kb)

Appendices

Appendix 1: ML criterion for choosing between datasets

The ML of DF in Forrester’s version (Forrester et al. 2007) is the case when the LF function is given. Actually, it can also be used in selection of samples from alternative LF sources, when fitting a surrogate to the limited HF samples. For such case, the LF sources is treated as another hyper-parameter as D_L, i.e., we have to choose D_L in addition to ρ and θ_d in fitting the co-kriging by (4). The corresponding Bayesian posterior probability is formulated as:

$$ P\left({D}_L,\rho, {\boldsymbol{\uptheta}}_d|{\mathbf{y}}_H\right)= Likelihood\left({\mathbf{y}}_H|{\tilde{D}}_L,\tilde{\rho},{\tilde{\boldsymbol{\uptheta}}}_d\right)\cdot Prior\left({D}_L,\rho, {\boldsymbol{\uptheta}}_d\right)/m\left({\mathbf{y}}_H\right) $$

(A.1)

where, m(y_H) is the marginal distribution of the dataset y_H, Prior(D_L, ρ, θ_d) is the prior probability of the hyper-parameters D_L, ρand θ_d, the $ Likelihood\left({\mathbf{y}}_{HF}|{\tilde{D}}_L,\tilde{\rho},{\tilde{\boldsymbol{\uptheta}}}_d\right) $ is the likelihood of the co-kriging when $ {\tilde{D}}_L,\tilde{\rho},{\tilde{\boldsymbol{\uptheta}}}_d $ are given, which is actually equal to the formulation of (4). The term m(y_H) is a constant w.r.t. the variation of hyper-parameters D_L, ρ, θ_d, thus for the purpose of selection with fixed HF samples, m(y_H) can be discarded. Meanwhile, we do not have prior knowledge of the hyper-parameters, so it is defensible to simplify the justification by using the non-informative prior as Prior(D_L, ρ, θ_d) = 1 (Neath and Joseph 2012). Hence, the ML of DF in (4) may be still useful for the selection between LF samples coming from alternative LF sources.

Appendix 2: Polynomial smoothing of turbine data

As kriging and co-kriging are sensitive to the data noise, which will also influence the selection of CV and ML, and hence make complicate the problem. Therefore, polynomial regression is employed to smooth the data sets. The RMSE, adjusted R² (see B.1), and the mean absolute error and the standard deviation (denoted by σ) of polynomial regression (Myers and Montgomery 2002) as well are calculated to inspect the goodness of the dataset.

$$ adjusted\kern0.5em {R}^2=1-\left\{\sum \limits_{i=1}^n{\left({y}_i-{\hat{y}}_i\right)}^2/\left(n-p\right)\right\}/\left\{\sum \limits_{i=1}^n{\left({y}_i-{\overline{y}}_i\right)}^2/\left(n-1\right)\right\} $$

(B.1)

where, n is the number of samples, p is the number of polynomial coefficients, y_i and $ {\hat{y}}_i $ are the true and estimated function value of the ith sample, respectively. $ {\overline{y}}_i $ is the mean function value of the samples. In addition, the relative errors of polynomial regressions are also calculated, as dividing the σ by related function range. Table 9 shows the results of cubic polynomials, and Table 10 provides the regression coefficients of cubic polynomials for different flow models. Obviously, the cubic regression can well predict the function trend of different flow models, as the data noises is small.

Table 9 Data smoothing by using cubic polynomial regressions

Full size table

Table 10 Regression coefficients of the cubic polynomials of turbine stage efficiency (100%-Loss) for different flow models with normalized design variables

Full size table

Appendix 3: HF Sampling strategy

The HF sampling strategy was devised to prevent poor design of experiments for the turbine problem. It is not intended to serve as a general approach, as it is tailored to the specifics of this 2D problem, where samples were available on a grid. The HF sampling is based on the strategy of nearest neighbor sampling (NNS) (Park et al. 2017). The basic idea of NNS is shown in the lower left of Fig. 12a: First, m HF samples are generated independently by using Latin hypercube sampling (abbreviated as LHS). Second, each LHS sample (circles) is moved to its nearest LF site (squares). When the number of HF samples is smaller or equal to the number of LF values in each dimension (e.g. 4 or 6 HF samples), the NNS strategy is directly used.

When the number of HF samples is larger, they are generated as follows: First, the HF samples are sequentially generated by NNS in the four shaded subspaces of Fig. 12; Second, some local samples may not meet the criterion of d > 0.2 (seen in Fig. 12b), the violated samples will be moved to its adjacent LF sample location shown by arrows. The objective is to maximize the distance to its neighboring HF samples as max{d₁ + d₂ + ⋯} will be imposed to optimize the sample locations. Similar fine tuning strategy is also implemented in the case of small number of HF samples when the distance criterion is violated.

When the number of samples is a multiple of 4, e.g.8, the samples can be evenly distributed in the four subspaces. When the number of samples is not a multiple of 4, e.g. 10, we should have 2 in two subspaces and 4 in other two. The specific number in each subspace is determined randomly.

Appendix 4: Accuracy of trapezoidal integration

Tables 11 and 12 shows the comparison results of the Trapezoidal integration-based RMSE. It is calculated by a dense testing grid of 101 × 101points. The latter can be regarded as accurate enough RMSE owing to sufficient testing samples. Table 11 shows that, the RMSE values estimated by Trapezoidal integration are reasonably close to those of 101 × 101 points. Further, Table 12 shows the changing rate of accuracy order by using Trapezoidal integration-based RMSE; clearly the accuracy order estimated by Trapezoidal integration-based RMSE are in well consistent with that of 101 × 101 points, in other words, The Trapezoidal Integration-based RMSE is accurate enough to judge the selection success of CV and ML in multi-fidelity dataset selection.

Table 11 Averaged RMSE and relative errors of Trapezoidal integration-based RMSE w.r.t. that of 101 × 101 points

Full size table

Table 12 The changing rate of accuracy order by using the Trapezoidal integration-based RMSE w.r.t that of 101 × 101 points

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, Z., Song, L., Park, C. et al. Analysis of dataset selection for multi-fidelity surrogates for a turbine problem. Struct Multidisc Optim 57, 2127–2142 (2018). https://doi.org/10.1007/s00158-018-2001-8

Download citation

Received: 28 December 2017
Revised: 20 April 2018
Accepted: 01 May 2018
Published: 25 May 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s00158-018-2001-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of dataset selection for multi-fidelity surrogates for a turbine problem

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of machine learning algorithms for predicting wave runup

Principle Parameters and Environmental Impacts that Affect the Performance of Wind Turbine: An Overview

Artificial Intelligence, Machine Learning, and Deep Learning in Structural Engineering: A Scientometrics Review of Trends and Best Practices

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

ESM 1

Appendices

Appendix 1: ML criterion for choosing between datasets

Appendix 2: Polynomial smoothing of turbine data

Appendix 3: HF Sampling strategy

Appendix 4: Accuracy of trapezoidal integration

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysis of dataset selection for multi-fidelity surrogates for a turbine problem

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of machine learning algorithms for predicting wave runup

Principle Parameters and Environmental Impacts that Affect the Performance of Wind Turbine: An Overview

Artificial Intelligence, Machine Learning, and Deep Learning in Structural Engineering: A Scientometrics Review of Trends and Best Practices

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

ESM 1

Appendices

Appendix 1: ML criterion for choosing between datasets

Appendix 2: Polynomial smoothing of turbine data

Appendix 3: HF Sampling strategy

Appendix 4: Accuracy of trapezoidal integration

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation