Abstract
Structural reliability analysis aims at computing failure probability with respect to prescribed performance function. To efficiently estimate the structural failure probability, a novel two-stage meta-model importance sampling based on the support vector machine (SVM) is proposed. Firstly, a quasi-optimal importance sampling density function is approximated by SVM. To construct the SVM model, a multi-point enrichment algorithm allowing adding several training points in each iteration is employed. Then, the augmented failure probability and quasi-optimal importance sampling samples can be obtained by the trained SVM model. Secondly, the current SVM model is further polished by selecting informative training points from the quasi-optimal importance sampling samples until it can accurately recognize the states of samples, and the correction factor is estimated by the well-trained SVM model. Finally, the failure probability is obtained by the product of augmented failure probability and correction factor. The proposed method provides an algorithm to efficiently deal with multiple failure regions and rare events. Several examples are performed to illustrate the feasibility of the proposed method.
Similar content being viewed by others
References
Alibrandi U, Alani AM, Ricciardi G (2015) A new sampling strategy for SVM-based response surface for structural reliability analysis. Probabilistic Eng Mech 41:1–12
Basudhar A (2015) Multi-objective optimization using adaptive explicit non-dominated region sampling. In: 11th world congress on structural and multidisciplinary optimization
Basudhar A, Missoum S (2007) Parallel update of failure domain boundaries constructed using support vector machines. In: 7th World Congress on Structural and Multidisciplinary Optimization, Seoul, Korea
Basudhar A, Missoum S (2008) Adaptive explicit decision functions for probabilistic design and optimization using support vector machines. Comput Struct 86(19–20):1904–1917
Basudhar A, Missoum S (2009) Local update of support vector machine decision boundaries. In: 50th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference 17th AIAA/ASME/AHS Adaptive Structures Conference 11th AIAA No 2009 (p. 2189)
Basudhar A, Missoum S (2010) An improved adaptive sampling scheme for the construction of explicit boundaries. Struct Multidiscip Optim 42(4):517–529
Basudhar A, Missoum S, Sanchez AH (2007) Limit state function identification using support vector machines for discontinuous responses and disjoint failure domains. Probabilistic Eng Mech 23(1):1–11
Basudhar A, Dribusch C, Lacaze S, Missoum S (2012) Constrained efficient global optimization with support vector machines. Struct Multidiscip Optim 46(2):201–221
Basudhar A, Witowski K, Gandikota I (2020) Sequential optimization & probabilistic analysis using adaptively refined constraints in LS-OPT®. 16th International LS-DYNA® Users Conference
Bichon BJ, Eldred MS, Swiler LP, Mahadevan S, McFarland JM (2008) Efficient global reliability analysis for nonlinear implicit performance functions. AIAA J 46:2459–2468
Bourinet JM (2016) Rare-event probability estimation with adaptive support vector regression surrogates. Reliab Eng Syst Saf 150:210–221
Bourinet JM, Deheeger F, Lemaire M (2011) Assessing small failure probabilities by combined subset simulation and support vector machines. Struct Saf 33(6):343–353
Cadini F, Santos F, Zio E (2014) An improved adaptive kriging-based importance technique for sampling multiple failure regions of low probability. Reliab Eng Syst Saf 131:109–117
Cheng K, Lu ZZ (2018) Adaptive sparse polynomial chaos expansions for global sensitivity analysis based on support vector regression. Comput Struct 194:86–96
Cheng K, Lu ZZ, Zhou YC, Shi Y, Wei YH (2017) Global sensitivity analysis using support vector regression. Appl Math Model 49:587–598
Derennes P, Morio J, Simatos F (2019) A nonparametric importance sampling estimator for moment independent importance measures. Reliab Eng Syst Saf 187:3–16
Dubourg V, Sudret B, Deheeger F (2013) Metamodel-based importance sampling for structural reliability analysis. Probabilistic Eng Mech 33:47–57
Echard B, Gayton N, Lemaire M (2011) AK-MCS: an active learning reliability method combining Kriging and Monte Carlo simulation. Struct Saf 33:145–154
Echard B, Gayton N, Lemaire M, Relun N (2013) A combined importance sampling and Kriging reliability method for small failure probabilities with time-demanding numerical methods. Reliab Eng Syst Saf 111:232–240
He W, Zeng Y, Li G (2020) An adaptive polynomial chaos expansion for high-dimensional reliability analysis. Struct Multidiscip Optim. https://doi.org/10.1007/s00158-020-02594-4
Hurtado JE (2004) An examination of methods for approximating implicit limit state functions from the viewpoint of statistical learning theory. Struct Saf 26(3):271–293
Hurtado JE (2007) Filtered importance sampling with support vector margin: a powerful method for structural reliability analysis. Struct Saf 29(1):2–15
Lacaze S, Missoum S (2014) A generalized “max-min” sample for surrogate update. Struct Multidiscip Optim 49(4):683–687
Ling CY, Lu ZZ, Zhu XM (2019) Efficient methods by active learning Kriging coupled with variance reduction based sampling methods for time-dependent failure probability. Reliab Eng Syst Saf 188:23–35
Ling CY, Lu ZZ, Sun B, Wang MJ (2020) An efficient method combining active learning Kriging and Monte Carlo simulation for profust failure probability. Fuzzy Sets Syst 387:89–107
MacKay D (1992) Information-based objective functions for active data selection. Neural Comput 4(4):590–604
Misaka T (2020) Image-based fluid data assimilation with deep neural network. Struct Multidiscip Optim. https://doi.org/10.1007/s00158-020-02537-z
Pan QJ, Dias D (2017) An efficient reliability method combining adaptive support vector machine and Monte Carlo simulation. Struct Saf 67:85–95
Rocco CM, Moreno JA (2002) Fast Monte Carlo reliability evaluation using support vector machine. Reliab Eng Syst Saf 76(3):237–243
Song H, Choi KK, Lee I, Zhao L, Lamb D (2013) Adaptive virtual support vector machine for reliability analysis for high-dimensional problems. Struct Multidiscip Optim 47(4):479–491
Tharwat A (2019) Parameter investigation of support vector machine classifier with kernel functions. Knowl Inf Syst. https://doi.org/10.1007/s10115-019-01335-4
Vapnik VN (2000) The nature of statistical learning theory. Springer Verlag, New York
Vapnik VN (1998) Statistical learning theory. Wiley, New York (1)
Wang ZQ, Wang PF (2014) A maximum confidence enhancement based sequential sampling scheme for simulation-based design. J Mech Des 136:021006–021001
Wang ZQ, Wang PF (2016) Accelerated failure identification sampling for probability analysis of rare events. Struct Multidiscip Optim 54(1):137–149
Xing J, Luo Y, Gao Z (2020) A global optimization strategy based on the Kriging surrogate model and parallel computing. Struct Multidiscip Optim 62:405–417
Funding
This work was supported by the National Natural Science Foundation of China (Grant no. NSFC 52075442), and National Science and Technology Major Project (Grant no. 2017-IV-0009-0046).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This manuscript is approved by all authors for publication. We would like to declare that the work described was an original research that has not been published previously, and not under consideration for publication elsewhere, in whole or in part.
Replication of results
The MATLAB codes used to generate the results are available in the Supplementary information.
Additional information
Responsible Editor: Erdem Acar
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
ESM 1
(RAR 1 kb)
Appendices
Appendix 1. Support vector machine
This section illustrates the detailed knowledge of SVM, which includes the linear SVM, nonlinear SVM, and imperfect SVM. They are shown as follows.
1.1 Linear SVM
Given a two-class problem, suppose we have Nt sets of labeled training datum \( \left\{{\mathbf{x}}_i^t,{y}_i^t\right\}\left(i=1,2,\cdots, {N}_t\right) \), where \( {\mathbf{x}}_i^t\in {R}^n \) is the training sample of inputs and \( {y}_i^t\in \left\{-1,+1\right\} \) is the sign of \( {\mathbf{x}}_i^t \). SVM aims at searching an optimal hyperplane which is a decision function (also named as separating function) for which all vectors labeled as “− 1” are located on one side and all vectors labeled as “+ 1” on the other side. The optimal hyperplane is the one that has the largest distance to the nearest training samples of any class (maximum margin).
Considering a possible hyperplane (the SVM decision boundary) as (28), which divides the space into two spaces (just as shown in Fig. 17): (1) positive half space where the samples from the positive class (+ 1) are located and (2) negative half space where the samples from the negative class (− 1) are located (Tharwat 2019).
in which the weight vector w is perpendicular to the hyperplane and b is a scalar parameter which represents the bias or threshold.
The goal of SVM is to determine the values of w and b to orient the hyperplane to be as far as possible from the closest samples. Two hyperplanes (H1 and H2) parallel to decision boundary H are shown as follows,
There are no data points between H1 and H2. Let d+ (d−) be the shortest distance from the decision boundary to the closest positive (negative) point. The distance between H1 and H2 (the margin) is d+ + d−, and d+ = d− = 1/‖w‖, thus the margin is 2/‖w‖.
All training points \( \left\{{\mathbf{x}}_i^t,{y}_i^t\right\}\left(i=1,2,\cdots, {N}_t\right) \) should satisfy the following constrains,
This constraint ensures that there is no sample existing inside the margin. Therefore, determining the optimal hyperplane with maximum margin is equivalently reduced to find the pair of hyperplanes that give the maximum margin,
Introducing Lagrange multipliers αi ≥ 0(i = 1, 2, ⋯, Nt), a Lagrangian function for the above optimization problem can be defined,
Now, L(w, b, α) must be minimized with respect to w and b with the condition that the derivatives of L(w, b, α) with respect to all the αi ≥ 0(i = 1, 2, ⋯, Nt) that vanished. The constrains associated with the gradient give the following conditions,
Substituting (33) for (32), the Wolfe dual formulation is obtained,
When the maximal margin hyperplanes (H1 and H2) are found, only those sample points which lie closest to the decision boundary (H) satisfy αi > 0, and these points are termed as support vectors (just as the “*” points shown in Fig. 17). That is to say, the Lagrange multipliers associated with the support vectors are positive while the other samples have Lagrange multipliers equal to zero. A SVM model trained using only the support vectors is identical as the one obtained using all the data samples. Typically, the number of support vectors is much smaller than Nt (Basudhar and Missoum 2010).
The classification of any arbitrary point x to be predicted is determined by the following function,
where s(⋅) is the symbolic function, \( {\mathbf{x}}_j^{\ast}\left(j=1,2,\cdots, {N}_{\mathrm{SV}}\right) \) are NSV support vectors, \( {y}_j^{\ast}\left(j=1,2,\cdots, {N}_{\mathrm{SV}}\right) \) are the signs of \( {\mathbf{x}}_j^{\ast}\left(j=1,2,\cdots, {N}_{\mathrm{SV}}\right) \). \( {\alpha}_j^{\ast}\left(j=1,2,\cdots, {N}_{\mathrm{SV}}\right) \) represent the Lagrange multipliers corresponding to the support vectors.
For the primal L(w, b, α), the Karush-Kuhn-Tucker (KKT) conditions are
The KKT conditions are necessary and sufficient for w, b and α to be optimal solutions. It is noted that, while w is explicitly determined by the training procedure, the threshold b is found by using the KKT conditions.
1.2 Nonlinear SVM
This section presents how SVM works in the case of nonlinearly separable samples. The main idea is to map the input data into a higher-dimensional feature space where the problem is linearly separable, just as shown in Fig. 18.
Denote the nonlinear mapping function as Φ(⋅), the Lagrangian function in the higher-dimensional feature space is
Suppose \( \varPsi \left({\mathbf{x}}_i^t,{\mathbf{x}}_j^t\right)=\varPhi \left({\mathbf{x}}_i^t\right)\cdot \varPhi \left({\mathbf{x}}_j^t\right) \), i.e., the dot product in the higher-dimensional feature space defines a kernel function of the input space. Therefore, it is not necessary to be explicit about the mapping function Φ(⋅) as long as it is known that the kernel function \( \varPsi \left({\mathbf{x}}_i^t,{\mathbf{x}}_j^t\right) \) corresponds to a dot product in some higher-dimensional feature space.
There are many kernel functions that can be used, for example,
-
(1)
Linear kernel \( \varPsi \left({\mathbf{x}}_i^t,{\mathbf{x}}_j^t\right)={\left({x}_i^t\right)}^{\mathrm{T}}{\mathbf{x}}_j^t \)
-
(2)
Polynomial kernel \( \varPsi \left({\mathbf{x}}_i^t,{\mathbf{x}}_j^t\right)={\left({\left({\mathbf{x}}_i^t\right)}^{\mathrm{T}}{\mathbf{x}}_j^t+1\right)}^q \)
-
(3)
Gaussian kernel \( \varPsi \left({\mathbf{x}}_i^t,{\mathbf{x}}_j^t\right)=\exp \left(-\gamma {\left\Vert {\mathbf{x}}_i^t-{\mathbf{x}}_j^t\right\Vert}^2\right) \)
where q and γ are the parameters needed to be decided. The linear kernel is suitable for linear classification, whereas the polynomial and Gaussian kernels are applicable to the nonlinear classification. It must be observed that the nature of the optimization problem in the SVM makes the problem kernel-insensitive. In empirical applications in very high dimensional spaces, it has been found that a large part of the support vectors determined with different kernels coincides (Hurtado 2004; Vapnik 2000).
The prediction of classification of a point x is then expressed as follows,
1.3 Imperfect SVM
SVM can be extended to allow for imperfect separation. That is data between H1 and H2 can be penalized. The penalty P will be finite.
Introduce the nonnegative slack variables ζi ≥ 0 so that
and add to the objective function in (31) a penalizing term, the problem is now formulated as
Use the Lagrange multipliers and the Wolfe dual formulation, the problem is shown as follows (Rocco and Moreno 2002),
The only difference from the perfect separation case is that αi(i = 1, 2, ⋯, Nt) are now bounded above by P. The soft margin parameter P permits the misclassification and should be specified by the user. Increasing P generates a stricter separation between classes. If we reduce P towards 0, it makes misclassification less important, in contrast, if we increase P to infinity, it means no misclassification is allowed.
In summary, there are two parameters that should be tuned by the users when use the SVM, i.e., the penalty P which controls the trade-off between minimizing the training error and maximizing the classification margin, and the kernel parameter which determines the distances between patterns into the new space, dimensions of the new space, and the complexity of the classification model.
Appendix 2. Calculation of the variation coefficient
The estimator in (18) is defined as the product of two unbiased independent estimators. The calculation of the variation coefficient of the final estimator \( \hat{P}\left\{F\right\}={\hat{P}}_{\varepsilon}\left\{F\right\}{\hat{\alpha}}_C \) proceeds as follows.
First of all, according to its definition, the variance reads
Since the two estimators \( {\hat{P}}_{\varepsilon}\left\{F\right\} \) and \( {\hat{\alpha}}_C \) are independent, the variance also reads
According to the Konig-Huyghens theorem, \( \mathrm{Var}\left[\hat{P}\left\{F\right\}\right] \) can be further elaborated
Taking advantage of the unbiasedness of the estimators, we can obtain the following result,
Then, the variation coefficient of the estimator is expressed as follows,
In practice, usual target variation coefficient is smaller than 10% so that
Rights and permissions
About this article
Cite this article
Ling, C., Lu, Z. Support vector machine-based importance sampling for rare event estimation. Struct Multidisc Optim 63, 1609–1631 (2021). https://doi.org/10.1007/s00158-020-02809-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-020-02809-8