Skip to main content
Log in

Uncertainty quantification: a minimum variance unbiased (joint) estimator of the non-normalized Sobol’ indices

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Often, uncertainty quantification is followed by the computation of sensitivity indices of input factors. Variance-based sensitivity analysis and multivariate sensitivity analysis (MSA) aim to apportion the variability of the model output(s) into input factors and their interactions. Sobol’ indices (first-order and total indices), which quantify the effects of input factor(s), serve as a practical tool to assess interactions among input factors, the order of interactions, and the magnitude of interactions. In this paper, we investigate a novel way of estimating both the first-order and total indices based on U-statistics, including the statistical properties of the new estimator. First, we provide a minimum variance unbiased estimator of the non-normalized Sobol’ indices as well as its optimal rate of convergence and its asymptotic distribution. Second, we derive a joint estimator of Sobol’ indices, its consistency and its asymptotic distribution, and third, we demonstrate the applicability of these results by means of numerical tests. The new estimator allows for improving the estimation of Sobol’ indices for some degrees of the kernel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Antoniadis A, Pasanisi A (2012) Modeling of computer experiments for uncertainty propagation and sensitivity analysis. Stat Comput 22(3):677–679. https://doi.org/10.1007/s11222-011-9282-8

    Article  MathSciNet  MATH  Google Scholar 

  • Bolado-Lavin R, Castaings W, Tarantola S (2009) Contribution to the sample mean plot for graphical and numerical sensitivity analysis. Reliab Eng Syst Saf 94(6):1041–1049

    Article  Google Scholar 

  • Borgonovo E, Tarantola S, Plischke E, Morris MD (2014) Transformations and invariance in the sensitivity analysis of computer experiments. J R Stat Soc Ser B 76(5):925–947

    Article  MathSciNet  Google Scholar 

  • Buzzard GT (2012) Global sensitivity analysis using sparse grid interpolation and polynomial chaos. Reliab Eng Syst Saf 107:82–89

    Article  Google Scholar 

  • Caflisch RE, Morokoff W, Owen AB (1997) Valuation of mortgage backed securities using brownian bridges to reduce effective dimension. J Comput Financ 1:27–46

    Article  Google Scholar 

  • Chan K, Saltelli A, Tarantola S (2000) Winding stairs: a sampling tool to compute sensitivity indices. Stat Comput 10(3):187–196. https://doi.org/10.1023/A:1008950625967

    Article  Google Scholar 

  • Conti S, O’Hagan A (2010) Bayesian emulation of complex multi-output and dynamic computer models. J Stat Plan Inference 140(3):640–651

    Article  MathSciNet  Google Scholar 

  • Dutang C, Savicky P (2013) randtoolbox: generating and testing random numbers. R package version 1:13

  • Fang S, Gertner GZ, Shinkareva S, Wang G, Anderson A (2003) Improved generalized Fourier amplitude sensitivity test (FAST) for model assessment. Stat Comput 13(3):221–226. https://doi.org/10.1023/A:1024266632666

    Article  MathSciNet  Google Scholar 

  • Ferguson TS (1996) A course in large sample theory. Chapman-Hall, New York

    Book  Google Scholar 

  • Gamboa F, Janon A, Klein T, Lagnoux A (2014) Sensitivity indices for multivariate outputs. Comptes Rendus de l’Académie des Sciences 351(7–8):307–310

    MathSciNet  MATH  Google Scholar 

  • Ghanem R, Higdon D, Owhadi H (2017) Handbook of uncertainty quantification. Springer, Cham

    Book  Google Scholar 

  • Hoeffding W (1948a) A class of statistics with asymptotically normal distribution. Ann Math Stat 19:293–325

    Article  MathSciNet  Google Scholar 

  • Hoeffding W (1948b) A non-parametric test for independence. Ann Math Stat 19:546–557

    Article  MathSciNet  Google Scholar 

  • Homma T, Saltelli A (1996) Importance measures in global sensitivity analysis of nonlinear models. Reliab Eng Syst Saf 52:1–17

    Article  Google Scholar 

  • Jansen MJW (1999) Analysis of variance designs for model output. Comput Phys Commun 117:35–43

    Article  Google Scholar 

  • Jourdan A (2012) Global sensitivity analysis using complex linear models. Stat Comput 22(3):823–831. https://doi.org/10.1007/s11222-011-9239-y

    Article  MathSciNet  MATH  Google Scholar 

  • Kucherenko S, Rodriguez-Fernandez M, Pantelides C, Shah N (2009) Monte Carlo evaluation of derivative-based global sensitivity measures. Reliab Eng Syst Saf 94:1135–1148

    Article  Google Scholar 

  • Kucherenko S, Feil B, Shah N, Mauntz W (2011) The identification of model effective dimensions using global sensitivity analysis. Reliab Eng Syst Saf 96(4):440–449

    Article  Google Scholar 

  • Kucherenko S, Tarantola S, Annoni P (2012) Estimation of global sensitivity indices for models with dependent variables. Comput Phys Commun 183(4):937–946

    Article  MathSciNet  Google Scholar 

  • Kucherenko S, Delpuech B, Iooss B, Tarantola S (2015) Application of the control variate technique to estimation of total sensitivity indices. Reliab Eng Syst Saf 134:251–259

    Article  Google Scholar 

  • Lamboni M (2016a) Global sensitivity analysis: a generalized, unbiased and optimal estimator of total-effect variance. Stat Pap 59(1):361–386. https://doi.org/10.1007/s00362-016-0768-5

    Article  MathSciNet  MATH  Google Scholar 

  • Lamboni M (2016b) Global sensitivity analysis: an efficient numerical method for approximating the total sensitivity index. Int J Uncertain Quant (accepted)

  • Lamboni M, Makowski D, Monod H (2008) Multivariate global sensitivity analysis for discrete-time models. Rapport technique 2008-3. INRA, UR341 Mathématiques et Informatique Appliquées, Jouy-en-Josas, France

  • Lamboni M, Makowski D, Lehuger S, Gabrielle B, Monod H (2009) Multivariate global sensitivity analysis for dynamic crop models. Fields Crop Res 113:312–320

    Article  Google Scholar 

  • Lamboni M, Monod H, Makowski D (2011) Multivariate sensitivity analysis to measure global contribution of input factors in dynamic models. Reliab Eng Syst Saf 96:450–459

    Article  Google Scholar 

  • Lamboni M, Iooss B, Popelin AL, Gamboa F (2013) Derivative-based global sensitivity measures: general links with Sobol’ indices and numerical tests. Math Comput Simul 87:45–54

    Article  MathSciNet  Google Scholar 

  • Lehmann EL (1951) Consistency and unbiasedness of certain nonparametric tests. Ann Math Stat 22:165–179

    Article  MathSciNet  Google Scholar 

  • Lehmann EL (1999) Elements of large sample theory. Springer, New York

    Book  Google Scholar 

  • Mara TA, Joseph OR (2008) Comparison of some efficient methods to evaluate the main effect of computer model factors. J Stat Comput Simul 78(2):167–178. https://doi.org/10.1080/10629360600964454

    Article  MathSciNet  MATH  Google Scholar 

  • Mara TA, Tarantola S (2012) Variance-based sensitivity indices for models with dependent inputs. Reliab Eng Syst Saf 107:115–121

    Article  Google Scholar 

  • Mara TA, Tarantola S, Annoni P (2015) Non-parametric methods for global sensitivity analysis of model output with dependent inputs. Environ Model Softw 72:173–183

    Article  Google Scholar 

  • Marrel A, Iooss B, Da Veiga S, Ribatet M (2012) Global sensitivity analysis of stochastic computer models with joint metamodels. Stat Comput 22(3):833–847. https://doi.org/10.1007/s11222-011-9274-8

    Article  MathSciNet  MATH  Google Scholar 

  • Muehlenstaedt T, Roustant O, Carraro L, Kuhnt S (2012) Data-driven kriging models based on FANOVA-decomposition. Stat Comput 22(3):723–738. https://doi.org/10.1007/s11222-011-9259-7

    Article  MathSciNet  MATH  Google Scholar 

  • Oakley JE, O’Hagan A (2004) Probabilistic sensitivity analysis of complex models: a Bayesian approach. J R Stat Soc Ser B (Stat Methodol) 66(3):751–769

    Article  MathSciNet  Google Scholar 

  • Owen AB (2013a) Better estimation of small Sobol’ sensitivity indices. ACM Trans Model Comput Simul 23:111–1117

    Article  MathSciNet  Google Scholar 

  • Owen AB (2013b) Variance components and generalized Sobol’ indices. SIAM/ASA J Uncertain Quant 1(1):19–41

    Article  MathSciNet  Google Scholar 

  • Plischke E, Borgonovo E, Smith CL (2013) Global sensitivity measures from given data. Eur J Oper Res 226(3):536–550

    Article  MathSciNet  Google Scholar 

  • Pujol G, Iooss B, Janon A (2013) sensitivity: sensitivity analysis. R package version 1:7

  • Rao CR, Kleffe J (1988) Estimation of variance components and applications. North Holland, Amsterdam

    MATH  Google Scholar 

  • Ratto M, Pagano A (2010) Using recursive algorithms for the efficient identification of smoothing spline anova models. AStA Adv Stat Anal 94(4):367–388

    Article  MathSciNet  Google Scholar 

  • Ratto M, Pagano A, Young P (2007) State dependent parameter metamodelling and sensitivity analysis. Comput Phys Commun 177(11):863–876

    Article  Google Scholar 

  • Saltelli A (2002) Making best use of model evaluations to compute sensitivity indices. Comput Phys Commun 145:280–297

    Article  Google Scholar 

  • Saltelli A, Tarantola S, Chan K (1999) Quantitative model independent methods for global sensitivity analysis of model output. Technometrics 41:39–56

    Article  Google Scholar 

  • Saltelli A, Chan K, Scott E (2000) Variance-based methods. Probability and statistics. Wiley, New York

    Google Scholar 

  • Saltelli A, Annoni P, Azzini I, Campolongo F, Ratto M, Tarantola S (2010) Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Comput Phys Commun 181(2):259–270

    Article  MathSciNet  Google Scholar 

  • Sobol IM (1993) Sensitivity analysis for non-linear mathematical models. Math Model Comput Exp 1:407–414

    MATH  Google Scholar 

  • Sobol IM (2001) Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math Comput Simul 55:271–280

    Article  MathSciNet  Google Scholar 

  • Sobol IM (2007) Global sensitivity analysis indices for the investigation of nonlinear mathematical models. Matematicheskoe Modelirovanie 19:23–24

    MathSciNet  MATH  Google Scholar 

  • Sobol IM, Kucherenko S (2009) Derivative based global sensitivity measures and the link with global sensitivity indices. Math Comput Simul 79:3009–3017

    Article  MathSciNet  Google Scholar 

  • Storlie CB, Swiler LP, Helton JC, Sallaberry CJ (2009) Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models. Reliab Eng Syst Saf 94(11):1735–1763

    Article  Google Scholar 

  • Sudret B (2008) Global sensitivity analysis using polynomial chaos expansions. Reliab Eng Syst Saf 93(7):964–979

    Article  Google Scholar 

  • Sugiura N (1965) Multisample and multivariate nonparametric tests based on \(u\) statistics and their asymptotic efficiencies. Osaka J Math 2(2):385–426. http://projecteuclid.org/euclid.ojm/1200691466

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matieyendou Lamboni.

Appendices

Appendix A: An useful summary of the U-statistics theory

This section presents the main theorems about U-statistics used in this paper. While we consider only three independent samples, the general version of the theorems can be found in Ferguson (1996), Hoeffding (1948a, b) and Sugiura (1965).

Let \((X_1, \ldots , X_{n_1})\), \((Y_1, \ldots , Y_{n_2})\), and \((Z_1, \ldots , Z_{n_3})\) be three independent samples of size \(n_1, n_2, n_3\) from the distribution \(D_1\), \(D_2\), and \(D_3\) respectively. Suppose that our parameter of interest \(\theta \) is defined as follows:

$$\begin{aligned} \theta =\mathbb {E}\left[ K\left( X_1, \ldots , X_{p}; Y_1, \ldots , Y_{q}; Z_1, \ldots , Z_{r} \right) \right] , \end{aligned}$$
(6.1)

where \( K\left( \cdot \right) \) is a symmetric kernel w.r.t its first, second, and third arguments. It has \(p+q+r\) arguments.

The U-statistic corresponding to \(\theta \) is given by

$$\begin{aligned} \widehat{\theta } = \frac{1}{\left( {\begin{array}{c}n_1\\ p\end{array}}\right) \left( {\begin{array}{c}n_2\\ q\end{array}}\right) \left( {\begin{array}{c}n_3\\ r\end{array}}\right) } \sum _{\begin{array}{c} 1\le i_1< i_2< \cdots< i_p \le n_1 \\ 1\le j_1< j_2< \cdots<j_q \le n_2 \\ 1\le k_1< k_2< \cdots <k_r \le n_3 \end{array}} K\left( X_{i_1}, \ldots , X_{i_p}; Y_{j_1}, \ldots , Y_{j_q}; Z_{k_1}, \ldots , Z_{k_r} \right) . \end{aligned}$$
(6.2)

The estimator \(\widehat{\theta }\) is unbiased, and its variance is given by

$$\begin{aligned} \mathbb {V}\left[ \widehat{\theta }\right] = \sum _{i=1}^{p} \sum _{j=1}^{q} \sum _{k=1}^{r} \frac{\left( {\begin{array}{c}p\\ i\end{array}}\right) \left( {\begin{array}{c}n_1-p\\ p-i\end{array}}\right) \left( {\begin{array}{c}q\\ j\end{array}}\right) \left( {\begin{array}{c}n_2-q\\ q-j\end{array}}\right) \left( {\begin{array}{c}r\\ k\end{array}}\right) \left( {\begin{array}{c}n_3-r\\ r-k\end{array}}\right) }{m \left( {\begin{array}{c}n_1\\ p\end{array}}\right) \left( {\begin{array}{c}n_2\\ q\end{array}}\right) \left( {\begin{array}{c}n_3\\ r\end{array}}\right) } \sigma _{i,j,k}^2, \end{aligned}$$
(6.3)

with \(\sigma _{i,j,k}^2= \mathbb {V}\Big [ \mathbb {E}\Big [K\Big (X_{1},\ldots , X_{p}; Y_{1}, \ldots , Y_{q}; Z_{1}, \ldots , Z_{r} \Big ) | X_{1}, \ldots , X_{i}; Y_{1}, \ldots , Y_{j}; Z_{1}, \ldots , Z_{k} \Big ]\Big ] \).

The U-statistic \(\widehat{\theta }\) is a function of ordered statistics, as it is symmetric w.r.t each of its three types of arguments. It is known that ordered statistics are sufficient and complete statistics for non-parametric families such as the class of distribution functions having finite fourth moments. By Lehmann–Scheff theorem, it follows that \(\widehat{\theta }\) is the unique, uniformly minimum variance unbiased estimator of \(\theta \), given \(X_1, \ldots , X_{n_1}; Y_1, \ldots , Y_{n_2}; Z_1, \ldots , Z_{n_3}\). We also have the asymptotic normality of the estimator \(\widehat{\theta }\).

The same results can be derived for the multivariate U-statistics, and all elements can be found in Ferguson (1996), Lehmann (1999), Hoeffding (1948a, b) and Sugiura (1965).

Appendix B: a generalized estimator of the total sensitivity index

This section recalls mainly Corollary 2 from Lamboni (2016a).

For \(p\ge 2\), let \(\mathcal {X}=(\mathbf {X}^{(1)}_{u},\ldots , \mathbf {X}^{(p)}_{u}\)) be p i.i.d copies of \(\mathbf {X}_{u}\) and \(\mathcal {Y}=\mathbf {X}_{\sim u}\). We consider the following symmetric kernel.

$$\begin{aligned} K\left( \mathbf {X}_{u}^{(1)}, \ldots , \mathbf {X}_{u}^{(p)}, \mathbf {X}_{\sim u} \right)= & {} \frac{1}{p^2(p-1)}\sum _{k=1}^p \left( \sum _{\begin{array}{c} j=1 \\ j\ne k \end{array}}^p [f(\mathbf {X}_{u}^{(k)}, \mathbf {X}_{\sim u}) - f(\mathbf {X}_{u}^{(j)}, \mathbf {X}_{\sim u})] \right) ^2.\nonumber \\ \end{aligned}$$
(6.4)

It is known that the expectation of \(K\left( \mathbf {X}_{u}^{(1)}, \ldots , \mathbf {X}_{u}^{(p)}, \mathbf {X}_{\sim u} \right) \) is the non-normalized total index, that is,

$$\begin{aligned} D_u^{\textit{tot}} =\mathbb {E}\left[ K\left( \mathbf {X}_{u}^{(1)}, \ldots , \mathbf {X}_{u}^{(p)}, \mathbf {X}_{\sim u} \right) \right] . \end{aligned}$$
(6.5)

Given two independent samples \(\mathcal {X}_i= (\mathbf {X}^{(1)}_{i,u},\ldots , \mathbf {X}^{(p)}_{i,u}\)) and \(\mathcal {Y}_i =(\mathbf {X}_{i,\sim u})\), \(i=1, 2, \ldots , m\), from \(\mathcal {X}\) and \(\mathcal {Y}\) respectively, the U-statistic corresponding to the kernel in (6.4 ) is given by

$$\begin{aligned} \widehat{D}_{u}^{\textit{tot}} = \frac{1}{m p^2(p-1)} \sum _{i=1}^m \sum _{k=1}^p \left( \sum _{\begin{array}{c} j=1\\ j\ne k \end{array}}^p [f(\mathbf {X}_{i,u}^{(k)}, \mathbf {X}_{i,\sim u}) - f(\mathbf {X}_{i,u}^{(j)}, \mathbf {X}_{i,\sim u})] \right) ^2. \end{aligned}$$
(6.6)

The properties of \(\widehat{D_u^{\textit{tot}}}\) are given as follows:

(i):

a MVUE of \(D_u^{\textit{tot}}\) is \(\widehat{D_u^{\textit{tot}}}\) defined in (6.6);

(ii):

the variance of \(\widehat{D}_{u}^{\textit{tot}}\) is given by

$$\begin{aligned} \mathbb {V}(\widehat{D}_{u}^{\textit{tot}}) = \frac{\sigma _{p,1}^2 }{m}, \end{aligned}$$
(6.7)

with \(\sigma _{p,1}^2 \) the variance of the kernel.

Moreover, we have

$$\begin{aligned} m\mathbb {E}\left( \widehat{D}_{u}^{\textit{tot}} - D_{u}^{\textit{tot}} \right) ^2 = \sigma _{p,1}^2; \end{aligned}$$
(6.8)
(iii):

if \(m \rightarrow +\infty \), we have

$$\begin{aligned} \sqrt{m}\left( \widehat{D}_{u}^{\textit{tot}} - D_{u}^{\textit{tot}} \right) \xrightarrow {\mathcal {D}} \mathcal {N}\left( 0, \sigma _{p,1}^2 \right) . \end{aligned}$$
(6.9)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lamboni, M. Uncertainty quantification: a minimum variance unbiased (joint) estimator of the non-normalized Sobol’ indices. Stat Papers 61, 1939–1970 (2020). https://doi.org/10.1007/s00362-018-1010-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-018-1010-4

Keywords

Navigation