Abstract
The FANOVA (or “Sobol’-Hoeffding”) decomposition of multivariate functions has been used for high-dimensional model representation and global sensitivity analysis. When the objective function f has no simple analytic form and is costly to evaluate, computing FANOVA terms may be unaffordable due to numerical integration costs. Several approximate approaches relying on Gaussian random field (GRF) models have been proposed to alleviate these costs, where f is substituted by a (kriging) predictor or by conditional simulations. Here we focus on FANOVA decompositions of GRF sample paths, and we notably introduce an associated kernel decomposition into \(4^{d}\) terms called KANOVA. An interpretation in terms of tensor product projections is obtained, and it is shown that projected kernels control both the sparsity of GRF sample paths and the dependence structure between FANOVA effects. Applications on simulated data show the relevance of the approach for designing new classes of covariance kernels dedicated to high-dimensional kriging.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adler, R., Taylor, J.: Random Fields and Geometry. Springer, Boston (2007)
Antoniadis, A.: Analysis of variance on function spaces. Statistics 15, 59–71 (1984)
Berlinet, A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic Publishers, Boston (2004)
Chastaing, G., Le Gratiet, L.: ANOVA decomposition of conditional Gaussian processes for sensitivity analysis with dependent inputs. J. Stat. Comput. Simul. 85(11), 2164–2186 (2015)
Durrande, N., Ginsbourger, D., Roustant, O.: Additive covariance kernels for high-dimensional Gaussian process modeling. Ann. Fac. Sci. Toulous. Math. 21, 481–499 (2012)
Durrande, N., Ginsbourger, D., Roustant, O., Carraro, L.: ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis. J. Multivar. Anal. 115, 57–67 (2013)
Dupuy, D., Helbert, C., Franco, J.: DiceDesign and DiceEval: Two R packages for design and analysis of computer experiments. J. Stat. Softw. 65(11): 1–38 (2015)
Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, Department of Engineering, University of Cambridge (2014)
Duvenaud, D., Nickisch, H., Rasmussen, C.: Additive Gaussian Processes. NIPS conference. (2011)
Efron, B., Stein, C.: The jackknife estimate of variance. Ann. Stat. 9, 586–596 (1981)
Franco, J., Dupuy, D., Roustant, O., Damblin, G., Iooss, B.: DiceDesign: Designs of computer experiments. R package version 1.7 (2015)
Gikhman, I.I., Skorokhod, A.V.: The theory of stochastic processes. Springer, Berlin (2004). Translated from the Russian by S. Kotz, Reprint of the 1974 edition
Hoeffding, W.: A class of statistics with asymptotically normal distributions. Ann. Math. Stat. 19, 293–325 (1948)
Jan, B., Bect, J., Vazquez, E., Lefranc, P.: approche bayésienne pour l’estimation d’indices de Sobol. In 45èmes Journées de Statistique - JdS 2013. Toulouse, France (2013)
Janon, A., Klein, T., Lagnoux, A., Nodet, M., Prieur, C.: Asymptotic Normality and Efficiency of Two Sobol Index Estimators. Probability And Statistics, ESAIM (2013)
Kaufman, C., Sain, S.: Bayesian functional ANOVA modeling using Gaussian process prior distributions. Bayesian Anal. 5, 123–150 (2010)
Krée, P.: Produits tensoriels complétés d’espaces de Hilbert. Séminaire Paul Krée Vol 1, No. 7 (1974–1975)
Kuelbs, J.: Expansions of vectors in a Banach space related to Gaussian measures. Proc. Am. Math. Soc. 27(2), 364–370 (1971)
Kuo, F.Y., Sloan, I.H., Wasilkowski, G.W., Wozniakowski, H.: On decompositions of multivariate functions. Math. Comput. 79, 953–966 (2010)
Le Gratiet, L., Cannamela, C., Iooss, B.: A Bayesian approach for global sensitivity analysis of (multi-fidelity) computer codes. SIAM/ASA J. Uncertain. Quantif. 2(1), 336–363 (2014)
Lenz, N.: Additivity and ortho-additivity in Gaussian random fields. Master’s thesis, Departement of Mathematics and Statistics, University of Bern (2013). http://hal.archives-ouvertes.fr/hal-01063741
Marrel, A., Iooss, B., Laurent, B., Roustant, O.: Calculations of Sobol indices for the Gaussian process metamodel. Reliab. Eng. Syst. Saf. 94, 742–751 (2009)
Muehlenstaedt, T., Roustant, O., Carraro, L., Kuhnt, S.: Data-driven Kriging models based on FANOVA-decomposition. Stat. Comput. 22(3), 723–738 (2012)
Oakley, J., O’Hagan, A.: Probabilistic sensitivity analysis of complex models: a Bayesian approach. J. R. Stat. Soc. 66, 751–769 (2004)
Rajput, B.S., Cambanis, S.: Gaussian processes and Gaussian measures. Ann. Math. Stat. 43, 1944–1952 (1972)
Rasmussen, C.R., Williams, C.K.I.: Gaussian Processes for Machine Learning. Cambridge, MIT Press (2006)
Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M., Tarantola, S.: Global sensitivity analysis: the primer. Wiley Online Library (2008)
Santner, T., Williams, B., Notz, W.: The design and analysis of computer experiments. Springer, New York (2003)
Sawa, T.: The exact moments of the least squares estimator for the autoregressive model. J. Econom. 8(2), 159–172 (1978)
Scheuerer, M.: A comparison of models and methods for spatial interpolation in statistics and numerical analysis. Ph.D. thesis, Georg-August-Universität Göttingen (2009)
Schuhmacher, D.: Distance estimates for poisson process approximations of dependent thinnings. Electron. J. Probab. 10(5), 165–201 (2005)
Sobol’, I.: Multidimensional Quadrature Formulas and Haar Functions. Nauka, Moscow (1969). (In Russian)
Sobol’, I.: Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math. Comput. Simul. 55(1–3), 271–280 (2001)
Steinwart, I., Scovel, C.: Mercer’s theorem on general domains: on the interaction between measures, kernels, and RKHSs. Constr. Approx. 35(3), 363–417 (2012)
Talagrand, M.: Regularity of Gaussian processes. Acta Math. 159(1–2), 99–149 (1987)
Tarieladze, V., Vakhania, N.: Disintegration of Gaussian measures and average-case optimal algorithms. J. Complex. 23(4–6), 851–866 (2007)
Touzani, S.: Response surface methods based on analysis of variance expansion for sensitivity analysis. Ph.D. thesis, Université de Grenoble (2011)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Wahba, G.: Spline Models for Observational Data. Siam, Philadelphia (1990)
Welch, W.J., Buck, R.J., Sacks, J., Wynn, H.P., Mitchell, T.J., Morris, M.D.: Screening, predicting, and computer experiments. Technometrics 34, 15–25 (1992)
Acknowledgments
The authors would like to thank Dario Azzimonti for proofreading, as well as the editors and an anonymous referee for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Proofs
Theorem 1 (a) The first part and the concrete solution (6) follow directly from the corresponding statements in Sect. 2. Having established (6), it is easily seen that \([T_{{\mathfrak {u}}}\otimes T_{{\mathfrak {v}}}]k =T^{(1)}_{{\mathfrak {u}}} T^{(2)}_{{\mathfrak {v}}} k\) coincides with \(k_{{\mathfrak {u}},{\mathfrak {v}}}\).
(b) Under these conditions Mercer’s theorem applies (see [34] for an overview and recent extensions). So there exist a non-negative sequence \((\lambda _{i})_{i \in \mathbb {N}\backslash \{0\}}\), and continuous representatives \((\phi _{i})_{i \in \mathbb {N}\backslash \{0\}}\) of an orthonormal basis of \(\mathrm {L}^2(\nu )\) such that \( k(\mathbf {x}, \mathbf {y})=\sum _{i=1}^{\infty } \lambda _{i} \phi _{i}(\mathbf {x}) \phi _{i}(\mathbf {y}) \), \(\mathbf {x}, \mathbf {y} \in D\), where the convergence is absolute and uniform. Noting that \(T_{{\mathfrak {u}}}, T_{{\mathfrak {v}}}\) are also bounded as operators on continuous functions, applying \(T^{(1)}_{{\mathfrak {u}}} T^{(2)}_{{\mathfrak {v}}}\) from above yields that
where \(\psi _{i}= \sum _{{\mathfrak {u}}\subseteq I} \alpha _{{\mathfrak {u}}} (T_{{\mathfrak {u}}}\phi _{i})\). Thus the considered function is indeed s.p.d. \(\square \)
Corollary 1 Expand the product \(\prod _{l=1}^{d}(1+k_{l}^{(0)}(x_{l},y_{l}))\) and conclude by uniqueness of the KANOVA decomposition, noting that \(\int \prod _{l\in {\mathfrak {u}}}k_{l}^{(0)}(x_{l},y_{l})\nu _i(\mathrm {d}x_i) = \int \prod _{l\in {\mathfrak {u}}}k_{l}^{(0)}(x_{l},y_{l})\nu _j(\mathrm {d}y_j) = 0\) for any \({\mathfrak {u}}\subseteq I\) and any \(i,j \in {\mathfrak {u}}\). \(\square \)
Theorem 2 Sample path continuity implies product-measurability of Z and \(Z^{({\mathfrak {u}})}\), respectively, as can be shown by an approximation argument; see e.g. Prop. A.D. in [31]. Due to Theorem 3 in [35], the covariance kernel k is continuous, hence \(\int _{D} {\mathbb {E}}|Z_{\mathbf {x}}| \, \nu _{-{\mathfrak {u}}}(\mathrm {d}\mathbf {x_{-{\mathfrak {u}}}}) \le (\int _{D} k(\mathbf {x}, \mathbf {x}) \, \nu _{-{\mathfrak {u}}}(\mathrm {d}\mathbf {x_{-{\mathfrak {u}}}}))^{1/2} < \infty \) for any \({\mathfrak {u}}\subseteq I\) and by Cauchy–Schwarz \(\int _{D} \int _{D} {\mathbb {E}}|Z_{\mathbf {x}}Z_{\mathbf {y}}| \, \nu _{-{\mathfrak {u}}}(\mathrm {d}\mathbf {x_{-{\mathfrak {u}}}}) \nu _{-{\mathfrak {v}}}(\mathrm {d}\mathbf {y_{-{\mathfrak {v}}}}) < \infty \) for any \({\mathfrak {u}},{\mathfrak {v}}\subseteq I\). Replacing f by Z in Formula (2), taking expectations and using Fubini’s theorem yields that \(Z^{({\mathfrak {u}})}\) is centred again. Combining (2), Fubini’s theorem, and (6) yields
It remains to show the joint Gaussianity of the \(Z^{({\mathfrak {u}})}\). First note that \(C_b(D,\mathbb {R}^r)\) is a separable Banach space for \(r \in {\mathbb {N}}\setminus \{0\}\). We may and do interprete Z as a random element of \(C_b(D)\), equipped with the \(\sigma \)-algebra \(\mathscr {B}^{D}\) generated by the evaluation maps \([C_b(D) \ni f \mapsto f(\mathbf {x}) \in \mathbb {R}]\). By Theorem 2 in [25] the distribution \({\mathbb {P}}Z^{-1}\) of Z is a Gaussian measure on \(\bigl ( C_b(D),\mathscr {B}(C_b(D)) \bigr )\). Since \(T_{{\mathfrak {u}}}\) is a bounded linear operator \(C_b(D) \rightarrow C_b(D)\), we obtain immediately that the “combined operator” \(\mathfrak {T} :C_b(D) \rightarrow C_b(D,\mathbb {R}^{2^d})\), defined by \((\mathfrak {T}(f))(\mathbf {x}) = (T_{{\mathfrak {u}}}f(\mathbf {x}))_{{\mathfrak {u}}\subseteq I}\), is also bounded and linear. Corollary 3.7 of [36] yields that the image measure \(({\mathbb {P}}Z^{-1}) \mathfrak {T}^{-1}\) is a Gaussian measure on \(C_b(D,\mathbb {R}^{2^d})\). This means that for every bounded linear operator \(\Lambda :C_b(D,\mathbb {R}^{2^d}) \rightarrow \mathbb {R}\) the image measure \((({\mathbb {P}}Z^{-1}) \mathfrak {T}^{-1}) \Lambda ^{-1}\) is a univariate normal distribution, i.e. \(\Lambda (\mathfrak {T}Z)\) is a Gaussian random variable. Thus, for all \(n \in \mathbb {N}\), \(\mathbf {x}^{(i)}\in D\) and \(a_i^{({\mathfrak {u}})} \in \mathbb {R}\), where \(1 \le i \le n\), \(u \subseteq I\), we obtain that \(\sum _{i=1}^n \sum _{{\mathfrak {u}}\subseteq I} a_i^{({\mathfrak {u}})} (T_{{\mathfrak {u}}} Z)_{\mathbf {x}^{(i)}}\) is Gaussian by the fact that \([C_b(D) \ni f \mapsto f(\mathbf {x}) \in \mathbb {R}]\) is continuous (and linear) for every \(\mathbf {x} \in D\). We conclude that \(\mathfrak {T}Z = (Z_{\mathbf {x}}^{({\mathfrak {u}})}, {\mathfrak {u}}\subseteq I)_{\mathbf {x}\in D}\) is a vector-valued GRF. \(\square \)
Corollary 2 (a) If (i) holds, \([T_{{\mathfrak {u}}} \otimes T_{{\mathfrak {u}}}]k= T_{{\mathfrak {u}}}^{(2)} (T_{{\mathfrak {u}}}^{(1)}k)=\mathbf {0}\) by \((T_{{\mathfrak {u}}}^{(1)}k)(\bullet ,\mathbf {y}) = T_{{\mathfrak {u}}}(k(\bullet ,\mathbf {y}))\); thus (ii) holds. (ii) trivially implies (iii). Statement (iii) means that \({{\mathrm{Var}}}(Z^{({\mathfrak {u}})}_{\mathbf {x}}) = 0\), which implies that \(Z^{({\mathfrak {u}})}_{\mathbf {x}} = 0\) a.s., since \(Z^{({\mathfrak {u}})}\) is centred. (iv) follows by noting that \({\mathbb {P}}(Z^{({\mathfrak {u}})}_{\mathbf {x}} = 0)=1\) for all \(\mathbf {x} \in D\) implies \(\mathbb {P}( Z^{({\mathfrak {u}})}=\mathbf {0} ) =1\) by the fact that \(Z^{({\mathfrak {u}})}\) has continuous sample paths and is therefore separable. Finally, (iv) implies (i) because \(T_{{\mathfrak {u}}}(k(\bullet ,\mathbf {y}))={{\mathrm{Cov}}}(Z^{({\mathfrak {u}})}_{{\bullet }},Z_{\mathbf {y}})=\mathbf {0}\); see (18) for the first equality.
(b) For any \(m,n \in \mathbb {N}\) and \(\mathbf {x}_1,\ldots ,\mathbf {x}_m,\mathbf {y}_1,\ldots ,\mathbf {y}_n \in D\) we obtain by Theorem 2 that \(Z^{({\mathfrak {u}})}_{\mathbf {x}_1}, \ldots , Z^{({\mathfrak {u}})}_{\mathbf {x}_m}, Z^{({\mathfrak {v}})}_{\mathbf {y}_1}, \ldots , Z^{({\mathfrak {v}})}_{\mathbf {y}_n}\) are jointly normally distributed. Statement (i) is equivalent to saying that \({{\mathrm{Cov}}}(Z^{({\mathfrak {u}})}_{\mathbf {x}},Z^{({\mathfrak {v}})}_{\mathbf {y}}) = 0\) for all \(\mathbf {x}, \mathbf {y} \in D\). Thus \((Z^{({\mathfrak {u}})}_{\mathbf {x}_1}, \ldots , Z^{({\mathfrak {u}})}_{\mathbf {x}_m})\) and \((Z^{({\mathfrak {v}})}_{\mathbf {y}_1}, \ldots , Z^{({\mathfrak {v}})}_{\mathbf {y}_n})\) are independent. Since the sets
with \(m,n \in \mathbb {N}\), \(\mathbf {x}_1,\ldots ,\mathbf {x}_m,\mathbf {y}_1,\ldots ,\mathbf {y}_n \in D\), \(A \in \mathscr {B}(\mathbb {R}^m)\), \(B \in \mathscr {B}(\mathbb {R}^n)\) generate \(\mathscr {B}^D \otimes \mathscr {B}^D\) (and the system of such sets is stable under intersections), statement (ii) follows. The converse direction is straightforward. \(\square \)
Corollary 3 By Remark 2, there is a Gaussian white noise sequence \(\varepsilon =(\varepsilon _{i})_{i\in \mathbb {N}\backslash \{0\}}\) such that \(Z_{\mathbf {x}}=\sum _{i=1}^{\infty } \sqrt{\lambda _{i}} \varepsilon _{i} \phi _{i}(\mathbf {x})\) uniformly with probability 1. From \(Z^{({\mathfrak {u}})}_{\mathbf {x}}=\sum _{i=1}^{\infty } \sqrt{\lambda _{i}} \varepsilon _{i} T_{{\mathfrak {u}}}\phi _{i}(\mathbf {x})\), we obtain \(\Vert Z^{({\mathfrak {u}})} \Vert ^2=Q_{{\mathfrak {u}}}(\varepsilon , \varepsilon )\) with \(Q_{{\mathfrak {u}}}\) as defined in the statement. A similar calculation for the denominator of \(S_{{\mathfrak {u}}}(Z)\) leads to \(\sum _{{\mathfrak {v}}\ne \emptyset } Q_{{\mathfrak {v}}}(\varepsilon , \varepsilon )\).\(\square \)
Additional Examples
Here we give useful expressions to compute the KANOVA decomposition of some tensor product kernels with respect to the uniform measure on \([0,1]^{d}\). For simplicity we denote the 1-dimensional kernels on which they are based by k (corresponding to the notation \(k_i\) in Example 2). The uniform measure on [0, 1] is denoted by \(\lambda \).
Example 5
(Exponential kernel) If \(k(x,y) = \exp \left( - \frac{ \vert x-y \vert }{\theta } \right) \), then:
-
\(\int _0^1 k(., y) d\lambda = \theta \times \left[ 2 - k(0, y) - k(1,y) \right] \)
-
\(\iint _{[0,1]^2} k(.,.) d(\lambda \otimes \lambda ) = 2 \theta (1 - \theta + \theta e^{-1 / \theta } )\)
Example 6
(Matérn kernel, \(\nu =p+\frac{1}{2}\)) Define for \(\nu =p+\frac{1}{2}\) (\(p \in \mathbb {N}\)):
Then, denoting \(\zeta _p = \frac{\theta }{\sqrt{2\nu }}\), we have:
where \( A_p(u) = \left( \sum _{\ell =0}^p c_{p,\ell } u^\ell \right) e^{-u}\) with \(c_{p,\ell } = \frac{1}{\ell !} \sum _{i=0}^{p-\ell }{\frac{(p+i)!}{i!} 2^{p-i}}.\) This generalizes Example 5, corresponding to \(\nu =1/2\). Also, this result can be written more explicitly for the commonly selected value \(\nu =3/2\) (\(p=1, \zeta _1=\theta / \sqrt{3}\)):
-
\(k(x,y) = \left( 1 + \frac{ \vert x-y \vert }{\zeta _1} \right) \exp \left( - \frac{ \vert x-y \vert }{\zeta _1}\right) \)
-
\(\int _0^1 k(., y) d\lambda = \zeta _1 \times \left[ 4 - A_1 \left( \frac{y}{\zeta _1} \right) - A_1 \left( \frac{1 - y}{\zeta _1} \right) \right] \,\,\) with \(A_1(u) = (2+u)e^{-u}\)
-
\(\iint _{[0,1]^2} k(.,.) d(\lambda \otimes \lambda ) = 2\zeta _1 \left[ 2 - 3 \zeta _1 + (1 + 3 \zeta _1 ) e^{ - 1/\zeta _1 } \right] \)
Similarly, for \(\nu =5/2\) (\(p=2, \zeta _2=\theta / \sqrt{5}\)):
-
\(k(x,y) = \left( 1 + \frac{ \vert x-y \vert }{\zeta _2} + \frac{1}{3} \frac{ (x-y)^2 }{(\zeta _2)^2} \right) \exp \left( - \frac{ \vert x-y \vert }{\zeta _2} \right) \)
-
\(\int _0^1 k(., y) d\lambda = \frac{1}{3} \zeta _2 \times \left[ 16 - A_2 \left( \frac{y}{\zeta _2} \right) - A_2 \left( \frac{1 - y}{\zeta _2} \right) \right] \,\,\) with \(A_2(u) = (8 + 5u + u^2) e^{-u}\)
-
\(\iint _{[0,1]^2} k(.,.) d(\lambda \otimes \lambda ) = \frac{1}{3}\zeta _2 (16 - 30 \,\zeta _2) + \frac{2}{3} (1 + 7 \,\zeta _2 + 15 \, (\zeta _2)^2 ) e^{ - 1/\zeta _2 } \)
Example 7
(Gaussian kernel) If \(k(x,y) = \exp \left( - \frac{1}{2} \frac{(x-y)^2}{\theta ^2} \right) \), then
-
\(\int _0^1 k(., y) d\lambda = \theta \sqrt{2\pi } \times \left[ \varPhi \left( \frac{1-y}{\theta } \right) + \varPhi \left( \frac{y}{\theta } \right) - 1 \right] \)
-
\(\iint _{[0,1]^2} k(.,.) d(\lambda \otimes \lambda ) = 2 (e^{-1/(2 \theta ^2)} -1 ) + \theta \sqrt{2 \pi } \times \left( 2 \varPhi \left( \frac{1}{\theta } \right) - 1 \right) \)
where \(\varPhi \) denotes the cdf of the standard normal distribution.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ginsbourger, D., Roustant, O., Schuhmacher, D., Durrande, N., Lenz, N. (2016). On ANOVA Decompositions of Kernels and Gaussian Random Field Paths. In: Cools, R., Nuyens, D. (eds) Monte Carlo and Quasi-Monte Carlo Methods. Springer Proceedings in Mathematics & Statistics, vol 163. Springer, Cham. https://doi.org/10.1007/978-3-319-33507-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-33507-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33505-6
Online ISBN: 978-3-319-33507-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)