Abstract
Multifidelity methods are widely used for estimating quantities of interest (QoI) in computational science by employing numerical simulations of differing costs and accuracies. Many methods approximate numerical-valued statistics that represent only limited information, e.g., scalar statistics, about the QoI. Further quantification of uncertainty, e.g., for risk assessment, failure probabilities, or confidence intervals, requires estimation of the full distributions. In this paper, we generalize the ideas in (Xu et al. in SIAM J Sci Comput 44(1):A150–A175, 2022) to develop a multifidelity method that approximates the full distribution of scalar-valued QoI. The main advantage of our approach compared to alternative methods is that we require no particular relationships among the high and lower-fidelity models (e.g. model hierarchy), and we do not assume any knowledge of model statistics including correlations and other cross-model statistics before the procedure starts. Under suitable assumptions in the framework above, we achieve provable 1-Wasserstein metric convergence of an algorithmically constructed distributional emulator via an exploration–exploitation strategy. We also prove that crucial policy actions taken by our algorithm are budget-asymptotically optimal. Numerical experiments are provided to support our theoretical analysis.
Similar content being viewed by others
Notes
For example, requiring that models with higher correlation relative to the high-fidelity model should also incur higher cost is one such model cost behavior.
A sequence of probability measures \(\{P_k\}\) defined on a metric space is called \(\delta \)-tight if for every \({\varepsilon }>0\), there exist a compact measurable set K and a sequence \(\delta _k\downarrow 0\) such that \(P_k(K^{\delta _k})>1-{\varepsilon }\) for every k, where \(K^{\delta _k}: = \{x: \text {dist}(x, K)<\delta _k\}\).
When the variance ratio between \({\varepsilon }_S\) and Y is small, \(Y\approx X_S^\top \beta _S\), so that adding the noise emulator has little impact on the accuracy of the resulting estimator. When the ratio is moderate, adding \({\varepsilon }_S\) as an independent component will degrade the quality of the estimator if the independence assumption is violated.
References
Beran, R., Le Cam, L., Millar, P.: Convergence of stochastic empirical measures. J. Multivar. Anal. 23(1), 159–168 (1987)
Bobkov,S., Ledoux, M.: One-Dimensional Empirical Measures, Order Statistics, and Kantorovich Transport Distances. Vol. 261, No. 1259. American Mathematical Society (2019)
Bubeck, S., Cesa-Bianchi, N., et al.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends R Mach. Learn. 5(1), 1–122 (2012)
Cambanis, S., Simons, G., Stout, W.: Inequalities for E k (x, y) when the marginals are fixed. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 36(4), 285–294 (1976)
Chatterjee, S.: Lecture Notes on Stein’s Method. Stanford Lecture Notes (2007)
Cohen, A., DeVore, R.: Approximation of high-dimensional parametric PDEs. Acta Numer. 24, 1–159 (2015). https://doi.org/10.1017/S0962492915000033. (ISSN: 1474-0508)
Farcas, I.-G.: Context-aware model hierarchies for higher-dimensional uncertainty quantification. PhD thesis. Technische Universität München (2020)
Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002). https://doi.org/10.1111/j.1751-5823.2002.tb00178.x
Giles, M.B.: Multilevel Monte Carlo path simulation. Oper. Res. 56(3), 607–617 (2008)
Giles, M.B., Nagapetyan, T., Ritter, K.: Multilevel Monte Carlo approximation of distribution functions and densities. SIAM/ASA J. Uncertain. Quantif. 3(1), 267–295 (2015). https://doi.org/10.1137/140960086
Giles, M.B., Nagapetyan, T., Ritter, K.: Adaptive multilevel Monte Carlo approximation of distribution functions. arXiv preprint arXiv:1706.06869 (2017)
Gorodetsky, A.A., Geraci, G., Eldred, M.S., Jakeman, J.D.: A generalized approximate control variate framework for multifidelity uncertainty quantification. J. Comput. Phys. 408, 109257 (2020). https://doi.org/10.1016/j.jcp.2020.109257
Hammersley, J.M., Handscomb, D.C.: Monte Carlo Methods. Methuen, London (1964)
Ishigami,T., Homma, T.: An importance quantification technique in uncertainty analysis for computer models. In: [1990] Proceedings. First International Symposium on Uncertainty Modeling and Analysis. IEEE Computere Society Press (1990). https://doi.org/10.1109/isuma.1990.151285
Koenker, R.: Fundamentals of quantile regression. In: Quantile Regression, pp. 26–67. Cambridge University Press, Cambridge (2001). https://doi.org/10.1017/ccol0521845734.002
Krumscheid, S., Nobile, F.: Multilevel Monte Carlo Approximation of Functions. SIAM/ASA J. Uncertain. Quantif. 6(3), 1256–1293 (2018). https://doi.org/10.1137/17m1135566
Lai, T.L., Wei, C.Z.: Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. Ann. Stat. 10(1), 154–166 (1982). https://doi.org/10.1214/aos/1176345697
Lattimore, T., Szepesvári, C.: Bandit Algorithms. Cambridge University Press, Cambridge (2020)
Lu, D., Zhang, G., Webster, C., Barbier, C.: An improved multilevel Monte Carlo method for estimating probability distribution functions in stochastic oil reservoir simulations. Water Resour. Res. 52(12), 9642–9660 (2016). https://doi.org/10.1002/2016wr019475
Massart, P.: The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Probab. 18, 1269–1283 (1990)
Pan, X., Zhou, W.-X.: Multiplier bootstrap for quantile regression: non-asymptotic theory under random design. Inf. Inference J. IMA (2020). https://doi.org/10.1093/imaiai/iaaa006
Panaretos, V.M., Zemel, Y.: Statistical aspects of Wasserstein distances. Annu. Rev. Stat. Appl. 6, 405–431 (2019)
Peherstorfer, B.: Multifidelity Monte Carlo estimation with adaptive low-fidelity models. SIAM/ASA J. Uncertain. Quantif. 7(2), 579–603 (2019)
Peherstorfer, B., Willcox, K., Gunzburger, M.: Optimal model management for multifidelity Monte Carlo estimation. SIAM J. Sci. Comput. 38(5), A3163–A3194 (2016). https://doi.org/10.1137/15m1046472
Peherstorfer, B., Willcox, K., Gunzburger, M.: Survey of multifidelity methods in uncertainty propagation, inference, and optimization. SIAM Rev. 60(3), A550–A591 (2018)
Qian, E., Peherstorfer, B., O’Malley, D., Vesselinov, V.V., Willcox, K.: Multifidelity Monte Carlo estimation of variance and sensitivity indices. SIAM/ASA J. Uncertain. Quantif. 6(2), 683–706 (2018). https://doi.org/10.1137/17m1151006
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria (2020)
Robinson, P.M.: Root-N-consistent semiparametric regression. Econometrica 56(4), 931 (1988). https://doi.org/10.2307/1912705
Schaden, D., Ullmann, E.: On multilevel best linear unbiased estimators. SIAM/ASA J. Uncertain. Quantif. 8(2), 601–635 (2020). https://doi.org/10.1137/19m1263534
Schaden, D., Ullmann, E.: Asymptotic analysis of multilevel best linear unbiased estimators. SIAM/ASA J. Uncertain. Quantif. 9(3), 953–978 (2021)
Vershynin, R.: High-Dimensional Probability: An Introduction with Applications in Data Science, vol. 47. Cambridge University Press, Cambridge (2018)
Villani, C.: The metric side of optimal transportation. In: Graduate Studies in Mathematics, pp. 205–235. (2003). https://doi.org/10.1090/gsm/058/08
Williams, D.: Probability with Martingales. Cambridge University Press, Cambridge (1991)
Xu, Y., Keshavarzzadeh, V., Kirby, R.M., Narayan, A.: A bandit-learning approach to multifidelity approximation. SIAM J. Sci. Comput. 44(1), A150–A175 (2022)
Acknowledgements
We would like to thank the referees for their time and helpful comments which significantly improved the presentation of the manuscript. Y. Xu and A. Narayan are partially supported by National Science Foundation DMS-1848508. A. Narayan is partially supported by the Air Force Office of Scientific Research award FA9550-20-1-0338. Y. Xu would like to thank Dr. Xiaoou Pan for clarifying a uniform consistency result in quantile regression. We also thank Dr. Ruijian Han for a careful reading of an early draft, and for providing several comments that improved the presentation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
Proof of Lemma 4.1
Let \(|S| = s\). We first prove (28a). Recall the definition of Y and \(Y'\):
where \({\varepsilon }_S, {\widehat{{\varepsilon }}}_S\) are independent of \(X_S\) by Assumption 2.2.
Define the true empirical distribution of \({\varepsilon }_S\) using exploration samples as
by the additive property of \(W_1\) metric under independence [22] and the triangle inequality, we can upper bound the average \(W_1\) distance between \(F_Y\) and \(F_{Y'}\) by conditioning on the exploration data:
For the first term in (59), note \({{\widehat{\beta }}}_S-\beta _S\) satisfies
Averaging out the randomness of exploration noise, we have that, almost surely,
where the first step uses Jensen’s inequality, and the last step follows from the law of large numbers and Assumption 2.3.
For the second term in (59), note that \({\widetilde{{\varepsilon }}}_S\) is the empirical distribution of \({\varepsilon }_S\) based on m exploration samples, which does not depend on \(Z_S\) under Assumption 2.2. Applying the nonasymptotic estimates on the convergence rate of empirical measures in Lemma 3.2 obtains
where \(J_1\) is defined in (10).
For the third term in (59), consider the natural coupling between \({\widehat{{\varepsilon }}}_S\) and \({\widetilde{{\varepsilon }}}_S\): \({\widetilde{{\varepsilon }}}^{\leftarrow }_S({\widetilde{\tau }}_\ell ) = {\widehat{{\varepsilon }}}^{\leftarrow }_S({\widehat{\tau }}_\ell )\), where \(^{\leftarrow }\) denotes the preimage of a map and
In this case,
Putting (61), (62), (64) together finishes the proof of (28a).
We next prove (28b). Conditioned on \(Z_S\) and \(Y_{\text {epr}}\), \(Y'\) is a random variable with bounded r-th moments for all \(r>2\). Appealing to Lemma 3.2 and averaging over the exploration noise, we have
where \(J_1\) is defined in (10). The desired result would follow if we can show that \({{\mathbb {E}}}[J_1(F_{Y'})|Z_S]\) converges to \(J_1(F_{Y})\) a.s. as \(m\rightarrow \infty \). To this end, we introduce the following intermediate random variables:
We will prove the desired result by verifying the following convergence statements respectively:
-
(a).
\(|{{\mathbb {E}}}[J_1(F_{Y})|Z_S]-{{\mathbb {E}}}[J_1(F_{Y''})|Z_S]|\rightarrow 0\) a.s.;
-
(b).
\(|{{\mathbb {E}}}[J_1(F_{Y''})|Z_S]-{{\mathbb {E}}}[J_1(F_{Y'''})|Z_S]|\rightarrow 0\) a.s.;
-
(c).
\(|{{\mathbb {E}}}[J_1(F_{Y'''})|Z_S]-{{\mathbb {E}}}[J_1(F_{Y'})|Z_S]|\rightarrow 0\) a.s.
Without loss of generality, we assume \(\text {supp}({\varepsilon }_S)\subseteq [-1, 1]\) and \(\Vert \beta _S\Vert _2 = 1\); the general case can be considered similarly by taking appropriate scaling involving a constant C in Assumption 2.5.
We introduce the following quantity for our analysis:
It is clear that \(K_m^*\) depends only on \(Z_S\). Under Assumption 2.4, \(X_{S,\ell }\)’s are i.i.d. sub-exponential random variables with uniformly bounded sub-exponential norm. By Lemma 3.6,
where the implicit constant is realization-dependent.
To prove (a), we condition on \(Z_S\) and \(Y_{\text {epr}}\). Using (13)–(15) in Lemma 3.5,
Under Assumption 2.4, for every y, since \(\Vert \beta _S\Vert _2 = 1\), \(F_{X_S^\top \beta _S}(y-z)\) as a function of z, is \(C_{\text {Lip}}\)-Lipschitz. By the Kantorovich-Rubinstein duality (6),
Meanwhile, note \({\text {supp}}({\varepsilon }_S) = {\text {supp}}({\widetilde{{\varepsilon }}}_S)\subseteq [-1,1]\). This combined with the fact that \(X_S^\top \beta _S\) is sub-exponential implies that
where C is an absolute constant depending only on the sub-exponential norm of \(\Vert X_S\Vert _2\), and \(M_1\) is the 1-local maximum operator in Definition 3.4. Substituting (69) and (70) into (68) and applying a truncated estimate,
Since \(\sqrt{W_1({\varepsilon }_S, {\widetilde{{\varepsilon }}}_S)}\) is independent of \(Z_S\), taking expectation over the exploration noise together with Jensen’s inequality and Lemma 3.2 yields
To prove (b), note that conditioning on \(Z_S\) and \(Y_{\text {epr}}\), the difference between \({\widetilde{\tau }}_\ell \) and \({\widehat{\tau }}_\ell \) is bounded as follows:
where \({\widehat{\tau }}_\ell , {\widetilde{\tau }}_\ell \) are defined in (63). Moreover, since \(\text {supp}({\varepsilon }_S)\subseteq [-1, 1]\), \(|{\widetilde{\tau }}_\ell |\le 1\). This combined with (72) implies
The rest is similar to the proof of statement (a),
It is easy to verify using the Lipschitz assumption and the tail bound of \(X_S^\top \beta _S\) that
Thus,
Averaging out exploration noise and applying Jensen’s inequality,
To prove (c), recall from (73) that conditioning on \(Z_S\) and \(Y_{\text {epr}}\), \({\text {supp}}({\widehat{{\varepsilon }}}_S)\subseteq [-r ,r]\). Applying Lemma 3.5,
If \(\delta <1/2\), then \(1/2<\Vert {{\widehat{\beta }}}_S\Vert _2< 3/2\). In this case, the \(C_1, C_2, C_3\) in Lemma 3.7 are absolute constants. According to Lemma 3.7 with \(p = 1/2\),
where we used \(\log (1/\delta )<\delta ^{-1/6}\) when \(\delta \le 1/2\).
If \(\delta \ge 1/2\), the same result in Lemma 3.7 implies
Substituting (75) and (76) into (74) yields that
Taking expectation over the exploration noise and applying Jensen’s inequality,
(28b) is proved by combining statements (a), (b), (c).
A quantile regression framework
Quantile regression offers an alternative approach to simulating Y through a random coefficient interpretation [15]. For any \(S\subseteq [n]\) and \(\tau \in (0,1)\), we assume the conditional \(\tau \)-th quantile of Y on \(X_S\) satisfies
where \(\beta _S(\tau )\) the \(\tau \)-th coefficient vector. (77) is a standard quantile regression formulation, and can be used to model heteroscedastic noise effects.
Thus, (77) approximately equals
As opposed to (23), (78) provides a way to simulate Y based on \(X_S\) via inverse transform sampling:
In our case, \(X_{{\text {epr}}, \ell }, \ell \in [m]\) are i.i.d. samples so (77) fits into a random design quantile regression framework as analyzed in [21], where the authors established a strong consistency result for \({\widehat{\beta }}_S(\tau )\) under suitable conditions. The consistency result can be further proven to hold uniformly for all \(\tau \in [\delta ,1-\delta ]\) for any fixed \(\delta >0\), which justifies the asymptotic behavior of the procedure in (78) as \(m, N_S\rightarrow \infty \).
In the quantile regression framework, obtaining the optimal choices for m and S is much harder than in the linear regression setup. The AETC-d-q algorithm in Sect. 6 implements (79) with m set as the adaptive exploration rate given by the AETC-d, S as the corresponding model output for exploitation, and U approximated via \(\frac{1}{K}\sum _{j\in [K]}\delta _{\frac{j}{K+1}}\) with \(K=100\).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, Y., Narayan, A. Budget-limited distribution learning in multifidelity problems. Numer. Math. 153, 171–212 (2023). https://doi.org/10.1007/s00211-022-01337-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-022-01337-5