Local Dvoretzky–Kiefer–Wolfowitz Confidence Bands

Maillard, Odalric-Ambrym

doi:10.3103/S1066530721010038

Local Dvoretzky–Kiefer–Wolfowitz Confidence Bands

Published: 30 May 2022

Volume 30, pages 16–46, (2021)
Cite this article

Mathematical Methods of Statistics Aims and scope Submit manuscript

Odalric-Ambrym Maillard¹

139 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we revisit the concentration inequalities for the supremum of the cumulative distribution function (CDF) of a real-valued continuous distribution as established by Dvoretzky, Kiefer, Wolfowitz and revisited later by Massart in in two seminal papers. We focus on the concentration of the local supremum over a sub-interval, rather than on the full domain. That is, denoting $U$ the CDF of the uniform distribution over $[0,1]$ and $U_{n}$ its empirical version built from $n$ samples, we study $\mathbb{P}\Big{(}\sup_{u\in[\underline{u},\overline{u}]}U_{n}(u)-U(u)>\varepsilon\Big{)}$ for different values of $\underline{u},\overline{u}\in[0,1]$. Such local controls naturally appear for instance when studying estimation error of spectral risk-measures (such as the conditional value at risk), where $[\underline{u},\overline{u}]$ is typically $[0,\alpha]$ or $[1-\alpha,1]$ for a risk level $\alpha$, after reshaping the CDF $F$ of the considered distribution into $U$ by the general inverse transform $F^{-1}$. Extending a proof technique from Smirnov, we provide exact expressions of the local quantities $\mathbb{P}\Big{(}\sup_{u\in[\underline{u},\overline{u}]}U_{n}(u)-U(u)>\varepsilon\Big{)}$ and $\mathbb{P}\Big{(}\sup_{u\in[\underline{u},\overline{u}]}U(u)-U_{n}(u)>\varepsilon\Big{)}$ for each $n,\varepsilon,\underline{u},\overline{u}$. Interestingly these quantities, seen as a function of $\varepsilon$, can be easily inverted numerically into functions of the probability level $\delta$. Although not explicit, they can be computed and tabulated. We plot such expressions and compare them to the classical bound $\sqrt{\frac{\ln(1/\delta)}{2n}}$ provided by Massart inequality. We then provide an application of such result to the control of generic functional of the CDF, motivated by the case of the conditional value at risk. Last, we extend the local concentration results holding individually for each $n$ to time-uniform concentration inequalities holding simultaneously for all $n$, revisiting a reflection inequality by James, which is of independent interest for the study of sequential decision making strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Article Open access 24 May 2023

Random variables, monotone relations, and convex analysis

Article 12 August 2014

Non-parametric Lower Bounds and Information Functions

Notes

The superscript $r$ stands for rewards, and $\ell$ for losses.

REFERENCES

C. Acerbi, ‘‘Spectral measures of risk: A coherent representation of subjective risk aversion,’’ Journal of Banking and Finance 26 (7), 1505–1518 (2002).
Article Google Scholar
P. Billingsley, Convergence of probability measures (John Wiley and Sons, 1968).
MATH Google Scholar
O. Cappé, A. Garivier, O.-A. Maillard, R. Munos, and G. Stoltz, ‘‘Kullback–Leibler Upper Confidence Bounds For Optimal Sequential Allocation,’’ Annals of Statistics 41 (3), 1516–1541 (2013).
Article MathSciNet Google Scholar
M. D. Donsker, ‘‘Justification and Extension of Doob’s Heuristic Approach to the Kolmogorov-Smirnov Theorems,’’ The Annals of Mathematical Statistics 23 (2), 277–281, 06 (1952).
R. M. Dudley, Uniform central limit theorems, Vol. 142 (Cambridge university press, 1999).
Book Google Scholar
A. Dvoretzky, J. Kiefer, and J. Wolfowitz, ‘‘Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator,’’ The Annals of Mathematical Statistics, 642–669 (1956).
B. R. James, ‘‘A functional law of the iterated logarithm for weighted empirical distributions,’’ The Annals of Probability 3 (5), 762–772 (1975).
Article MathSciNet Google Scholar
A. Kolmogorov, ‘‘Sulla determinazione empirica di una legge di distribuzione,’’ Giornale dell’Istituto Italiano degli Attuari 4, 83–91 (1933).
A. N. Kolmogorov, ‘‘On Skorokhod convergence,’’ Theory of Probability and Its Applications 1 (2), 215–222 (1956).
Article Google Scholar
B. B. Mandelbrot, ‘‘The variation of certain speculative prices,’’ in Fractals and Scaling in Finance, pp. 371–418, Springer (1997).
Book Google Scholar
P. Massart, ‘‘The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality,’’ The Annals of Probability, 1269–1283 (1990).
D. Pollard, Convergence of stochastic processes (Springer Science and Business Media, 1984).
Book Google Scholar
Ralph T. Rockafellar and Stanislav Uryasev, ‘‘Optimization of conditional value-at-risk,’’ Journal of Risk 2, 21–42 (2000).
Article Google Scholar
G. R. Shorack and J. A. Wellner, Empirical processes with applications to statistics (Society for Industrial and Applied Mathematics, 2009).
Book Google Scholar
N. V. Smirnov, ‘‘Approximate laws of distribution of random variables from empirical data,’’ Uspekhi Matematicheskikh Nauk 10, 179–206 (1944).
MathSciNet Google Scholar
P. Thomas and E. Learned-Miller, ‘‘Concentration inequalities for conditional value at risk,’’ International Conference on Machine Learning, pp. 6225–6233 (2019).

Download references

ACKNOWLEDGMENTS

This work has been supported by CPER Nord-Pas-de-Calais/FEDER DATA advanced data science and technologies 2015-2020, the French Ministry of Higher Education and Research, Inria, the French Agence Nationale de la Recherche (ANR) under grant ANR-16-CE40-0002 (the BADASS project), the MEL, the I-Site ULNE regarding project R-PILOTE-19-004-APPRENF, and the Inria A.Ex. SR4SG project.

Author information

Authors and Affiliations

Université de Lille, Inria, CNRS, Centrale Lille, UMR 9189—CRIStAL, F-59000, Lille, France
Odalric-Ambrym Maillard

Authors

Odalric-Ambrym Maillard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Odalric-Ambrym Maillard.

Appendices

PROOFS OF THE MAIN RESULTS

Proof of Lemma 2. We let $[n]=\{1,\dots,n\}$ for all $n\in\mathbb{N}_{\star}$.

Left tail, step 1. Let first recall the following remark by Smirnov [15, p. 10], showing that if $u_{(1)},\dots,u_{(n)}$ denotes the order samples received from the uniform distribution, then

$$\mathbb{P}(\sup_{u\in[0,1]}U_{n}(u)-u\leqslant\varepsilon)=\mathbb{P}\Big{(}\forall k\in[n],\quad U_{n}(u_{(k)})-\varepsilon\leqslant u_{(k)}\Big{)}$$

$${}=\mathbb{P}\Big{(}\forall k\in[n],\quad k/n-\varepsilon\leqslant u_{(k)}\Big{)}$$

$${}=n!\int\dots\int\mathbb{I}\bigg{\{}0\leqslant u_{1}\leqslant\dots u_{n}\leqslant 1;\forall k,u_{k}\geqslant k/n-\varepsilon\bigg{\}}du_{1}\dots du_{n}.$$

(1)

When restricting the supremum to $[\alpha,\beta]$, this equality needs to be modified. First of all, it holds that

$$\sup_{u\in[\alpha,\beta]}U_{n}(u)-u=\max\bigg{\{}U_{n}(v)-v:v\in\{\alpha\}\cup\{u_{(1)},\dots,u_{(n)}\}\cap[\alpha,\beta]\bigg{\}}.$$

Hence, we deduce that

$${\bigg{\{}\sup_{u\in[\alpha,\beta]}U_{n}(u)-u\leqslant\varepsilon\bigg{\}}}$$

$${}=\bigcap_{k\in[n]}\bigg{\{}u_{(k)}\in[\alpha,\beta]\implies U_{n}(u_{(k)})-\varepsilon\leqslant u_{(k)}\bigg{\}}\cap\bigg{\{}U_{n}(\alpha)\leqslant\varepsilon+\alpha\bigg{\}}$$

$${}=\bigcap_{k\in[n]}\bigg{\{}u_{(k)}\in[\alpha,\beta]\implies k/n-\varepsilon\leqslant u_{(k)}\bigg{\}}\cap\bigg{\{}\sum_{k=1}^{n}\mathbb{I}\{u_{(k)\}\leqslant\alpha}\leqslant n(\varepsilon+\alpha)\bigg{\}}$$

$${}=\bigcap_{k\in[n]}\bigg{\{}u_{(k)}\in[\alpha,\beta]\implies k/n-\varepsilon\leqslant u_{(k)}\bigg{\}}\cap\bigg{\{}u_{(k)}\leqslant\alpha<u_{(k+1)}\implies k/n\leqslant\varepsilon+\alpha\bigg{\}},$$

where we introduced the term $u_{(n+1)}=1$ in the last line and used that $\varepsilon+\alpha\geqslant 0$ to exclude the term $u_{(0)}=0$. Using the distribution of the order statistics, we deduce that

$${\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)\leqslant\varepsilon\Big{)}}=n!\int\dots\int\mathbb{I}\bigg{\{}0\leqslant u_{1}\leqslant\dots u_{n}\leqslant 1;$$

$$\forall k\in[n],\begin{cases}\text{if }u_{k}\in[\alpha,\beta]\quad\text{then }u_{k}\geqslant\frac{k}{n}-\varepsilon\\ \text{if }k>n(\varepsilon+\alpha)\quad\text{then }\alpha\notin[u_{k},u_{k+1})\\ \end{cases}\bigg{\}}du_{1}\dots du_{n}.$$

Left tail, step 2. Following [15], we introduce the notation $t_{k}=u_{n-k+1}$, constant $\gamma_{k}=(n-k+1)/n-\varepsilon$ (non-negative for $k\leqslant n-\lfloor\varepsilon n\rfloor$) as well as $\beta_{k}=\min(\gamma_{k},\beta)$. We thus have the following rewriting

$$\bigg{\{}t\in[0,1]:\text{if }t\in[\alpha,\beta]\text{ then }t\geqslant\gamma_{k}\bigg{\}}=[0,\alpha]\cup[\min(\gamma_{k},\beta),1]=[0,\alpha]\cup[\beta_{k},1],$$

which further reduces to $[0,1]$ when $\gamma_{k}\leqslant\alpha$. We let $n_{\alpha,\varepsilon}=n(1-\varepsilon-\alpha)$, $\overline{n}_{\alpha,\varepsilon}=\lceil n_{\alpha,\varepsilon}\rceil$ and remark that $\gamma_{k}\leqslant\alpha$ iff $k\geqslant n_{\alpha,\varepsilon}+1$ . Also, $\gamma_{k}\leqslant\alpha$ as soon as $k>\overline{n}_{\alpha,\varepsilon}$. Let us also note that $n-k+1>n(\varepsilon+\alpha)$ iff $k<n_{\alpha,\varepsilon}+1$, and that $\overline{n}_{\alpha,\varepsilon}<n_{\alpha,\varepsilon}+1$. This means in particular that if $k\geqslant\overline{n}_{\alpha,\varepsilon}+1$, then both conditions in the integral vanish (so contribute to $1$ in the integral)

$${\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)\leqslant\varepsilon\Big{)}}=n!\int\dots\int\mathbb{I}\bigg{\{}0\leqslant t_{n}\leqslant\dots t_{1}\leqslant 1;$$

$$\forall k\in[n],t_{k}\in[0,\alpha]\cup[\beta_{k},1],\forall k\leqslant\overline{n}_{\alpha,\varepsilon},\alpha\notin[t_{k},t_{k-1})\bigg{\}}dt_{n}\dots dt_{1}$$

$${}=n!\int\dots\int\mathbb{I}\bigg{\{}0\leqslant t_{\overline{n}_{\alpha,\varepsilon}}\leqslant\dots t_{1}\leqslant 1;$$

$$\forall k\leqslant\overline{n}_{\alpha,\varepsilon},t_{k}\in[0,\alpha]\cup[\beta_{k},1]\text{ and }\alpha\notin[t_{k},t_{k-1})\bigg{\}}J_{n-\overline{n}_{\alpha,\varepsilon}}(t_{\overline{n}_{\alpha,\varepsilon}})dt_{\overline{n}_{\alpha,\varepsilon}}\dots dt_{1},$$

where we integrated out all terms for $k>\overline{n}_{\alpha,\varepsilon}$ into the short-hand notation $J_{k}(x)=\frac{x^{k}}{k!}$. In the integral, we note that if $t_{k}\leqslant\alpha$, then so must be all terms $t_{k^{\prime}}$ for $k^{\prime}\geqslant k$.

We now proceed with integration. Starting with $t_{1}$, we see that if $t_{1}\leqslant\alpha$, then this implies $\alpha\in[t_{1},t_{0}]$. Hence, the corresponding terms are $0$, and it remains to integrate $t_{1}$ on $(\alpha,1]$, that is on $[\beta_{1},1]$.

Regarding $t_{2}$, if $t_{2}\leqslant\alpha<t_{1}$, this contradicts $\alpha\notin[t_{2},t_{1}]$, hence it remains it remains to integrate $t_{2}$ on $(\alpha,1]$, that is on $[\beta_{2},1]$. Proceeding similarly, for all $k\leqslant\overline{n}_{\alpha,\varepsilon}$ we obtain that

$$\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)\leqslant\varepsilon\Big{)}=n!\int\limits_{\beta_{1}}^{1}\int\limits_{\beta_{2}}^{t_{1}}\dots\int\limits_{\beta_{\overline{n}_{\alpha,\varepsilon}}}^{t_{\overline{n}_{\alpha,\varepsilon}-1}}J_{n-\overline{n}_{\alpha,\varepsilon}}(t_{\overline{n}_{\alpha,\varepsilon}})dt_{\overline{n}_{\alpha,\varepsilon}}\dots dt_{1}.$$

In order to compute the multiple integral, similarly to [15], we make use of the following variant of the Taylor expansion

$$f(x)=f(a_{1})+\sum_{\ell=1}^{n-k-1}f^{(\ell)}(a_{\ell+1})\int\limits_{a_{1}}^{x}\int\limits_{a_{2}}^{t_{1}}\dots\int\limits_{a_{k}}^{t_{\ell-1}}dt_{1}\dots dt_{\ell}$$

$${}+\int\limits_{a_{1}}^{x}\int\limits_{a_{2}}^{t_{1}}\dots\int\limits_{a_{n-k}}^{t_{n-k-1}}f^{(n-k)}(t_{n-k})dt_{1}\dots dt_{n-k},$$

which, using the $I_{k}$ notation, yields the following form

$$\int\limits_{a_{1}}^{x}\int\limits_{a_{2}}^{t_{1}}\dots\int\limits_{a_{n-k}}^{t_{n-k-1}}f^{(n-k)}(t_{n-k})dt_{1}\dots dt_{n-k}=f(x)-f(a_{1})-\sum_{\ell=1}^{n-k-1}f^{(\ell)}(a_{\ell+1})I_{\ell}(x;a_{1},\dots,a_{\ell}).$$

This is applied to the function $f(x)=x^{n}$, $k=n-\overline{n}_{\alpha,\varepsilon}$. Indeed, we then get $f^{(n-k)}(x)=\frac{n!x^{k}}{k!}=\frac{n!x^{n-\overline{n}_{\alpha,\varepsilon}}}{(n-\overline{n}_{\alpha,\varepsilon})!}=n!J_{n-\overline{n}_{\alpha,\varepsilon}}(x)$. This in turns yields

$$\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)\leqslant\varepsilon\Big{)}=1-\beta_{1}^{n}-\sum_{\ell=1}^{\overline{n}_{\alpha,\varepsilon}-1}\frac{n!}{(n-\ell)!}\beta_{\ell+1}^{n-\ell}I_{\ell}(1;\beta_{1},\dots\beta_{\ell})$$

$${}=1-\sum_{\ell=0}^{\overline{n}_{\alpha,\varepsilon}-1}\binom{n}{\ell}\beta_{\ell+1}^{n-\ell}\ell!I_{\ell}(1;\beta_{1},\dots\beta_{\ell}),$$

using the convention that $I_{0}(x;\emptyset)=1$. This completes the proof regarding the left tail concentration.

Right tail. We proceed similarly for the right tail. First, using our notation we note that

$$\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)=\max\bigg{\{}\lim_{u\to v;u<v}U(u)-U_{n}(u):v\in\{\beta\}\cup\{u_{(1)},\dots,u_{(n)}\}\cap(\alpha,\beta]\bigg{\}}.$$

To be more precise, we let $(\eta_{k})_{k\in[n]}>0$ be arbitrary small constants. We also let $\eta_{0}<\min_{k\in[n]}\eta_{k}$ and define $\overline{\eta}=\max_{k\in[n]}\eta_{k}$. We further introduce, for each $k\in[n]$, $u^{-}_{(k)}$ such that $u_{(k)}-\eta=u^{-}_{(k)}<u_{(k)}$ and $\beta^{-}$ such that $\beta-\eta_{0}=\beta^{-}<\beta$. Then, we introduce the notation

$$\sup^{\eta}_{u\in[\alpha,\beta]}U(u)-U_{n}(u)=\max\bigg{\{}U(v)-U_{n}(v):v\in\{\beta^{-}\}\cup\{u_{(1)}^{-},\dots,u_{(n)}^{-}\}\cap(\alpha,\beta]\bigg{\}}.$$

Before proceeding, we note that $\forall n\in\mathbb{N},\lim_{\overline{\eta}\to 0}\mathbb{P}(\min_{k\in[n]}u_{(k)}-u_{(k-1)}>\eta_{k})=1$ Indeed, it holds

$$\mathbb{P}\big{(}\min_{k\in[n]}u_{(k)}-u_{(k-1)}\leqslant\eta_{k}\big{)}=\mathbb{P}\big{(}\exists{k\in[n]},\,\,u_{(k)}-u_{(k-1)}\leqslant\eta_{k}\big{)}$$

$${}\leqslant\mathbb{P}\big{(}\exists{i,j\in[n],i<j},\,\,|X_{i}-X_{j}|\leqslant\overline{\eta}\big{)}\leqslant\frac{n(n-1)}{2}\mathbb{P}(|X_{1}-X_{2}|\leqslant\overline{\eta})=n(n-1)\overline{\eta}.$$

In the following, we use a construction similar to that of [9] for Skorokhod convergence. Note that under the event that $\Omega_{n}=\big{\{}\min_{k\in[n]}u_{(k)}-u_{(k-1)}>\eta_{k}\big{\}}$ (where $u_{(0)}=0$) we have the following rewriting

$${\bigg{\{}\sup^{\eta}_{u\in[\alpha,\beta]}u-U_{n}(u)\leqslant\varepsilon\bigg{\}}}$$

$${}=\bigcap_{k\in[n]}\bigg{\{}u_{(k)}^{-}\in[\alpha,\beta]\implies u_{(k)}^{-}-\varepsilon\leqslant U_{n}(u_{(k)}^{-})\bigg{\}}\cap\bigg{\{}\beta^{-}-\varepsilon\leqslant U_{n}(\beta^{-})\bigg{\}}$$

$${}=\bigcap_{k\in[n]}\bigg{\{}u_{(k)}^{-}\in[\alpha,\beta]\implies u_{(k)}^{-}-\varepsilon\leqslant(k-1)/n\bigg{\}}\cap\bigg{\{}\sum_{k=1}^{n}\mathbb{I}\{u_{(k)}\leqslant\beta^{-}\}\geqslant n(\beta^{-}-\varepsilon)\bigg{\}}$$

$${}=\bigcap_{k\in[n]}\bigg{\{}u_{(k)}^{-}\in[\alpha,\beta]\implies u_{(k)}^{-}-\varepsilon\leqslant(k-1)/n\bigg{\}}$$

$${}\cap\bigg{\{}u_{(k-1)}\leqslant\beta^{-}<u_{(k)}\implies(k-1)/n\geqslant\beta^{-}-\varepsilon\bigg{\}}.$$

In the last line, we used that $\beta^{-}=\beta-\eta_{0}$ and $\eta_{0}<\min_{k\in[n]}\eta_{k}$ to rewrite $\sum_{k=1}^{n}\mathbb{I}\{u_{(k)}\leqslant\beta^{-}\}$, in terms of $u_{(k)}\leqslant\beta^{-}<u_{(k+1)}$. Then we shifted $k$ by $1$, and used the fact that $\beta\leqslant 1$ implies $1\geqslant\beta^{-}-\varepsilon$ in order to exclude the term $u_{(n+1)}=1$.

We let $\tilde{\beta}=1-\beta,\tilde{\alpha}=1-\alpha$, $\tau_{k}=1-u_{k}$ and introduce for all $k$ the constant $\rho_{k}=1-\varepsilon-k/n$ (non-negative for all $k\leqslant n(1-\varepsilon)$ as well as $\tilde{\alpha}_{k}=\min(\rho_{k-1},\tilde{\alpha})$. We also let $\tilde{\beta}^{+}=1-\tilde{\beta}^{-}$. Finally, we let $\tau_{k}<_{k}\tau_{k-1}$ if and only if $\tau_{k}+\eta_{k}<\tau_{k-1}$. Using the distribution of the order statistics together with these notations,we then naturally study the quantity $\lim_{\overline{\eta}\to 0}\mathbb{P}^{\eta}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\Big{)}$ where

$${\mathbb{P}^{\eta}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\Big{)}=\mathbb{P}\Big{(}\sup^{\eta}_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\cap\Omega_{n}\Big{)}}$$

$${}=n!\int\dots\int\mathbb{I}\bigg{\{}\Omega_{n}\cap 0\leqslant u_{1}\leqslant\dots u_{n}\leqslant 1;$$

$$\forall k\in[n],\begin{cases}\text{if }u_{k}^{-}\in[\alpha,\beta]&\text{then }u_{k}^{-}\leqslant\frac{k-1}{n}+\varepsilon\\ \text{if }k-1<n(\beta^{-}-\varepsilon)&\text{then }\beta^{-}\notin[u_{k-1},u_{k})\\ \end{cases}\bigg{\}}du_{1}\dots du_{n}$$

$${}=n!\int\dots\int\mathbb{I}\bigg{\{}0\leqslant\tau_{n}<_{n}\dots\tau_{1}<_{1}1;$$

$$\forall k\in[n],\begin{cases}\text{if }\tau_{k}^{+}\in[\tilde{\beta},\tilde{\alpha}]&\text{ then }\tau_{k}^{+}\geqslant\rho_{k-1}\\ \text{if }k-1<n(\beta^{-}-\varepsilon)&\text{ then }\tilde{\beta}^{+}\notin[\tau_{k},\tau_{k-1})\\ \end{cases}\bigg{\}}d\tau_{n}\dots d\tau_{1}.$$

Now, $[0,\tilde{\beta}]\cup[\min(\rho_{k-1},\tilde{\alpha}),1]$ reduces to $[0,1]$ when $\rho_{k-1}\leqslant\tilde{\beta}$. We let $n_{\beta,\varepsilon}=n(\beta-\varepsilon)$, $\overline{n}_{\beta,\varepsilon}=\lceil n_{\beta,\varepsilon}\rceil$ and remark that $\rho_{k-1}\leqslant\tilde{\beta}$ iff $k-1\geqslant n_{\beta,\varepsilon}$. We first deal with the case when $n_{\beta,\varepsilon}\in\mathbb{N}$. In this situation, provided that $\eta_{0}$ is sufficiently small, then $\overline{n}_{\beta^{-},\varepsilon}=\overline{n}_{\beta,\varepsilon}=n_{\beta,\varepsilon}$ and also, $k-1<n_{\beta^{-},\varepsilon}$ iff $k\leqslant\overline{n}_{\beta^{-},\varepsilon}$. If $k>\overline{n}_{\beta,\varepsilon}$, then $\rho_{k-1}\leqslant\tilde{\beta}$ and both restrictions disappear in the integral. In the general situation when $n_{\beta,\varepsilon}\notin\mathbb{N}$, then $\overline{n}_{\beta,\varepsilon}>n_{\beta,\varepsilon}$ and $\overline{n}_{\beta^{-},\varepsilon}=\overline{n}_{\beta,\varepsilon}$. Also, $k-1<n_{\beta^{-},\varepsilon}$ iff $k\leqslant\overline{n}_{\beta^{-},\varepsilon}$. If $k>\overline{n}_{\beta,\varepsilon}$ then $\rho_{k-1}\leqslant\tilde{\beta}$ and again restrictions disappear in the integral. We deduce that provided that $\overline{\eta}$ is sufficiently small,

$${\mathbb{P}^{\eta}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\Big{)}}=n!\int\dots\int\mathbb{I}\bigg{\{}0\leqslant\tau_{n}<_{n}\dots\tau_{1}<_{1}1;$$

$$\forall k\in[n],\tau_{k}\in[0,\tilde{\beta}]\cup[\tilde{\alpha}_{k},1],\forall k\leqslant\overline{n}_{\beta^{-},\varepsilon},\tilde{\beta}^{+}\notin[\tau_{k},\tau_{k-1})\bigg{\}}d\tau_{n}\dots d\tau_{1}$$

$${}=n!\int\dots\int\mathbb{I}\bigg{\{}0\leqslant\tau_{\overline{n}_{\beta,\varepsilon}}<_{{\overline{n}_{\beta,\varepsilon}}}\dots\tau_{1}<_{1}1;$$

$$\quad\quad\forall k\leqslant\overline{n}_{\beta,\varepsilon},\tau_{k}\in[0,\tilde{\beta}]\cup[\tilde{\alpha}_{k},1]\text{ and }\tilde{\beta}^{+}\notin[\tau_{k},\tau_{k-1})\bigg{\}}J^{\eta}_{n-\overline{n}_{\beta}}(\tau_{\overline{n}_{\beta,\varepsilon}})d\tau_{\overline{n}_{\beta,\varepsilon}}\dots d\tau_{1},$$

where we integrated out all terms for $k>\overline{n}_{\beta,\varepsilon}$ in the term $J^{\eta}_{m}(x)$, that satisfies $\lim_{\overline{\eta}\to 0}J^{\eta}_{m}(x)=\frac{x^{m}}{m!}$.

We now proceed with integration. Starting with $\tau_{1}$, we see that if $\tau_{1}\leqslant\tilde{\beta}$, then this implies $\tilde{\beta}\in[\tau_{1},\tau_{0})$. The case when $\tilde{\beta}^{+}\geqslant\tau_{0}=1$, that is $\beta^{-}\leqslant 0$ is excluded by the assumption that $\beta>0$. Hence, this in turns implies $\tilde{\beta}^{+}\in[\tau_{1},\tau_{0})$, provided that $\eta_{0}<\tau_{0}-\tilde{\beta}=\beta$. Since this event is excluded by the indicator function, the corresponding terms are $0$, and it remains to integrate $\tau_{1}$ on $(\tilde{\beta},1]$, that is on $[\tilde{\alpha}_{1},1]$. Regarding $\tau_{2}$, if $\tau_{2}\leqslant\tilde{\beta}^{+}<\tau_{1}$, this contradicts $\tilde{\beta}^{+}\notin[\tau_{2},\tau_{1})$, hence it remains to integrate $\tau_{2}$ on $(\tilde{\beta}^{+},1]$, that is on $[\tilde{\alpha}_{2},1]$. We proceed similarly for all $k\leqslant\overline{n}_{\alpha,\varepsilon}$. We obtain that for $\overline{\eta}$ sufficiently small,

$${\mathbb{P}^{\eta}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\Big{)}}$$

$${}=n!\int\limits_{\tilde{\alpha}_{1}}^{1}\int\limits_{\tilde{\alpha}_{2}}^{\tau_{1}}\dots\int\limits_{\tilde{\alpha}_{\overline{n}_{\beta,\varepsilon}}}^{\tau_{\overline{n}_{\beta,\varepsilon}-1}}\mathbb{I}\bigg{\{}0\leqslant\tau_{\overline{n}_{\beta,\varepsilon}}<_{{\overline{n}_{\beta,\varepsilon}}}\dots\tau_{1}<_{1}1\bigg{\}}J^{\eta}_{n-\overline{n}_{\beta,\varepsilon}}(\tau_{\overline{n}_{\beta,\varepsilon}})d\tau_{\overline{n}_{\beta,\varepsilon}}\dots d\tau_{1}.$$

Now, we remark that $\lim_{\overline{\eta}\to 0}\mathbb{I}\bigg{\{}0\leqslant\tau_{\overline{n}_{\beta,\varepsilon}}<_{{\overline{n}_{\beta,\varepsilon}}}\dots\tau_{1}<_{1}1\bigg{\}}=\mathbb{I}\bigg{\{}0\leqslant\tau_{\overline{n}_{\beta,\varepsilon}}\leqslant\dots\tau_{1}\leqslant 1\bigg{\}}$, and so

$$\lim_{\overline{\eta}\to 0}\mathbb{P}^{\eta}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\Big{)}=n!\int\limits_{\tilde{\alpha}_{1}}^{1}\int\limits_{\tilde{\alpha}_{2}}^{\tau_{1}}\dots\int\limits_{\tilde{\alpha}_{\overline{n}_{\beta,\varepsilon}}}^{\tau_{\overline{n}_{\beta,\varepsilon}-1}}J_{n-\overline{n}_{\beta,\varepsilon}}(\tau_{\overline{n}_{\beta,\varepsilon}})d\tau_{\overline{n}_{\beta,\varepsilon}}\dots d\tau_{1}.$$

In order to compute the multiple integral, we resort to a Taylor expansion as for the left tail, and deduce that

$$\lim_{\overline{\eta}\to 0}\mathbb{P}^{\eta}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\Big{)}=1-\sum_{\ell=0}^{\overline{n}_{\beta,\varepsilon}-1}\binom{n}{\ell}\tilde{\alpha}_{\ell+1}^{n-\ell}\ell!I_{\ell}(1;\tilde{\alpha}_{1},\dots\tilde{\alpha}_{\ell}).$$

It remains to note that $\lim_{\overline{\eta}\to 0}\mathbb{P}\Big{(}\sup^{\eta}_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\cap\Omega_{n}^{c}\Big{)}\leqslant\lim_{\overline{\eta}\to 0}\mathbb{P}\Big{(}\Omega_{n}^{c}\Big{)}=0$ and thus

$$\lim_{\overline{\eta}\to 0}\mathbb{P}\Big{(}\sup^{\eta}_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\Big{)}=\lim_{\overline{\eta}\to 0}\mathbb{P}\Big{(}\sup^{\eta}_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\cap\Omega_{n}\Big{)}$$

$${}=\lim_{\overline{\eta}\to 0}\mathbb{P}^{\eta}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\leqslant\varepsilon\Big{)}.$$

This shows that the limit of $\mathbb{P}\Big{(}\sup^{\eta}_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\Big{)}$ indeed exists and hence gives the value of $\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\Big{)}$. $\Box$

Proof of Theorem 3. We now compute for $\ell\geqslant 1$ the quantity

$$I_{\ell}(x;\beta_{1},\dots,\beta_{\ell})=\int\limits_{\beta_{1}}^{x}\int\limits_{\beta_{2}}^{t_{1}}\dots\int\limits_{\beta_{\ell}}^{t_{\ell-1}}dt_{\ell}\dots dt_{1}.$$

We further let $n_{\beta}=n(1-\beta-\varepsilon)$, $\underline{n}_{\beta}=\lfloor n_{\beta}\rfloor$ and note that $\beta_{k}=\min((n-k+1)/n-\varepsilon,\beta)$ is equal to $\beta$ iff $k\leqslant n_{\beta}+1$. Also, $\beta_{k}=\beta$ as soon as $k\leqslant\underline{n}_{\beta}+1$. Last, $\gamma_{k}=(n-k+1)/n-\varepsilon$.

Case 1. When $n_{\beta}<0$, then $\beta_{k}=\gamma_{k}$ for all $k\geqslant 1$. In this case, since $\gamma_{k}-\gamma_{k-1}=-1/n$, we deduce that

$$I_{\ell}(1;\beta_{1},\dots,\beta_{\ell})=I_{\ell}(1;\gamma_{1},\dots,\gamma_{\ell})=\frac{(1-\gamma_{1})(1-\gamma_{1}+\ell/n)^{\ell-1}}{\ell!}$$

and hence since $\gamma_{1}=1-\varepsilon$ and $\gamma_{\ell+1}=1-\ell/n-\varepsilon$,

$$\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)<\varepsilon\Big{)}=1-\sum_{\ell=0}^{\overline{n}_{\alpha,\varepsilon}-1}\binom{n}{\ell}\bigg{(}\Big{(}1-\ell/n-\varepsilon\Big{)}^{n-\ell}-\alpha^{n-\ell}\bigg{)}\varepsilon(\varepsilon+\ell/n)^{\ell-1}.$$

Case 2. We now consider the general case when $\underline{n}_{\beta}\geqslant 0$. For instance if $n_{\beta}\geqslant 0$ but $\underline{n}_{\beta}=0$ (that is, $0\leqslant n_{\beta}<1$), then, we deduce that $\beta_{k}=\gamma_{k}$ for all $k\geqslant 2$, while $\beta_{1}=\beta$. Hence, we deduce that

$$I_{\ell}(1;\beta_{1},\dots,\beta_{\ell})=\begin{cases}\int\limits_{\beta}^{1}dt_{1}=(1-\beta)\quad\text{if }\ell=1\\ \int\limits_{\beta}^{1}I_{\ell-1}(t_{1};\gamma_{2},\dots,\gamma_{\ell})dt_{1}\quad\text{if }\ell>1.\end{cases}$$

Likewise, when $\underline{n}_{\beta}=1$, then we deduce that $\beta_{k}=\gamma_{k}$ for all $k\geqslant 3$, while $\beta_{1}=\beta_{2}=\beta$, and so

$$I_{\ell}(1;\beta_{1},\dots,\beta_{\ell})=\begin{cases}\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{\ell-1}}dt_{\ell}\dots dt_{1}=\frac{(1-\beta)^{\ell}}{\ell!}\quad\text{if }\ell\leqslant 2\\ \int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}I_{\ell-2}(t_{\underline{n}_{\beta}+1};\gamma_{\underline{n}_{\beta}+2},\dots,\gamma_{\ell})dt_{2}dt_{1}\quad\text{if }\ell>2.\end{cases}$$

More generally, for a generic $\underline{n}_{\beta}\geqslant 0$, we deduce (using the convention that $t_{0}=1$) that

$$I_{\ell}(1;\beta_{1},\dots,\beta_{\ell})=\begin{cases}\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{\ell-1}}dt_{\ell}\dots dt_{1}=\frac{(1-\beta)^{\ell}}{\ell!}\quad\text{if }\ell\leqslant\underline{n}_{\beta}+1\\ \int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{\underline{n}_{\beta}}}I_{\ell-\underline{n}_{\beta}-1}(t_{\underline{n}_{\beta}+1};\gamma_{\underline{n}_{\beta}+2},\dots,\gamma_{\ell})dt_{\underline{n}_{\beta}+1}\dots dt_{1}\quad\text{if }\ell>\underline{n}_{\beta}+1.\end{cases}$$

Further, since $\gamma_{\ell}-\gamma_{\ell-1}=-1/n$ for all $\ell$, and introducing $\ell_{\beta}=\ell-\underline{n}_{\beta}-1$ we also have (see [15])

$$I_{\ell_{\beta}}(t;\gamma_{\underline{n}_{\beta}+2},\dots,\gamma_{\ell})=\frac{(t-\gamma_{\underline{n}_{\beta}+2})(t-\gamma_{\underline{n}_{\beta}+2}+\ell_{\beta}/n)^{\ell_{\beta}-1}}{\ell_{\beta}!}$$

$${}=\frac{(t-\beta+C_{\ell_{\beta}})^{\ell_{\beta}}}{\ell_{\beta}!}-\frac{1}{n}\frac{(t-\beta+C_{\ell_{\beta}})^{\ell_{\beta}-1}}{(\ell_{\beta}-1)!},$$

where in the second line, we also introduced $C_{\ell_{\beta}}=\beta-\gamma_{\underline{n}_{\beta}+2}+\ell_{\beta}/n=(\beta+\varepsilon)-(n-\ell)/n$. In particular, $C_{\ell_{\beta}}=(\ell-n_{\beta})/n>0$ for $\ell>\underline{n}_{\beta}+1$. From this expression, we deduce that if $\ell>\underline{n}_{\beta}+1$, then

$$I_{\ell}(1;\beta_{1},\dots,\beta_{\ell})=\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{\underline{n}_{\beta}}}\frac{(t_{\underline{n}_{\beta}+1}-\beta+C_{\ell_{\beta}})^{\ell_{\beta}}}{\ell_{\beta}!}dt_{\underline{n}_{\beta}+1}\dots dt_{1}$$

$${}-\frac{1}{n}\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{\underline{n}_{\beta}}}\frac{(t_{\underline{n}_{\beta}+1}-\beta+C_{\ell_{\beta}})^{\ell_{\beta}-1}}{(\ell_{\beta}-1)!}dt_{\underline{n}_{\beta}+1}\dots dt_{1}.$$

In order to compute both terms, we use the following inequality for given $k,\ell,C$,

$${\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{k}}\frac{(t_{k+1}-\beta+C)^{\ell}}{\ell!}dt_{k+1}\dots dt_{1}}$$

$${}=\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{k-1}}\frac{(t_{k}-\beta+C)^{\ell+1}}{(\ell+1)!}dt_{k}\dots dt_{1}-\frac{C^{\ell+1}}{(\ell+1)!}\underbrace{\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{k-1}}dt_{k}\dots dt_{1}}_{B_{k}}$$

$${}=\int\limits_{\beta}^{1}\int\limits_{\beta}^{t_{1}}\dots\int\limits_{\beta}^{t_{k-j-1}}\frac{(t_{k-j}-\beta+C)^{\ell+j+1}}{(\ell+j+1)!}dt_{k-j}\dots dt_{1}-\frac{C^{\ell+j+1}}{(\ell+j+1)!}B_{k-j}-\dots-\frac{C^{\ell+1}}{(\ell+1)!}B_{k}$$

$${}=\frac{(1-\beta+C)^{\ell+k+1}}{(\ell+k+1)!}-\sum_{j=0}^{k}\frac{C^{\ell+j+1}}{(\ell+j+1)!}\frac{(1-\beta)^{k-j}}{(k-j)!}.$$

Hence, we deduce that

$$I_{\ell}(1;\beta_{1},\dots,\beta_{\ell})=\frac{(1-\beta+C_{\ell_{\beta}})^{\ell_{\beta}+\underline{n}_{\beta}+1}}{(\ell_{\beta}+\underline{n}_{\beta}+1)!}-\frac{1}{n}\frac{(1-\beta+C_{\ell_{\beta}})^{\ell_{\beta}+\underline{n}_{\beta}}}{(\ell_{\beta}+\underline{n}_{\beta})!}$$

$${}-\sum_{j=0}^{\underline{n}_{\beta}}\frac{C_{\ell_{\beta}}^{\ell_{\beta}+j+1}}{(\ell_{\beta}+j+1)!}\frac{(1-\beta)^{\underline{n}_{\beta}-j}}{(\underline{n}_{\beta}-j)!}+\frac{1}{n}\sum_{j=0}^{\underline{n}_{\beta}}\frac{C_{\ell_{\beta}}^{\ell_{\beta}+j}}{(\ell_{\beta}+j)!}\frac{(1-\beta)^{\underline{n}_{\beta}-j}}{(\underline{n}_{\beta}-j)!}$$

$${}=\bigg{[}\frac{1-\beta+C_{\ell_{\beta}}}{\ell}-\frac{1}{n}\bigg{]}\frac{(1-\beta+C_{\ell_{\beta}})^{\ell-1}}{(\ell-1)!}$$

$${}-\sum_{j=0}^{\underline{n}_{\beta}}\bigg{[}\frac{C_{\ell_{\beta}}}{\ell-\underline{n}_{\beta}+j}-\frac{1}{n}\bigg{]}\frac{C_{\ell_{\beta}}^{\ell-\underline{n}_{\beta}+j-1}}{(\ell-\underline{n}_{\beta}+j-1)!}\frac{(1-\beta)^{\underline{n}_{\beta}-j}}{(\underline{n}_{\beta}-j)!}.$$

After reorganizing the terms, and remarking that $C_{\ell_{\beta}}=b-1+\varepsilon+\ell/n=(\ell-n_{\beta})/n$, we obtain that then if $\ell>\underline{n}_{\beta}+1$, then

$$I_{\ell}(1;\beta_{1},\dots,\beta_{\ell})$$

$${}=\frac{1}{n}\bigg{[}\frac{\ell+n\varepsilon}{\ell}-1\bigg{]}\frac{(\ell/n+\varepsilon)^{\ell-1}}{(\ell-1)!}-\sum_{j=0}^{\underline{n}_{\beta}}\frac{1}{n}\bigg{[}\frac{\ell-n_{\beta}}{\ell-\underline{n}_{\beta}+j}-1\bigg{]}\frac{((\ell-n_{\beta})/n)^{\ell-\underline{n}_{\beta}+j-1}}{(\ell-\underline{n}_{\beta}+j-1)!}\frac{(1-\beta)^{\underline{n}_{\beta}-j}}{(\underline{n}_{\beta}-j)!}$$

$${}=\varepsilon\frac{(\ell/n+\varepsilon)^{\ell-1}}{\ell!}+\frac{1}{n(\ell-1)!}\sum_{j=0}^{\underline{n}_{\beta}}\frac{n_{\beta}-j}{\ell-j}\binom{\ell-1}{j}\bigg{(}\frac{\ell-n_{\beta}}{n}\bigg{)}^{\ell-j-1}(1-\beta)^{j}$$

$${}=\varepsilon\frac{(\ell/n+\varepsilon)^{\ell-1}}{\ell!}+\frac{1}{\ell!}\sum_{j=0}^{\underline{n}_{\beta}}\frac{n_{\beta}-j}{n}\binom{\ell}{j}\bigg{(}\frac{\ell-n_{\beta}}{n}\bigg{)}^{\ell-j-1}(1-\beta)^{j}.$$

Combining all steps together, we deduce that if $\underline{n}_{\beta}\geqslant 0$ and $\lfloor n_{\beta}\rfloor+1\leqslant\overline{n}_{\alpha,\varepsilon}-1$, then

$${\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)<\varepsilon\Big{)}=1-\sum_{\ell=0}^{\underline{n}_{\beta}+1}\binom{n}{\ell}\beta_{\ell+1}^{n-\ell}(1-\beta)^{\ell}}$$

$${}-\sum_{\ell=\underline{n}_{\beta}+2}^{\overline{n}_{\alpha,\varepsilon}-1}\binom{n}{\ell}\bigg{(}1-\frac{\ell}{n}-\varepsilon\bigg{)}^{n-\ell}\varepsilon\bigg{(}\frac{\ell}{n}+\varepsilon\bigg{)}^{\ell-1}$$

$${}-\sum_{\ell=\underline{n}_{\beta}+2}^{\overline{n}_{\alpha,\varepsilon}-1}\binom{n}{\ell}\bigg{(}1-\frac{\ell}{n}-\varepsilon\bigg{)}^{n-\ell}\sum_{j=0}^{\underline{n}_{\beta}}\bigg{[}\frac{n_{\beta}-j}{n}\bigg{]}\binom{\ell}{j}\bigg{(}\frac{\ell-n_{\beta}}{n}\bigg{)}^{\ell-j-1}(1-\beta)^{j},$$

where $\beta_{k}=\min((n-k+1)/n-\varepsilon,b)$, $\overline{n}_{\alpha,\varepsilon}=\lceil n(1-\alpha-\varepsilon)\rceil$ $\underline{n}_{\beta}=\lfloor n(1-\beta-\varepsilon)\rfloor$. Introducing the term $m_{\beta}=\min\{\lfloor n_{\beta}\rfloor+1,\overline{n}_{\alpha,\varepsilon}-1\}$, we get more generally when $n_{\beta}>0$,

$${\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)<\varepsilon\Big{)}=1-\sum_{\ell=0}^{m_{\beta}}\binom{n}{\ell}\Big{(}\min\Big{\{}1-\frac{\ell}{n}-\varepsilon,\beta\Big{\}}\Big{)}^{n-\ell}(1-\beta)^{\ell}}$$

$${}-\sum_{\ell=m_{\beta}+1}^{\overline{n}_{\alpha,\varepsilon}-1}\binom{n}{\ell}\bigg{(}1-\frac{\ell}{n}-\varepsilon\bigg{)}^{n-\ell}\bigg{[}\varepsilon\bigg{(}\frac{\ell}{n}+\varepsilon\bigg{)}^{\ell-1}+\sum_{j=0}^{m_{\beta}-1}\bigg{[}\frac{n_{\beta}-j}{n}\bigg{]}\binom{\ell}{j}\bigg{(}\frac{\ell-n_{\beta}}{n}\bigg{)}^{\ell-j-1}(1-\beta)^{j}\bigg{]}.$$

$\Box$

Proof of Lemma 4. We let $\tilde{\beta}=1-\beta,\tilde{\alpha}=1-\alpha$, $\tau_{k}=1-u_{k}$ and consider for all $k$ the constant $\rho_{k}=1-\varepsilon-k/n$ (non-negative for all $k\leqslant n(1-\varepsilon)$. We recall that $\tilde{\alpha}_{k}=\min(\rho_{k-1},\tilde{\alpha})$. We let $n_{\beta,\varepsilon}=n(\beta-\varepsilon)$, $\overline{n}_{\beta,\varepsilon}=\lceil n_{\beta,\varepsilon}\rceil$ and remark that $\rho_{k}\leqslant\tilde{\beta}$ iff $k\geqslant n_{\beta,\varepsilon}$. Also, $\rho_{k}\leqslant\tilde{\beta}$ as soon as $k-1>\overline{n}_{\beta,\varepsilon}$.

We now compute the quantity

$$I_{\ell}(x;\tilde{\alpha}_{1},\dots,\tilde{\alpha}_{\ell})=\int\limits_{\tilde{\alpha}_{1}}^{x}\int\limits_{\tilde{\alpha}_{2}}^{t_{1}}\dots\int\limits_{\tilde{\alpha}_{\ell}}^{t_{\ell-1}}dt_{\ell}\dots dt_{1},$$

where $\tilde{\alpha}_{k}=\min(\rho_{k-1},\tilde{\alpha})$. We further let $n_{\tilde{\alpha}}=n(\alpha-\varepsilon)$, $\underline{n}_{\alpha}=\lfloor n_{\alpha}\rfloor$ and note that $\tilde{\alpha}_{k}=\min(1-\varepsilon-(k-1)/n,\tilde{\alpha})$ is equal to $\tilde{\alpha}$ iff $k\leqslant n_{\tilde{\alpha}}+1$. Also, $\tilde{\alpha}_{k}=\tilde{\alpha}$ as soon as $k\leqslant\underline{n}_{\tilde{\alpha}}+1$.

Case 1. When $n_{\tilde{\alpha}}<0$, then $\tilde{\alpha}_{k}=\rho_{k-1}$ for all $k\geqslant 1$. In this case, since $\rho_{k}-\rho_{k-1}=-1/n$, we deduce that

$$I_{\ell}(1;\tilde{\alpha}_{1},\dots,\tilde{\alpha}_{\ell})=I_{\ell}(1;\rho_{0},\dots,\rho_{\ell-1})=\frac{(1-\rho_{0})(1-\rho_{0}+\ell/n)^{\ell-1}}{\ell!}$$

and hence since $\rho_{0}=1-\varepsilon$ and $\rho_{\ell}=1-\ell/n-\varepsilon$, it comes

$$\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U(u)-U_{n}(u)\geqslant\varepsilon\Big{)}=\sum_{\ell=0}^{\overline{n}_{\beta,\varepsilon}-1}\binom{n}{\ell}\Big{(}1-\frac{\ell}{n}-\varepsilon\Big{)}^{n-\ell}\varepsilon\Big{(}\varepsilon+\frac{\ell}{n}\Big{)}^{\ell-1}.$$

Case 2. We now consider the general case when $\underline{n}_{\tilde{\alpha}}\geqslant 0$. For instance if $n_{\tilde{\alpha}}\geqslant 0$ but $\underline{n}_{\tilde{\alpha}}=0$ (that is, $0\leqslant n_{\tilde{\alpha}}<1$), then, we deduce that $\tilde{\alpha}_{k}=\rho_{k-1}$ for all $k\geqslant 2$, while $\tilde{\alpha}_{1}=\tilde{\alpha}$. Hence, we deduce that

$$I_{\ell}(1;\tilde{\alpha}_{1},\dots,\tilde{\alpha}_{\ell})=\begin{cases}\int\limits_{\tilde{\alpha}}^{1}dt_{1}=(1-\tilde{\alpha})\quad\text{if }\ell=1\\ \int\limits_{\tilde{\alpha}}^{1}I_{\ell-1}(t_{1};\rho_{1},\dots,\rho_{\ell-1})dt_{1}\quad\text{if }\ell>1.\end{cases}$$

Likewise, when $\underline{n}_{\tilde{\alpha}}=1$, then we deduce that $\tilde{\alpha}_{k}=\rho_{k-1}$ for all $k\geqslant 3$, while $\tilde{\alpha}_{1}=\tilde{\alpha}_{2}=\tilde{\alpha}$, and so

$$I_{\ell}(1;\tilde{\alpha}_{1},\dots,\tilde{\alpha}_{\ell})=\begin{cases}\int\limits_{\tilde{\alpha}}^{1}\int\limits_{\tilde{\alpha}}^{t_{1}}\dots\int\limits_{\tilde{\alpha}}^{t_{\ell-1}}dt_{\ell}\dots dt_{1}=\frac{(1-\tilde{\alpha})^{\ell}}{\ell!}\quad\text{if }\ell\leqslant 2\\ \int\limits_{\tilde{\alpha}}^{1}\int\limits_{\tilde{\alpha}}^{t_{1}}I_{\ell-2}(t_{\underline{n}_{\tilde{\alpha}}+1};\rho_{\underline{n}_{\tilde{\alpha}}+1},\dots,\rho_{\ell-1})dt_{2}dt_{1}\quad\text{if }\ell>2.\end{cases}$$

More generally, for a generic $\underline{n}_{\tilde{\alpha}}\geqslant 0$, we deduce (using the convention that $t_{0}=1$) that

$$I_{\ell}(1;\tilde{\alpha}_{1},\dots,\tilde{\alpha}_{\ell})=\begin{cases}\int\limits_{\tilde{\alpha}}^{1}\int\limits_{\tilde{\alpha}}^{t_{1}}\dots\int\limits_{\tilde{\alpha}}^{t_{\ell-1}}dt_{\ell}\dots dt_{1}=\frac{(1-\tilde{\alpha})^{\ell}}{\ell!}\quad\text{if }\ell\leqslant\underline{n}_{\tilde{\alpha}}+1\\ \int\limits_{\tilde{\alpha}}^{1}\int\limits_{\tilde{\alpha}}^{t_{1}}\dots\int\limits_{\tilde{\alpha}}^{t_{\underline{n}_{\tilde{\alpha}}}}I_{\ell-\underline{n}_{\tilde{\alpha}}-1}(t_{\underline{n}_{\tilde{\alpha}}+1};\rho_{\underline{n}_{\tilde{\alpha}}+1},\dots,\rho_{\ell-1})dt_{\underline{n}_{\tilde{\alpha}}+1}\dots dt_{1}\quad\text{if }\ell>\underline{n}_{\tilde{\alpha}}+1.\end{cases}$$

Further, since $\rho_{\ell}-\rho_{\ell-1}=-1/n$ for all $\ell$, and introducing $\ell_{\tilde{\alpha}}=\ell-\underline{n}_{\tilde{\alpha}}-1$ we also have (see [15])

$$I_{\ell_{\tilde{\alpha}}}(t;\rho_{\underline{n}_{\tilde{\alpha}}+1},\dots,\rho_{\ell-1})=\frac{(t-\rho_{\underline{n}_{\tilde{\alpha}}+1})(t-\rho_{\underline{n}_{\tilde{\alpha}}+1}+\ell_{\tilde{\alpha}}/n)^{\ell_{\tilde{\alpha}}-1}}{\ell_{\tilde{\alpha}}!}$$

$${}=\frac{(t-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}})^{\ell_{\tilde{\alpha}}}}{\ell_{\tilde{\alpha}}!}-\frac{1}{n}\frac{(t-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}})^{\ell_{\tilde{\alpha}}-1}}{(\ell_{\tilde{\alpha}}-1)!},$$

where in the second line, we also introduced $C_{\ell_{\tilde{\alpha}}}={\tilde{\alpha}}-\rho_{\underline{n}_{\tilde{\alpha}}+1}+\ell_{\tilde{\alpha}}/n=(1-\alpha+\varepsilon)-(n-\ell)/n$.

In particular, it holds that $C_{\ell_{\tilde{\alpha}}}=(\ell-n_{\tilde{\alpha}})/n>0$ for $\ell>\underline{n}_{\tilde{\alpha}}+1$. From this expression, we deduce that if $\ell>\underline{n}_{\tilde{\alpha}}+1$, then

$$I_{\ell}(1;{\tilde{\alpha}}_{1},\dots,{\tilde{\alpha}}_{\ell})=\int\limits_{{\tilde{\alpha}}}^{1}\int\limits_{{\tilde{\alpha}}}^{t_{1}}\dots\int\limits_{{\tilde{\alpha}}}^{t_{\underline{n}_{\tilde{\alpha}}}}\frac{(t_{\underline{n}_{\tilde{\alpha}}+1}-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}})^{\ell_{\tilde{\alpha}}}}{\ell_{\tilde{\alpha}}!}dt_{\underline{n}_{\tilde{\alpha}}+1}\dots dt_{1}$$

$${}-\frac{1}{n}\int\limits_{{\tilde{\alpha}}}^{1}\int\limits_{{\tilde{\alpha}}}^{t_{1}}\dots\int\limits_{{\tilde{\alpha}}}^{t_{\underline{n}_{\tilde{\alpha}}}}\frac{(t_{\underline{n}_{\tilde{\alpha}}+1}-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}})^{\ell_{\tilde{\alpha}}-1}}{(\ell_{\tilde{\alpha}}-1)!}dt_{\underline{n}_{\tilde{\alpha}}+1}\dots dt_{1}.$$

Hence, we deduce that

$$I_{\ell}(1;{\tilde{\alpha}}_{1},\dots,{\tilde{\alpha}}_{\ell})=\frac{(1-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}})^{\ell_{\tilde{\alpha}}+\underline{n}_{\tilde{\alpha}}+1}}{(\ell_{\tilde{\alpha}}+\underline{n}_{\tilde{\alpha}}+1)!}-\frac{1}{n}\frac{(1-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}})^{\ell_{\tilde{\alpha}}+\underline{n}_{\tilde{\alpha}}}}{(\ell_{\tilde{\alpha}}+\underline{n}_{\tilde{\alpha}})!}$$

$${}-\sum_{j=0}^{\underline{n}_{\tilde{\alpha}}}\frac{C_{\ell_{\tilde{\alpha}}}^{\ell_{\tilde{\alpha}}+j+1}}{(\ell_{\tilde{\alpha}}+j+1)!}\frac{(1-{\tilde{\alpha}})^{\underline{n}_{\tilde{\alpha}}-j}}{(\underline{n}_{\tilde{\alpha}}-j)!}+\frac{1}{n}\sum_{j=0}^{\underline{n}_{\tilde{\alpha}}}\frac{C_{\ell_{\tilde{\alpha}}}^{\ell_{\tilde{\alpha}}+j}}{(\ell_{\tilde{\alpha}}+j)!}\frac{(1-{\tilde{\alpha}})^{\underline{n}_{\tilde{\alpha}}-j}}{(\underline{n}_{\tilde{\alpha}}-j)!}$$

$${}=\bigg{[}\frac{1-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}}}{\ell}-\frac{1}{n}\bigg{]}\frac{(1-{\tilde{\alpha}}+C_{\ell_{\tilde{\alpha}}})^{\ell-1}}{(\ell-1)!}$$

$${}-\sum_{j=0}^{\underline{n}_{\tilde{\alpha}}}\bigg{[}\frac{C_{\ell_{\tilde{\alpha}}}}{\ell-\underline{n}_{\tilde{\alpha}}+j}-\frac{1}{n}\bigg{]}\frac{C_{\ell_{\tilde{\alpha}}}^{\ell-\underline{n}_{\tilde{\alpha}}+j-1}}{(\ell-\underline{n}_{\tilde{\alpha}}+j-1)!}\frac{(1-{\tilde{\alpha}})^{\underline{n}_{\tilde{\alpha}}-j}}{(\underline{n}_{\tilde{\alpha}}-j)!},$$

where we used that $\ell_{\tilde{\alpha}}+\underline{n}_{\tilde{\alpha}}+1=\ell$. After reorganizing the terms, and remarking that $C_{\ell_{\tilde{\alpha}}}=\varepsilon-\alpha+\ell/n=(\ell-n_{\tilde{\alpha}})/n$, we obtain that then if $\ell>\underline{n}_{\tilde{\alpha}}+1$, then

$$I_{\ell}(1;{\tilde{\alpha}}_{1},\dots,{\tilde{\alpha}}_{\ell})$$

$${}=\frac{1}{n}\bigg{[}\frac{\ell+n\varepsilon}{\ell}-1\bigg{]}\frac{(\ell/n+\varepsilon)^{\ell-1}}{(\ell-1)!}-\sum_{j=0}^{\underline{n}_{\tilde{\alpha}}}\frac{1}{n}\bigg{[}\frac{\ell-n_{\tilde{\alpha}}}{\ell-\underline{n}_{\tilde{\alpha}}+j}-1\bigg{]}\frac{((\ell-n_{\tilde{\alpha}})/n)^{\ell-\underline{n}_{\tilde{\alpha}}+j-1}}{(\ell-\underline{n}_{\tilde{\alpha}}+j-1)!}\frac{(1-{\tilde{\alpha}})^{\underline{n}_{\tilde{\alpha}}-j}}{(\underline{n}_{\tilde{\alpha}}-j)!}$$

$${}=\varepsilon\frac{(\ell/n+\varepsilon)^{\ell-1}}{\ell!}+\frac{1}{n}\sum_{j=0}^{\underline{n}_{\tilde{\alpha}}}\bigg{(}n_{\tilde{\alpha}}-j\bigg{)}\bigg{(}\frac{\ell-n_{\tilde{\alpha}}}{n}\bigg{)}^{\ell-j-1}(1-{\tilde{\alpha}})^{j}\frac{1}{\ell!}\binom{\ell}{j}$$

$${}=\varepsilon\frac{(\ell/n+\varepsilon)^{\ell-1}}{\ell!}+\frac{1}{\ell!}\sum_{j=0}^{\underline{n}_{\tilde{\alpha}}}\frac{n_{\tilde{\alpha}}-j}{n}\binom{\ell}{j}\bigg{(}\frac{\ell-n_{\tilde{\alpha}}}{n}\bigg{)}^{\ell-j-1}(1-{\tilde{\alpha}})^{j}.$$

Combining all steps together, we deduce that if $\underline{n}_{\tilde{\alpha}}\geqslant 0$ and $\lfloor n_{\tilde{\alpha}}\rfloor+1\leqslant\overline{n}_{\beta,\varepsilon}-1$, then

$${\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)<\varepsilon\Big{)}=1-\sum_{\ell=0}^{\underline{n}_{\tilde{\alpha}}+1}\binom{n}{\ell}\tilde{\alpha}_{\ell+1}^{n-\ell}(1-{\tilde{\alpha}})^{\ell}}$$

$${}-\sum_{\ell=\underline{n}_{\tilde{\alpha}}+2}^{\overline{n}_{\beta,\varepsilon}-1}\binom{n}{\ell}\bigg{(}1-\frac{\ell}{n}-\varepsilon\bigg{)}^{n-\ell}\varepsilon\bigg{(}\frac{\ell}{n}+\varepsilon\bigg{)}^{\ell-1}$$

$${}-\sum_{\ell=\underline{n}_{\tilde{\alpha}}+2}^{\overline{n}_{\beta,\varepsilon}-1}\binom{n}{\ell}\bigg{(}1-\frac{\ell}{n}-\varepsilon\bigg{)}^{n-\ell}\sum_{j=0}^{\underline{n}_{\tilde{\alpha}}}\bigg{[}\frac{n_{\tilde{\alpha}}-j}{n}\bigg{]}\binom{\ell}{j}\bigg{(}\frac{\ell-n_{\tilde{\alpha}}}{n}\bigg{)}^{\ell-j-1}(1-{\tilde{\alpha}})^{j},$$

where ${\tilde{\alpha}}_{k}=\min((n-(k-1))/n-\varepsilon,\tilde{\alpha})$, $\overline{n}_{\beta,\varepsilon}=\lceil n(\beta-\varepsilon)\rceil$ $\underline{n}_{\tilde{\alpha}}=\lfloor n(\alpha-\varepsilon)\rfloor$. Introducing the term $m_{\beta}=\min\{\lfloor n_{\tilde{\alpha}}\rfloor+1,\overline{n}_{\beta,\varepsilon}-1\}$, we get more generally when $n_{\tilde{\alpha}}>0$,

$${\mathbb{P}\Big{(}\sup_{u\in[\alpha,\beta]}U_{n}(u)-U(u)<\varepsilon\Big{)}=1-\sum_{\ell=0}^{m_{\beta}}\binom{n}{\ell}\Big{(}\min\Big{\{}1-\frac{\ell}{n}-\varepsilon,\tilde{\alpha}\Big{\}}\Big{)}^{n-\ell}(1-{\tilde{\alpha}})^{\ell}}$$

$${}-\sum_{\ell=m_{\beta}+1}^{\overline{n}_{\beta,\varepsilon}-1}\binom{n}{\ell}\bigg{(}1-\frac{\ell}{n}-\varepsilon\bigg{)}^{n-\ell}\bigg{[}\varepsilon\bigg{(}\frac{\ell}{n}+\varepsilon\bigg{)}^{\ell-1}+\sum_{j=0}^{m_{\beta}-1}\bigg{[}\frac{n_{\tilde{\alpha}}-j}{n}\bigg{]}\binom{\ell}{j}\bigg{(}\frac{\ell-n_{\tilde{\alpha}}}{n}\bigg{)}^{\ell-j-1}(1-{\tilde{\alpha}})^{j}\bigg{]}.$$

$\Box$

MONTE CARLO SIMULATIONS OF THE CONFIDENCE BOUNDS

TECHNICAL DETAILS REGARDING THE CVAR

Proposition 7 is a consequence of the following more general results.

Proposition 17 (conditional value at risk). Any solution $x^{\star}$ to the following problem

$$\mathsf{CVaR}^{l}_{1-\alpha}(\nu)=\inf_{x\in\mathbb{R}}\bigg{\{}x+\frac{1}{\alpha}\mathbb{E}[\max(X-x,0)]\bigg{\}}$$

must satisfy $1-\alpha\in[F(x^{\star})-\mathbb{P}(X=x^{\star}),F(x^{\star})]$ . Further, it holds

$$\mathsf{CVaR}^{l}_{1-\alpha}(\nu)=\frac{1}{\alpha}\bigg{(}\mathbb{E}\Big{[}X\mathbb{I}\{X>x^{\star}\}\Big{]}+x^{\star}\Big{(}F(x^{\star})-(1-\alpha)\Big{)}\bigg{)}.$$

Proof of Proposition 17. Let us introduce the function $H:x\mapsto x+\frac{1}{\alpha}\mathbb{E}[\max(X-x,0)]$. This is a convex function. Let $\partial H(x)$ denotes its subdifferential at point $x$. In particular, for $y\in\partial H(x)$, we must have $\forall x^{\prime},H(x^{\prime})\leqslant H(x)+y(x^{\prime}-x)$, and $x$ is a minimum of $H$ if $0\in\partial H(x)$. Using Minkowski set notations, we first have

$$\partial H(x)=\{1\}+\frac{1}{\alpha}\partial\mathbb{E}[\max(X-x,0)],$$

hence we focus on computing $\partial\mathbb{E}[(X-x)\mathbb{I}\{X>x\}]$. To this end, we look at the $y$ such that

$$\forall x^{\prime},\mathbb{E}[(X-x^{\prime})\mathbb{I}\{X>x^{\prime}\}]\geqslant\mathbb{E}[(X-x)\mathbb{I}\{X>x\}]+y(x^{\prime}-x)$$

$$\text{i.e.}\forall x^{\prime},-(x-x^{\prime})\mathbb{E}[\mathbb{I}\{X>x\}]+\mathbb{E}[(X-x^{\prime})(\mathbb{I}\{X>x^{\prime}\}-\mathbb{I}\{X>x\})]\geqslant y(x^{\prime}-x)$$

Remarking that if $x>x^{\prime}$, then $\mathbb{I}\{X>x^{\prime}\}-\mathbb{I}\{X>x\}=\mathbb{I}\{X\in(x^{\prime},x]\}$, while if $x^{\prime}>x$ then $\mathbb{I}\{X>x^{\prime}\}-\mathbb{I}\{X>x\}=-\mathbb{I}\{X\in(x,x^{\prime}]\}$, and reorganizing the terms, this means we must have

$$\forall x^{\prime}>x,-\mathbb{E}[\mathbb{I}\{X>x\}]-\mathbb{E}[\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in(x,x^{\prime}]\}]\geqslant y$$

$$\forall x^{\prime}<x,-\mathbb{E}[\mathbb{I}\{X>x\}]+\mathbb{E}[\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in(x^{\prime},x]\}]\leqslant y$$

Further, note that if $x^{\prime}>x$, then $\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in(x,x^{\prime}]\}\in(-\mathbb{I}\{X\in(x,x^{\prime}]\},0]$, while if $x>x^{\prime}$, then $\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in(x^{\prime},x]\}\in[-\mathbb{I}\{X\in(x^{\prime},x]\},0)$. Hence, we deduce that such $y$ must satisfy

$$\inf_{x^{\prime}:x^{\prime}>x}-\mathbb{P}(X>x)+\mathbb{P}(X\in(x,x^{\prime}])\geqslant y\geqslant\sup_{x^{\prime}:x^{\prime}<x}-\mathbb{P}(X>x)-\mathbb{P}(X\in(x^{\prime},x]).$$

Hence, $-\mathbb{P}(X>x)\geqslant y\geqslant-\mathbb{P}(X\geqslant x)$, $\partial\mathbb{E}[(X-x)\mathbb{I}\{X>x\}]\subset[-\mathbb{P}(X\geqslant x),-\mathbb{P}(X>x)]=[F(x)-1-\mathbb{P}(X=x),F(x)-1]$, from which we deduce that

$$\partial H(x)\subset\bigg{[}\frac{1}{\alpha}(F(x)-\mathbb{P}(X=x)-(1-\alpha)),\frac{1}{\alpha}(F(x)-(1-\alpha))\bigg{]}.$$

This means that a minimum $x^{\star}$ of $H$ should at least satisfy that $1-\alpha\in[F(x^{\star})-\mathbb{P}(X=x^{\star}),F(x^{\star})]$. Finally, the value of the optimization is given by

$$x^{\star}+\frac{1}{\alpha}\mathbb{E}[X\mathbb{I}\{X>x^{\star}\}]-\frac{x^{\star}}{\alpha}(1-F(x^{\star}))=\frac{1}{\alpha}\mathbb{E}\Big{[}X\mathbb{I}\{X>x^{\star}\}\Big{]}+\frac{x^{\star}}{\alpha}\Big{(}F(x^{\star})-(1-\alpha)\Big{)}.$$

$\Box$

Proposition 18 (expected shorfall). Any solution $x^{\star}$ to the following problem

$$\mathsf{CVaR}_{\alpha}(\nu)=\sup_{x\in\mathbb{R}}\bigg{\{}\frac{1}{\alpha}\mathbb{E}[\min(X-x,0)]+x\bigg{\}}$$

must satisfy $\alpha\in[F(x^{\star})-\mathbb{P}(X=x^{\star}),F(x^{\star})]$ . Further, it holds

$$\mathsf{CVaR}_{\alpha}(\nu)=\frac{1}{\alpha}\bigg{(}\mathbb{E}\Big{[}X\mathbb{I}\{X<x^{\star}\}\Big{]}+x^{\star}\Big{(}\alpha-F(x^{\star})+\mathbb{P}(X=x^{\star})\Big{)}\bigg{)}.$$

Proof of Proposition 18. Let us introduce the function $H:x\mapsto\frac{1}{\alpha}\mathbb{E}[\min(X-x,0)]+x$. This is a concave function. Let $\partial H(x)$ denotes its subdifferential at point $x$. In particular, for $y\in\partial H(x)$, we must have $\forall x^{\prime},H(x^{\prime})\geqslant H(x)+y(x^{\prime}-x)$, and $x$ is a minimum of $H$ if $0\in\partial H(x)$. Using Minkowski set notations, we first have

$$\partial H(x)=\frac{1}{\alpha}\partial\mathbb{E}[\min(X-x,0)]+\{1\},$$

hence we focus on computing $\partial\mathbb{E}[(X-x)\mathbb{I}\{X<x\}]$. To this end, we look at the $y$ such that

$$\forall x^{\prime},\mathbb{E}[(X-x^{\prime})\mathbb{I}\{X<x^{\prime}\}]\leqslant\mathbb{E}[(X-x)\mathbb{I}\{X<x\}]+y(x^{\prime}-x)$$

i.e.

$$\forall x^{\prime},-(x-x^{\prime})\mathbb{E}[\mathbb{I}\{X<x\}]+\mathbb{E}[(X-x^{\prime})(\mathbb{I}\{X<x^{\prime}\}-\mathbb{I}\{X<x\})]\leqslant y(x^{\prime}-x).$$

Remarking that if $x>x^{\prime}$, then $\mathbb{I}\{X<x^{\prime}\}-\mathbb{I}\{X<x\}=-\mathbb{I}\{X\in[x^{\prime},x)\}$, while if $x^{\prime}>x$ then $\mathbb{I}\{X<x^{\prime}\}-\mathbb{I}\{X<x\}=\mathbb{I}\{X\in[x,x^{\prime})\}$, and reorganizing the terms, this means we must have

$$\forall x^{\prime}>x,-\mathbb{E}[\mathbb{I}\{X<x\}]+\mathbb{E}[\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in[x,x^{\prime})\}]\leqslant y,$$

$$\forall x^{\prime}<x,-\mathbb{E}[\mathbb{I}\{X<x\}]-\mathbb{E}[\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in[x^{\prime},x)\}]\geqslant y.$$

Further, note that if $x^{\prime}>x$, then $\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in[x,x^{\prime})\}\in(-\mathbb{I}\{X\in[x,x^{\prime})\},0]$, while if $x>x^{\prime}$, then $\frac{(X-x^{\prime})}{x^{\prime}-x}\mathbb{I}\{X\in[x^{\prime},x)\}\in[-\mathbb{I}\{X\in[x^{\prime},x)\},0)$. Hence, we deduce that such $y$ must satisfy

$$\sup_{x^{\prime}:x^{\prime}>x}-\mathbb{P}(X<x)-\mathbb{P}(X\in[x,x^{\prime}))\leqslant y\leqslant\inf_{x^{\prime}:x^{\prime}<x}-\mathbb{P}(X<x)+\mathbb{P}(X\in[x^{\prime},x)).$$

Hence, $-\mathbb{P}(X\leqslant x)\leqslant y\leqslant-\mathbb{P}(X<x)$, $\partial\mathbb{E}[(X-x)\mathbb{I}\{X<x\}]\subset[-\mathbb{P}(X\leqslant x),-\mathbb{P}(X<x)]=[-F(x),-F(x)+\mathbb{P}(X=x)]$, from which we deduce that

$$\partial H(x)\subset\bigg{[}\frac{1}{\alpha}(-F(x)+\alpha),\frac{1}{\alpha}(-F(x)+\mathbb{P}(X=x)+\alpha)\bigg{]}.$$

This means that a minimum $x^{\star}$ of $H$ should at least satisfy that $\alpha\in[F(x^{\star})-\mathbb{P}(X=x^{\star}),F(x^{\star})]$. Finally, the value of the optimization is given by

$${x^{\star}+\frac{1}{\alpha}\mathbb{E}[X\mathbb{I}\{X<x^{\star}\}]-\frac{x^{\star}}{\alpha}(F(x^{\star})-\mathbb{P}(X=x^{\star}))}$$

$${}=\frac{1}{\alpha}\mathbb{E}\Big{[}X\mathbb{I}\{X<x^{\star}\}\Big{]}+\frac{x^{\star}}{\alpha}\Big{(}\alpha-F(x^{\star})+\mathbb{P}(X=x^{\star})\Big{)}.$$

$\Box$

Proposition 19 (integrated and optimization forms). Let $X$ be a real-valued random variable with distribution $\nu$ and CDF $F$. Let $a,b\in\overline{\mathbb{R}}$ be such that $\mathbb{P}_{\nu}(a\leqslant X\leqslant b)=1$. Let $\alpha\in[0,1]$ and $x^{\star}$ be any solution to the optimization problem $\mathsf{CVaR}_{1-\alpha}(\nu)$. Let $(\chi_{i})_{i\in\mathbb{Z}}$ denotes the discontinuity points of $F$ (empty when $F$ is continuous), and let $\kappa=1-\alpha$. Then, if $a\geqslant 0$, the following rewriting holds

$$\mathsf{CVaR}_{\kappa}(\nu)=\frac{1}{\alpha}\int\limits_{a}^{b}\bigg{[}\alpha-\max\big{(}F(x)-\kappa,F(x^{\star})-\kappa\big{)}\bigg{]}dx+\frac{a}{\alpha}$$

$${}+\frac{1}{\alpha}\sum_{i\in\mathbb{Z}}\mathbb{P}(X=\chi_{i})\mathbb{I}\{\chi_{i}>x^{\star}\}+\frac{x^{\star}}{\alpha}(F(x^{\star})-\kappa).$$

In particular if $X$ is continuous, $a\geqslant 0$ and $b<\infty$ , then

$$\mathsf{CVaR}_{\kappa}(\nu)=(b-a)-\frac{1}{\alpha}\left(\int\limits_{a}^{b}\max(F(x)-\kappa,0)dx-a\right).$$

Proof of Proposition 19. Indeed, we first have that

$$\mathsf{CVaR}_{\kappa}(\nu)=\frac{1}{\alpha}\bigg{(}\mathbb{E}\Big{[}X\mathbb{I}\{X>x^{\star}\}\Big{]}+x^{\star}\Big{(}F(x^{\star})-(\kappa)\Big{)}\bigg{)}.$$

Now, if $a\geqslant 0$, then $Y=X\mathbb{I}\{X>x^{\star}\}$ is a non-negative random variable, hence we can use the following rewriting $\mathbb{E}[Y]=\int\limits_{0}^{b}\mathbb{P}(Y\geqslant y)dy=a+\int\limits_{a}^{b}\mathbb{P}(Y\geqslant y)dy$. Hence,

$$\mathbb{E}[Y]=a+\int\limits_{a}^{b}\mathbb{P}\bigg{(}X\mathbb{I}\{X>x^{\star}\}\geqslant x\bigg{)}dx$$

$${}=a+\int\limits_{a}^{b}\mathbb{P}\bigg{(}X\geqslant x\bigg{)}\mathbb{I}\{x>x^{\star}\}+\mathbb{P}\bigg{(}X>x^{\star}\bigg{)}\mathbb{I}\{x\leqslant x^{\star}\}dx$$

$${}=a+\int\limits_{a}^{b}(1-F(x))\mathbb{I}\{x>x^{\star}\}+(1-F(x^{\star}))\mathbb{I}\{x\leqslant x^{\star}\}dx+\int\limits_{a}^{b}\mathbb{P}(X=x)\mathbb{I}\{x>x^{\star}\}$$

$${}=a+\int\limits_{a}^{b}1-F(x)\mathbb{I}\{x>x^{\star}\}-F(x^{\star})\mathbb{I}\{x\leqslant x^{\star}\}dx+\sum_{i\in\mathbb{Z}}\mathbb{P}(X=x_{i})\mathbb{I}\{x_{i}>x^{\star}\}$$

$${}=a+\int\limits_{a}^{b}1-\max(F(x),F(x^{\star}))dx+\sum_{i\in\mathbb{Z}}\mathbb{P}(X=x_{i})\mathbb{I}\{x_{i}>x^{\star}\},$$

where the last line is by monotony of $F$. We conclude remarking that $1-\max(F(x),F(x^{\star}))=\alpha-\max\big{(}F(x)-(\kappa),F(x^{\star})-(\kappa)\big{)}$. $\Box$

Proposition 20 (integrated and optimization forms). Let $X$ be a real-valued random variable with distribution $\nu$ and CDF $F$. Let $a,b\in\overline{\mathbb{R}}$ be such that $\mathbb{P}_{\nu}(a\leqslant X\leqslant b)=1$. Let $\alpha\in[0,1]$ and $x^{\star}$ be any solution to the optimization problem $\mathsf{CVaR}_{\alpha}(\nu)$. Let $(\chi_{i})_{i\in\mathbb{Z}}$ denotes the discontinuity points of $F$ (empty when $F$ is continuous). Then, if $a\geqslant 0$, the following rewriting holds

$$\mathsf{CVaR}_{\alpha}(\nu)=\frac{1}{\alpha}\int\limits_{a}^{b}(F(x^{\star})-F(x))_{+}dx+\frac{a}{\alpha}$$

$${}+\frac{1}{\alpha}\sum_{i\in\mathbb{Z}}(p_{i}-\mathbb{P}(X=x^{\star}))\mathbb{I}\{\chi_{i}<x^{\star}\}+\frac{x^{\star}}{\alpha}\Big{(}\alpha-F(x^{\star})+\mathbb{P}(X=x^{\star})\Big{)}.$$

In particular if $X$ is continuous, $a\geqslant 0$ and $b<\infty$ , then

$$\mathsf{CVaR}_{\alpha}(\nu)=\frac{1}{\alpha}\int\limits_{a}^{b}(\alpha-F(x))_{+}dx+\frac{a}{\alpha}.$$

Proof of Proposition 20. Indeed, we first have that

$$\mathsf{CVaR}_{\alpha}(\nu)=\frac{1}{\alpha}\bigg{(}\mathbb{E}\Big{[}X\mathbb{I}\{X<x^{\star}\}\Big{]}+x^{\star}\Big{(}\alpha-F(x^{\star})+\mathbb{P}(X=x^{\star})\Big{)}\bigg{)}.$$

Now, if $a\geqslant 0$, then $Y=X\mathbb{I}\{X<x^{\star}\}$ is a non-negative random variable, hence we can use the following rewriting $\mathbb{E}[Y]=\int_{0}^{b}\mathbb{P}(Y\geqslant y)dy=a+\int\limits_{a}^{b}\mathbb{P}(Y\geqslant y)dy$. Hence,

$$\mathbb{E}[Y]=a+\int\limits_{a}^{b}\mathbb{P}\bigg{(}X\mathbb{I}\{X<x^{\star}\}\geqslant x\bigg{)}dx$$

$${}=a+0\times\mathbb{I}\{x\geqslant x^{\star}\}+\int\limits_{a}^{b}\mathbb{P}\bigg{(}x^{\star}>X\geqslant x\bigg{)}\mathbb{I}\{x<x^{\star}\}dx$$

$${}=a+\int\limits_{a}^{b}(F(x^{\star})-\mathbb{P}(X=x^{\star})-F(x)+\mathbb{P}(X=x))\mathbb{I}\{x<x^{\star}\}dx$$

$${}=a+\int\limits_{a}^{b}(F(x^{\star})-F(x))_{+}dx+\sum_{i\in\mathbb{Z}}(p_{i}-\mathbb{P}(X=x^{\star}))\mathbb{I}\{\chi_{i}<x^{\star}\}.$$

Hence, we deduce that

$$\mathsf{CVaR}_{\alpha}(\nu)=\frac{1}{\alpha}\int\limits_{a}^{b}(F(x^{\star})-F(x))_{+}dx+\frac{a}{\alpha}$$

$${}+\frac{1}{\alpha}\sum_{i\in\mathbb{Z}}(p_{i}-\mathbb{P}(X=x^{\star}))\mathbb{I}\{\chi_{i}<x^{\star}\}+\frac{x^{\star}}{\alpha}\Big{(}\alpha-F(x^{\star})+\mathbb{P}(X=x^{\star})\Big{)}.$$

$\Box$

OTHER RESULT

We provide below for the interested reader some examples of functions $g$ satisfying $\sum\limits_{t=1}^{\infty}\frac{1}{g(t)}\leqslant 1$.

Lemma 21 (controlled sums). The following functions $g$ satisfy $\sum\limits_{t=1}^{\infty}\frac{1}{g(t)}\leqslant 1$.

$g(t)=3t^{3/2}$, $g(t)=t(t+1)$, $g(t)=\frac{(t+1)\ln^{2}(t+1)}{\ln(2)}$, $g(t)=\frac{(t+2)\ln(t+2)(\ln\ln(t+2))^{2}}{\ln\ln(3)}$,
For each $m\in\mathbb{N}$ , $g_{m}(t)=C_{m}(\overline{\ln}^{\bigcirc{m}}(t))^{2}\prod_{i=0}^{m-1}\overline{\ln}^{\bigcirc{i}}(t)$ , where $f^{\bigcirc{m}}$ denotes the $m$ -fold composition of function $f$ , $\overline{\ln}(t)=\max\{\ln(t),1\}$ , and we introduced the constants $C_{1}=2+\ln(2)+1/e$ , $C_{2}=2.03+\ln(e^{e}-1)$ as well as $C_{m}=2+\ln\Big{(}\exp^{\bigcirc{m}}(1)\Big{)}$ for $m\geqslant 3$ .

Proof of Lemma 21. Note that $\overline{\ln}^{\bigcirc{m}}(t)=\ln^{\bigcirc{m}}(t)$ for $t\geqslant\exp^{\bigcirc{m}}(1)$ and $1$ else. Using that $g(t)=C_{m}(\overline{\ln}^{\bigcirc{m}}(t))^{2}\times\prod_{i=0}^{m-1}\overline{\ln}^{\bigcirc{i}}(t)$, and that $t\mapsto-\frac{1}{\ln^{\bigcirc{m}}(t)}$ has derivative $t\mapsto\frac{1}{(\ln^{\bigcirc{m}}(t))^{2}\prod_{i=0}^{m-1}\ln^{\bigcirc{i}}(t)}$, it comes

$$\sum_{t=1}^{\infty}\frac{C_{m}}{g(t)}=\sum_{t=1}^{\lceil\exp^{\bigcirc{m}}(1)\rceil-1}\frac{1}{t}+\sum_{t=\lceil\exp^{\bigcirc{m}}(1)\rceil}\frac{1}{(\ln^{\bigcirc{m}}(t))^{2}\prod_{i=0}^{m-1}\ln^{\bigcirc{i}}(t)}$$

$${}\leqslant 1+\ln\Big{(}\lceil\exp^{\bigcirc{m}}(1)\rceil-1\Big{)}+\frac{1}{(\ln^{\bigcirc{m}}(\lceil\exp^{\bigcirc{m}}(1)\rceil))^{2}\prod_{i=0}^{m-1}\ln^{\bigcirc{i}}(\lceil\exp^{\bigcirc{m}}(1)\rceil)}$$

$${}+\frac{1}{\ln^{\bigcirc{m}}\Big{(}\lceil\exp^{\bigcirc{m}}(1)\rceil\Big{)}}=2+\ln\Big{(}\lceil\exp^{\bigcirc{m}}(1)\rceil-1\Big{)}+\frac{1}{\prod_{i=0}^{m-1}\exp^{(m-i)}(1)}.$$

$\Box$

About this article

Cite this article

Maillard, OA. Local Dvoretzky–Kiefer–Wolfowitz Confidence Bands. Math. Meth. Stat. 30, 16–46 (2021). https://doi.org/10.3103/S1066530721010038

Download citation

Received: 14 June 2021
Revised: 12 July 2021
Accepted: 16 January 2022
Published: 30 May 2022
Issue Date: January 2021
DOI: https://doi.org/10.3103/S1066530721010038

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Local Dvoretzky–Kiefer–Wolfowitz Confidence Bands

Abstract

Access this article

Similar content being viewed by others

A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Random variables, monotone relations, and convex analysis

Non-parametric Lower Bounds and Information Functions

Notes

REFERENCES

ACKNOWLEDGMENTS

Author information

Authors and Affiliations

Corresponding author

Appendices

PROOFS OF THE MAIN RESULTS

MONTE CARLO SIMULATIONS OF THE CONFIDENCE BOUNDS

TECHNICAL DETAILS REGARDING THE CVAR

OTHER RESULT

About this article

Cite this article

Keywords:

Navigation

Local Dvoretzky–Kiefer–Wolfowitz Confidence Bands

Abstract

Access this article

Similar content being viewed by others

A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Random variables, monotone relations, and convex analysis

Non-parametric Lower Bounds and Information Functions

Notes

REFERENCES

ACKNOWLEDGMENTS

Author information

Authors and Affiliations

Corresponding author

Appendices

PROOFS OF THE MAIN RESULTS

MONTE CARLO SIMULATIONS OF THE CONFIDENCE BOUNDS

TECHNICAL DETAILS REGARDING THE CVAR

OTHER RESULT

About this article

Cite this article

Share this article

Keywords:

Search

Navigation