A Lemmas
The first lemma is due to Greblicki and Pawlak (1985) (Lemma 1) which is restated without proof as Lemma 1 below:
Lemma 1
(Greblicki and Pawlak (1985))
$$\begin{aligned} \lim _{N \rightarrow \infty }\sum _{k=0}^{N} a_{k}h_{k}(x) = f(x) \end{aligned}$$
at every differentiability point of f. If \(f \in L_{p}\), \(p>1\), the convergence holds for almost all \(x \in \mathbb {R}\).
The second Lemma is due to Liebscher (1990) (Lemma 5 in that paper) which is presented without proof as Lemma 2 below:
Lemma 2
(Liebscher (1990)) For the Hermite series estimators (3):
$$\begin{aligned} \sum _{k=0}^{N} E\left( \hat{a}_{k}-a_{k}\right) ^{2}=O\left( \frac{N^{1/2}}{n}\right) . \end{aligned}$$
The third Lemma is due to Greblicki and Pawlak (1984) (following from equation (15) in Theorem 4 of Greblicki and Pawlak (1984)), which we restate without proof as Lemma 3 below:
Lemma 3
(Greblicki and Pawlak 1984) For the Hermite series estimators (3), if \(E|X|^{s} < \infty \), \(s> 8(r+1)/3(2r+1)\) then:
$$\begin{aligned} \sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2} = O(n^{-2r/(2r+1)}\log n) \, \text{ a.s. } \end{aligned}$$
Finally, we present an important novel result with proof in Lemma 4 below. We will make use of Lemma 4 several times in this article.
Lemma 4
$$\begin{aligned} \int _{-\infty }^{x}|h_{k}(t)| dt \le 2 c_{1} (k+1)^{-\frac{1}{4}} + 12d_{1} (k+1)^{\frac{1}{2}}, \end{aligned}$$
where \(c_{1}\) and \(d_{1}\) are positive constants.
Proof
$$\begin{aligned} \int _{-\infty }^{x}|h_{k}(t)| dt\le & {} \int _{-\infty }^{\infty }|h_{k}(t)| dt\\= & {} \int _{-\infty }^{-1}|h_{k}(t)| dt +\int _{-1}^{1}|h_{k}(t)| dt +\int _{1}^{\infty }|h_{k}(t)| dt \\= & {} \int _{-1}^{1}|h_{k}(t)| dt +2\int _{1}^{\infty }|h_{k}(t)| dt \\\le & {} 2c_{1} (k+1)^{-\frac{1}{4}} + \frac{2d_{1}}{b} (k+1)^{\frac{5}{12}+\frac{b}{2}}, \, b>0. \end{aligned}$$
This follows from the inequalities implied by Theorem 8.91.3 of Szego (1975), namely, \(\max _{|x|\le a} h_{k}(x) \le c_{a} (k+1)^{-\frac{1}{4}}\) and \(\max _{|x|\ge a}|h_{k}(x)|x^{\lambda } \le d_{a} (k+1)^{s} \) where \(c_{a}\) and \(d_{a}\) are positive constants depending only on a, \(s=\max (\frac{\lambda }{2} - \frac{1}{12}, -\frac{1}{4})\) and we have set \(\lambda =1+b, \, b>0\). In addition, we have made use of \(\int _{1}^{\infty } x^{-1-b} = \frac{1}{b}, \, b>0\). For concreteness we have set \(b=\frac{1}{6}\). \(\square \)
B Proofs of propositions and theorems
B.1 Proof of Proposition 1
Proof
$$\begin{aligned} \left| E[\hat{F}_{N}(x)] - F(x) \right|= & {} \left| E\left[ \int _{-\infty }^{x} \hat{f}_{N}(t) dt\right] - \int _{-\infty }^{x} f(t) dt \right| \\\le & {} \int _{-\infty }^{x}\sum _{k=N+1}^{\infty } |a_k| |h_{k}(t)| dt. \end{aligned}$$
This follows from (3), (4), the fact that \(E(\hat{a}_k)=a_k\) and Lemma 1. By the monotone convergence theorem we have,
$$\begin{aligned} \int _{-\infty }^{x}\sum _{k=N+1}^{\infty } |a_k| |h_{k}(t)| dt = \sum _{k=N+1}^{\infty } |a_k| \int _{-\infty }^{x}|h_{k}(t)| dt. \end{aligned}$$
Utilising Lemma 4 we have:
$$\begin{aligned}&\sum _{k=N+1}^{\infty } |a_k| \int _{-\infty }^{x}|h_{k}(t)| dt \\&\quad \le 2c_{1}\sum _{k=N+1}^{\infty } |a_k| (k+1)^{-\frac{1}{4}} + 12d_{1} \sum _{k=N+1}^{\infty } |a_k| (k+1)^{\frac{1}{2}} \\&\quad \le 2c_{1}\sum _{k=N+1}^{\infty } |b_{k+r}| (k+1)^{-\frac{1}{4}-\frac{r}{2}} + 12d_{1} \sum _{k=N+1}^{\infty } |b_{k+r}| (k+1)^{\frac{1}{2} - \frac{r}{2}} \\&\quad \le 2c_{1}||\left( x-\frac{d}{dx}\right) ^r f(x)|| \sqrt{\sum _{k=N+1}^{\infty } (k+1)^{-\frac{1}{2}-r}} \\&\qquad + 12d_{1} ||\left( x-\frac{d}{dx}\right) ^r f(x)|| \sqrt{\sum _{k=N+1}^{\infty } (k+1)^{1- r}}, \end{aligned}$$
where we have also used the fact that by assumption, \((x-\frac{d}{dx})^r f(x) \in L_{2}\) and Walter (1977) has shown \(a_{k}^{2} \le \frac{b_{k+r}^{2}}{(k+1)^r}\), where \(b_k\) is the k-th coefficient of the expansion of \((x-\frac{d}{dx})^r f(x) \in L_{2}\). In addition, we have utilised Parseval’s theorem, \(||(x-\frac{d}{dx})^r f(x)||^{2} = \sum _{k=0}^{\infty } b_{k}^{2}\), and the Cauchy–Schwarz inequality in the last line. Using the well known properties of the Hurwitz Zeta function, \(\zeta (s,a) = \sum _{k=0}^{\infty }(k+a)^{-s}\), (see DLMF 2017, 25.11.43), we have:
$$\begin{aligned} \sum _{k=N+1}^{\infty } |a_k| \int _{-\infty }^{x}|h_{k}(t)| dt = O(N^{-r/2 +1}), \end{aligned}$$
(7)
completing the proof. \(\square \)
B.2 Proof of Proposition 2
Proof
It is easy to see that
$$\begin{aligned} \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right|= & {} \left| \sum _{k=0}^{N} (\hat{a}_{k}-a_{k}) \int _{-\infty }^{x}h_k(t)dt \right| \\\le & {} \sqrt{\sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2}} \sqrt{\sum _{k=0}^{N}\left| \int _{-\infty }^{x}h_k(t)dt\right| ^2}. \end{aligned}$$
Now by virtue of Lemma 4 we have:
$$\begin{aligned} \left[ \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right] ^{2} = \sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2} O\left( N^{2}\right) . \end{aligned}$$
(8)
Making use of Lemma 2 we have,
$$\begin{aligned} E\left[ \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right] ^{2} = O\left( \frac{N^{\frac{5}{2}}}{n}\right) . \end{aligned}$$
(9)
\(\square \)
B.3 Proof of Theorem 3
Proof
We begin by restating the definition of the rate of almost sure convergence provided in Greblicki and Pawlak (1984): for a sequence of random variables \(Y_{n}\), we say that \(Y=O(a_{n})\) almost surely if \(\frac{\beta {n} Y_{n}}{a_n} \rightarrow 0\) almost surely as \(n \rightarrow \infty \), for all (non-negative) sequences \(\{\beta _{n}\}\) convergent to zero. Now,
$$\begin{aligned} \left| \hat{F}_{N}(x) - F(x) \right|\le & {} \left| E[\hat{F}_{N}(x)] - F(x) \right| + \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right| . \end{aligned}$$
By Proposition 1,
$$\begin{aligned}&\left| E[\hat{F}_{N}(x)] - F(x) \right| = O(N^{-r/2 +1}) \\&\quad =O(n^{-(r-2)/(2r+1)}). \end{aligned}$$
In addition, via (8) we have:
$$\begin{aligned} \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right| =\sqrt{\sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2}} O\left( N\right) . \end{aligned}$$
We make use of Lemma 3 to obtain,
$$\begin{aligned} \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right| = O(n^{-(r-2)/(2r+1)} \log n) \, a.s., \end{aligned}$$
and finally:
$$\begin{aligned} \left| \hat{F}_{N}(x) - F(x) \right| =O(n^{-(r-2)/(2r+1)} \log n) \, a.s. \end{aligned}$$
\(\square \)
B.4 Proof of Theorem 4
Proof
It suffices to prove \(\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon \right) < \infty \) for all \(\epsilon > 0\) (Borel-Cantelli). We have via the law of total probability,
$$\begin{aligned}&\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)|> \epsilon \right) \\&\quad = \sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)|> \epsilon \big | N(n)> cn^{\gamma }\right) P\left( N(n)> cn^{\gamma } \right) \\&\qquad + \sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon \big | N(n) \le cn^{\gamma } \right) P\left( N(n) \le cn^{\gamma } \right) ,\, \end{aligned}$$
where c is a constant. By the assumption that \(\sum _{n=1}^{\infty } P\left( \frac{N(n)}{n^{\gamma }} > \epsilon \right) < \infty \) for all \(\epsilon > 0\) , it is clear that the first term is finite. It remains to show that \(\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon \big | N(n) \le cn^{\gamma } \right) < \infty \) for all \(\epsilon > 0\). By the conditional Markov inequality we have:
$$\begin{aligned}&P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) \\&\quad \le \epsilon ^{-p} E \left| \int _{-\infty }^{x} \sum _{k=0}^{q(n)} (\hat{a}_k-a_k) h_{k}(t) dt - \int _{-\infty }^{x} \sum _{k=q(n)+1}^{\infty } a_k h_k(t) dt \right| ^p, \end{aligned}$$
for all \(\epsilon > 0\). Using the fact that \(|f+g|^{p} \le 2^{p-1} (|f|^{p} +|g|^{p})\) along with the Hölder inequality, Lemma 4 and Proposition 1 we have,
$$\begin{aligned}&P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) \\&\quad \le \epsilon ^{-p} 2^{p-1} \left( \sum _{k=0}^{q(n)} E|\hat{a}_k-a_k |^{p}\right) \left( \sum _{k=0}^{q(n)} \left( \int _{-\infty }^{x}|h_{k}(t)| dt\right) ^{p/(p-1)}\right) ^{p-1}\\&\qquad + \epsilon ^{-p} 2^{p-1} \left| \int _{-\infty }^{x} \sum _{k=q(n)+1}^{\infty } a_k h_k(t) dt \right| ^p \\&\quad \le \epsilon ^{-p} 2^{p-1} b_{1} \left( \sum _{k=0}^{q(n)} E|\hat{a_k}-a_k |^{p}\right) \left( \sum _{k=0}^{q(n)} (k+1)^{p/2(p-1)}\right) ^{p-1} \\&\qquad + \epsilon ^{-p} 2^{p-1} b_{2} q(n)^{-rp/2 + p}, \end{aligned}$$
for all \(\epsilon > 0\) , where \(b_{1}, b_{2}\) are positive constants. Now, the results of Dharmadhikari and Jogdeo (1969) for independent random variables, \(X_{i}\), with zero mean imply that (Theorem 2 in that paper):
$$\begin{aligned} E\left| \sum _{i=1}^{n} X_{i} \right| ^{\nu } \le F_{\nu } n^{\nu /2 - 1}\sum _{i=1}^{n} E\left| X_{i}\right| ^{\nu }, \end{aligned}$$
where \(\nu \ge 2\) and \(F_{\nu }\) is a constant depending only on \(\nu \). Thus we have \(E|\hat{a_k}-a_k |^{p} = n^{-p} E|\sum _{i=1}^{n} (h_k(\mathbf {x_i}) - a_k)|^{p} \le F_{p} n^{-p/2-1}\sum _{i=1}^{n} E|h_k(\mathbf {x_i}) - a_k|^{p} \), where \(F_{p}\) is a constant depending only on p. Also noting that \(\max _{x} |h_{k}(x)| \le C (k+1)^{-1/12}\) where C is a positive constant (implied by Theorem 8.91.3 of Szego 1975), we have:
$$\begin{aligned}&P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) \\&\quad \le \epsilon ^{-p} 2^{p-1} b_{3} n^{-p/2} q(n)^{-p/12+1} q(n)^{3p/2 -1}+ \epsilon ^{-p} 2^{p-1} b_{2} q(n)^{-rp/2 + p}, \end{aligned}$$
for all \(\epsilon > 0\), where \(b_{3}\) depends only on p. It is easy to see that for \(r>2\) and \(q(n) = O(n^{\gamma })\), \(0< \gamma < 6/17\), we can choose p such that \(\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) < \infty \) for all \(\epsilon > 0\). \(\square \)
B.5 Proof of Proposition 3
Proof
The fixed N hermite series estimator (4) (equal to (5)) can be represented as:
$$\begin{aligned} T(x,\hat{F}_{n}) =\int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} d_{N}(t,y) dy \right] d\hat{F}_{n}(t), \end{aligned}$$
where \(d_{N}(t,y) = \sum _{k=0}^{N} h_{k}(t) h_{k}(y)\). The influence function and empirical influence function are:
$$\begin{aligned} IF(x,x';T,F)= & {} \int _{-\infty }^{x} d_{N}(x',y) dy - \int _{-\infty }^{\infty }\int _{-\infty }^{x} d_{N}(t,y) dy dF(t),\\ IF(x,x';T,\hat{F}_{n})= & {} \int _{-\infty }^{x} d_{N}(x',y) dy -\int _{-\infty }^{\infty } \int _{-\infty }^{x} d_{N}(t,y) dy d\hat{F}_{n}(t). \end{aligned}$$
Now, for fixed N,
$$\begin{aligned} |\int _{-\infty }^{x} d_{N}(t,y) dy|\le & {} \sum _{k=0}^{N} |h_{k}(t)| \int _{\infty }^{x} |h_{k}(y)| \nonumber \\\le & {} u_1 \sum _{k=0}^{N} (k+1)^{-1/12-1/4} + v_{1} \sum _{k=0}^{N} (k+1)^{1/2-1/12} \nonumber \\< & {} \infty , \end{aligned}$$
(10)
where \(u_1\) and \(v_1\) are constants. The result (10) follows from Lemma 4 and the fact that \(\max _{x} |h_{k}(t)| \le C (k+1)^{-1/12}\). Thus, the gross-error sensitivities, \(\sup _{x'} |IF(x,x';T,F)|< \infty \) and \(\sup _{x'} |IF(x,x';T,\hat{F}_{n})| < \infty \) and the fixed N Hermite series cumulative distribution function estimator is Bias-robust. \(\square \)
B.6 Proof of Proposition 4
Proof
The Kernel distribution function estimator is defined as:
$$\begin{aligned} \hat{F}(x) = \frac{1}{n} \sum _{i=1}^{n} \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{\mathbf {x}_{i}-y}{h}\right) dy. \end{aligned}$$
This has the representation:
$$\begin{aligned} T(x,\hat{F}_{n}) = \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{t-y}{h}\right) dy\right] d\hat{F}_{n}(t), \end{aligned}$$
where \(\hat{F}_{n}\) is the empirical distribution function. The influence function and empirical influence function are easily seen to be:
$$\begin{aligned} IF(x,x';T,F)= & {} \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{x'-y}{h}\right) dy\right] \\&- \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{t-y}{h}\right) dy\right] dF(t),\\ IF(x,x';T,\hat{F}_{n})= & {} \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{x'-y}{h}\right) dy\right] \\&- \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{t-y}{h}\right) dy\right] d\hat{F}_{n}(t). \end{aligned}$$
Since \(\int _{-\infty }^{\infty } K(u) du = 1\), it is clear that \(\sup _{x'} |IF(x,x';T,F)| \le 2 < \infty \), \(\sup _{x'}|IF(x,x';T,\hat{F}_{n})| \le 2 < \infty \), and thus the smooth kernel distribution function estimator is Bias-robust. \(\square \)
B.7 Proof of Proposition 5
Proof
Suppose a density function, f(x), can be expanded formally as:
$$\begin{aligned} f(x)= & {} \sum _{k=0}^{\infty } c_{k} He_{k}(x) \phi (x),\\ c_{k}= & {} \frac{1}{k!} \int _{-\infty }^{\infty } f(x) He_{k}(x) dx, \end{aligned}$$
where \(\phi (x)=\frac{e^{-x^2/2}}{\sqrt{2\pi }}\) and \(He_{k}(x)\) are the Chebyshev-Hermite polynomials (following the notation of Szego (1975)). The truncated expansion has the form:
$$\begin{aligned} f(x)=\sum _{k=0}^{N} c_{k} He_{k} (x) \phi (x), \end{aligned}$$
usually truncated to obtain:
$$\begin{aligned} f(x)=\phi (x) (1+\frac{1}{2} (\mu _{2}-1)He_{2}(x) + \frac{1}{6} \mu _{3} He_{3}(x) +\frac{1}{24} (\mu _{4}-6\mu _{2}+3)He_{4}), \end{aligned}$$
where \(\mu _{2},\mu _{3},\mu _{4}\) are non-central moments. This is the Gram–Charlier series of Type A Kendall et al. (1987). A natural cumulative distribution function estimator based on the Gram–Charlier series is:
$$\begin{aligned} \hat{F}_{N}(x) = \sum _{k=0}^{N} \hat{c}_{k} \int _{-\infty }^{x} He_{k}(y) \phi (y) dy. \end{aligned}$$
(11)
This has the representation
$$\begin{aligned} T(x,\hat{F}_{n}) = \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \sum _{k=0}^{N}\frac{1}{k!} He_{k}(t)He_{k}(y) \phi (y) dy \right] d\hat{F}_{n}(t). \end{aligned}$$
Now:
$$\begin{aligned} IF(x,x';T,F)= & {} \sum _{k=0}^{N} \frac{1}{k!} He_{k}(x')\int _{-\infty }^{x} He_{k}(y) \phi (y) dy \\&- \int _{-\infty }^{\infty } \int _{-\infty }^{x}\frac{1}{k!} He_{k}(t)He_{k}(y) \phi (y) dy dF(t),\\ IF(x,x';T,\hat{F}_{n})= & {} \sum _{k=0}^{N} \frac{1}{k!} He_{k}(x')\int _{-\infty }^{x} He_{k}(y) \phi (y) dy \\&- \int _{-\infty }^{\infty }\int _{-\infty }^{x}\frac{1}{k!} He_{k}(t)He_{k}(y) \phi (y) dy d\hat{F}_{n}. \end{aligned}$$
Since \(He_{k}(x')\) is not bounded, whereas the second terms of \(IF(x,x';T,F)\) and \(IF(x,x';T,\hat{F}_{n})\) are bounded, the gross-error sensitivities, \(\sup _{x'} |IF(x,x';T,F)|\) and \(\sup _{x'} |IF(x,x';T,\hat{F}_{n})|\) are not bounded and thus the CDF estimator (11) is not Bias-robust. \(\square \)