Skip to main content

Goodness-of-Fit Testing for Hölder Continuous Densities Under Local Differential Privacy

  • Conference paper
  • First Online:
Foundations of Modern Statistics (FMS 2019)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 425))

Included in the following conference series:

  • 349 Accesses

Abstract

We address the problem of goodness-of-fit testing for Hölder continuous densities under local differential privacy constraints. We study minimax separation rates when only non-interactive privacy mechanisms are allowed to be used and when both non-interactive and sequentially interactive can be used for privatisation. We propose privacy mechanisms and associated testing procedures whose analysis enables us to obtain upper bounds on the minimax rates. These results are complemented with lower bounds. By comparing these bounds, we show that the proposed privacy mechanisms and tests are optimal up to at most a logarithmic factor for several choices of \(f_0\) including densities from uniform, normal, Beta, Cauchy, Pareto, exponential distributions. In particular, we observe that the results are deteriorated in the private setting compared to the non-private one. Moreover, we show that sequentially interactive mechanisms improve upon the results obtained when considering only non-interactive privacy mechanisms.

Financial support from GENES and from the French ANR grant ANR-18-EURE-0004.

Financial support from GENES and the French National Research Agency (ANR) under the grant Labex Ecodec (ANR-11-LABEX-0047).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Acharya, J., Canonne, C.L., Freitag, C., Tyagi, H.: Test without trust: Optimal locally private distribution testing. In: Proceedings of Machine Learning Research, vol. 89, pp. 2067–2076 (2019)

    Google Scholar 

  2. Acharya, J., Sun, Z., Zhang, H.: Differentially private testing of identity and closeness of discrete distributions. In: Advances in Neural Information Processing Systems, pp. 6878–6891 (2018)

    Google Scholar 

  3. Aliakbarpour, M., Diakonikolas, I., Rubinfeld, R.: Differentially private identity and equivalence testing of discrete distributions. In: International Conference on Machine Learning, pp. 169–178 (2018)

    Google Scholar 

  4. Balakrishnan, S., Wasserman, L.: Hypothesis testing for densities and high-dimensional multinomials: Sharp local minimax rates. Ann. Stat. 47(4), 1893–1927 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  5. Berrett, T.B., Butucea, C.: Classification under local differential privacy. Annales de l’ISUP 63, 191–205 (2019)

    Google Scholar 

  6. Berrett, T.B., Butucea, C.: Locally private non-asymptotic testing of discrete distributions is faster using interactive mechanisms. In: 34, NeurIPS (2020)

    Google Scholar 

  7. Boucheron, S., Lugosi, G., Massart, P.: A Nonasymptotic Theory of Independence. Oxford University Press, Concentration Inequalities (2013)

    MATH  Google Scholar 

  8. Butucea, C., Dubois, A., Kroll, M., Saumard, A.: Local differential privacy: elbow effect in optimal density estimation and adaptation over Besov ellipsoids. Bernoulli 26(3), 1727–1764 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  9. Butucea, C., Rohde, A., Steinberger, L.: Interactive versus noninteractive locally differentially private estimation: two elbows for the quadratic functional (2020). arXiv e-prints, arXiv:2003.04773 [math.ST]

  10. Cai, B., Daskalakis, C., Kamath, G.: Priv’IT: private and sample efficient identity testing. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 635–644 (2017)

    Google Scholar 

  11. Chhor, J., Carpentier, A.: Sharp local minimax rates for goodness-of-fit testing in large random graphs, multivariate poisson families and multinomials (2020). arXiv preprint arXiv:2012.13766

  12. Diakonikolas, I., Daniel, M.K.: A new approach for testing properties of discrete distributions. In: IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), vol. 2016, pp. 685–694. IEEE (2016)

    Google Scholar 

  13. Duchi, J.C., Jordan, M.I., Wainwright, M.J.: Minimax optimal procedures for locally private estimation. J. Am. Stat. Assoc. 113(521), 182–201 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  14. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference, pp. 265–284. Springer (2006)

    Google Scholar 

  15. Gaboardi, M., Lim, H., Rogers, R., Vadhan, S.: Differentially private chi-squared hypothesis testing: goodness of fit and independence testing. In: International Conference on Machine Learning, pp. 2111–2120 (2016)

    Google Scholar 

  16. Gaboardi, M., Rogers, R.: Local private hypothesis testing: Chi-square tests. In: International Conference on Machine Learning, pp. 1626–1635 (2018)

    Google Scholar 

  17. Joseph, M., Mao, J., Neel, S., Roth, A.: The role of interactivity in local differential privacy. In: IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), vol. 2019. IEEE (2019)

    Google Scholar 

  18. Kairouz, P., Oh, S., Viswanath, P.: Extremal mechanisms for local differential privacy. In: Advances in Neural Information Processing Systems, pp. 2879–2887 (2014)

    Google Scholar 

  19. Kairouz, P., Oh, S., Viswanath, P.: Extremal mechanisms for local differential privacy. J. Mach. Learn. Res. 17(1), 492–542 (2016)

    MathSciNet  MATH  Google Scholar 

  20. Lam-Weil, J., Laurent, B., Loubes, J.-M.: Minimax optimal goodnessof- fit testing for densities under a local differential privacy constraint (2020). arXiv preprint arXiv:2002.04254

  21. Rohde, A., Steinberger, L.: Geometrizing rates of convergence under local differential privacy constraints. Ann. Stat. 48(5), 2646–2670 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  22. Serfling, R.J.: Approximation Theorems of Mathematical Statistics. Wiley (1980)

    Google Scholar 

  23. Or Sheffet.: Locally private hypothesis testing. In: International Conference on Machine Learning. PMLR (2018), pp. 4605–4614

    Google Scholar 

  24. Spokoiny, V.: Adaptive hypothesis testing using wavelets. Ann. Stat. 24(6), 2477–2498 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  25. Valiant, G., Valiant, P.: An automatic inequality prover and instance optimal identity testing. SIAM J. Comput. 46(1), 429–455 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  26. Wang, Y., Lee, J., Kifer, D.: Revisiting differentially private hypothesis tests for categorical data (2015). arXiv preprint arXiv:1511.03376

  27. Wasserman, L., Zhou, S.: A statistical framework for differential privacy. J. Am. Stat. Assoc. 105(489), 375–389 (2010)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cristina Butucea .

Editor information

Editors and Affiliations

Appendices

Proofs of Sect. 3

1.1 Proof of Proposition 3.2

Let \(i\in \llbracket 1,n\rrbracket \). Set \(\sigma :=2\Vert \psi \Vert _\infty /(\alpha h)\). The conditional density of \(Z_i\) given \(X_i=y\) can be written as

$$\begin{aligned} q^{Z_i\mid X_i=y}(z)=\prod _{j=1}^N \frac{1}{2\sigma }\exp \left( -\frac{\vert z_j-\psi _h(x_j-y)\vert }{\sigma } \right) . \end{aligned}$$

Thus, by the reverse and the ordinary triangle inequality,

$$\begin{aligned} \frac{q^{Z_i\mid X_i=y}(z)}{q^{Z_i\mid X_i=y'}(z)}&=\prod _{j=1}^N \exp \left( \frac{\vert z_j-\psi _h(x_j-y')\vert -\vert z_j-\psi _h(x_j-y)\vert }{\sigma } \right) \\&\le \prod _{j=1}^N \exp \left( \frac{\vert \psi _h(x_j-y')-\psi _h(x_j-y)\vert }{\sigma } \right) \\&\le \exp \left( \frac{1}{\sigma h}\sum _{j=1}^{N}\left| \psi \left( \frac{x_j-y'}{h}\right) -\psi \left( \frac{x_j-y}{h}\right) \right| \right) \\&\le \exp \left( \frac{1}{\sigma h}\sum _{j=1}^{N}\left[ \left| \psi \left( \frac{x_j-y'}{h}\right) \right| +\left| \psi \left( \frac{x_j-y}{h}\right) \right| \right] \right) \\&\le \exp \left( \frac{2\Vert \psi \Vert _\infty }{\sigma h} \right) \\&\le \exp (\alpha ), \end{aligned}$$

where the second to last inequality follows from the fact that for a fixed y the quantity \(\psi ((x_j-y)/h)\) is non-zero for at most one coefficient \(j\in \llbracket 1,N\rrbracket \). This is a consequence of Assumption 3.1. This proves that \(Z_i\) is an \(\alpha \)-locally differentially private view of \(X_i\) for all \(i\in \llbracket 1,n\rrbracket \).

Consider now \(i\in \llbracket n+1,2n \rrbracket \). For all \(j\in \llbracket 1,N\rrbracket \) it holds

$$\begin{aligned} \frac{\mathbb P\left( Z_i=c_\alpha \mid X_i\notin B \right) }{\mathbb P\left( Z_i=c_\alpha \mid X_i \in B_j \right) }=1+\frac{1}{c_\alpha }= \frac{2e^\alpha }{e^\alpha +1}. \end{aligned}$$

Since \(2 \le e^\alpha +1\le 2e^\alpha \), we obtain

$$\begin{aligned} e^{-\alpha }\le 1 \le \frac{\mathbb P\left( Z_i=c_\alpha \mid X_i\notin B \right) }{\mathbb P\left( Z_i=c_\alpha \mid X_i \in B_j \right) }\le e^\alpha . \end{aligned}$$

It also holds

$$\begin{aligned} \frac{\mathbb P\left( Z_i=-c_\alpha \mid X_i\notin B \right) }{\mathbb P\left( Z_i=-c_\alpha \mid X_i \in B_j \right) }=1-\frac{1}{c_\alpha }= \frac{2}{e^\alpha +1} \in [e^{-\alpha }, e^{\alpha }]. \end{aligned}$$

Now, for all \((j,k)\in \llbracket 1, N\rrbracket ^2\) it holds

$$\begin{aligned} \frac{\mathbb P\left( Z_i=c_\alpha \mid X_i \in B_k \right) }{\mathbb P\left( Z_i=c_\alpha \mid X_i \in B_j \right) }=\frac{\mathbb P\left( Z_i=-c_\alpha \mid X_i\in B_k \right) }{\mathbb P\left( Z_i=-c_\alpha \mid X_i \in B_j \right) }=1\in [e^{-\alpha }, e^\alpha ]. \end{aligned}$$

This proves that \(Z_i\) is an \(\alpha \)-locally differentially private view of \(X_i\) for all \(i\in \llbracket n+1,2n\rrbracket \).

1.2 Proof of Theorem 3.4

Proof of Proposition 3.3

1. Equality (4) follows from the independence of \(Z_i\) and \(Z_k\) for \(i\ne k\) and from \(\mathbb E[Z_{ij}]=\psi _h*f(x_j)\). We now prove (5). Set \(a_{h,j}:=\psi _h*f(x_j)\) and let us define

$$\begin{aligned} \widehat{U}_B=\frac{1}{n(n-1)}\sum _{i\ne k}\sum _{j=1}^N\left( Z_{ij}-a_{h,j} \right) \left( Z_{kj}-a_{h,j} \right) , \end{aligned}$$
$$\begin{aligned} \widehat{V}_B=\frac{2}{n}\sum _{i=1}^n\sum _{j=1}^N\left( a_{h,j}-f_0(x_j) \right) \left( Z_{ij}-a_{h,j} \right) , \end{aligned}$$

and observe that we have

$$\begin{aligned} S_B=\widehat{U}_B+\widehat{V}_B+\sum _{j=1}^N(a_{h,j}-f_0(x_j))^2. \end{aligned}$$

Note that \(\textrm{Cov}(\widehat{U}_B,\widehat{V}_B)=0\). We thus have

$$\begin{aligned} \text {Var}(S_B)=\text {Var}(\widehat{U}_B)+\text {Var}(\widehat{V}_B), \end{aligned}$$

and we will bound from above \(\text {Var}(\widehat{U}_B)\) and \(\text {Var}(\widehat{V}_B)\) separately. We begin with \(\text {Var}(\widehat{V}_B)\). Since \(\widehat{V}_B\) is centered, it holds

$$\begin{aligned} \text {Var}(\widehat{V}_B)&=\mathbb E[\widehat{V}_B^2]\\&=\frac{4}{n^2}\sum _{i=1}^n\sum _{j=1}^N\sum _{t=1}^n\sum _{k=1}^N \left( a_{h,j}-f_0(x_j) \right) \left( a_{h,k}-f_0(x_k) \right) \\ {}&\qquad \mathbb E\left[ \left( Z_{ij}-a_{h,j} \right) \left( Z_{tk}-a_{h,k} \right) \right] . \end{aligned}$$

Note that if \(t\ne i\), the independence of \(Z_i\) and \(Z_t\) yields

$$\begin{aligned} \mathbb E\left[ \left( Z_{ij}-a_{h,j} \right) \left( Z_{tk}-a_{h,k} \right) \right] =0. \end{aligned}$$

Moreover, since the \(W_{ij}\), \(j=1,\ldots , N\) are independent of \(X_i\) and \(\mathbb E[W_{ij}]=0\) we have

$$\begin{aligned} \mathbb E\left[ \left( Z_{ij}-a_{h,j} \right) \left( Z_{ik}-a_{h,k} \right) \right]&=\mathbb E[ \left( \psi _h\left( x_j-X_i\right) +\frac{2\Vert \psi \Vert _\infty }{\alpha h} W_{ij}-a_{h,j}\right) \left( \psi _h\left( x_k-X_i\right) \right. \\ {}&\left. \quad +\frac{2\Vert \psi \Vert _\infty }{\alpha h} W_{ik}-a_{h,k}\right) ]\\&=\mathbb E\left[ \psi _h\left( x_j-X_i\right) \psi _h\left( x_k-X_i\right) \right] -a_{h,k}\mathbb E\left[ \psi _h\left( x_j-X_i\right) \right] \\ {}&\quad +\frac{4\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\mathbb E\left[ W_{ij}W_{ik} \right] -a_{h,j}\mathbb E\left[ \psi _h\left( x_k-X_i\right) \right] +a_{h,j}a_{h,k}\\&=\left[ \int \left( \psi _h\left( x_j-y\right) \right) ^2f(y)\textrm{d}y +\frac{8\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\right] I(j=k)-a_{h,j}a_{h,k}, \end{aligned}$$

where the last equality is a consequence of Assumption 3.1. We thus obtain

$$\begin{aligned} \text {Var}(\widehat{V}_B)&=\frac{4}{n}\sum _{j=1}^N\left( a_{h,j}-f_0(x_j) \right) ^2\left[ \int ( \psi _h\left( x_j-y\right) )^2f(y)\textrm{d}y +\frac{8\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\right] \\&-\frac{4}{n}\sum _{j=1}^N\sum _{k=1}^N\left( a_{h,j}-f_0(x_j) \right) \left( a_{h,k}-f_0(x_k) \right) a_{h,j}a_{h,k}\\&=\frac{4}{n}\sum _{j=1}^N\left( a_{h,j}-f_0(x_j) \right) ^2\left[ \int (\psi _h\left( x_j-y\right) )^2f(y)\textrm{d}y +\frac{8\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\right] \\&\quad -\frac{4}{n}\left( \sum _{j=1}^N\left( a_{h,j}-f_0(x_j) \right) a_{h,j}\right) ^2\\&\le \frac{4}{n}\sum _{j=1}^N\left( a_{h,j}-f_0(x_j) \right) ^2\left[ \int (\psi _h\left( x_j-y\right) )^2f(y)\textrm{d}y +\frac{8\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\right] . \end{aligned}$$

Now, \(\int (\psi _h\left( x_j-y\right) )^2f(y)\textrm{d}y\le \Vert \psi _h\Vert _\infty ^2\le \Vert \psi \Vert _\infty ^2/h^2\le \Vert \psi \Vert _\infty ^2/(\alpha ^2h^2)\) if \(\alpha \in (0,1]\). We finally obtain

$$\begin{aligned} \text {Var}(\widehat{V}_B)\le \frac{36\Vert \psi \Vert _\infty ^2}{n\alpha ^2 h^2}\sum _{j=1}^N\left( a_{h,j}-f_0(x_j) \right) ^2. \end{aligned}$$

We now bound from above \(\text {Var}(\widehat{U}_B)\). One can rewrite \(\widehat{U}_B\) as

$$\begin{aligned} \widehat{U}_B=\frac{1}{n(n-1)}\sum _{i\ne k}h(Z_i,Z_k), \end{aligned}$$

where

$$\begin{aligned} h(Z_i,Z_k)=\sum _{j=1}^N\left( Z_{ij}-a_{h,j} \right) \left( Z_{kj}-a_{h,j} \right) . \end{aligned}$$

Using a result for the variance of a U-statistic (see for instance Lemma A, p. 183 in [22]), we have

$$\begin{aligned} \left( {\begin{array}{c}n\\ 2\end{array}}\right) \text {Var}(\widehat{U}_B)=2(n-2)\zeta _1+\zeta _2, \end{aligned}$$

where

$$\begin{aligned} \zeta _1=\text {Var}\left( \mathbb E\left[ h(Z_1,Z_2)\mid Z_1 \right] \right) , \text { and } \zeta _2=\text {Var}\left( h(Z_1,Z_2) \right) . \end{aligned}$$

We have \(\zeta _1=0\) since \(\mathbb E\left[ h(Z_1,Z_2)\mid Z_1 \right] =0\) and thus

$$\begin{aligned} \text {Var}(\widehat{U}_B)=\frac{2}{n(n-1)}\text {Var}\left( h(Z_1,Z_2) \right) . \end{aligned}$$

Write

$$\begin{aligned} h(Z_1,Z_2)&= \sum _{j=1}^N \left( \psi _h\left( x_j-X_1\right) +\frac{2\Vert \psi \Vert _\infty }{\alpha h} W_{1j}-a_{h,j}\right) \left( \psi _h\left( x_j-X_2\right) +\frac{2\Vert \psi \Vert _\infty }{\alpha h} W_{2j}-a_{h,j}\right) \\&=\sum _{j=1}^N\left( \psi _h\left( x_j-X_1\right) -a_{h,j}\right) \left( \psi _h\left( x_j-X_2\right) -a_{h,j}\right) +\frac{4\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\sum _{j=1}^N W_{1j}W_{2j} \\&\quad +\frac{2\Vert \psi \Vert _\infty }{\alpha h}\sum _{j=1}^N W_{1j}(\psi _h(x_j-X_2)-a_{h,j})\\ {}&\quad +\frac{2\Vert \psi \Vert _\infty }{\alpha h}\sum _{j=1}^N W_{2j}(\psi _h(x_j-X_1)-a_{h,j})\\&=: \tilde{T}_1+\tilde{T}_2+\tilde{T}_3+\tilde{T}_4. \end{aligned}$$

We thus have \(\text {Var}(h(Z_1,Z_2))=\sum _{i=1}^4 \text {Var}(\tilde{T}_i)+2\sum _{i<j}\textrm{Cov}(\tilde{T}_i,\tilde{T}_j)\). Observe that \(\textrm{Cov}(\tilde{T}_i,\tilde{T}_j)=0\) for \(i<j\) and \(\text {Var}(\tilde{T}_3)=\text {Var}(\tilde{T}_4)\). We thus have

$$\begin{aligned} \text {Var}(h(Z_1,Z_2))=\text {Var}(\tilde{T}_1)+\text {Var}(\tilde{T}_2)+2\text {Var}(\tilde{T}_3). \end{aligned}$$

The independence of the random variables \((W_{ij})_{i,j}\) yields

$$\begin{aligned} \text {Var}(\tilde{T}_2)=\frac{64\Vert \psi \Vert _\infty ^4N}{\alpha ^4 h^4}. \end{aligned}$$

The independence of the random variables \((W_{ij})_{i,j}\) and their independence with \(X_2\) yield

$$\begin{aligned} \text {Var}(\tilde{T}_3)&=\mathbb E\left[ \tilde{T}_3^2 \right] \\&=\frac{4\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\mathbb E\left[ \sum _{j=1}^N W_{1j}(\psi _h(x_j-X_2)-a_{h,j})\sum _{k=1}^N W_{1k}(\psi _h(x_k-X_2)-a_{h,k}) \right] \\&=\frac{4\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\sum _{j=1}^N\sum _{k=1}^N\mathbb E\left[ W_{1j}W_{1k}\right] \mathbb E\left[ (\psi _h(x_j-X_2)-a_{h,j})(\psi _h(x_k-X_2)-a_{h,k}) \right] \\&=\frac{8\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\sum _{j=1}^N\mathbb E\left[ (\psi _h(x_j-X_2)-a_{h,j})^2 \right] \\&\le \frac{8\Vert \psi \Vert _\infty ^2}{\alpha ^2 h^2}\sum _{j=1}^N\mathbb E\left[ (\psi _h(x_j-X_2))^2 \right] . \end{aligned}$$

Now, since \(y\mapsto \psi _h(x_j-y)\) is null outside \(B_j\) (consequence of Assumption 3.1), it holds

$$\begin{aligned} \sum _{j=1}^N\mathbb E\left[ (\psi _h(x_j-X_2))^2 \right] =\sum _{j=1}^N\int _{B_j}\left( \psi _h(x_j-y) \right) ^2f(y) dy\le \Vert \psi _h\Vert _\infty ^2\sum _{j=1}^N\int _{B_j}f\le \Vert \psi _h\Vert _\infty ^2, \end{aligned}$$

and thus

$$\begin{aligned} \text {Var}(\tilde{T}_3)\le \frac{8\Vert \psi \Vert _\infty ^4}{\alpha ^2 h^4}. \end{aligned}$$

By independence of \(X_1\) and \(X_2\), it holds \(\mathbb E[\tilde{T}_1]=0\), and

$$\begin{aligned} \text {Var}(\tilde{T}_1)&=\mathbb E\left[ \tilde{T}_1^2\right] \\&=\sum _{j=1}^N\sum _{k=1}^N\mathbb E\left[ \left( \psi _h\left( x_j-X_1\right) -a_{h,j}\right) \left( \psi _h\left( x_j-X_2\right) -a_{h,j}\right) \right. \\ {}&\qquad \left. \left( \psi _h\left( x_k-X_1\right) -a_{h,k}\right) \left( \psi _h\left( x_k-X_2\right) -a_{h,k}\right) \right] \\&=\sum _{j=1}^N\sum _{k=1}^N\mathbb E\left[ \left( \psi _h\left( x_j-X_1\right) -a_{h,j}\right) \left( \psi _h\left( x_k-X_1\right) -a_{h,k}\right) \right] \\ {}&\qquad \mathbb E\left[ \left( \psi _h\left( x_j-X_2\right) -a_{h,j}\right) \left( \psi _h\left( x_k-X_2\right) -a_{h,k}\right) \right] \\&= \sum _{j=1}^N\sum _{k=1}^N\left[ \int \psi _h(x_j-y)\psi _h(x_k-y)f(y)\textrm{d}y- a_{h,j}a_{h,k} \right] ^2\\&=\sum _{j=1}^N\sum _{k=1}^N \left( \int \psi _h(x_j-y)\psi _h(x_k-y)f(y)\textrm{d}y \right) ^2 \\ {}&\qquad -2\sum _{j=1}^N\sum _{k=1}^Na_{h,j}a_{h,k}\int \psi _h(x_j-y)\psi _h(x_k-y)f(y)\textrm{d}y\\&+\sum _{j=1}^N\sum _{k=1}^Na_{h,j}^2a_{h,k}^2. \end{aligned}$$

Assumption 3.1 yields \(\int \psi _h(x_j-y)\psi _h(x_k-y)f(y)\textrm{d}y=0\) if \(j\ne k\). We thus obtain

$$\begin{aligned} \text {Var}(\tilde{T}_1)&=\sum _{j=1}^N\left( \int (\psi _h(x_j-y))^2f(y)\textrm{d}y \right) ^2\\&\quad -2\sum _{j=1}^Na_{h,j}^2\int \left( \psi _h(x_j-y)\right) ^2f(y)\textrm{d}y+\left( \sum _{j=1}^Na_{h,j}^2\right) ^2. \end{aligned}$$

Now, since \(y\mapsto \psi _h(x_j-y)\) is null outside \(B_j\) (consequence of Assumption 3.1), observe that

$$\begin{aligned}&\sum _{j=1}^N\left( \int (\psi _h(x_j-y))^2f(y)\textrm{d}y \right) ^2\le \frac{\Vert \psi \Vert _\infty ^4}{h^4}\sum _{j=1}^N\left( \int _{B_j}f \right) ^2\\ {}&\qquad \le \frac{\Vert \psi \Vert _\infty ^4}{h^4}\sum _{j=1}^N\int _{B_j}f \le \frac{\Vert \psi \Vert _\infty ^4}{h^4}, \end{aligned}$$

and

$$\begin{aligned} \left( \sum _{j=1}^Na_{h,j}^2\right) ^2&=\left( \sum _{j=1}^N \left( \int \psi _h(x_j-y)f(y)\textrm{d}y \right) ^2\right) ^2 \\ {}&\le \frac{\Vert \psi \Vert _\infty ^4}{h^4}\left[ \sum _{j=1}^N\left( \int _{B_j}f \right) ^2\right] ^2\le \frac{\Vert \psi \Vert _\infty ^4}{h^4}, \end{aligned}$$

yielding \(\text {Var}(\tilde{T}_1)\le 2\frac{\Vert \psi \Vert _\infty ^4}{h^4}\). We thus have

$$\begin{aligned} \text {Var}(\widehat{U}_B)\le \frac{2}{n(n-1)}\left[ 2\frac{\Vert \psi \Vert _\infty ^4}{h^4}+ \frac{64\Vert \psi \Vert _\infty ^4N}{\alpha ^4 h^4}+ \frac{16\Vert \psi \Vert _\infty ^4}{\alpha ^2 h^4} \right] \le \frac{164\Vert \psi \Vert _\infty ^4N}{n(n-1)\alpha ^4 h^4}. \end{aligned}$$

Finally,

$$\begin{aligned} \text {Var}(S_B)\le \frac{36\Vert \psi \Vert _\infty ^2}{n\alpha ^2 h^2}\sum _{j=1}^N\left( a_{h,j}-f_0(x_j) \right) ^2+\frac{164\Vert \psi \Vert _\infty ^4N}{n(n-1)\alpha ^4 h^4}. \end{aligned}$$

2. For all \(i\in \llbracket n+1,2n\rrbracket \) it holds

$$\begin{aligned} \mathbb E_{Q_{f}^n}[Z_i]&=\mathbb E\left[ Z_i\mid X_i\notin B \right] \mathbb P\left( X_i \notin B \right) +\sum _{j=1}^N\mathbb E\left[ Z_i\mid X_i\in B_j \right] \mathbb P\left( X_i\in B_j \right) \\&=\left[ c_\alpha \cdot \frac{1}{2}\left( 1+\frac{1}{c_\alpha } \right) -c_\alpha \cdot \frac{1}{2}\left( 1-\frac{1}{c_\alpha } \right) \right] \mathbb P\left( X_i \notin B \right) \\ {}&\qquad +\sum _{j=1}^N\left[ c_\alpha \cdot \frac{1}{2}-c_\alpha \cdot \frac{1}{2} \right] \mathbb P\left( X_i\in B_j \right) \\&=\mathbb P\left( X_i \notin B \right) . \end{aligned}$$

This yields \(\mathbb E_{Q_{f}^n}[T_B] =\int _{\overline{B}} (f-f_0)\), and using the independence of the \(Z_i\), \(i=n+1,\ldots ,2n\) we obtain

$$\begin{aligned} \text {Var}_{Q_{f}^n}[T_B]=\frac{1}{n^2}\sum _{i=n+1}^{2n}\text {Var}(Z_i)=\frac{1}{n^2}\sum _{i=n+1}^{2n}\left[ \mathbb E[Z_i^2]-\mathbb E[Z_i]^2 \right] =\frac{1}{n}\left( c_\alpha ^2-\left( \int _{\overline{B}} f \right) ^2 \right) . \end{aligned}$$

\(\square \)

We can now prove Theorem 3.4. We first prove that the choice of \(t_1\) and \(t_2\) in (3) gives \(\mathbb P_{Q_{f_0}^n}(\Phi =1)\le \gamma /2\). Since \(\mathbb E_{Q_{f_0}^n}[T_B]=0\), Chebyshev’s inequality and Proposition 3.3 yield for \(\alpha \in (0,1]\)

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}(T_B \ge t_2)\le \mathbb P_{Q_{f_0}^n}(\vert T_B \vert \ge t_2)\le \frac{\text {Var}_{Q_{f_0}^n}(T_B)}{t_2^2}\le \frac{c_\alpha ^2 }{n t_2^2}\le \frac{5 }{n \alpha ^2 t_2^2}= \frac{\gamma }{4}. \end{aligned}$$

If \(t_1>\mathbb E_{Q_{f_0}^n}[S_B]= \sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2\), then Chebychev’s inequality and Proposition 3.3 yield

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}(S_B \ge t_1)&\le \mathbb P_{Q_{f_0}^n}(\vert S_B -\mathbb E_{Q_{f_0}^n}[S_B]\vert \ge t_1-\mathbb E_{Q_{f_0}^n}[S_B])\\&\le \frac{\text {Var}_{Q_{f_0}^n}(S_B)}{(t_1-\mathbb E_{Q_{f_0}^n}[S_B])^2}\\&\le \frac{\frac{36\Vert \psi \Vert _\infty ^2}{n \alpha ^2 h^2}\sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2}{\left( t_1-\sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2\right) ^2}\\ {}&\qquad +\frac{\frac{164\Vert \psi \Vert _\infty ^4N}{n(n-1)\alpha ^4h^4}}{\left( t_1-\sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2\right) ^2}. \end{aligned}$$

Observe that

$$\begin{aligned} t_1&\ge \sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2\\&\qquad +\max \left\{ \sqrt{ \frac{288\Vert \psi \Vert _\infty ^2}{\gamma n\alpha ^2 h^2} \sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2}, \sqrt{\frac{1312\Vert \psi \Vert _\infty ^4N}{\gamma n(n-1)\alpha ^4 h^4} }\right\} . \end{aligned}$$

Indeed for \(f\in H(\beta ,L)\) with \(\beta \le 1\) it holds \(\left| [\psi _h*f](x_j)-f(x_j) \right| \le LC_\beta h^{\beta }\) for all \(j\in \llbracket 1,N\rrbracket \) where \(C_\beta =\int _{-1}^1\vert u\vert ^\beta \vert \psi (u)\vert \textrm{d}u\), and thus using \(ab\le a^2/2+b^2 /2\) we obtain

$$\begin{aligned}&\sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2\\&\quad +\max \left\{ \sqrt{ \frac{288\Vert \psi \Vert _\infty ^2}{\gamma n\alpha ^2 h^2} \sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2}, \sqrt{\frac{1312\Vert \psi \Vert _\infty ^4N}{\gamma n(n-1)\alpha ^4 h^4} }\right\} \\&\le L_0^2C_\beta ^2Nh^{2\beta }+\max \left\{ \frac{1}{2}L_0^2C_\beta ^2Nh^{2\beta }+\frac{144\Vert \psi \Vert _\infty ^2}{\gamma n\alpha ^2 h^2}, \sqrt{\frac{1312\Vert \psi \Vert _\infty ^4N}{\gamma n(n-1)\alpha ^4 h^4} }\right\} \\&\le \frac{3}{2}L_0^2C_\beta ^2Nh^{2\beta }+\frac{144\Vert \psi \Vert _\infty ^2}{\gamma n\alpha ^2 h^2}+ \frac{52\Vert \psi \Vert _\infty ^2\sqrt{N}}{\sqrt{\gamma }n\alpha ^2h^2}\\&\le \frac{3}{2}L_0^2C_\beta ^2Nh^{2\beta }+ \frac{196\Vert \psi \Vert _\infty ^2\sqrt{N}}{\gamma n\alpha ^2h^2}=t_1. \end{aligned}$$

Then it holds

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}(S_B \ge t_1)&\le \frac{\frac{36\Vert \psi \Vert _\infty ^2}{n \alpha ^2 h^2}\sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2}{(t_1-\sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2)^2}\\ {}&\quad +\frac{\frac{164\Vert \psi \Vert _\infty ^4N}{n(n-1)\alpha ^4h^4}}{(t_1-\sum _{j=1}^N\left( [\psi _h*f_0](x_j)-f_0(x_j) \right) ^2)^2}\\&\le \frac{\gamma }{8}+\frac{\gamma }{8}\le \frac{\gamma }{4}, \end{aligned}$$

and thus

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}(\Phi =1)\le \mathbb P_{Q_{f_0}^n}(T_B \ge t_2) +\mathbb P_{Q_{f_0}^n}(S_B \ge t_1)\le \frac{\gamma }{2}. \end{aligned}$$

We now exhibit \(\rho _1, \rho _2>0\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \int _{B}\vert f-f_0\vert \ge \rho _1 \Rightarrow \mathbb P_{Q_f^n}(S_B< t_1)\le \gamma /2\\ \int _{\bar{B}}\vert f-f_0\vert \ge \rho _2 \Rightarrow \mathbb P_{Q_f^n}(T_B< t_2)\le \gamma /2. \end{array}\right. } \end{aligned}$$

In this case, for all \(f\in H(\beta ,L)\) satisfying \(\Vert f-f_0\Vert _1\ge \rho _1+\rho _2\) it holds

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}(\Phi =1)+\mathbb P_{Q_f^n}(\Phi =0)\le \frac{\gamma }{2}+ \min \left\{ \mathbb P_{Q_f^n}( S_B< t_1), \mathbb P_{Q_f^n}(T_B< t_2)\right\} \le \frac{\gamma }{2}+\frac{\gamma }{2}=\gamma , \end{aligned}$$

since \(\int _B\vert f-f_0\vert +\int _{\bar{B}}\vert f-f_0\vert =\Vert f-f_0\Vert _1\ge \rho _1+\rho _2\) implies \(\int _{B}\vert f-f_0\vert \ge \rho _1\) or \( \int _{\bar{B}}\vert f-f_0\vert \ge \rho _2\). Consequently, \(\rho _1+\rho _2\) will provide an upper bound on \(\mathcal E_{n,\alpha }^{\text {NI}}(f_0,\gamma )\).

If \(\int _{\overline{B}} (f-f_0)=\mathbb E_{Q_{f}^n}[T_B]>t_2\) then Chebychev’s inequality yields

$$\begin{aligned} \mathbb P_{Q_{f}^n}(T_B < t_2)&=\mathbb P_{Q_{f}^n}\left( \mathbb E_{Q_{f}^n}[T_B]-T_B> \mathbb E_{Q_{f}^n}[T_B]-t_2\right) \\&\le \mathbb P_{Q_{f}^n}\left( \left| \mathbb E_{Q_{f}^n}[T_B]- T_B \right| > \mathbb E_{Q_{f}^n}[ T_B]-t_2\right) \\&\le \frac{\text {Var}_{Q_{f}^n}(T_B)}{\left( \mathbb E_{Q_{f}^n}[ T_B]-t_2\right) ^2}\\&\le \frac{c_\alpha ^2 }{n \left( \int _{\overline{B}} (f-f_0)-t_2 \right) ^2}. \end{aligned}$$

Now, observe that

$$\begin{aligned} \int _{\bar{B}}(f-f_0)\ge \int _{\bar{B}}\vert f-f_0\vert -2\int _{\bar{B}}f_0. \end{aligned}$$

Thus, setting

$$\begin{aligned} \rho _2=2\int _{\bar{B}}f_0+\left( 1+\frac{1}{\sqrt{2}}\right) t_2, \end{aligned}$$

we obtain that \(\int _{\bar{B}}\vert f-f_0\vert \ge \rho _2\) implies

$$\begin{aligned} \mathbb P_{Q_{f}^n}(T_B < t_2)\le \frac{2c_\alpha ^2}{n t_2^2}\le \frac{10}{n\alpha ^2 t_2^2}=\frac{\gamma }{2}. \end{aligned}$$

We now exhibit \(\rho _1\) such that \(\int _B\vert f-f_0\vert \ge \rho _1\) implies \(\mathbb P_{Q_{f}^n}(S_B < t_1)\le \gamma /2.\) First note that if the following relation holds

$$\begin{aligned} \mathbb E_{Q_{f}^n}[S_B]= \sum _{j=1}^N\left| [\psi _h*f](x_j)-f_0(x_j) \right| ^2 \ge t_1 + \sqrt{\frac{2\text {Var}_{Q_{f}^n}(S_B)}{\gamma }}, \end{aligned}$$
(16)

then Chebychev’s inequality yields

$$\begin{aligned} \mathbb P_{Q_{f}^n}(S_B < t_1)\le \mathbb P_{Q_{f}^n}\left( S_B\le \mathbb E_{Q_{f}^n}[S_B]-\sqrt{\frac{2\text {Var}_{Q_{f}^n}(S_B)}{\gamma }} \right) \le \frac{\gamma }{2}. \end{aligned}$$

Using \(\sqrt{a+b}\le \sqrt{a}+\sqrt{b}\) for all \(a,b>0\) and \(ab\le a^2/2+b^2/2\) we have

$$\begin{aligned} \sqrt{\frac{2\text {Var}_{Q_{f}^n}(S_B)}{\gamma }}&\le \sqrt{\frac{72\Vert \psi \Vert _\infty ^2}{\gamma n \alpha ^2 h^2}\sum _{j=1}^N\left( [\psi _h*f](x_j)-f_0(x_j) \right) ^2+\frac{328\Vert \psi \Vert _\infty ^4N}{\gamma n(n-1)\alpha ^4h^4}}\\&\le \sqrt{\frac{72\Vert \psi \Vert _\infty ^2}{\gamma n \alpha ^2 h^2}\sum _{j=1}^N\left( [\psi _h*f](x_j)-f_0(x_j) \right) ^2}+\sqrt{\frac{656\Vert \psi \Vert _\infty ^4N}{\gamma n^2\alpha ^4h^4}}\\&\le \frac{1}{2}\sum _{j=1}^N\left( [\psi _h*f](x_j)-f_0(x_j) \right) ^2+\frac{36\Vert \psi \Vert _\infty ^2}{\gamma n \alpha ^2 h^2}+\frac{26\Vert \psi \Vert _\infty ^2\sqrt{N}}{\sqrt{\gamma }n\alpha ^2h^2}\\&\le \frac{1}{2}\sum _{j=1}^N\left( [\psi _h*f](x_j)-f_0(x_j) \right) ^2+\frac{62\Vert \psi \Vert _\infty ^2\sqrt{N}}{\gamma n\alpha ^2h^2}. \end{aligned}$$

Thus, if

$$\begin{aligned} \sum _{j=1}^N\left| [\psi _h*f](x_j)-f_0(x_j) \right| ^2 \ge 2\left[ t_1 + \frac{62\Vert \psi \Vert _\infty ^2\sqrt{N}}{\gamma n\alpha ^2h^2}\right] \end{aligned}$$
(17)

then (16) holds and we have \(\mathbb P_{Q_{f}^n}(S_B < t_1)\le \gamma /2\). We now link \(\sum _{j=1}^N\left| [\psi _h*f](x_j)\right. \left. -f_0(x_j) \right| ^2\) to \(\int _B\vert f-f_0\vert \). According to Cauchy-Schwarz inequality we have

$$\begin{aligned} \left( \sum _{j=1}^N\left| [\psi _h*f](x_j)-f_0(x_j) \right| \right) ^2 \le N \sum _{j=1}^N\left| [\psi _h*f](x_j)-f_0(x_j) \right| ^2. \end{aligned}$$

We also have

$$\begin{aligned}&\left| \int _B \vert f-f_0\vert -\sum _{j=1}^N2h\vert \psi _h*f(x_j)-f_0(x_j)\vert \right| \\&\qquad = \left| \sum _{j=1}^N \int _{B_j} \vert f-f_0\vert -\sum _{j=1}^N2h\vert \psi _h*f(x_j)-f_0(x_j)\vert \right| \\&\qquad = \left| \sum _{j=1}^N \int _{B_j}\left( \vert f(x)-f_0(x)\vert -\vert \psi _h*f(x_j)-f_0(x_j)\vert \right) \textrm{d}x\right| \\&\qquad \le \sum _{j=1}^N \int _{B_j}\left| f(x)-f_0(x)- \psi _h*f(x_j)+f_0(x_j) \right| \textrm{d}x\\&\qquad \le \sum _{j=1}^N \int _{B_j}\left( \vert f(x)-f(x_j)\vert +\vert f(x_j) - \psi _h*f(x_j)\vert + \vert f_0(x_j)-f_0(x) \vert \right) \textrm{d}x\\&\qquad \le \left[ 1 + C_\beta +\frac{L_0}{L} \right] L h^\beta \vert B\vert . \end{aligned}$$

We thus have

$$\begin{aligned} \sum _{j=1}^N\left| [\psi _h*f](x_j)-f_0(x_j) \right| ^2\ge \frac{1}{4Nh^2}\left( \int _B \vert f-f_0\vert - \left[ 1 + C_\beta +\frac{L_0}{L} \right] L h^\beta \vert B\vert \right) ^2. \end{aligned}$$

Thus, if

$$\begin{aligned} \int _B \vert f-f_0\vert \ge \left[ 1 + C_\beta +\frac{L_0}{L} \right] L h^\beta \vert B\vert + 2h\sqrt{N}\sqrt{ 2t_1 + \frac{124\Vert \psi \Vert _\infty ^2\sqrt{N}}{\gamma n\alpha ^2h^2} } =:\rho _1 \end{aligned}$$

then (17) holds and we have \(\mathbb P_{Q_{f}^n}(S_B < t_1)\le \gamma /2\). Consequently

$$\begin{aligned} \mathcal E_{n,\alpha }^{\text {NI}}(f_0,\gamma )&\le \rho _1+\rho _2 \\&\le \left[ 1 + C_\beta +\frac{L_0}{L} \right] L h^\beta \vert B\vert + 2h\sqrt{N}\sqrt{ 2t_1 + \frac{124\Vert \psi \Vert _\infty ^2\sqrt{N}}{\gamma n\alpha ^2h^2} }\\&\quad + 2\int _{\bar{B}}f_0+\left( 1+\frac{1}{\sqrt{2}}\right) t_2\\&\le C(L,L_0,\beta ,\gamma ,\psi )\left[ h^\beta \vert B\vert + Nh^{\beta +1}+ \frac{N^{3/4}}{\sqrt{n\alpha ^2}} + \int _{\bar{B}}f_0+ \frac{1}{\sqrt{n\alpha ^2}}\right] \\&\le C(L,L_0,\beta ,\gamma ,\psi )\left[ h^\beta \vert B\vert + \frac{\vert B\vert ^{3/4}}{h^{3/4} \sqrt{n\alpha ^2}} + \int _{\bar{B}}f_0 + \frac{1}{\sqrt{n\alpha ^2}}\right] \end{aligned}$$

where we have used \(\sqrt{a+b}\le \sqrt{a}+\sqrt{b}\) for \(a,b>0\) to obtain the second to last inequality. Taking \(h\asymp \vert B \vert ^{-1/(4\beta +3)}(n\alpha ^2)^{-2/(4\beta +3)}\) yields

$$\begin{aligned} \mathcal E_{n,\alpha }^{\text {NI}}(f_0,\gamma )\le C(L,L_0,\beta ,\gamma ,\psi )\left[ \vert B \vert ^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \int _{\overline{B}} f_0+ \frac{1}{\sqrt{n\alpha ^2}}\right] . \end{aligned}$$

1.3 Proof of Lemma 3.7

For \(j=1,\ldots ,N\), write

$$\begin{aligned} v_j=\sum _{k=1}^N a_{kj}\psi _k. \end{aligned}$$

Note that since \((\psi _1,\dots ,\psi _N)\) and \((v_1,\ldots ,v_N)\) are two orthonormal bases of \(W_N\), the matrix \((a_{kj})_{kj}\) is orthogonal. We can write

$$\begin{aligned} f_\nu (x)= f_0(x)+\delta \sum _{j=1}^N\sum _{k=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \psi _k(x), \quad x\in \mathbb R. \end{aligned}$$

Define

$$\begin{aligned} A_b=\left\{ \nu \in \{-1,1\}^N : \left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \right| \le \frac{1}{\sqrt{h}}\sqrt{\log \left( \frac{2N}{b} \right) }\text { for all } 1\le k\le N\right\} . \end{aligned}$$

The union bound and Hoeffding inequality yield

$$\begin{aligned} \mathbb P_\nu (A_b^c)&\le \sum _{k=1}^N\mathbb P\left( \left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \right| > \frac{1}{\sqrt{h}}\sqrt{\log \left( \frac{2N}{b} \right) } \right) \\&\le \sum _{k=1}^N 2\exp \left( -\frac{2\log \left( \frac{2N}{b} \right) }{h\sum _{j=1}^N4\frac{a_{kj}^2}{\tilde{\lambda }_j^2}} \right) \\&\le b, \end{aligned}$$

where the last inequality follows from \(\tilde{\lambda }_j^2\ge 2h\) for all j and \(\sum _{j=1}^Na_{kj}^2=1\). We thus have \(\mathbb P_\nu (A_b)\ge 1-b\).

We now prove i). Since \(\int \psi _k= 0\) for all \(k=1,\ldots ,n\), it holds \(\int f_\nu =\int f_0=1\) for all \(\nu \). Since \(\text {Supp}(\psi _k)=B_k\) for all \(k=1,\ldots ,N\), it holds \(f_\nu \equiv f_0\) on \(B^c\) and thus \(f_\nu \) is non-negative on \(B^c\). Now, for \(x\in B_k\) it holds

$$\begin{aligned} f_\nu (x)= f_0(x)+\delta \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \psi _k(x)\ge C_0(B)-\frac{\delta \Vert \psi \Vert _\infty }{\sqrt{h}}\left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \right| . \end{aligned}$$

Moreover, for any \(\nu \in A_b\), we have

$$\begin{aligned} \frac{\delta \Vert \psi \Vert _\infty }{\sqrt{h}}\left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \right| \le \frac{\delta \Vert \psi \Vert _\infty }{h} \sqrt{\log \left( \frac{2N}{b} \right) }\le C_0(B) \end{aligned}$$

since \(\delta \) is assumed to satisfy \(\delta \le \frac{h}{\sqrt{\log (2N/b)}}\min \left\{ \frac{C_0(B)}{\Vert \psi \Vert _\infty } , \frac{1}{2}\left( 1-\frac{L_0}{L} \right) h^\beta \right\} \). Thus, \(f_\nu \) is non-negative on \(\mathbb R\) for all \(\nu \in A_b\).

To prove ii), we have to show that \(\vert f_\nu (x)-f_\nu (y)\vert \le L \vert x-y\vert ^\beta \), for all \(\nu \in A_b\), for all \(x,y\in \mathbb R\). Since \(f_\nu \equiv f_0\) on \(B^c\) and \(f_0\in H(\beta ,L_0)\), this result is trivial for \(x,y\in B^c\). If \(x\in B_l\) and \(y\in B_k\) it holds

$$\begin{aligned} \vert f_\nu (x)-f_\nu (y)\vert&\le \vert f_0(x)-f_0(y)\vert + \left| \delta \sum _{j=1}^N\frac{\nu _ja_{lj}}{\tilde{\lambda }_j} \psi _l(x) - \delta \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \psi _k(y) \right| \\&\le L_0\vert x-y\vert ^\beta + \left| \delta \sum _{j=1}^N\frac{\nu _ja_{lj}}{\tilde{\lambda }_j} \psi _l(x) - \delta \sum _{j=1}^N\frac{\nu _ja_{lj}}{\tilde{\lambda }_j} \psi _l(y) \right| \\&+\left| \delta \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \psi _k(x) - \delta \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \psi _k(y) \right| \\&\le L_0\vert x-y\vert ^\beta +\frac{\delta }{\sqrt{h}} \left| \sum _{j=1}^N\frac{\nu _ja_{lj}}{\tilde{\lambda }_j} \right| \left| \psi \left( \frac{x-x_l}{h}\right) -\psi \left( \frac{y-x_l}{h}\right) \right| \\&+\frac{\delta }{\sqrt{h}} \left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j} \right| \left| \psi \left( \frac{x-x_k}{h}\right) -\psi \left( \frac{y-x_k}{h}\right) \right| \\&\le L_0\vert x-y\vert ^\beta + \frac{\delta }{h^{\beta +1/2}} \left| \sum _{j=1}^N\frac{\nu _ja_{lj}}{\tilde{\lambda }_j}\right| \cdot L\vert x-y\vert ^\beta \\&\qquad \quad + \frac{\delta }{h^{\beta +1/2}} \left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j}\right| \cdot L\vert x-y\vert ^\beta \\&=\left( \frac{L_0}{L}+ \frac{\delta }{h^{\beta +1/2}} \left| \sum _{j=1}^N\frac{\nu _ja_{lj}}{\tilde{\lambda }_j}\right| +\frac{\delta }{h^{\beta +1/2}} \left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j}\right| \right) L\vert x-y\vert ^\beta , \end{aligned}$$

where we have used \(\psi \in H(\beta ,L)\). Observe that for all \(k=1,\ldots ,n\) and for all \(\nu \in A_b\) it holds

$$\begin{aligned} \frac{\delta }{h^{\beta +1/2}} \left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda }_j}\right| \le \frac{\delta }{h^{\beta +1}}\cdot \sqrt{\log \left( \frac{2N}{b} \right) }\le \frac{1}{2}\left( 1-\frac{L_0}{L} \right) , \end{aligned}$$

since \(\delta \) is assumed to satisfy \(\delta \le \frac{h}{\sqrt{\log (2N/b)}}\min \left\{ \frac{C_0(B)}{\Vert \psi \Vert _\infty } , \frac{1}{2}\left( 1-\frac{L_0}{L} \right) h^\beta \right\} \). Thus, it holds \(\vert f_\nu (x)-f_\nu (y)\vert \le L\vert x-y\vert ^\beta \) for all \(\nu \in A_b\), \(x\in B_l\) and \(y\in B_k\). The case \(x\in B^c\) and \(y\in B_k\) can be handled in a similar way, which ends the proof of ii).

We now prove iii). It holds

$$\begin{aligned} \int _\mathbb R\vert f_\nu -f_0\vert&=\int _\mathbb R\left| \delta \sum _{j=1}^N\frac{\nu _j}{\tilde{\lambda _j}}v_j(x) \right| \textrm{d}x = \delta \sum _{k=1}^N\int _{B_k}\left| \sum _{j=1}^N\frac{\nu _j}{\tilde{\lambda _j}}v_j(x) \right| \textrm{d}x\\&= \delta \sum _{k=1}^N\int _{B_k}\left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda _j}}\psi _k(x) \right| \textrm{d}x\\&= \delta \sum _{k=1}^N\left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda _j}} \right| \int _{B_k}\left| \psi _k(x) \right| \textrm{d}x\\&= C_1\delta \sqrt{h}\sum _{k=1}^N\left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda _j}} \right| , \end{aligned}$$

where \(C_1=\int _{-1}^1\vert \psi \vert \). For all \(\nu \in A_b\) it thus holds

$$\begin{aligned} \int _\mathbb R\vert f_\nu -f_0\vert \ge C_1\frac{\delta h}{\sqrt{\log \left( \frac{2N}{b} \right) }}\sum _{k=1}^N\left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda _j}} \right| ^2. \end{aligned}$$

Moreover,

$$\begin{aligned} \sum _{k=1}^N\left| \sum _{j=1}^N\frac{\nu _ja_{kj}}{\tilde{\lambda _j}} \right| ^2&= \sum _{k=1}^N\left( \sum _{j=1}^N\left( \frac{\nu _ja_{kj}}{\tilde{\lambda _j}}\right) ^2+\sum _{j\ne l}\frac{\nu _ja_{kj}}{\tilde{\lambda _j}}\frac{\nu _la_{kl}}{\tilde{\lambda _l}} \right) \\&=\sum _{j=1}^N\frac{1}{\tilde{\lambda }_j^2}\sum _{k=1}^Na_{kj}^2+\sum _{j\ne l}\frac{\nu _j\nu _l}{\tilde{\lambda }_j\tilde{\lambda }_l}\sum _{k=1}^Na_{kj}a_{kl}\\&=\sum _{j=1}^N\frac{1}{\tilde{\lambda }_j^2}, \end{aligned}$$

since the matrix \((a_{kj})_{k,j}\) is orthogonal. Thus, for all \(\nu \in A_b\) it holds

$$\begin{aligned} \Vert f_\nu -f_0\Vert _1\ge C_1\frac{\delta h}{\sqrt{\log \left( \frac{2N}{b} \right) }}\sum _{j=1}^N\frac{1}{\tilde{\lambda }_j^2}. \end{aligned}$$

Set \(\mathcal J=\{j\in \llbracket 1,N\rrbracket : z_\alpha ^{-1}\lambda _j\ge \sqrt{2h} \}\), we have for all \(\nu \in A_b\)

$$\begin{aligned} \Vert f_\nu -f_0\Vert _1&\ge C_1\frac{\delta h}{\sqrt{\log \left( \frac{2N}{b} \right) }} \sum _{j=1}^N\left( \frac{1}{2h} I(z_\alpha ^{-1}\lambda _j<\sqrt{2h})+\frac{z_\alpha ^2}{\lambda _j^2}I(z_\alpha ^{-1}\lambda _j\ge \sqrt{2h}) \right) \\&=C_1\frac{\delta h}{\sqrt{\log \left( \frac{2N}{b} \right) }}\left( \frac{1}{2h} (N-\vert \mathcal J\vert ) +\sum _{j\in \mathcal J}\frac{z_\alpha ^2}{\lambda _j^2}\right) \\&\ge C_1\frac{\delta h}{\sqrt{\log \left( \frac{2N}{b} \right) }} \left( \frac{N}{2h}-\frac{\vert \mathcal J\vert }{2h}+z_\alpha ^2\vert \mathcal J\vert ^2\left( \sum _{j\in \mathcal J}\lambda _j^2\right) ^{-1}\right) \\&= C_1\frac{\delta N}{2\sqrt{\log \left( \frac{2N}{b} \right) }}\left( 1-\frac{\vert \mathcal J\vert }{N}+\left( \frac{\vert \mathcal J\vert }{N}\right) ^2 \vert B\vert z_\alpha ^2\left( \sum _{j\in \mathcal J}\lambda _j^2\right) ^{-1} \right) , \end{aligned}$$

where the second to last inequality follows from the inequality between harmonic and arithmetic means. Now,

$$\begin{aligned} \sum _{j\in \mathcal J}\lambda _j^2\le \sum _{j=1}^N\lambda _j^2&=\sum _{j=1}^N \langle Kv_j,v_j\rangle \\&=\sum _{j=1}^N\left\langle \frac{1}{n}\sum _{i=1}^n\int _\mathbb R\left( \int _{\mathcal Z_i} \frac{q_i(z_i\mid y)q_i(z_i\mid \cdot )\mathbbm {1}_B(y)\mathbbm {1}_B(\cdot )}{g_{0,i}(z_i)}\textrm{d}\mu _i(z_i) \right) v_j(y)\textrm{d}y , v_j \right\rangle \\&=\frac{1}{n}\sum _{i=1}^n\int _{\mathcal Z_i}\sum _{j=1}^N\left( \int _\mathbb R\int _\mathbb R\frac{q_i(z_i\mid y)q_i(z_i\mid x)\mathbbm {1}_B(y)\mathbbm {1}_B(x)}{g_{0,i}(z_i)}v_j(x)v_j(y)\textrm{d}x\textrm{d}y\right) \textrm{d}\mu _i(z_i) \\&=\frac{1}{n}\sum _{i=1}^n\int _{\mathcal Z_i}\sum _{j=1}^N\left( \int _\mathbb R\frac{q_i(z_i\mid x)\mathbbm {1}_B(x)}{g_{0,i}(z_i)}v_j(x)\textrm{d}x\right) ^2g_{0,i}(z_i)\textrm{d}\mu _i(z_i) \\&=\frac{1}{n}\sum _{i=1}^n\int _{\mathcal Z_i}\sum _{j=1}^N\left( \int _\mathbb R\left( \frac{q_i(z_i\mid x)}{g_{0,i}(z_i)}-e^{-2\alpha }\right) \mathbbm {1}_B(x)v_j(x)\textrm{d}x\right) ^2g_{0,i}(z_i)\textrm{d}\mu _i(z_i), \end{aligned}$$

since \(\int \mathbbm {1}_B(x) v_j(x)dx=0\). Recall that \(q_i\) satisfies \(e^{-\alpha }\le q_i(z_i\mid x)\le e^\alpha \) for all \(z_i\in \mathcal Z_i\) and all \(x\in \mathbb R\). This implies \(e^{-\alpha }\le g_{0,i}(z_i)\le e^\alpha \), and therefore \(0\le f_{i,z_i}(x):=\frac{q_i(z_i\mid x)}{g_{0,i}(z_i)}-e^{-2\alpha }\le z_\alpha \). Writing \(f_{i,z_i,B}=\mathbbm {1}_B\cdot f_{i,z_i}\), we have

$$\begin{aligned} \sum _{j=1}^N\left( \int _\mathbb R\left( \frac{q_i(z_i\mid x)}{g_{0,i}(z_i)}-e^{-2\alpha }\right) \mathbbm {1}_B(x) v_j(x)\textrm{d}x\right) ^2&=\sum _{j=1}^N \langle f_{i,z_i,B}, v_j\rangle ^2\\&=\left\| \sum _{j=1}^N \langle f_{i,z_i,B}, v_j\rangle v_j \right\| _2^2 \\&=\left\| \text {Proj}_{\text {Vect}(v_1,\ldots ,v_N)}( f_{i,z_i,B}) \right\| _2^2 \\&\le \left\| f_{i,z_i,B}\right\| _2^2\le z_\alpha ^2\vert B\vert . \end{aligned}$$

Moreover, \(\int _{\mathcal Z_i} g_{0,i}(z_i)\textrm{d}\mu _i(z_i)=\int _\mathbb R(\int _{\mathcal Z_i}q_i(z_i\mid x)\textrm{d}\mu _i(z_i))f_0(x)\textrm{d}x=\int _\mathbb Rf_0=1\). This gives \(\sum _{j\in \mathcal J}\lambda _j^2\le z_\alpha ^2 \vert B\vert \) and for all \(\nu \in A_b\)

$$\begin{aligned} \Vert f_\nu -f_0\Vert _1\ge C_1\frac{\delta N}{2\sqrt{\log \left( \frac{2N}{b} \right) }}\left( 1-\frac{\vert \mathcal J\vert }{N}+\left( \frac{\vert \mathcal J\vert }{N}\right) ^2 \right) \ge \frac{3C_1}{8}\frac{\delta N }{\sqrt{\log \left( \frac{2N}{b} \right) }}. \end{aligned}$$

Proofs of Sect. 4

1.1 Proof of Proposition 4.1

Let \(i\in \llbracket 1,n\rrbracket \). Since \(Z_i\) depends only on \(X_i\), condition (1) reduces to

$$\begin{aligned} \frac{q^{Z_i\mid X_i=y}(z)}{q^{Z_i\mid X_i=y'}(z)}\le e^\alpha , \quad \forall y,y'\in \mathbb R, \, \forall z\in \mathbb R^N, \end{aligned}$$
(18)

where \(q^{Z_i\mid X_i=y}\) denotes the conditional density of \(Z_i\) given \(X_i=y\). It holds

$$\begin{aligned} q^{Z_i\mid X_i=y}(z)=\prod _{j=1}^N \frac{\alpha }{4}\exp \left( -\frac{\alpha \vert z_j-I(y\in B_j)\vert }{2} \right) . \end{aligned}$$

Thus, by the reverse and the ordinary triangle inequality,

$$\begin{aligned} \frac{q^{Z_i\mid X_i=y}(z)}{q^{Z_i\mid X_i=y'}(z)}&=\prod _{j=1}^N \exp \left( \frac{\alpha \left[ \vert z_j-I(y'\in B_j)\vert -\vert z_j-I(y\in B_j)\vert \right] }{2} \right) \\&\le \prod _{j=1}^N \exp \left( \frac{\alpha \vert I(y\in B_j)-I(y'\in B_j)\vert }{2} \right) \\&= \exp \left( \frac{\alpha }{2}\sum _{j=1}^{N}\vert I(y\in B_j)-I(y'\in B_j)\vert \right) \\&\le \exp (\alpha ), \end{aligned}$$

which proves (18).

Consider now \(i\in \llbracket n+1,2n\rrbracket \). Since \(Z_i\) depends only on \(X_i\) and on \(Z_1,\ldots , Z_n\), condition (1) reduces for \(i\in \llbracket n+1, 2n\rrbracket \) to

$$\begin{aligned} \frac{\mathbb P\left( Z_i=z \mid X_i\in A , Z_1=z_1,\ldots , Z_n=z_n \right) }{\mathbb P\left( Z_i=z \mid X_i\in F , Z_1=z_1,\ldots , Z_n=z_n \right) }\in [e^{-\alpha },e^\alpha ] \end{aligned}$$
(19)

for all \(z\in \{-c_\alpha \tau ,c_\alpha \tau \}\), \(A,F\in \{\overline{B},B_1,\ldots ,B_N \}\) and \(z_1,\ldots ,z_n\in \mathbb R^N\). For all \(j,k\in \llbracket 1,N\rrbracket \), for all \(z_1,\ldots ,z_n\) it holds

$$\begin{aligned}&\frac{\mathbb P\left( Z_i=c_\alpha \tau \mid X_i\in B_j , Z_1=z_1,\ldots , Z_n=z_n \right) }{\mathbb P\left( Z_i=c_\alpha \tau \mid X_i\in B_k , Z_1=z_1,\ldots , Z_n=z_n \right) }\\&\quad =\frac{1+\frac{[\hat{p}_j-p_0(j)]_{-\tau }^\tau }{c_\alpha \tau }}{1+\frac{[\hat{p}_k-p_0(k)]_{-\tau }^\tau }{c_\alpha \tau }}\in \left[ \frac{c_\alpha -1}{c_\alpha +1}, \frac{c_\alpha +1}{c_\alpha -1} \right] =[e^{-\alpha },e^{\alpha }], \end{aligned}$$

and a similar result holds for \(z=-c_\alpha \tau \). For all \(j\in \llbracket 1,N\rrbracket \), for all \(z_1,\ldots ,z_n\) it holds

$$\begin{aligned}&\frac{\mathbb P\left( Z_i=c_\alpha \tau \mid X_i\in B_j , Z_1=z_1,\ldots , Z_n=z_n \right) }{\mathbb P\left( Z_i=c_\alpha \tau \mid X_i\in \overline{B} , Z_1 =z_1,\ldots , Z_n=z_n \right) }\\&\quad =1+\frac{[\hat{p}_j-p_0(j)]_{-\tau }^\tau }{c_\alpha \tau }\in \left[ 1-\frac{1}{c_\alpha },1+\frac{1}{c_\alpha } \right] \subset [e^{-\alpha },e^{\alpha }], \end{aligned}$$

and a similar result holds for \(z=-c_\alpha \tau \). This ends the proof of (19).

Consider now \(i\in \llbracket 2n+1, 3n\rrbracket \). Since \(Z_i\) depends only on \(X_i\), condition (1) reduces for \(i\in \llbracket 2n+1, 3n\rrbracket \) to

$$\begin{aligned} \frac{\mathbb P\left( Z_i=z \mid X_i\in A \right) }{\mathbb P\left( Z_i=z \mid X_i\in F \right) }\in [e^{-\alpha },e^\alpha ] , \quad \forall A,F\in \{\overline{B},B_1,\ldots ,B_N\}, \, \forall z\in \{-c_\alpha ,c_\alpha \}. \end{aligned}$$

We have already proved this in the proof of Proposition 3.2.

1.2 Analysis of the Mean and Variance of the Statistic \(D_B\)

Proof of Proposition 4.2

1. For all \(i\in \llbracket n+1,2n\rrbracket \) it holds

$$\begin{aligned}&\mathbb P\left( Z_i=\pm c_\alpha \tau \mid Z_1,\ldots , Z_n \right) \\&=\sum _{j=1}^N\mathbb P\left( Z_i= \pm c_\alpha \tau \mid X_i\in B_j \right) \mathbb P(X_i\in B_j)+\mathbb P\left( Z_i= \pm c_\alpha \tau \mid X_i\in \bar{B} \right) \mathbb P(X_i\in \bar{B})\\&=\sum _{j=1}^N\frac{1}{2}\left( 1 \pm \frac{[\widehat{p}_j-p_0(j)]_{-\tau }^\tau }{c_\alpha \tau } \right) p(j)+\frac{1}{2}\int _{\bar{B}}f. \end{aligned}$$

For \(i\in \llbracket n+1,2n\rrbracket \) we thus have

$$\begin{aligned} \mathbb E[Z_i\mid Z_1,\ldots ,Z_n]&=c_\alpha \tau \mathbb P(Z_i=c_\alpha \tau \mid Z_1,\ldots ,Z_n)-c_\alpha \tau \mathbb P(Z_i=-c_\alpha \tau \mid Z_1,\ldots ,Z_n) \\&=\sum _{j=1}^Np(j)[\widehat{p}_j-p_0(j)]_{-\tau }^\tau . \end{aligned}$$

Thus,

$$\begin{aligned} \mathbb E[D_B]=\mathbb E\left[ \mathbb E[D_B\mid Z_1,\ldots , Z_n] \right] =\sum _{j=1}^N\{p(j)-p_0(j)\}\mathbb E\left[ [\widehat{p}_j-p_0(j)]_{-\tau }^\tau \right] . \end{aligned}$$

The proof of (12) is similar to the proof of Theorem 3 in [6].

2. Write

$$\begin{aligned} \text {Var}(D_B)=\mathbb E\left[ \text {Var}\left( D_B \mid Z_1,\ldots ,Z_n \right) \right] + \text {Var}\left( \mathbb E\left[ D_B \mid Z_1,\ldots , Z_n \right] \right) . \end{aligned}$$

It holds

$$\begin{aligned} \mathbb E\left[ D_B \mid Z_1,\ldots , Z_n \right] =\sum _{j=1}^N\{p(j)-p_0(j)\}[\widehat{p}_j-p_0(j)]_{-\tau }^\tau , \end{aligned}$$

and

$$\begin{aligned} \text {Var}\left( D_B \mid Z_1,\ldots ,Z_n \right)&=\text {Var}\left( \frac{1}{n}\sum _{i=n+1}^{2n}Z_i-\sum _{j=1}^Np_0(j)[\widehat{p}_j-p_0(j)]_{-\tau }^\tau \mid Z_1,\ldots ,Z_n \right) \\&=\text {Var}\left( \frac{1}{n}\sum _{i=n+1}^{2n}Z_i \mid Z_1,\ldots ,Z_n \right) \\&=\frac{1}{n^2}\sum _{i=n+1}^{2n}\text {Var}\left( Z_i \mid Z_1,\ldots ,Z_n \right) \\&\le \frac{1}{n^2}\sum _{i=n+1}^{2n}\mathbb E\left[ Z_i^2 \mid Z_1,\ldots ,Z_n \right] \\&\le \frac{c_\alpha ^2\tau ^2}{n}, \end{aligned}$$

where we have used the independence of the random variables \((Z_i)_{i=n+1,\ldots ,2n}\) conditionally on \(Z_1,\ldots ,Z_n\). This gives

$$\begin{aligned} \text {Var}(D_B)&\le \frac{c_\alpha ^2\tau ^2}{n}+ \sum _{j=1}^N\{p(j)-p_0(j)\}^2\text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^\tau \right) \\&+\sum _{j_1\ne j_2}\{p(j_1)-p_0(j_1)\}\{p(j_2)-p_0(j_2)\}\textrm{Cov}([\widehat{p}_{j_1}-p_0(j_1)]_{-\tau }^\tau ,[\widehat{p}_{j_2}-p_0(j_2)]_{-\tau }^\tau ). \end{aligned}$$

Set \(P_j=[\widehat{p}_j-p_0(j)]_{-\tau }^\tau \). We will prove that

$$\begin{aligned} \text {Var}(P_j)\le \frac{10}{n\alpha ^2}\exp \left( -\frac{n\alpha ^2(p(j)-p_0(j))^2}{168} \right) , \quad \forall j\in \llbracket 1,N\rrbracket , \end{aligned}$$
(20)

and

$$\begin{aligned} \left| \textrm{Cov}(P_{j_1},P_{j_2})\right| \le \frac{2p(j_1)p(j_2)}{n}\exp \left( -\frac{n\alpha ^2\left[ (p(j_1)-p_0(j_1))^2+(p(j_2)-p_0(j_2))^2\right] }{336} \right) \end{aligned}$$
(21)

for all \(j_1,j_2\in \llbracket 1, N\rrbracket \), \(j_1\ne j_2\). We admit these results for the moment and finish the proof of Proposition 4.2. Using (20) and (21) we obtain

$$\begin{aligned} \text {Var}(D_B)&\le \frac{c_\alpha ^2\tau ^2}{n}+\frac{10}{n\alpha ^2}\sum _{j=1}^N\{p(j)-p_0(j)\}^2\exp \left( -\frac{n\alpha ^2(p(j)-p_0(j))^2}{168} \right) \\&+\frac{2}{n}\left[ \sum _{j=1}^N\vert p(j)-p_0(j)\vert p(j)\exp \left( -\frac{n\alpha ^2(p(j)-p_0(j))^2}{336} \right) \right] ^2\\&\le \frac{c_\alpha ^2\tau ^2}{n}+\frac{10}{n\alpha ^2}\sum _{j=1}^N\{p(j)-p_0(j)\}^2\exp \left( -\frac{n\alpha ^2(p(j)-p_0(j))^2}{168} \right) \\&+\frac{2}{n}\left[ \sum _{j=1}^Np(j)^2 \right] \left[ \sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2 \exp \left( -\frac{n\alpha ^2(p(j)-p_0(j))^2}{168} \right) \right] \\&\le \frac{c_\alpha ^2\tau ^2}{n}+\frac{12}{n\alpha ^2}\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2\exp \left( -\frac{n\alpha ^2(p(j)-p_0(j))^2}{168} \right) , \end{aligned}$$

where the second to last inequality follows from Cauchy Schwarz inequality. Now, observe that if \(a_j:=\vert p(j)-p_0(j) \vert \ne 0\), then we can write

$$\begin{aligned}&\vert p(j)-p_0(j)\vert \exp \left( -\frac{n\alpha ^2(p(j)-p_0(j))^2}{168} \right) \\&\qquad =\min \{\tau , a_j\}\cdot \frac{a_j/\tau }{\min \{1, a_j/\tau \}}\exp \left( -\frac{1}{168}\left( \frac{a_j}{\tau } \right) ^2\right) , \end{aligned}$$

where we recall that \(\tau =1/\sqrt{n\alpha ^2}\). The study of the function \(g:x\mapsto [x/\min \{ 1,x\}]\exp (-x^2/168)\) gives \(g(x)\le \sqrt{84}e^{-1/2}\) for all \(x\ge 0\). We thus have

$$\begin{aligned} \text {Var}(D_B)\le \frac{c_\alpha ^2\tau ^2}{n}+\frac{12e^{-1/2}\sqrt{84}}{n\alpha ^2}\sum _{j=1}^N\vert p(j)-p_0(j)\vert \min \left\{ \tau , \vert p(j)-p_0(j)\vert \right\} . \end{aligned}$$

Using that \(\alpha ^2c_\alpha ^2\le 5\) for all \(\alpha \in (0,1)\), we finally obtain the claim of Proposition 4.2,

$$\begin{aligned} \text {Var}(D_B)\le \frac{5}{(n\alpha ^2)^2}+\frac{67}{n\alpha ^2}D_\tau (f). \end{aligned}$$

It remains now to prove (20) and (21). We will use the following concentration inequality which is an application of Bernstein’s inequality (see for instance Corollary 2.11 in [7])

$$\begin{aligned} \mathbb P\left( \vert \widehat{p}_j-p(j)\vert \ge x\right) \le 2\exp \left( -\frac{n\alpha ^2x^2}{42}\right) , \quad \text {for all } 0<x\le \frac{1}{\alpha }. \end{aligned}$$
(22)

Let us prove (20). Let \(j\in \llbracket 1,N\rrbracket \). We first deal with the case where \(p(j)-p_0(j)\ge 2\tau \). We have

$$\begin{aligned} \text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau } \right)&= \text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau }-\tau \right) \\&\le \mathbb E\left[ \left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau }-\tau \right) ^2 \right] \\&=\mathbb E\left[ (-2\tau )^2\mathbbm {1}\left( \widehat{p}_j-p_0(j)\le -\tau \right) + (\widehat{p}_j-p_0(j)-\tau )^2\mathbbm {1}\left( \widehat{p}_j-p_0(j)\in [-\tau ,\tau ] \right) \right] \\&\le 4\tau ^2 \mathbb P\left( \widehat{p}_j-p_0(j)\le \tau \right) \\&=4\tau ^2 \mathbb P\left( p(j)-\widehat{p}_j\ge p(j)-p_0(j)-\tau \right) \\&\le 4\tau ^2 \mathbb P\left( \vert p(j)-\widehat{p}_j\vert \ge p(j)-p_0(j)-\tau \right) \end{aligned}$$

Now, if \(p(j)-p_0(j)\ge 2\tau \) then we have \(0<p(j)-p_0(j)-\tau \le p(j)\le 1\le 1/\alpha \) and (22) gives

$$\begin{aligned} \text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau } \right)&\le 8\tau ^2\exp \left( -\frac{n\alpha ^2\left\{ p(j)-p_0(j)-\tau \right\} ^2}{42} \right) \\&\le \frac{8}{n\alpha ^2}\exp \left( -\frac{n\alpha ^2\left\{ p(j)-p_0(j)\right\} ^2}{168} \right) , \end{aligned}$$

which ends the proof of (20) for the elements \(j\in \llbracket 1,N\rrbracket \) such that \(p(j)-p_0(j)\ge 2\tau \). Starting from \(\text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau } \right) = \text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau }+\tau \right) \), a similar proof gives (20) for the elements \(j\in \llbracket 1,N\rrbracket \) such that \(p(j)-p_0(j)\le -2\tau \). It remains to deal with the case \(\vert p(j)-p_0(j) \vert <2\tau \). In this case, using that \([\cdot ]_{-\tau }^{\tau }\) is Lipschitz continuous with Lipschitz constant 1 we have

$$\begin{aligned} \text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau } \right)&= \text {Var}\left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau }- [p_j-p_0(j)]_{-\tau }^{\tau }\right) \\&\le \mathbb E\left[ \left( [\widehat{p}_j-p_0(j)]_{-\tau }^{\tau }- [p_j-p_0(j)]_{-\tau }^{\tau } \right) ^2 \right] \\&\le \mathbb E\left[ \vert \widehat{p}_j-p(j)\vert ^2 \right] \\&=\text {Var}(\widehat{p}_j)\\&=\frac{1}{n^2}\sum _{i=1}^n\text {Var}\left( I(X_i\in B_j)\right) +\frac{4}{n^2\alpha ^2}\sum _{i=1}^n\text {Var}(W_{ij})\\&\le \frac{9}{n\alpha ^2}\\&=\frac{9}{n\alpha ^2}\exp \left( \frac{n\alpha ^2\left\{ p(j)-p_0(j)\right\} ^2}{168}\right) \exp \left( -\frac{n\alpha ^2\left\{ p(j)-p_0(j)\right\} ^2}{168}\right) \\&\le \frac{9\exp (1/42)}{n\alpha ^2}\exp \left( -\frac{n\alpha ^2\left\{ p(j)-p_0(j)\right\} ^2}{168}\right) , \end{aligned}$$

where the last inequality follows from the assumption \(\vert p(j)-p_0(j)\vert \le 2\tau =2/\sqrt{n\alpha ^2}\). This ends the proof of (20). We now prove (21). For all \(i\in \llbracket 1,n+1\rrbracket \), we will write

$$\begin{aligned}&\mathbb E_i\left[ \cdot \right] =\mathbb E\left[ \cdot \mid X_1,\ldots ,X_{i-1} \right] ,\\&\mathbb E_i^j\left[ \cdot \right] =\frac{1}{p(j)}\mathbb E\left[ \cdot \mathbbm {1}(X_i\in B_j) \mid X_1,\ldots ,X_{i-1} \right] ,\\&\mathbb E_i^{comp}\left[ \cdot \right] =\frac{1}{p\left( \overline{B}\right) }\mathbb E\left[ \cdot \mathbbm {1}(X_i\in \overline{B}) \mid X_1,\ldots ,X_{i-1} \right] . \end{aligned}$$

Observe that

$$\begin{aligned} \mathbb E_i^j\left[ P_{j_1} \right] \overset{a.s.}{=}\ \mathbb E_i^{j_2}\left[ P_{j_1} \right] , \quad \forall j, j_2\ne j_1, \end{aligned}$$
(23)

and

$$\begin{aligned} \mathbb E_i^{comp}\left[ P_{j_1} \right] \overset{a.s.}{=}\ \mathbb E_i^{j_2}\left[ P_{j_1} \right] , \quad \forall j_2\ne j_1, \end{aligned}$$
(24)

where we recall that \(P_j=[\widehat{p}_j-p_0(j)]_{-\tau }^\tau \). Let \(j_1,j_2\in \llbracket 1,N\rrbracket \), \(j_1\ne j_2\). We have

$$\begin{aligned} \textrm{Cov}\left( P_{j_1},P_{j_2} \right)&=\textrm{Cov}\left( \mathbb E_{n+1}\left[ P_{j_1}\right] ,\mathbb E_{n+1}\left[ P_{j_2}\right] \right) \\&=\mathbb E\left[ \mathbb E_{n+1}\left[ P_{j_1}\right] \mathbb E_{n+1}\left[ P_{j_2}\right] \right] -\mathbb E\left[ P_{j_1} \right] \mathbb E\left[ P_{j_2} \right] \\&=\mathbb E\left[ \sum _{i=1}^n\left( \mathbb E_{i+1}\left[ P_{j_1}\right] \mathbb E_{i+1}\left[ P_{j_2}\right] -\mathbb E_{i}\left[ P_{j_1}\right] \mathbb E_{i}\left[ P_{j_2}\right] \right) \right] , \end{aligned}$$

where the sum in the last line is a telescoping sum. We thus have

$$\begin{aligned} \textrm{Cov}\left( P_{j_1},P_{j_2} \right) =\sum _{i=1}^n \mathbb E\left[ \mathbb E_{i+1}\left[ P_{j_1}\right] \mathbb E_{i+1}\left[ P_{j_2}\right] -\mathbb E_{i}\left[ P_{j_1}\right] \mathbb E_{i}\left[ P_{j_2}\right] \right] . \end{aligned}$$
(25)

Now, it holds

$$\begin{aligned} \mathbb E_{i}\left[ P_{j_1}\right]&=\mathbb E\left[ P_{j_1}\mid X_1,\ldots ,X_{i-1} \right] \\&=\mathbb E\left[ P_{j_1}\cdot \left( \sum _{j=1}^N \mathbbm {1}(X_i\in B_j)+\mathbbm {1}(X_i\in \overline{B}) \right) \mid X_1,\ldots ,X_{i-1} \right] \\&=\sum _{j=1}^N p(j)\mathbb E_i^j\left[ P_{j_1}\right] +p\left( \overline{B} \right) \mathbb E_i^{comp}\left[ P_{j_1}\right] \\&=p(j_1)\mathbb E_i^{j_1}\left[ P_{j_1}\right] +\displaystyle \sum _{\begin{array}{c} j=1\ \\ j\ne j_1 \end{array}}^N p(j)\mathbb E_i^{j_2}\left[ P_{j_1}\right] +p\left( \overline{B} \right) \mathbb E_i^{j_2}\left[ P_{j_1}\right] , \end{aligned}$$

where the last equality follows from (23) and (24). We thus obtain

$$\begin{aligned} \mathbb E_{i}\left[ P_{j_1}\right] =p(j_1)\mathbb E_i^{j_1}\left[ P_{j_1}\right] +(1-p(j_1))\mathbb E_i^{j_2}\left[ P_{j_1}\right] . \end{aligned}$$
(26)

Similarly, it holds

$$\begin{aligned} \mathbb E_{i}\left[ P_{j_2}\right] =p(j_2)\mathbb E_i^{j_2}\left[ P_{j_2}\right] +(1-p(j_2))\mathbb E_i^{j_1}\left[ P_{j_2}\right] . \end{aligned}$$
(27)

We now compute \(\mathbb E_{X_i}\left[ \mathbb E_{i+1}\left[ P_{j_1}\right] \mathbb E_{i+1}\left[ P_{j_2}\right] \right] \). We have

$$\begin{aligned}&\mathbb E_{X_i}\left[ \mathbb E_{i+1}\left[ P_{j_1}\right] \mathbb E_{i+1}\left[ P_{j_2}\right] \right] \\&=\int _{\mathbb R}f(y_i)\left[ \int _{\mathbb R^{n-i}}P_{j_1}(X_1,\ldots ,X_{i-1},y_i,y_{i+1},\ldots y_n)f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_{n} \right. \\&\left. \cdot \int _{\mathbb R^{n-i}}P_{j_2}(X_1,\ldots ,X_{i-1},y_i,y'_{i+1},\ldots y'_n)f(y'_{i+1})\cdots f(y'_n) \textrm{d}y'_{i+1}\cdots \textrm{d}y'_{n} \right] \textrm{d}y_i\\&=\sum _{j=1}^N\int _{\mathbb R}f(y_i)\mathbbm {1}(y_i\in B_j)\left[ \int _{\mathbb R^{n-i}}P_{j_1}(X_1,\ldots ,X_{i-1},y_i,y_{i+1},\ldots y_n)f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_{n} \right. \\&\left. \cdot \int _{\mathbb R^{n-i}}P_{j_2}(X_1,\ldots ,X_{i-1},y_i,y'_{i+1},\ldots y'_n)f(y'_{i+1})\cdots f(y'_n) \textrm{d}y'_{i+1}\cdots \textrm{d}y'_{n} \right] \textrm{d}y_i\\&+\int _{\mathbb R}f(y_i)\mathbbm {1}(y_i\in \overline{B})\left[ \int _{\mathbb R^{n-i}}P_{j_1}(X_1,\ldots ,X_{i-1},y_i,y_{i+1},\ldots y_n)f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_{n} \right. \\&\left. \cdot \int _{\mathbb R^{n-i}}P_{j_2}(X_1,\ldots ,X_{i-1},y_i,y'_{i+1},\ldots y'_n)f(y'_{i+1})\cdots f(y'_n) \textrm{d}y'_{i+1}\cdots \textrm{d}y'_{n} \right] \textrm{d}y_i \end{aligned}$$

For \(j=1,\ldots ,N\), let \(x_j\) be such that \(B_j=[x_j-h, x_j+h]\). Observe that if \(y_i\in \mathring{B}_j \) then it holds \(\mathbbm {1}(y_i\in B_k)=\delta _{j,k}=\mathbbm {1}(x_j\in B_k)\) where \(\delta \) is the Kronecker delta. Observe also that if \(y_i\in \overline{B}\) then it holds \(\mathbbm {1}(y_i\in B_k)=0=\mathbbm {1}(z\in B_k)\) for some \(z\in \overline{B}\). This gives

$$\begin{aligned}&P_k\left( X_1,\ldots ,X_{i-1},y_i,y_{i+1},\ldots ,y_n \right) \mathbbm {1}(y_i\in \mathring{B}_j)\nonumber \\&\qquad =P_k\left( X_1,\ldots ,X_{i-1},x_j,y_{i+1},\ldots ,y_n \right) \mathbbm {1}(y_i\in \mathring{B}_j), \end{aligned}$$
(28)

and

$$\begin{aligned}&P_k\left( X_1,\ldots ,X_{i-1},y_i,y_{i+1},\ldots ,y_n \right) \mathbbm {1}(y_i\in \overline{B})\nonumber \\&\qquad =P_k\left( X_1,\ldots ,X_{i-1},z,y_{i+1},\ldots ,y_n \right) \mathbbm {1}(y_i\in \overline{B}). \end{aligned}$$
(29)

We thus have

$$\begin{aligned}&\mathbb E_{X_i}\left[ \mathbb E_{i+1}\left[ P_{j_1}\right] \mathbb E_{i+1}\left[ P_{j_2}\right] \right] \\&=\sum _{j=1}^Np(j)\left[ \int _{\mathbb R^{n-i}}P_{j_1}(X_1,\ldots ,X_{i-1},x_j,y_{i+1},\ldots y_n)f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_{n} \right. \\&\left. \cdot \int _{\mathbb R^{n-i}}P_{j_2}(X_1,\ldots ,X_{i-1},x_j,y'_{i+1},\ldots y'_n)f(y'_{i+1})\cdots f(y'_n) \textrm{d}y'_{i+1}\cdots \textrm{d}y'_{n} \right] \\&+p(\overline{B})\left[ \int _{\mathbb R^{n-i}}P_{j_1}(X_1,\ldots ,X_{i-1},z,y_{i+1},\ldots y_n)f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_{n} \right. \\&\left. \cdot \int _{\mathbb R^{n-i}}P_{j_2}(X_1,\ldots ,X_{i-1},z,y'_{i+1},\ldots y'_n)f(y'_{i+1})\cdots f(y'_n) \textrm{d}y'_{i+1}\cdots \textrm{d}y'_{n} \right] . \end{aligned}$$

Now, observe that

$$\begin{aligned} \int _{\mathbb R^{n-i}}P_{k}(X_1,\ldots ,X_{i-1},x_j,y_{i+1},\ldots y_n)f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_{n}=\mathbb E_i^j[P_k]. \end{aligned}$$
(30)

Indeed, it holds

$$\begin{aligned} \mathbb E_i^j[P_k]&= \frac{1}{p(j)}\mathbb E\left[ P_k\mathbbm {1}(X_i\in B_j)\mid X_1,\ldots ,X_{i-1} \right] \\&=\frac{1}{p(j)}\int _{\mathbb R^{n-i+1}}P_k(X_1,\ldots ,X_{i-1},y_i,y_{i+1},\ldots ,y_n)\\&\qquad \mathbbm {1}(y_i\in B_j)f(y_i)f(y_{i+1})\cdots f(y_{n})\textrm{d}y_i\textrm{d}y_{i+1}\textrm{d}y_n \\&=\int _{\mathbb R^{n-i}}P_k(X_1,\ldots ,X_{i-1},x_j,y_{i+1},\ldots ,y_n)f(y_{i+1})\cdots f(y_{n})\textrm{d}y_{i+1}\textrm{d}y_n, \end{aligned}$$

where the last equality follows from (28). Similarly, using (29) one can prove that for \(z\in \overline{B}\) it holds

$$\begin{aligned} \int _{\mathbb R^{n-i}}P_{k}(X_1,\ldots ,X_{i-1},z,y_{i+1},\ldots y_n)f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_{n} =\mathbb E_i^{comp}[P_k]. \end{aligned}$$

We thus have

$$\begin{aligned}&\mathbb E_{X_i}\left[ \mathbb E_{i+1}\left[ P_{j_1}\right] \mathbb E_{i+1}\left[ P_{j_2}\right] \right] =\sum _{j=1}^Np(j)\mathbb E_i^j[P_{j_1}]\mathbb E_i^j[P_{j_2}] +p(\overline{B})\mathbb E_i^{comp}[P_{j_1}]\mathbb E_i^{comp}[P_{j_2}], \end{aligned}$$

and, using (23) and (24) we finally obtain

$$\begin{aligned} \mathbb E_{X_i}\left[ \mathbb E_{i+1}\left[ P_{j_1}\right] \mathbb E_{i+1}\left[ P_{j_2}\right] \right] =p(j_1)\mathbb E_i^{j_1}\left[ P_{j_1}\right] \mathbb E_i^{j_1}\left[ P_{j_2}\right] +p(j_2)\mathbb E_i^{j_2}\left[ P_{j_1}\right] \mathbb E_i^{j_2}\left[ P_{j_2}\right] \\ +\left( 1-p(j_1)-p(j_2) \right) \mathbb E_i^{j_2}\left[ P_{j_1}\right] \mathbb E_i^{j_1}\left[ P_{j_2}\right] . \end{aligned}$$
(31)

Putting (26), (27) and (31) in (25), we obtain

$$\begin{aligned}&\textrm{Cov}\left( P_{j_1},P_{j_2} \right) \\&=\sum _{i=1}^n\mathbb E\left[ p(j_1)\mathbb E_i^{j_1}\left[ P_{j_1}\right] \mathbb E_i^{j_1}\left[ P_{j_2}\right] +p(j_2)\mathbb E_i^{j_2}\left[ P_{j_1}\right] \mathbb E_i^{j_2}\left[ P_{j_2}\right] +\left( 1-p(j_1)-p(j_2) \right) \mathbb E_i^{j_2}\left[ P_{j_1}\right] \mathbb E_i^{j_1}\left[ P_{j_2}\right] \right. \\&+\left. \left\{ p(j_1)\mathbb E_i^{j_1}\left[ P_{j_1}\right] +(1-p(j_1))\mathbb E_i^{j_2}\left[ P_{j_1}\right] \right\} \left\{ p(j_2)\mathbb E_i^{j_2}\left[ P_{j_2}\right] +(1-p(j_2))\mathbb E_i^{j_1}\left[ P_{j_2}\right] \right\} \right] \\&=\sum _{i=1}^np(j_1)p(j_2) \mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_1}\right] -\mathbb E_i^{j_2}\left[ P_{j_1}\right] \right) \left( \mathbb E_i^{j_1}\left[ P_{j_2}\right] -\mathbb E_i^{j_2}\left[ P_{j_2}\right] \right) \right] , \end{aligned}$$

and Cauchy-Schwarz inequality gives

$$\begin{aligned}&\left| \textrm{Cov}(P_{j_1},P_{j_2}) \right| \nonumber \\&\qquad \le \sum _{i=1}^np(j_1)p(j_2)\sqrt{\mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_1}\right] -\mathbb E_i^{j_2}\left[ P_{j_1}\right] \right) ^2 \right] }\sqrt{\mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_2}\right] -\mathbb E_i^{j_2}\left[ P_{j_2}\right] \right) ^2 \right] }. \end{aligned}$$
(32)

Now, using (30) and Jensen’s inequality we have

$$\begin{aligned}&\mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_1}\right] -\mathbb E_i^{j_2}\left[ P_{j_1}\right] \right) ^2 \right] \\&=\mathbb E\left[ \left\{ \int _{\mathbb R^{n-i}}\left( P_{j_1}(X_1,\ldots ,X_{i-1},x_{j_1},y_{i+1},\ldots ,y_n)-P_{j_1}(X_1,\ldots ,X_{i-1},x_{j_2},y_{i+1},\ldots ,y_n) \right) \right. \right. \\&\left. \left. f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_n \right\} ^2\right] \\&\le \mathbb E\left[ \int _{\mathbb R^{n-i}}\left\{ P_{j_1}(X_1,\ldots ,X_{i-1},x_{j_1},y_{i+1},\ldots ,y_n)-P_{j_1}(X_1,\ldots ,X_{i-1},x_{j_2},y_{i+1},\ldots ,y_n)\right\} ^2\right. \\&\left. f(y_{i+1})\cdots f(y_n) \textrm{d}y_{i+1}\cdots \textrm{d}y_n \right] \\&=\mathbb E\left[ \left\{ P_{j_1}(X_1,\ldots ,X_{i-1},x_{j_1},X_{i+1},\ldots ,X_n)-P_{j_1}(X_1,\ldots ,X_{i-1},x_{j_2},X_{i+1},\ldots ,X_n) \right\} ^2\right] \\&=\mathbb E\left[ \left( \left[ \frac{1}{n}+Y\right] _{-\tau }^\tau -\left[ Y\right] _{-\tau }^\tau \right) ^2 \right] , \end{aligned}$$

where

$$\begin{aligned} Y=\frac{1}{n}\displaystyle \sum _{\begin{array}{c} k=1 \\ k\ne i \end{array}}^n \mathbbm {1}(X_k\in B_{j_1})+\frac{2}{n\alpha }\sum _{k=1}^nW_{k j_1}-p_0(j_1). \end{aligned}$$

Note that since \([\cdot ]_{-\tau }^\tau \) is continuous Lipschitz with Lipschitz constant 1, it holds

$$\begin{aligned} \mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_1}\right] -\mathbb E_i^{j_2}\left[ P_{j_1}\right] \right) ^2 \right] \le \frac{1}{n^2}. \end{aligned}$$

However, we can provide another bound when \(\vert p(j_1)-p_0(j_1)\vert \ge 2(\tau +1/n)\). Assume that \( p(j_1)-p_0(j_1) \ge 2(\tau +1/n)\). We have

$$\begin{aligned}&\mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_1}\right] -\mathbb E_i^{j_2}\left[ P_{j_1}\right] \right) ^2 \right] \\&\le \mathbb E\left[ \left( \left[ \frac{1}{n}+Y\right] _{-\tau }^\tau -\left[ Y\right] _{-\tau }^\tau \right) ^2\mathbbm {1}(Y\le \tau ) \right] +\mathbb E\left[ \left( \left[ \frac{1}{n}+Y\right] _{-\tau }^\tau -\left[ Y\right] _{-\tau }^\tau \right) ^2\mathbbm {1}(Y> \tau ) \right] \\&\le \frac{1}{n^2}\mathbb P(Y\le \tau )\\&=\frac{1}{n^2}\mathbb P\left( \frac{1}{n}\displaystyle \sum _{\begin{array}{c} k=1 \\ k\ne i \end{array}}^n \mathbbm {1}(X_k\in B_{j_1})+\frac{2}{n\alpha }\sum _{k=1}^nW_{k j_1}-p_0(j_1) \le \tau \right) \\&\le \frac{1}{n^2}\mathbb P\left( \frac{1}{n}\sum _{k=1}^n \mathbbm {1}(X_k\in B_{j_1})-\frac{1}{n}+\frac{2}{n\alpha }\sum _{k=1}^nW_{k j_1}-p_0(j_1) \le \tau \right) \\&=\frac{1}{n^2}\mathbb P\left( \widehat{p}_{j_1}\le \tau +\frac{1}{n}+p_0(j_1) \right) \\&\le \frac{1}{n^2}\mathbb P\left( \vert \widehat{p}_{j_1}-p(j_1)\vert \ge p(j_1)-p_0(j_1)-\tau -\frac{1}{n} \right) \end{aligned}$$

Now, if \( p(j_1)-p_0(j_1) \ge 2(\tau +1/n)\) then we have \(0<p(j_1)-p_0(j_1)-\tau -\frac{1}{n} \le p(j_1)\le 1\le \frac{1}{\alpha }\) and (22) gives

$$\begin{aligned} \mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_1}\right] -\mathbb E_i^{j_2}\left[ P_{j_1}\right] \right) ^2 \right]&\le \frac{2}{n^2}\exp \left( -\frac{n\alpha ^2\left( p(j_1)-p_0(j_1)-\tau -1/n \right) ^2}{42} \right) \\&\le \frac{2}{n^2}\exp \left( -\frac{n\alpha ^2\left( p(j_1)-p_0(j_1)\right) ^2}{168} \right) . \end{aligned}$$

One can prove the same result if \( p(j_1)-p_0(j_1) \le -2(\tau +1/n)\), and similar bounds with \(j_1\) replaced by \(j_2\) hold for \(\mathbb E\left[ \left( \mathbb E_i^{j_1}\left[ P_{j_2}\right] -\mathbb E_i^{j_2}\left[ P_{j_2}\right] \right) ^2\right] \). We can now conclude.

If \(j_1\ne j_2\) are such that \(\vert p(j_1)-p_0(j_1)\vert \ge 2(\tau +1/n)\) and \(\vert p(j_2)-p_0(j_2)\vert \ge 2(\tau +1/n)\) then (32) gives

$$\begin{aligned} \left| \textrm{Cov}\left( P_{j_1},P_{j_2} \right) \right| \le \frac{2p(j_1)p(j_2)}{n}\exp \left( -\frac{n\alpha ^2\left[ (p(j_1)-p_0(j_1))^2+(p(j_2)-p_0(j_2))^2\right] }{336} \right) . \end{aligned}$$

If \(j_1\ne j_2\) are such that \(\vert p(j_1)-p_0(j_1)\vert < 2(\tau +1/n)\) and \(\vert p(j_2)-p_0(j_2)\vert \ge 2(\tau +1/n)\) then (32) gives

$$\begin{aligned}&\left| \textrm{Cov}\left( P_{j_1},P_{j_2} \right) \right| \\&\le \frac{\sqrt{2}p(j_1)p(j_2)}{n}\exp \left( -\frac{n\alpha ^2(p(j_2)-p_0(j_2))^2}{336} \right) \\&=\frac{\sqrt{2}p(j_1)p(j_2)}{n}\exp \left( -\frac{n\alpha ^2\left[ (p(j_1)-p_0(j_1))^2+(p(j_2)-p_0(j_2))^2\right] }{336} \right) \\ {}&\qquad \exp \left( \frac{n\alpha ^2(p(j_1)-p_0(j_1))^2}{336} \right) \\&\le \frac{\sqrt{2}\exp (1/21)p(j_1)p(j_2)}{n}\exp \left( -\frac{n\alpha ^2\left[ (p(j_1)-p_0(j_1))^2+(p(j_2)-p_0(j_2))^2\right] }{336} \right) , \end{aligned}$$

since \(\vert p(j_1)-p_0(j_1)\vert < 2(\tau +1/n)\le 4/\sqrt{n\alpha ^2}\). The same result holds if \(j_1\ne j_2\) are such that \(\vert p(j_1)-p_0(j_1)\vert \ge 2(\tau +1/n)\) and \(\vert p(j_2)-p_0(j_2)\vert < 2(\tau +1/n)\). Finally, if \(j_1\ne j_2\) are such that \(\vert p(j_1)-p_0(j_1)\vert < 2(\tau +1/n)\) and \(\vert p(j_2)-p_0(j_2)\vert < 2(\tau +1/n)\), then (32) gives

$$\begin{aligned} \left| \textrm{Cov}\left( P_{j_1},P_{j_2} \right) \right|&\le \frac{p(j_1)p(j_2)}{n}\\&\le \frac{p(j_1)p(j_2)}{n}\exp \left( \frac{2}{21}\right) \\&\quad \exp \left( -\frac{n\alpha ^2\left[ (p(j_1)-p_0(j_1))^2+(p(j_2)-p_0(j_2))^2\right] }{336} \right) , \end{aligned}$$

which ends the proof of (21).\(\square \)

1.3 Proof of Theorem 4.3

The outline of the proof is similar to that of Theorem 3.4: we first prove that the choice of \(t_1\) and \(t_2\) in (11) yields \(\mathbb P_{Q_{f_0}^n}(\Phi =1)\le \gamma /2\) and we then exhibit \(\rho _1,\rho _2>0\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \int _{B}\vert f-f_0\vert \ge \rho _1 \Rightarrow \mathbb P_{Q_f^n}(D_B< t_1)\le \gamma /2\\ \int _{\bar{B}}\vert f-f_0\vert \ge \rho _2 \Rightarrow \mathbb P_{Q_f^n}(T_B< t_2)\le \gamma /2. \end{array}\right. } \end{aligned}$$

The quantity \(\rho _1+\rho _2\) will then provide an upper bound on \(\mathcal E_{n,\alpha }(f_0,\gamma )\).

We have already seen in the proof of the upper bound in the non-interactive scenario that the choice \(t_2=\sqrt{20/(n\alpha ^2\gamma )}\) gives \(\mathbb P_{Q_{f_0}^n}(T_B\ge t_2)\le \gamma /4\). Moreover, Chebychev’s inequality and Proposition 4.2 yield

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}( D_B\ge t_1)=\mathbb P_{Q_{f_0}^n}( D_B-\mathbb E_{Q_{f_0}^n}[D_B]\ge t_1)&\le \mathbb P_{Q_{f_0}^n}\left( \vert D_B-\mathbb E_{Q_{f_0}^n}[D_B]\vert \ge t_1\right) \\&\le \frac{\text {Var}_{Qf_0^n}(D_B)}{t_1^2}\\&\le \frac{5}{(n\alpha ^2)^2t_1^2} \le \frac{\gamma }{4} \end{aligned}$$

for \(t_1= 2\sqrt{5}/(n\alpha ^2\sqrt{\gamma })\). We thus have

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}(\Phi =1)\le \mathbb P_{Q_{f_0}^n}( D_B\ge t_1)+ \mathbb P_{Q_{f_0}^n}(T_B\ge t_2)\le \frac{\gamma }{2}. \end{aligned}$$

We have seen in the proof of Theorem 3.4 (upper bound in the non-interactive scenario) that if we set

$$\begin{aligned} \rho _2=2\int _{\bar{B}}f_0+\left( 1+\frac{1}{\sqrt{2}}\right) t_2, \end{aligned}$$

then we have

$$\begin{aligned} \int _{\bar{B}}\vert f-f_0\vert \ge \rho _2\implies \mathbb P_{Q_{f}^n}(T_B < t_2)\le \frac{\gamma }{2}. \end{aligned}$$

It remains now to exhibit \(\rho _1\) such that \(\int _B\vert f-f_0\vert \ge \rho _1\) implies \(\mathbb P_{Q_{f}^n}(D_B < t_1)\le \gamma /2.\) Chebychev’s inequality gives

$$\begin{aligned} \mathbb P_{Q_f^n}(D_B< t_1)&=\mathbb P_{Q_f^n}\left( \mathbb E_{Q_f^n}[D_B]-D_B>\mathbb E_{Q_f^n}[D_B]- t_1\right) \\&\le \frac{\text {Var}_{Qf^n}(D_B)}{\left( \mathbb E_{Q_f^n}[D_B]- t_1\right) ^2}\\&\le \frac{\frac{5}{(n\alpha ^2)^2}}{\left( \mathbb E_{Q_f^n}[D_B]- t_1\right) ^2}+\frac{\frac{67D_\tau (f)}{n\alpha ^2}}{\left( \mathbb E_{Q_f^n}[D_B]- t_1\right) ^2}, \end{aligned}$$

if \(\mathbb E_{Q_f^n}[D_B]- t_1>0\). Now, observe that if \(D_\tau (f)\ge 12(t_1+6\tau /\sqrt{n})\), Proposition 4.2 implies

$$\begin{aligned} \mathbb E_{Q_f^n}[D_B]-t_1\ge \frac{1}{6}D_\tau (f)-\frac{6\tau }{\sqrt{n}}-t_1\ge t_1+\frac{6\tau }{\sqrt{n}}\ge t_1, \end{aligned}$$

and

$$\begin{aligned} \mathbb E_{Q_f^n}[D_B]-t_1\ge \frac{1}{6}D_\tau (f)-\left( \frac{6\tau }{\sqrt{n}}+t_1\right) \ge \frac{1}{6}D_\tau (f)-\frac{1}{12}D_\tau (f)=\frac{1}{12}D_\tau (f). \end{aligned}$$

Thus, if \(D_\tau (f)\ge 12(t_1+6\tau /\sqrt{n})\) we obtain

$$\begin{aligned} \mathbb P_{Q_f^n}(D_B< t_1)\le \frac{5}{(n\alpha ^2)^2t_1^2}+\frac{144\times 67}{n\alpha ^2D_\tau (f)} =\frac{\gamma }{4}+\frac{9648}{n\alpha ^2D_\tau (f)}. \end{aligned}$$

Thus, if \(D_\tau (f)\) satisfies

$$\begin{aligned} D_\tau (f)\ge \frac{C_\gamma }{n\alpha ^2}, \quad \text {with } C_\gamma = \max \left\{ \frac{24\sqrt{5}+72}{\sqrt{\gamma }},\frac{9648\times 4}{\gamma } \right\} \end{aligned}$$

then we have \(\mathbb P_{Q_f^n}(D_B< t_1)\le \gamma /2\). We now exhibit \(\rho _1\) such that \(\int _B \vert f-f_0\vert \ge \rho _1\) implies \(D_\tau (f)\ge C_\gamma /(n\alpha ^2)\). To this aim, we will use the following facts

  1. (i)

    \(D_\tau (f)\ge \min \left\{ \sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2, \tau \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2} \right\} \),

  2. (ii)

    \(\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2\ge C_\gamma ^2/(n\alpha ^2)\Rightarrow \min \left\{ \sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2,\right. \left. \tau \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2} \right\} \ge C_\gamma /(n\alpha ^2)\),

  3. (iii)

    \(\left( \int _B\vert f-f_0\vert \right) ^2\le 4(L+L_0)^2\vert B\vert ^2h^{2\beta }+\vert B\vert /(2h)\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2\).

We admit for now these three facts and conclude the proof of our upper bound. If we have

$$\begin{aligned} \left( \int _B\vert f-f_0\vert \right) ^2\ge 4(L+L_0)^2\vert B\vert ^2h^{2\beta }+\frac{\vert B \vert }{2h}\frac{C_\gamma ^2}{n\alpha ^2} \end{aligned}$$

then iii) implies

$$\begin{aligned} \sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2\ge \frac{C_\gamma ^2}{n\alpha ^2}, \end{aligned}$$

and ii) combined with i) yield \(D_\tau (f)\ge C_\gamma /(n\alpha ^2)\) and thus \(\mathbb P_{Q_f^n}(D_B< t_1)\le \gamma /2\). We can then take

$$\begin{aligned} \rho _1= \sqrt{4(L+L_0)^2\vert B\vert ^2h^{2\beta }+\frac{\vert B \vert }{2h}\frac{C_\gamma ^2}{n\alpha ^2}}. \end{aligned}$$

For all \(f\in H(\beta ,L)\) satisfying \(\Vert f-f_0\Vert _1\ge \rho _1+\rho _2\) it holds

$$\begin{aligned} \mathbb P_{Q_{f_0}^n}(\Phi =1)+\mathbb P_{Q_f^n}(\Phi =0)\le \frac{\gamma }{2}+ \min \left\{ \mathbb P_{Q_f^n}( D_B< t_1), \mathbb P_{Q_f^n}(T_B< t_2)\right\} \le \frac{\gamma }{2}+\frac{\gamma }{2}=\gamma , \end{aligned}$$

since \(\Vert f-f_0\Vert _1\ge \rho _1+\rho _2\) implies \(\int _{B}\vert f-f_0\vert \ge \rho _1\) or \( \int _{\bar{B}}\vert f-f_0\vert \ge \rho _2\). Consequently, we have

$$\begin{aligned} \mathcal E_{n,\alpha }(f_0,\gamma )&\le \rho _1+\rho _2= \sqrt{4(L+L_0)^2\vert B\vert ^2h^{2\beta }+\frac{\vert B \vert }{2h}\frac{C_\gamma ^2}{n\alpha ^2}}+2\int _{\bar{B}}f_0+\left( 1+\frac{1}{\sqrt{2}}\right) t_2\\&\le C(L,L_0,\gamma ) \left[ \vert B\vert h^\beta + \sqrt{\frac{\vert B\vert }{hn\alpha ^2}} +\int _{\bar{B}}f_0 + \frac{1}{\sqrt{n\alpha ^2}} \right] . \end{aligned}$$

The choice \( h\asymp \vert B\vert ^{-\frac{1}{2\beta +1}}(n\alpha ^2)^{-\frac{1}{2\beta +1}} \) yields

$$\begin{aligned} \mathcal E_{n,\alpha }(f_0,\gamma ) \le C\left[ \vert B\vert ^{\frac{\beta +1}{2\beta +1}}(n\alpha ^2)^{-\frac{\beta }{2\beta +1}} +\int _{\overline{B}} f_0+ \frac{1}{\sqrt{n\alpha ^2}} \right] , \end{aligned}$$

which ends the proof of Theorem 4.3. It remains to prove i), ii) and iii). Let’s start with the proof of i). If \(\tau \ge \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2}\), then \(\tau \ge \vert p(j)-p_0(j)\vert \) for all j, and we thus have

$$\begin{aligned} D_\tau (f)= \sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2 = \min \left\{ \sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2, \tau \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2} \right\} . \end{aligned}$$

We now deal with the case \(\tau < \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2}\). In this case, we can write

$$\begin{aligned}&D_\tau (f)-\tau \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2}\\&\qquad =\sum _{j=1}^N\vert p(j)-p_0(j)\vert \min \left\{ \vert p(j)-p_0(j)\vert , \tau \right\} -\tau \frac{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2}{\sqrt{\sum _{k=1}^N\vert p(k)-p_0(k)\vert ^2}}\\&\qquad =\sum _{j=1}^N\vert p(j)-p_0(j)\vert \underbrace{\left[ \min \left\{ \vert p(j)-p_0(j)\vert , \tau \right\} - \frac{\tau \vert p(j)-p_0(j)\vert }{\sqrt{\sum _{k=1}^N\vert p(k)-p_0(k)\vert ^2}} \right] }_{=:A_j}, \end{aligned}$$

and \(A_j\ge 0\) for all j. Indeed, if j is such that \(\vert p(j)-p_0(j)\vert <\tau \) it holds

$$\begin{aligned} A_j=\vert p(j)-p_0(j)\vert \left[ 1-\frac{\tau }{\sqrt{\sum _{k=1}^N\vert p(k)-p_0(k)\vert ^2}} \right] \ge 0, \end{aligned}$$

and if j is such that \(\vert p(j)-p_0(j)\vert \ge \tau \) it holds

$$\begin{aligned} A_j=\tau \left[ 1-\frac{\vert p(j)-p_0(j)\vert }{\sqrt{\sum _{k=1}^N\vert p(k)-p_0(k)\vert ^2}} \right] \ge 0. \end{aligned}$$

Thus, if \(\tau < \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2}\) we have

$$\begin{aligned} D_\tau (f)\ge \tau \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2}= \min \left\{ \sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2, \tau \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2} \right\} , \end{aligned}$$

which end the proof of i). We now prove ii). Assume that \(\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2\ge C_\gamma ^2/(n\alpha ^2)\). It holds \(C_\gamma ^2\ge C_\gamma \) since \(C_\gamma \ge 1\) and we thus have \(\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2\ge C_\gamma /(n\alpha ^2)\). It also holds

$$\begin{aligned} \tau \sqrt{\sum _{j=1}^N\vert p(j)-p_0(j)\vert ^2} \ge \tau \cdot \frac{C_\gamma }{\sqrt{n\alpha ^2}}=\frac{C_\gamma }{n\alpha ^2}, \end{aligned}$$

yielding ii). Finally, Cauchy-Schwarz inequality yields

$$\begin{aligned} \left( \int _B\vert f-f_0\vert \right) ^2&\le \vert B\vert \int _B\vert f-f_0\vert ^2\\&\le \vert B\vert \cdot \left| \int _B\vert f-f_0\vert ^2-\frac{1}{2h}\sum _{j=1}^N\left( p(j)-p_0(j)\right) ^2 \right| +\frac{\vert B \vert }{2h}\sum _{j=1}^N\left( p(j)-p_0(j)\right) ^2. \end{aligned}$$

Now, observe that

$$\begin{aligned} \left| \int _B\vert f-f_0\vert ^2-\frac{1}{2h}\sum _{j=1}^N\left( p(j)-p_0(j)\right) ^2 \right| = \left| \sum _{j=1}^N\int _{B_j}\left[ (f-f_0)(x)- \frac{p(j)-p_0(j)}{2h}\right] ^2\textrm{d}x \right| , \end{aligned}$$

and observe also that for \(x\in B_j\) it holds

$$\begin{aligned} \left| (f-f_0)(x)- \frac{p(j)-p_0(j)}{2h}\right|&= \left| \frac{1}{2h}\int _{B_j}[(f-f_0)(x)-(f-f_0)(u)] \textrm{d}u \right| \\&\le \frac{1}{2h}\int _{B_j}\left[ \vert f(x)-f(u) \vert +\vert f_0(x)-f_0(u) \vert \right] \textrm{d}u\\&\le \frac{L+L_0}{2h}\int _{B_j}\vert x-u\vert ^\beta \textrm{d}u\\&\le \frac{L+L_0}{2h}\int _{B_j}(2h)^\beta \textrm{d}u\\&\le 2(L+L_0)h^\beta . \end{aligned}$$

This gives

$$\begin{aligned} \left| \int _B\vert f-f_0\vert ^2-\frac{1}{2h}\sum _{j=1}^N\left( p(j)-p_0(j)\right) ^2 \right| \le \sum _{j=1}^N\int _{B_j} 4(L+L_0)^2h^{2\beta }=4(L+L_0)^2\vert B \vert h^{2\beta }, \end{aligned}$$

which yields (iii).

1.4 Proof of Theorem 4.4

Let \(B\subset \mathbb R\) be a non-empty compact set, and let \((B_j)_{j=1,\ldots ,N}\) be a partition of B, \(h>0\) be the bandwidth and \((x_1,\ldots ,x_N)\) be the centering points, that is \(B_j=[x_j-h,x_j+h]\) for all \(j\in \llbracket 1,N\rrbracket \). Let \(\psi : [-1,1]\rightarrow \mathbb R\) be such that \(\psi \in H(\beta ,L)\), \(\int \psi =0\) and \(\int \psi ^2=1\). For \(j\in \llbracket 1,N\rrbracket \), define

$$\begin{aligned} \psi _j:t\in \mathbb R\mapsto \frac{1}{\sqrt{h}}\psi \left( \frac{t-x_j}{h} \right) . \end{aligned}$$

Note that the support of \(\psi _j\) is \(B_j\), \(\int \psi _j=0\) and \((\psi _j)_{j=1,\ldots ,N}\) is an orthonormal family.

For \(\delta >0\) and \(\nu \in \mathcal V_N=\{-1,1\}^N\), define the functions

$$\begin{aligned}f_\nu :x\in \mathbb R\mapsto f_0(x)+\delta \sum _{j=1}^N\nu _j\psi _j(x),\end{aligned}$$

The following lemma shows that for \(\delta \) properly chosen, for all \(\nu \in \mathcal V_N\), \(f_\nu \) is a density belonging to \(H(\beta ,L)\) and \(f_\nu \) is sufficiently far away from \(f_0\) in a \(L_1\) sense.

Lemma B.1

If the parameter \(\delta \) appearing in the definition of \(f_\nu \) satisfies

$$\begin{aligned} \delta \le \sqrt{h}\cdot \min \left\{ \frac{C_0(B)}{\Vert \psi \Vert _\infty } , \frac{1}{2}\left( 1-\frac{L_0}{L} \right) h^\beta \right\} , \end{aligned}$$

where \(C_0(B):=\min \{f_0(x) : x\in B \}\), then we have

  1. (i)

    \(f_\nu \ge 0\) and \(\int f_\nu =1\), for all \(\nu \in \mathcal V_N\),

  2. (ii)

    \(f_\nu \in H(\beta ,L)\), for all \(\nu \in \mathcal V_N\),

  3. (iii)

    \(\Vert f_\nu -f_0\Vert _1= C_1\delta N\sqrt{h}\), for all \(\nu \in \mathcal V_N\), with \(C_1=\int _{-1}^1\vert \psi \vert \).

Proof

We first prove i). Since \(\int \psi _j= 0\) for all \(j=1,\ldots ,n\), it holds \(\int f_\nu =\int f_0=1\) for all \(\nu \). Since \(\text {Supp}(\psi _k)=B_k\) for all \(k=1,\ldots ,N\), it holds \(f_\nu \equiv f_0\) on \(B^c\) and thus \(f_\nu \) is non-negative on \(B^c\). Now, for \(x\in B_j\) it holds for all \(\nu \in \mathcal V_N\)

$$\begin{aligned} f_\nu (x)= f_0(x)+\delta \nu _j \psi _j(x)\ge C_0(B)-\delta \Vert \psi _j \Vert _\infty \ge C_0(B)-\frac{\delta \Vert \psi \Vert _\infty }{\sqrt{h}}\ge 0, \end{aligned}$$

since \(\delta \le C_0(B)\sqrt{h}/\Vert \psi \Vert _\infty \) Thus, \(f_\nu \) is non-negative on \(\mathbb R\) for all \(\nu \in \mathcal V_N\).

To prove (ii), we have to show that \(\vert f_\nu (x)-f_\nu (y)\vert \le L \vert x-y\vert ^\beta \), for all \(\nu \in \mathcal V_N\), for all \(x,y\in \mathbb R\). Since \(f_\nu \equiv f_0\) on \(B^c\) and \(f_0\in H(\beta ,L_0)\), this result is trivial for \(x,y\in B^c\). If \(x\in B_l\) and \(y\in B_k\) it holds

$$\begin{aligned} \vert f_\nu (x)-f_\nu (y)\vert&\le \vert f_0(x)-f_0(y)\vert + \left| \delta \nu _l \psi _l(x) - \delta \nu _k \psi _k(y) \right| \\&\le L_0\vert x-y\vert ^\beta + \left| \delta \nu _l \psi _l(x) - \delta \nu _l \psi _l(y) \right| +\left| \delta \nu _k \psi _k(x) - \delta \nu _k \psi _k(y) \right| \\&\le L_0\vert x-y\vert ^\beta +\frac{\delta }{\sqrt{h}} \left| \psi \left( \frac{x-x_l}{h}\right) -\psi \left( \frac{y-x_l}{h}\right) \right| +\frac{\delta }{\sqrt{h}} \left| \psi \left( \frac{x-x_k}{h}\right) -\psi \left( \frac{y-x_k}{h}\right) \right| \\&\le L_0\vert x-y\vert ^\beta + \frac{\delta }{h^{\beta +1/2}} \cdot L\vert x-y\vert ^\beta \\&\quad + \frac{\delta }{h^{\beta +1/2}}\cdot L\vert x-y\vert ^\beta \\&=\left( \frac{L_0}{L}+ \frac{2\delta }{h^{\beta +1/2}} \right) L\vert x-y\vert ^\beta \\&\le L\vert x-y\vert ^\beta \end{aligned}$$

where we have used \(\psi \in H(\beta ,L)\) and \(\delta \le \frac{h^{\beta +1/2}}{2}\left( 1-\frac{L_0}{L}\right) \). Thus, it holds \(\vert f_\nu (x)-f_\nu (y)\vert \le L\vert x-y\vert ^\beta \) for all \(\nu \in \mathcal V_N\), \(x\in B_l\) and \(y\in B_k\). The case \(x\in B^c\) and \(y\in B_k\) can be handled in a similar way, which ends the proof of ii).

We now prove iii). It holds

$$\begin{aligned} \int _\mathbb R\vert f_\nu -f_0\vert =\int _\mathbb R\left| \delta \sum _{j=1}^N\nu _j\psi _j(x) \right| \textrm{d}x = \sum _{k=1}^N\int _{B_k}\left| \delta \nu _k \psi _k(x) \right| \textrm{d}x = \delta N\sqrt{h} \int _{-1}^1\vert \psi \vert . \end{aligned}$$

\(\square \)

For a privacy mechanism \(Q\in \mathcal Q_\alpha \), we denote by \(Q_{f_0}^n\) (respectively \(Q_{f_\nu }^n\)) the distribution of \((Z_1,\ldots ,Z_n)\) when the \(X_i\)’s are distributed according to \(f_0\) (respectively to \(f_\nu \)). We set \(\bar{Q}^n=1/2^N\sum _{\nu \in \mathcal V_N} Q_{f_\nu }^n\). If \(\delta \) is chosen such that \(\delta \le \sqrt{h}\cdot \min \left\{ \frac{C_0(B)}{\Vert \psi \Vert _\infty } , \frac{1}{2}\left( 1-\frac{L_0}{L} \right) h^\beta \right\} \), setting \(\rho ^\star =C_1\delta N\sqrt{h}\), we deduce from the above lemma that if

$$\begin{aligned} \text {KL}(Q_{f_0}^n,\bar{Q}^n) \le 2(1-\gamma )^2 \text { for all } Q\in \mathcal Q_\alpha , \end{aligned}$$
(33)

then it holds

$$\begin{aligned} \inf _{Q\in \mathcal Q_\alpha }\inf _{\phi \in \Phi _Q} \sup _{f\in H_1(\rho ^\star )}\left\{ \mathbb P_{Q_{f_0}^n}(\phi =1)+ \mathbb P_{Q_{f}^n}(\phi =0) \right\} \ge \gamma , \end{aligned}$$

where \(H_1(\rho ^\star ):= \{ f \in H(\beta , L) : f\ge 0, \int f=1, \Vert f-f_0\Vert _1 \ge \rho ^\star \}\), and consequently \(\mathcal E_{n,\alpha }(f_0,\gamma )\ge \rho ^\star \). Indeed, if (33) holds, then we have

$$\begin{aligned}&\inf _{Q\in \mathcal Q_\alpha }\inf _{\phi \in \Phi _Q} \sup _{f\in H_1(\rho ^\star )}\left\{ \mathbb P_{Q_{f_0}^n}(\phi =1)+ \mathbb P_{Q_{f}^n}(\phi =0) \right\} \\&\qquad \ge \inf _{Q\in \mathcal Q_\alpha }\inf _{\phi \in \Phi _Q} \left( \mathbb P_{Q_{f_0}^n}(\phi =1)+ \frac{1}{2^N}\sum _{\nu \in \mathcal V_N}\mathbb P_{Q_{f_\nu }^n}(\phi =0) \right) \\&\qquad = \inf _{Q\in \mathcal Q_\alpha }\inf _{\phi \in \Phi _Q} \left( 1-\left[ \mathbb P_{Q_{f_0}^n}(\phi =0)-\mathbb P_{\bar{Q}^n}(\phi =0) \right] \right) \\&\qquad \ge \inf _{Q\in \mathcal Q_\alpha }\left[ 1-\text {TV}(Q_{f_0}^n, \bar{Q}^n)\right] \\&\qquad \ge \inf _{Q\in \mathcal Q_\alpha }\left[ 1-\sqrt{\frac{\text {KL}(Q_{f_0}^n, \bar{Q}^n)}{2}}\right] \\&\qquad \ge \gamma , \end{aligned}$$

where the second to last inequality follows from Pinsker’s inequality. We now prove that (33) holds under an extra assumption on \(\delta \). Fix a privacy mechanism \(Q\in \mathcal Q_\alpha \). The conditional distribution of \(Z_i\) given \(Z_1,\ldots ,Z_{i-1}\) when \(X_i\) is distributed according to \(f_0\) or \(f_\nu \) will be denoted by \(\mathcal L^{(0)}_{Z_i\mid z_{1:(i-1)}}(dz_i)=\int _\mathbb RQ_i(dz_i\mid x_i,z_{1:(i-1)})f_0(x_i)dx_i\) and \(\mathcal L^{(\nu )}_{Z_i\mid z_{1:(i-1)}}(dz_i)=\int _\mathbb RQ_i(dz_i\mid x_i,z_{1:(i-1)})f_\nu (x_i)dx_i\) respectively. The joint distribution of \(Z_1,\ldots ,Z_i\) when \(X_1,\ldots ,X_i\) are i.i.d. from \(f_0\) will be denoted by

$$\begin{aligned} \mathcal L^{(0)}_{Z_1,\ldots ,Z_i}(dz_{1:i})=\mathcal L^{(0)}_{Z_i\mid z_{1:(i-1)}}(dz_i)\cdots \mathcal L^{(0)}_{Z_2\mid z_1}(dz_2)\mathcal L^{(0)}_{Z_1}(dz_1). \end{aligned}$$

The convexity and tensorization of the Kullback-Leibler divergence give

$$\begin{aligned} \text {KL}(Q_{f_0}^n,\bar{Q}^n)&\le \frac{1}{2^N}\sum _{\nu \in \mathcal V}\text {KL}(Q_{f_0}^n,Q_{f_\nu }^n) \\&= \frac{1}{2^N}\sum _{\nu \in \mathcal V}\sum _{i=1}^n \int _{\mathcal Z^{i-1}}\text {KL}\left( \mathcal L^{(0)}_{Z_i\mid z_{1:(i-1)}} ,\mathcal L^{(\nu )}_{Z_i\mid z_{1:(i-1)}} \right) \mathcal L^{(0)}_{Z_1,\ldots ,Z_{i-1}}(d z_{1:(i-1)}) . \end{aligned}$$

According to Lemma B.3 in [9],there exists a probability measure \(\mu _{z_{1:(i-1)}}\) on \(\mathcal Z\) and a family of \(\mu _{z_{1:(i-1)}}\)-densities \(z_i\mapsto q_i(\cdot \mid x_i,z_{1:(i-1)})\) of \(Q_i(\cdot \mid x_i,z_{1:(i-1)})\), \(x_i\in \mathbb R\) such that

$$\begin{aligned} e^{-\alpha }\le q_i(z_i \mid x_i,z_{1:(i-1)})\le e^{\alpha }, \quad \forall z_i\in \mathcal Z, \forall x_i\in \mathbb R. \end{aligned}$$

We can thus write \(\mathcal L^{(0)}_{Z_i\mid z_{1:(i-1)}}(dz_i)=m^{(0)}_i(z_i\mid z_{1:(i-1)}) d\mu _{z_{1:(i-1)}}(z_i)\), and \(\mathcal L^{(\nu )}_{Z_i\mid z_{1:(i-1)}}(dz_i)=m^{(\nu )}_i(z_i\mid z_{1:(i-1)}) d\mu _{z_{1:(i-1)}}(z_i)\) with \(m^{(0)}_i(z_i\mid z_{1:(i-1)})=\int _\mathbb Rq_i(z_i\mid x_i,z_{1:(i-1)})f_0(x_i)\textrm{d}x_i\) and \(m^{(\nu )}_i(z_i\mid z_{1:(i-1)})=\int _\mathbb Rq_i(z_i\mid x_i,z_{1:(i-1)})f_\nu (x_i)\textrm{d}x_i\). Bounding the Kullback-Leibler divergence by the \(\chi ^2\)-divergence, we have

$$\begin{aligned}&\text {KL}\left( \mathcal L^{(0)}_{Z_i\mid z_{1:(i-1)}} ,\mathcal L^{(\nu )}_{Z_i\mid z_{1:(i-1)}} \right) \\&\le \int _{\mathcal Z} \left( \frac{d\mathcal L^{(0)}_{Z_i\mid z_{1:(i-1)}}}{d\mathcal L^{(\nu )}_{Z_i\mid z_{1:(i-1)}}}-1 \right) ^2\mathcal L^{(\nu )}_{Z_i\mid z_{1:(i-1)}}(dz_i) \\&=\int _{\mathcal Z}\left( \frac{m_{i}^{(0)}(z_i\mid z_{1:i-1})-m_{i}^{(\nu )}(z_i\mid z_{1:i-1}) }{m_{i}^{(\nu )}(z_i\mid z_{1:i-1})} \right) ^2 m_{i}^{(\nu )}(z_i\mid z_{1:i-1})d\mu _{z_{1:(i-1)}}(z_i) \\&= \int _{\mathcal Z} \left( \frac{\int _\mathbb Rq_{i}(z_i\mid x, z_{1:i-1})\left( f_0(x)-f_\nu (x) \right) dx}{m_{i}^{(\nu )}(z_i\mid z_{1:i-1})}\right) ^2 m_{i}^{(\nu )}(z_i\mid z_{1:i-1})d\mu _{z_{1:(i-1)}}(z_i) \\&=\int _{\mathcal Z}\left[ \int _{\mathbb R}\left( \frac{q_{i}(z_i\mid x, z_{1:i-1}) }{m_{i}^{(\nu )}(z_i\mid z_{1:i-1})}-e^{-2\alpha }\right) \left( f_0(x)-f_\nu (x) \right) dx \right] ^2 m_{i}^{(\nu )}(z_i\mid z_{1:i-1})d\mu _{z_{1:(i-1)}}(z_i), \\ \end{aligned}$$

since \(\int _\mathbb R(f_0-f_\nu )=0\). Recall that \(q_i\) satisfies \(e^{-\alpha }\le q_i(z_i \mid x,z_{1:(i-1)})\le e^{\alpha }\). Thus, we have \(e^\alpha =\int e^{\alpha } f_\nu \ge m_i^{(\nu )}(z_i\mid z_{1:(i-1)})\ge e^{-\alpha }\int f_\nu =e^{-\alpha }\), and therefore

$$\begin{aligned} 0\le g_{i,z_{1:i}}(x):=\frac{q_i(z_i\mid x,z_{1:(i-1)} )}{m^{(\nu )}_i(z_i\mid z_{1:(i-1)})}-e^{-2\alpha }\le z_\alpha =e^{2\alpha }-e^{-2\alpha }. \end{aligned}$$

Thus,

$$\begin{aligned}&\frac{1}{2^N}\sum _{\nu \in \mathcal V_N}\left[ \int _{\mathbb R}\left( \frac{q_{i}(z_i\mid x, z_{1:i-1}) }{m_{i}^{(\nu )}(z_i\mid z_{1:i-1})}-e^{-2\alpha }\right) \left( f_0(x)-f_\nu (x) \right) dx \right] ^2 m_{i}^{(\nu )}(z_i\mid z_{1:i-1})\\&\le e^\alpha \delta ^2\frac{1}{2^N}\sum _{\nu \in \mathcal V_N}\left[ \sum _{k=1}^N\nu _k\int _{\mathbb R}g_{i,z_{1:i}}(x) \psi _k(x)dx \right] ^2\\&= e\delta ^2\sum _{k=1}^N\left[ \int _{\mathbb R}g_{i,z_{1:i}}(x) \psi _k(x)dx \right] ^2\\&\le e\delta ^2z_\alpha ^2\sum _{k=1}^N \Vert \psi _k\Vert _1^2 \le e\delta ^2z_\alpha ^2NhC_1^2 = \frac{e}{2}C_1^2\delta ^2z_\alpha ^2\vert B\vert , \end{aligned}$$

where we recall that \(C_1=\int \vert \psi \vert \). We thus obtain

$$\begin{aligned} \text {KL}(Q_{f_0}^n,\bar{Q}^n)\le \frac{e}{2}C_1^2\delta ^2nz_\alpha ^2\vert B\vert , \end{aligned}$$

and (33) holds as soon as

$$\begin{aligned} \delta \le \sqrt{\frac{4(1-\gamma )^2}{eC_1^2nz_\alpha ^2\vert B\vert }}. \end{aligned}$$

Finally, taking \(\delta =\min \left\{ \sqrt{h}\cdot \min \left\{ \frac{C_0(B)}{\Vert \psi \Vert _\infty } , \frac{1}{2}\left( 1-\frac{L_0}{L} \right) h^\beta \right\} , \sqrt{\frac{4(1-\gamma )^2}{eC_1^2nz_\alpha ^2\vert B\vert }} \right\} \), we obtain

$$\begin{aligned} \mathcal E_{n,\alpha }(f_0,\gamma )&\ge C(\psi ,\gamma )\\ {}&\qquad \min \left\{ \vert B\vert \min \left\{ \frac{C_0(B)}{\Vert \psi \Vert _\infty } , \frac{1}{2}\left( 1-\frac{L_0}{L} \right) h^\beta \right\} , \frac{\sqrt{\vert B\vert }}{\sqrt{h}\sqrt{n z_\alpha ^2}}\right\} . \end{aligned}$$

If B is chosen such that \(C_0(B)=\min \{ f_0(x), x\in B\}\ge Ch^\beta \), then the bound becomes

$$\begin{aligned} \mathcal E_{n,\alpha }(f_0,\gamma )\ge C(\psi ,\gamma ,L,L_0)\min \left\{ \vert B\vert h^\beta , \frac{\sqrt{\vert B\vert }}{\sqrt{h}\sqrt{n z_\alpha ^2}}\right\} , \end{aligned}$$

and the choice \(h\asymp \vert B \vert ^{-1/(2\beta +1)}(nz_\alpha ^2)^{-1/(2\beta +1)}\) yields

$$\begin{aligned} \mathcal E_{n,\alpha }(f_0,\gamma )\ge C(\psi ,\gamma ,L,L_0)\vert B \vert ^{\frac{\beta +1}{2\beta +1}}(nz_\alpha ^2)^{-\frac{\beta }{2\beta +1}}. \end{aligned}$$

Note that with this choice of h, the condition \(C_0(B)\ge Ch^\beta \) becomes \(\vert B \vert ^{\beta /(2\beta +1)}C_0(B) \ge C(nz_\alpha ^2)^{-\beta /(2\beta +1)}\).

Proofs of Sect. 5

1.1 Example 5.2

We first prove the result for the non-interactive case. Take

$$\begin{aligned} B=[a,T], \quad \text {with} \quad T=(n\alpha ^2)^{\frac{2\beta }{k(4\beta +3)+3\beta +3}}. \end{aligned}$$

Note that \(T>a\) for n large enough. Theorem 3.4 gives

$$\begin{aligned} \mathcal E_{n,\alpha }^\text {NI}(f_0,\gamma )&\lesssim (T-a)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \left( \frac{a}{T}\right) ^k\\&\lesssim T^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + T^{-k} \\&= (n\alpha ^2)^{-\frac{2k\beta }{k(4\beta +3)+3\beta +3}}. \end{aligned}$$

To obtain the lower bound, we first check that condition (7) in Theorem 3.5 is satisfied. Since \(T\rightarrow +\infty \) as \(n\rightarrow \infty \), it holds for n large enough

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B)&=(T-a)^\frac{\beta }{4\beta +3}\frac{ka^k}{T^{k+1}} \\&=ka^kT^{\frac{\beta }{4\beta +3}-(k+1)}\left( 1-\frac{a}{T}\right) ^\frac{\beta }{4\beta +3} \\&\gtrsim T^\frac{\beta -(k+1)(4\beta +3)}{4\beta +3}\\&\gtrsim C(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Condition (7) is thus satisfied and Theorem 3.5 thus yields for n large enough

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C (T-a)^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}(T-a)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C T^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}T^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C (n\alpha ^2)^{\frac{4\beta +4}{4\beta +3}\cdot \frac{2\beta }{k(4\beta +3)+3\beta +3}+\frac{2}{4\beta +3}} \right) \right] ^{-1}(n\alpha ^2)^{-\frac{2k\beta }{k(4\beta +3)+3\beta +3}}. \end{aligned}$$

The proof in the interactive scenario follows the same lines at the exception of the choice of T which should be taken as

$$\begin{aligned} T= (n\alpha ^2)^{\frac{\beta }{k(2\beta +1)+\beta +1}}. \end{aligned}$$

1.2 Example 5.3

We first prove the result for the non-interactive case. Take

$$\begin{aligned} B=[0,T], \quad \text {with} \quad T=\frac{1}{\lambda }\cdot \frac{2\beta }{4\beta +3}\log (n\alpha ^2). \end{aligned}$$

Theorem 3.4 gives

$$\begin{aligned} \mathcal E_{n,\alpha }^\text {NI}(f_0,\gamma )&\lesssim T^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \exp (-\lambda T)\\&\lesssim \log (n\alpha ^2)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \\&\lesssim \log (n\alpha ^2)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Now, observe that

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B) =T^\frac{\beta }{4\beta +3}\cdot \lambda \exp (-\lambda T) =\lambda T^\frac{\beta }{4\beta +3}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\gtrsim (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Thus, condition (7) is satisfied and Theorem 3.5 yields

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C T^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}T^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C \log (n\alpha ^2)^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}\log (n\alpha ^2)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

The proof in the interactive scenario follows the same lines at the exception of the choice of T which should be taken as

$$\begin{aligned} T=\frac{1}{\lambda }\cdot \frac{\beta }{2\beta +1}\log (n\alpha ^2). \end{aligned}$$

1.3 Example 5.4

We first prove the result for the non-interactive case. Take

$$\begin{aligned} B=[-T,T], \quad \text {with} \quad T= \sqrt{\frac{4\beta }{4\beta +3}\log (n\alpha ^2)}. \end{aligned}$$

Theorem 3.4 gives

$$\begin{aligned} \mathcal E_{n,\alpha }^\text {NI}(f_0,\gamma )&\lesssim (2T)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \frac{2}{\sqrt{2\pi }}\int _T^{+\infty }e^{-x^2/2}dx\\&\lesssim T^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \frac{1}{T}\exp \left( -\frac{T^2}{2} \right) \\&\lesssim \log (n\alpha ^2)^{\frac{3\beta +3}{2(4\beta +3)}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \\&\lesssim \log (n\alpha ^2)^{\frac{3\beta +3}{2(4\beta +3)}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Now, observe that

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B) =(2T)^\frac{\beta }{4\beta +3}\cdot \frac{1}{\sqrt{2\pi }}\exp \left( -\frac{T^2}{2} \right) \gtrsim (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Thus, condition (7) is satisfied and Theorem 3.5 yields

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C (2T)^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}(2T)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C \log (n\alpha ^2)^{\frac{4\beta +4}{2(4\beta +3)}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}\log (n\alpha ^2)^{\frac{3\beta +3}{2(4\beta +3)}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \end{aligned}$$

The proof in the interactive scenario follows the same lines at the exception of the choice of T which should be taken as

$$\begin{aligned} T= \sqrt{\frac{2\beta }{2\beta +1}\log (n\alpha ^2)}. \end{aligned}$$

1.4 Example 5.5

We first prove the result for the non-interactive case. Take

$$\begin{aligned} B=[-T,T], \quad \text {with} \quad T= (n\alpha ^2)^\frac{2\beta }{7\beta +6}. \end{aligned}$$

Theorem 3.4 gives

$$\begin{aligned} \mathcal E_{n,\alpha }^\text {NI}(f_0,\gamma )&\lesssim (2T)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \frac{2}{\pi a}\int _{T}^{+\infty }\frac{a^2}{a^2+x^2}dx\\&\lesssim T^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \arctan \left( \frac{a}{T}\right) . \end{aligned}$$

Since \(T\rightarrow \infty \) as \(n\rightarrow \infty \), we have \(\arctan (a/T)\sim _{n\rightarrow \infty } a/T\) and thus \(\arctan (a/T)\le 2(a/T)\) for n large enough. This gives for n large enough

$$\begin{aligned} \mathcal E_{n,\alpha }^\text {NI}(f_0,\gamma )\lesssim T^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \frac{1}{T}= (n\alpha ^2)^{-\frac{2\beta }{7\beta +6}} \end{aligned}$$

Now, observe that for n large enough it holds

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B) =(2T)^\frac{\beta }{4\beta +3}\cdot \frac{1}{\pi a}\frac{a^2}{T^2+a^2} \gtrsim T^\frac{\beta }{4\beta +3}\cdot \frac{1}{T^2} = (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Thus, condition (7) is satisfied and Theorem 3.5 yields

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C (2T)^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}(2T)^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C (n\alpha ^2)^{\frac{4\beta +4}{4\beta +3}\cdot \frac{2\beta }{7\beta +6} + \frac{2}{4\beta +3} } \right) \right] ^{-1}(n\alpha ^2)^{-\frac{2\beta }{7\beta +6}}. \end{aligned}$$

The proof in the interactive scenario follows the same lines at the exception of the choice of T which should be taken as

$$\begin{aligned} T=(n\alpha ^2)^\frac{\beta }{3\beta +2}. \end{aligned}$$

1.5 Example 5.6

We first prove the result for the non-interactive case. The upper bound is straightforward taking \(B=[0, 2/\sqrt{L_0}]\). For the lower bound, take

$$\begin{aligned} B=\left[ T , \frac{2}{\sqrt{L_0}}-T\right] , \quad \text {with} \quad T= (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Note that for n large enough it holds \(T<1/(2\sqrt{L_0})\) and we thus have

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B) =\left( \frac{2}{\sqrt{L_0}}-2T \right) ^\frac{\beta }{4\beta +3}\cdot L_0T \gtrsim T= (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Thus, condition (7) is satisfied and Theorem 3.5 yields

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C \left( \frac{2}{\sqrt{L_0}}-2T \right) ^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}\left( \frac{2}{\sqrt{L_0}}-2T \right) ^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C (n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \end{aligned}$$

The proof in the interactive scenario follows the same lines at the exception of the choice of T for the lower bound which should be taken as

$$\begin{aligned} T=(n\alpha ^2)^{-\frac{\beta }{2\beta +1}}. \end{aligned}$$

1.6 Example 5.7

Let \(a\ge 1\), \(b\ge 1\) with \(a>1\) or \(b>1\). We first prove the result for the non-interactive case. The upper bound is straightforward taking \(B=[0, 1]\). For the lower bound, we need to distinguish different cases.

Case 1 : \(\underline{a > 1, b=1}\) . In this case \(f_0\) is strictly non-decreasing on [0, 1] and \(f_0(0)=0\). In order that \(f_0\) is bounded from below by a strictly positive quantity, we thus take B of the form \(B=[T_1,1]\) with \(0<T_1<1\). We choose

$$\begin{aligned} T_1= (n\alpha ^2)^{-\frac{2\beta }{(a-1)(4\beta +3)}}. \end{aligned}$$

Observe that that for n large enough we have

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B) =\left[ 1-T_1 \right] ^\frac{\beta }{4\beta +3}\cdot \frac{1}{B(a,1)} T_1^{a-1} \gtrsim T_1^{a-1}=(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \end{aligned}$$

Thus, condition (7) is satisfied and Theorem 3.5 yields for n large enough

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C \left[ 1-T_1 \right] ^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}\left[ 1-T_1 \right] ^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Case 2 : \(\underline{a= 1, b> 1}\) . In this case \(f_0\) is strictly non-increasing on [0, 1] and \(f_0(1)=0\). In order that \(f_0\) is bounded from below by a strictly positive quantity, we thus take B of the form \(B=[0,1-T_2]\) with \(0<T_2<1\). We choose

$$\begin{aligned} T_2= (n\alpha ^2)^{-\frac{2\beta }{(b-1)(4\beta +3)}}. \end{aligned}$$

Observe that that for n large enough we have

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B) =\left[ 1-T_2 \right] ^\frac{\beta }{4\beta +3}\cdot \frac{1}{B(1,b)} T_2^{b-1} \gtrsim T_2^{b-1}=(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \end{aligned}$$

Thus, condition (7) is satisfied and Theorem 3.5 yields for n large enough

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C \left[ 1-T_2 \right] ^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}\left[ 1-T_2 \right] ^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Case 3 : \(\underline{a> 1, b> 1}\) . In this case, \(f_0\) is non-decreasing on \([0,(a-1)/(a+b-2)]\), non-increasing on \([(a-1)/(a+b-2),1]\) and \(f_0(0)=f_0(1)=0\). In order that \(f_0\) is bounded from below by a strictly positive quantity, we thus take B of the form \(B=[T_3,1-T_4]\) and we choose

$$\begin{aligned} T_3= (n\alpha ^2)^{-\frac{2\beta }{(a-1)(4\beta +3)}}, \quad T_4= (n\alpha ^2)^{-\frac{2\beta }{(b-1)(4\beta +3)}}. \end{aligned}$$

Observe that for n large enough it holds

$$\begin{aligned} 0<T_3<\frac{a-1}{a+b-2}<1-T_4<1. \end{aligned}$$

Observe that for n large enough we have

$$\begin{aligned} \vert B \vert ^\frac{\beta }{4\beta +3}C_0(B)&=\left[ 1-(T_3+T_4) \right] ^\frac{\beta }{4\beta +3}\cdot \frac{1}{B(a,b)}\min \left\{ T_3^{a-1}(1-T_3)^{b-1}, (1-T_4)^{a-1}T_4^{b-1} \right\} \\&\gtrsim \min \left\{ T_3^{a-1},T_4^{b-1} \right\} \\&\gtrsim (n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}. \end{aligned}$$

Thus, condition (7) is satisfied and Theorem 3.5 yields for n large enough

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )&\gtrsim \left[ \log \left( C \left[ 1-(T_3+T_4) \right] ^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}\left[ 1-(T_3+T_4) \right] ^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}\\&\gtrsim \left[ \log \left( C(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \end{aligned}$$

The proof in the interactive scenario follows the same lines at the exception of the choice of \(T_1\) and \(T_2\) which should be taken as

$$\begin{aligned} T_1= T_3=(n\alpha ^2)^{-\frac{\beta }{(a-1)(2\beta +1)}}, \quad T_2=T_4= (n\alpha ^2)^{-\frac{\beta }{(b-1)(2\beta +1)}}. \end{aligned}$$

1.7 Example 5.8

We prove the result for the non interactive case. Take

$$\begin{aligned}B=B_{n,\alpha }\in \arg \inf _{B \text { compact set}} \left\{ \int _{\overline{B}}{ f_0} \ge \vert B \vert ^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} + \frac{1}{\sqrt{n\alpha ^2}} \text { and } \inf _B f_0 \ge \sup _{\overline{B}} f_0 \right\} , \end{aligned}$$

It holds \(B=B_{n,\alpha }=[0,a_*]\) with

$$\begin{aligned}a_*=\sup \left\{ a : \frac{(\log 2)^{A}}{(\log (2+a))^{A}} \ge a^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}+\frac{1}{\sqrt{n\alpha ^2}}\right\} ,\end{aligned}$$

and Theorem 3.4 thus gives

$$\begin{aligned}\mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )\lesssim a_*^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}+\frac{1}{\sqrt{n\alpha ^2}}\lesssim a_*^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}},\end{aligned}$$

where the last inequality follows from \(a_*\ge 1\ge (n\alpha ^2)^{-{\frac{1}{2\beta +2}}}.\)

Inspecting the proof of Theorem 3.5, we see that the lower bound can be rewritten

$$\begin{aligned} \mathcal E^{\text {NI}}_{n,\alpha }(f_0,\gamma )\gtrsim \left[ \log \left( C \vert B\vert ^{\frac{4\beta +4}{4\beta +3}}(n \alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1} \min \left\{ \vert B \vert C_0(B); \vert B \vert ^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}} \right\} . \end{aligned}$$

Yet, for \(B=B_{n,\alpha }=[0,a_*]\) it holds

$$\begin{aligned}\vert B_{n,\alpha } \vert C_0(B_{n,\alpha })=\frac{A(\log 2)^Aa_*}{(a_*+2)(\log (2+a_*))^{A+1}}\gtrsim \frac{1}{\log (a_*)}\times a_*^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}, \end{aligned}$$

yielding

$$\begin{aligned}\mathcal E_{n,\alpha }^\text {NI}(f_0,\gamma )\gtrsim \left[ \log \left( C a_*^{\frac{4\beta +4}{4\beta +3}}(n\alpha ^2)^\frac{2}{4\beta +3} \right) \right] ^{-1}\left[ \log (a_*) \right] ^{-1}a_*^{\frac{3\beta +3}{4\beta +3}}(n\alpha ^2)^{-\frac{2\beta }{4\beta +3}}.\end{aligned}$$

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dubois, A., Berrett, T.B., Butucea, C. (2023). Goodness-of-Fit Testing for Hölder Continuous Densities Under Local Differential Privacy. In: Belomestny, D., Butucea, C., Mammen, E., Moulines, E., Reiß, M., Ulyanov, V.V. (eds) Foundations of Modern Statistics. FMS 2019. Springer Proceedings in Mathematics & Statistics, vol 425. Springer, Cham. https://doi.org/10.1007/978-3-031-30114-8_2

Download citation

Publish with us

Policies and ethics