Skip to main content
Log in

Comparing regression curves: an L1-point of view

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

In this paper, we compare two regression curves by measuring their difference by the area between the two curves, represented by their \(L^1\)-distance. We develop asymptotic confidence intervals for this measure and statistical tests to investigate the similarity/equivalence of the two curves. Bootstrap methodology specifically designed for equivalence testing is developed to obtain procedures with good finite sample properties and its consistency is rigorously proved. The finite sample properties are investigated by means of a small simulation study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Aitchison, J. (1964). Confidence-region tests. Journal of the Royal Statistical Society, Series B, 26, 462–476.

    MathSciNet  Google Scholar 

  • Bauer, H. (2011). Measure and integration theory, Vol. 26. New York: Walter de Gruyter.

    Google Scholar 

  • Berger, R. L. (1982). Multiparameter hypothesis testing and acceptance sampling. Technometrics, 24, 295–300.

    Article  MathSciNet  Google Scholar 

  • Bradley, A. P. (1997). The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159.

    Article  Google Scholar 

  • Chow, S.-C., Liu, P.-J. (1992). Design and analysis of bioavailability and bioequivalence studies. New York: Marcel Dekker.

    Google Scholar 

  • Cox, T., Czanner, G. (2016). A practical divergence measure for survival distributions that can be estimated from Kaplan–Meier Curves. Statistics in Medicine, 35, 66.

    Article  MathSciNet  Google Scholar 

  • Dette, H., Möllenhoff, K, Volgushev, S, Bretz, F. (2018). Equivalence of regression curves. Journal of the American Statistical Association, 113(522), 711–729.

    Article  MathSciNet  Google Scholar 

  • EMA. (2014). Guideline on the investigation of bioequivalence. European Medicines Agency. Available at http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2010/01/WC500070039.pdf

  • Fang, Z., Santos, A. (2019). Inference on directionally differentiable functions. The Review of Economic Studies, 86(1), 377–412.

    MathSciNet  Google Scholar 

  • Gsteiger, S., Bretz, F., Liu, W. (2011). Simultaneous confidence bands for nonlinear regression models with application to population pharmacokinetic analyses. Journal of Biopharmaceutical Statistics, 21(4), 708–725.

    Article  MathSciNet  Google Scholar 

  • Hauschke, D., Steinijans, V., Pigeot, I. (2007). Bioequivalence studies in drug development methods and applications. Statistics in practice. New York: Wiley.

    Book  Google Scholar 

  • Heller, G., Seshan, V. E., Moskowitz, C. S., Gönen, M. (2016). Inference for the difference in the area under the ROC curve derived from nested binary regression models. Biostatistics, 18(2), 260–274.

    MathSciNet  Google Scholar 

  • Jachno, K., Heritier, S., Wolfe, R. (2019). Are non-constant rates and non-proportional treatment effects accounted for in the design and analysis of randomised controlled trials? A review of current practice. BMC Medical Research Methodology, 19(1), 1–9.

    Article  Google Scholar 

  • Liu, W., Hayter, A. J., Wynn, H. P. (2007). Operability region equivalence: Simultaneous confidence bands for the equivalence of two regression models over restricted regions. Biometrical Journal, 49(1), 144–150.

    Article  MathSciNet  Google Scholar 

  • Liu, W., Bretz, F., Hayter, A. J., Wynn, H. P. (2009). Assessing non-superiority, non-inferiority of equivalence when comparing two regression models over a restricted covariate region. Biometrics, 65(4), 1279–1287.

    Article  MathSciNet  Google Scholar 

  • McCaw, Z.R., Yin, G., Wei, L.-J. (2019). Using the restricted mean survival time difference as an alternative to the hazard ratio for analyzing clinical cardiovascular studies. Circulation, 140(17), 1366–1368.

    Article  Google Scholar 

  • Möllenhoff, K., Dette, H., Bretz, F. (2022). Testing for similarity of binary efficacy-toxicity responses. Biostatistics, 23(3), 949–966 .

    Article  MathSciNet  Google Scholar 

  • Ostrovski, V. (2017). Testing equivalence of multinomial distributions. Statistics & Probability Letters, 124, 77–82.

    Article  MathSciNet  Google Scholar 

  • Pepe, M. S., Kerr, K. F., Longton, G., Wang, Z. (2013). Testing for improvement in prediction model performance. Statistics in Medicine, 32(9), 1467–1482.

    Article  MathSciNet  Google Scholar 

  • Royston, P., Parmar, M. K. (2013). Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Medical Research Methodology, 13(1), 152.

    Article  Google Scholar 

  • Shapiro, A. (1990). On concepts of directional differentiability. Journal of optimization theory and applications, 66(3), 477–487.

    Article  MathSciNet  Google Scholar 

  • Shapiro, A. (1991). Asymptotic analysis of stochastic programs. Annals of Operations Research, 30(1), 169–186.

    Article  MathSciNet  Google Scholar 

  • U.S. Food and Drug Administration. (2003). Guidance for industry: Bioavailability and bioequivalence studies for orally administered drug products-general considerations. Washington, DC: Food and Drug Administration. Available at https://www.ipqpubs.com/wp-content/uploads/2020/12/BioStudies_OralDosageProducts_March.2003.GUIDANCE.pdf.pdf

  • Van der Vaart, A. W. (2000). Asymptotic statistics. Cambridge: Cambridge University Press.

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank two anonymous referees for their constructive comments on an earlier version of this paper. This research is supported by the European Union through the European Joint Programme on Rare Diseases under the European Union’s Horizon 2020 Research and Innovation Programme Grant Agreement Number 825575.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Holger Dette.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In this appendix, we give proofs to all theoretical results in this paper. For this purpose, we require the following assumptions:

Assumption 1:

The errors \(\eta _{\ell ,i,j}\) have finite variance \(\sigma _{\ell }^2\) and mean 0.

Assumption 2:

The covariate region \(\mathcal {X} \subset {\mathbb {R}}^{d}\) is compact and the number and location levels of \(k_{\ell }\) does not depend on \(n_{\ell }\) for \(\ell =1,2\).

Assumption 3:

All estimators of the parameters \(\beta _{1}\), \(\beta _{2}\) are computed over compact sets \(B_{1} \subset {\mathbb {R}}^{p_{1}}\) and \(B_{2} \subset {\mathbb {R}}^{p_{2}}\).

Assumption 4:

The regression functions \(m_{1}\) and \(m_{2}\) are twice continuously differentiable with respect to the parameters for all \(b_{1}, b_{2}\) in the neighbourhoods of the true parameters \(\beta _{1}, \beta _{2}\) and all \(x \in \mathcal {X}\). The functions \((x,b_{\ell }) \rightarrowtail m_{\ell }(x,b_{\ell })\) and their first two derivatives are continuous on \(\mathcal {X} \times B_{\ell }\) for \(\ell =1,2\).

Assumption 5:

Defining

$$\begin{aligned} \psi ^{(n)}_{a,\ell }(b) :=\sum _{i=1}^{k_{\ell }} \dfrac{n_{\ell ,i}}{n_{\ell }}(m_{\ell }(x_{\ell ,i},a)-m_{\ell }(x_{\ell ,i},b))^2, \end{aligned}$$

we assume that for any \(u >0\) there exists a constant \(v_{u,\ell } > 0\) such that

$$\begin{aligned} \liminf _{n \rightarrow \infty } \inf _{a \in B_{\ell }} \inf _{ | b-a | \ge u} \psi ^{(n)}_{a,\ell }(b) \ge v_{u,\ell }, \ \ \ell = 1,2. \end{aligned}$$
Assumption 6:

The matrices \(\Sigma _{\ell }\) are non-singular and the sample sizes \(n_{1},n_{2}\) converge to infinity such that

$$\begin{aligned} \lim _{n_{\ell } \rightarrow \infty } \dfrac{n_{\ell ,i}}{n_{\ell }} = \xi _{\ell ,i} > 0, \ \ i = 1, \ldots ,k_{\ell }, \ \ell =1,2, \end{aligned}$$

and

$$\begin{aligned} \lim _{n_{1},n_{2} \rightarrow \infty } \dfrac{n}{n_{1}} = \kappa \in (1,\infty ). \end{aligned}$$
Assumption 7:

We denote by \(\hat{\beta }_{1}, \hat{\beta }_{2}\) estimators of the parameters \(\beta _{1}, \beta _{2}\) and assume that they can be linearized, meaning the estimators fulfil the following condition:

$$\begin{aligned} \sqrt{n_{\ell }}(\hat{\beta _{\ell }}-\beta _{\ell }) = \dfrac{1}{\sqrt{n_{\ell }}}\sum _{i=1}^{k_{\ell }}\sum _{j=1}^{n_{\ell ,i}} \phi _{\ell ,i,j} + o_{{\mathbb {P}}}(1) \ \text { as } n_{\ell } \rightarrow \infty , \ \ell =1,2 \end{aligned}$$

with square integrable influence functions \(\phi _{1,i,j}\) and \(\phi _{2,i,j}\) satisfying

$$\begin{aligned} {\mathbb {E}}[\phi _{\ell ,i,j}] = 0, \ j=1, \ldots ,n_{\ell ,i}, \ i = 1, \ldots ,k_{\ell }, \ \ell =1,2. \end{aligned}$$

This implies that the asymptotic distribution of \(\hat{\beta _{1}}\) and \(\hat{\beta _{2}}\) is given by

$$\begin{aligned} \sqrt{n_{\ell }} ( \hat{\beta _{\ell }} - \beta _{\ell } ) \xrightarrow {d} \mathcal {N}(0,\Sigma _{\ell }^{-1}), \ \ell =1,2, \end{aligned}$$

where the asymptotic covariance matrix is given by

$$\begin{aligned} \Sigma _{\ell }^{-1} = \sum _{i=1}^{k_{\ell }} \xi _{\ell ,i} {\mathbb {E}}[\phi _{\ell ,i,j}\phi _{\ell ,i,j}^{\top }], \ \ell =1,2. \end{aligned}$$

Moreover, the variance estimators \(\hat{\sigma }_{1}^2\) and \(\hat{\sigma }_{2}^2\) used in (8) are consistent.

1.1 A.1 Proof of Theorem 1

We will prove this result by an application of the (functional) delta method for directionally differentiable functionals as stated in Theorem 2.1 in Shapiro (1991). We introduce the notations \(\theta (x) = m_{1}(x,\beta _{1}) - m_{2}(x,\beta _{2})\), \(\hat{\theta }(x) = m_{1}(x,\hat{\beta }_{1}) - m_{2}(x, \hat{\beta }_{2}),\) \(\theta = \{ \theta (x) \}_{x \in \mathcal{X}}\) and \(\hat{\theta }= \{\hat{\theta }(x) \}_{x \in \mathcal{X}}\), and will show below that the mapping

$$\begin{aligned} \Phi : {\left\{ \begin{array}{ll} &{} \ell ^{\infty }(\mathcal {X}) \rightarrow {\mathbb {R}} \\ &{} f \rightarrow \Phi (f) = \int _{\mathcal {X}} | f(x) | {\textrm{d}}x \end{array}\right. } \end{aligned}$$

is directionally Hadamard differentiable with respect to (\(\ell ^{\infty }(\mathcal {X})\), \(\Vert \cdot \Vert _1\)) and the absolute value norm on \({\mathbb {R}}\), where the derivative is given by

$$\begin{aligned} \Phi _{h}^{'}:~ {\left\{ \begin{array}{ll} &{} \ell ^{\infty }(\mathcal {X}) \rightarrow {\mathbb {R}}\\ &{} f \rightarrow \Phi _{h}^{'}(f) = \int _{ \{ h \ne 0 \} } {\text {sgn}}(h(x))f(x) {\textrm{d}}x \ + \int _{ \{ h = 0 \} } |f(x) | {\textrm{d}}x \end{array}\right. } \end{aligned}$$

at \(h \in \ell ^{\infty }(\mathcal {X})\). Note that \(\left( \ell ^{\infty }(\mathcal {X}), \Vert \cdot \Vert _1 \right)\) is still separable and that its norm is weaker than the sup-norm.

Hence, the convergence in distribution

$$\begin{aligned} \sqrt{n}\big \{ \hat{\theta }(x)- \theta (x)\big \}_{x \in \mathcal { X}} \xrightarrow {d} \{G(x)\}_{x \in \mathcal {X}} \end{aligned}$$

in \(\left( \ell ^{\infty }(\mathcal {X}),\Vert \cdot \Vert _\infty \right)\) established in Dette et al. (2018) is also valid in this setting. In particular, applying the (directional) delta method (Theorem 2.1 in Shapiro 1991) gives

$$\begin{aligned} \sqrt{n}({\hat{d}}_{1}-d_{1})&= \sqrt{n}\int _{\mathcal {X}} \ | {\hat{\theta }(x)} | - | \theta (x) | \ {\textrm{d}}x \\&= \sqrt{n} \left( \Phi ( \{\hat{\theta }(x) \}_{x \in \mathcal {X}} ) - \Phi ( \{ \theta (x) \}_{x \in \mathcal {X}}) ) \right) \\&\xrightarrow {d} \Phi ^{'}_{ \theta }( \{G(x)\}_{x \in \mathcal {X}} )\\&= \int _{{\mathcal{N}}^c} {\text {sgn}}(\theta (x)) \ G(x) \ {\textrm{d}}x + \int _{{\mathcal{N}}} | G(x) | \ {\textrm{d}}x, \end{aligned}$$

where \({\mathcal{N}}\) is defined in (6). Therefore, we are left with showing the differentiability of the functional \(\Phi\). For this purpose, we write \(\Phi = \Phi _{1} \circ \Phi _{2},\) where

$$\begin{aligned} \Phi _{1}:&{\left\{ \begin{array}{ll} &{} \ell ^{\infty }(\mathcal {X}) \rightarrow {\mathbb {R}} \\ &{} f \rightarrow \Phi _{1}(f) = \int _{\mathcal {X}} f(x) {\textrm{d}}x ~ \end{array}\right. }~, ~~~~ \Phi _{2}: {\left\{ \begin{array}{ll} &{} \ell ^{\infty }(\mathcal {X}) \rightarrow \ell ^{\infty }(\mathcal {X}) \\ &{} \Phi _{2}(f) = | f |. \end{array}\right. } \end{aligned}$$

As a linear mapping \(\Phi _{1}\) is obviously Hadamard differentiable with derivative \((\Phi _{1})_{h}^{'} = \Phi _{1}\) at \(h \in \ell ^{\infty }(\mathcal {X})\), with respect to \((\ell ^{\infty }(\mathcal {X}), \Vert \cdot \Vert _1 )\). We prove below that \(\Phi _{2}\) is directionally Hadamard differentiable with respect to \((\ell ^{\infty }(\mathcal {X}), \Vert \cdot \Vert _1)\) and derivative

$$\begin{aligned} (\Phi _{2})_{h}^{'}: {\left\{ \begin{array}{ll} &{} \ell ^{\infty }(\mathcal {X}) \rightarrow \ell ^{\infty }(\mathcal {X}) \\ &{} f \rightarrow (\Phi _{2})_{h}^{'}(f) = \mathbbm {1}_{\{ h \ne 0 \}} {\text {sgn}}(h)f + \mathbbm {1}_{\{h = 0\}} | f | \end{array}\right. } \end{aligned}$$
(22)

at \(h \in \ell ^{\infty }(\mathcal {X})\). The assertion then follows by the chain rule given in Proposition 3.6 in Shapiro (1990).

For a proof of (22), let \((x_{n})\) be a sequence in \(\ell ^{\infty }(\mathcal {X})\) converging to x and \((t_{n})\) be a sequence of positive real numbers converging to zero. We show that

$$\begin{aligned} \left\Vert \frac{\Phi _{2}(h+t_{n}x_{n}) - \Phi _{2}(h)}{t_{n}} - (\Phi _{2})_{h}^{'}(x) \right\Vert_{1} \xrightarrow {n \rightarrow \infty } 0, \end{aligned}$$
(23)

where

$$\begin{aligned} Z_{n} = \frac{\Phi _{2}(h+t_{n}x_{n}) - \Phi _{2}(h)}{t_{n}} - (\Phi _{2})_{h}^{'}(x) ~. \end{aligned}$$

This proves the claim.

For a proof of (23), note that this statement is equivalent to

$$\begin{aligned} \text {(A2.1)}&~~~~~~ \ Z_{n} \xrightarrow {\lambda } 0 ~, \\ \text {(A2.2)}&~~~~~~ \ (Z_{n}) \text { is uniformly integrable }, \end{aligned}$$

where \(\xrightarrow {\lambda }\) denotes \(\lambda\)-stochastic convergence (see Theorem 21.4 and the preceding definitions in Bauer, 2011).

Proof of (A2.1)

To prove this statement, it suffices to show every subsequence \((Z_{n_{k}})\) of \((Z_{n})\) has a further subsequence \((Z_{n_{k_{j}}})\), which converges to zero almost everywhere. So, let \((Z_{n_{k}})\) be a subsequence of \((Z_{n})\). Since \(x_{n} \xrightarrow {\Vert \cdot \Vert _{1}} x\) by assumption, we know \(x_{n_{k}} \xrightarrow {\Vert \cdot \Vert _{1}} x\). Theorem 15.7 in Bauer (2011) then implies that there exists a subsequence \((x_{n_{k_{j}}})\) such that \(x_{n_{k_{j}}} \xrightarrow {a.e} x\). We conclude that \(Z_{n_{k_{j}}} \xrightarrow {a.e} 0\) by the following:

  1. 1)

    On the set \(\{ t \in {\mathbb {R}} \mid x_{n_{k_{j}}}(t) \rightarrow x(t), \ h(t) = 0 \}\), we have

    $$\begin{aligned} Z_{n_{k_{j}}}(t) = \big | {x_{n_{k_{j}}}(t)} \big | -| x(t) | \xrightarrow { j \rightarrow \infty } 0. \end{aligned}$$
  2. 2)

    On the set \(\{ t \in {\mathbb {R}} \mid x_{n_{k_{j}}}(t) \rightarrow x(t), \ h(t) > 0 \}\), we have for sufficiently large j

    $$\begin{aligned} Z_{n_{k_{j}}}(t) = x_{n_{k_{j}}}(t)-x(t) \xrightarrow { j \rightarrow \infty } 0. \end{aligned}$$
  3. 3)

    On the set \(\{ t \in {\mathbb {R}} \mid x_{n_{k_{j}}}(t) \rightarrow x(t), \ h(t) < 0 \}\), we have for sufficiently large j

    $$\begin{aligned} Z_{n_{k_{j}}}(t) = - x_{n_{k_{j}}}(t)+x(t) \xrightarrow { j \rightarrow \infty } 0. \end{aligned}$$

\(\square\)

Proof of (A2.2)

We have that

$$\begin{aligned} | Z_{n} |&= \Big | \frac{ | h+t_{n}x_{n} | - |h|}{t_{n}} - (\Phi _{2})_{h}^{'}(x) \Big | \le \Big | \frac{ | h+t_{n}x_{n} | - | h | }{t_{n}} \Big | + \Big | (\Phi _{2})_{h}^{'}(x) \Big | \\&\le \frac{ | h+t_{n}x_{n} - h | }{t_{n}} + | x | = | x_{n} | + | x | \end{aligned}$$

where we have used the definition of \((\Phi _{2})_{h}^{'}\) for the second inequality. Now \(| x_{n} |\) is uniformly integrable, since \(x_{n} \xrightarrow {\Vert \cdot \Vert _{1}} x\) and |x| is uniformly integrable, since x is bounded. Therefore, \(| x_{n} | + | x |\) is uniformly integrable. Since \(|Z_{n}|\) is dominated by \(| x_{n} | + | x |\), the result follows. \(\square\)

1.2 A.2 Proof of Theorem 2

We first show that, unconditionally,

$$\begin{aligned} {\hat{T}} \xrightarrow {{\mathbb {P}}} T~. \end{aligned}$$
(24)

For this purpose, we note that from Assumption 1-7

$$\begin{aligned} \Vert {\theta -\hat{\theta }} \Vert _\infty =O_\mathbbm {P}(n^{-1/2}). \end{aligned}$$
(25)

Next, we define

$$\begin{aligned} S:=\int _{ \mathcal{N}^c} {\text {sgn}}(\theta (x)) \ {\hat{G}}(x) \ {\textrm{d}}x + \int _{ \mathcal{N} } | {\hat{G}}(x) | \ {\textrm{d}}x \end{aligned}$$

and let \(\lambda\) denote the Lebesgue measure on \(\mathcal {X}\). By the triangle inequality, we have

$$\begin{aligned} |{T - {\hat{T}}} | \le | T-S | + | {S-{\hat{T}}} |. \end{aligned}$$

Note that \(\{{\hat{G}}(x)-G(x) \}_{x \in \mathcal {X}}\) tends to \(0 \in (\ell ^\infty (\mathcal {X}), \Vert \cdot \Vert _\infty )\) in probability, which follows from the consistency of the estimators involved in the definition of \({\hat{G}}\) and the fact that \(m_l\) are twice continuously differentiable functions defined on a compact set. As a consequence of

$$\begin{aligned} | S-T | \le \Vert {{\hat{G}}-G} \Vert _1\le \lambda (\mathcal {X})\Vert {{\hat{G}}-G}\Vert _\infty ~, \end{aligned}$$

we obtain \(| T-S | \rightarrow 0\) in probability. We are hence left with showing that \(| {S-{\hat{T}}} | \rightarrow 0\) in probability. To this end, we observe

$$\begin{aligned} \vert S - {\hat{T}}\vert \le A+B, \end{aligned}$$

where

$$\begin{aligned} A&=\Big \vert \int _{ \hat{\mathcal {N}}^c } {\text {sgn}}(\hat{\theta }(x)) \hat{G}(x) \ {\textrm{d}}x \ - \int _{ \mathcal {N}^c }{\text {sgn}}(\theta (x)) \hat{G}(x)\ {\textrm{d}}x \Big \vert , \\ B&=\Big \vert \int _{ \hat{\mathcal {N}} } \vert \hat{G}(x)\vert \ {\textrm{d}}x \ - \int _{ \mathcal {N}}\vert \hat{G}(x) \vert \ {\textrm{d}}x \Big \vert . \end{aligned}$$

We will only show that \(A\rightarrow 0\) in probability; the corresponding result for B follows by similar arguments. Note that

$$\begin{aligned} A&\le \Big \vert \int _{ \hat{\mathcal {N}}^c \cap \mathcal {N} } {\text {sgn}}(\hat{\theta }(x)) \hat{G}(x) \ {\textrm{d}}x -\int _{\mathcal {N}^c \cap \hat{\mathcal {N}} }{\text {sgn}}(\theta (x)) \hat{G}(x)\ {\textrm{d}}x \Big \vert \\&\quad \quad + \Big \vert \int _{\mathcal {N}^c\cap \hat{\mathcal {N}}^c } \big ( {\text {sgn}}(\hat{\theta }(x)) -{\text {sgn}}(\theta (x)) \big ) {\hat{G}}(x) {\textrm{d}}x \Big \vert \\&\le \lambda (\hat{\mathcal {N}}^c \cap \mathcal {N})\Vert {{\hat{G}}} \Vert _\infty +\lambda (\mathcal {N}^c \cap \hat{\mathcal {N}})\Vert {{\hat{G}}} \Vert _\infty +o_\mathbbm {P}(1)~, \end{aligned}$$

where the last inequality is true because with high probability the signs in the third integral cancel each other out on \(\hat{\mathcal {N}}^c \cap \mathcal {N}^c\). This can be seen by recalling the definition of the set \(\mathcal {N}^c\) and (25). The other two terms vanish due to \({\hat{G}}(x)\) being bounded in probability by virtue of its tightness and because

$$\begin{aligned} \lambda (\hat{\mathcal {N}}^c \cap \mathcal {N})&=\lambda \big ( \{ x ~|~\hat{\theta }(x) \ge c\sqrt{\log (n)/n} , \theta (x)=0 \} \big ) \\&\le \lambda \big ( \{ x ~|~ \sqrt{n}(\hat{\theta }(x)-\theta (x))\ge c\log (n) \} \big ) =o_\mathbbm {P}(1) \end{aligned}$$

due to (25). As a similar inequality holds true for the set \(\mathcal {N}^c \cap \hat{\mathcal {N}}\), this concludes the proof of (24).

Define \(\mathcal {Y}:= \{ Y_{\ell ,i,j}: \ell = 1,2, \ i = 1,\ldots , k_{\ell }, \ j=1,\ldots , n_{\ell , i} \}\). Consider the conditional distribution \({\mathbb {P}}^{\hat{T}\vert \mathcal {Y} }\) and note that by the previous argument we have

$$\begin{aligned} {\mathbb {P}}(\hat{T}-T \in \mathcal {A} )=\int {\mathbb {P}}^{\hat{T}-T\vert \mathcal {Y} }(\mathcal {A})d{\mathbb {P}} \rightarrow 0, \end{aligned}$$

which implies that \(\hat{T}-T\vert \mathcal {Y} \rightarrow 0\) in probability by a suitable choice of a countable family of \(\mathcal {A}\) and a repeated subsequence argument. We see that \({\hat{q}}_{0, \alpha }\) converges to \(q_\alpha\) in probability. As all quantities of which we take limits in the following are real-valued, we may assume, without loss of generality, that it does so even almost surely.

Observe that

$$\begin{aligned} {\mathbb {P}}\Big ( d_1 \in [0, \hat{d}_1 - \dfrac{{\hat{q}}_{0, \alpha }}{\sqrt{n}}] \Big )&= 1 - {\mathbb {P}}\Big (\sqrt{n}(\hat{d}_1 - d_1 ) \le {\hat{q}}_{0, \alpha } \Big )\\&= 1 - {\mathbb {P}}\Big (\sqrt{n}(\hat{d}_1 - d_1 ) \le q_\alpha +o(1) \Big ). \end{aligned}$$

By Egorov’s theorem, we may assume that the o(1) term vanishes uniformly on a set of measure \(\delta\) for any \(\delta >0\). To be precise, for n(m) large enough we have \(o(1)\le 1/m\) on a set \(\mathcal {A}_m\) that has measure at least \(1-1/m\). Hence, for \(n\ge n(m)\) we obtain

$$\begin{aligned} {\mathbb {P}}\Big (\sqrt{n}(\hat{d}_1 - d_1 ) \le q_\alpha +o(1) \Big )&\le {\mathbb {P}}\Big (\sqrt{n}(\hat{d}_1 - d_1 ) \le q_\alpha +1/m,\mathcal {A}_m\Big )+{\mathbb {P}}(\mathcal {A}_m^c)\\&\le {\mathbb {P}}\Big (\sqrt{n}(\hat{d}_1 - d_1 ) \le q_\alpha +1/m\Big )+1/m. \end{aligned}$$

A similar lower bound can be obtained by the same arguments. Letting n go to infinity then establishes

$$\begin{aligned} {\mathbb {P}}\Big ( d_1 \in [0, \hat{d}_1 - \dfrac{{\hat{q}}_{0, \alpha }}{\sqrt{n}}] \Big ) = 1-{\mathbb {P}}\Big (\sqrt{n}(\hat{d}_1 - d_1 ) \le {\hat{q}}_{0, \alpha } \Big )\rightarrow 1-\alpha , \end{aligned}$$

because the convergence of the distribution functions of \(\sqrt{n}({\hat{d}}_{1} - d_{1})\) is uniform for all continuity points of \(F_T\).

This proves the first part of Theorem 2.

For the test in (12), under the null hypothesis \(H_0: d_1 \ge \epsilon\), we have \(\epsilon - d_{1} \le 0\), which implies for the probability of rejection

$$\begin{aligned} {\mathbb {P}}\Big ({\hat{d}}_{1}< \epsilon + \dfrac{{\hat{q}}_{0, \alpha }}{\sqrt{n}} \Big )&= {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1} - d_{1})< \sqrt{n}(\epsilon - d_{1}) + {\hat{q}}_{0, \alpha }) \\&\le {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1} - d_{1})< {\hat{q}}_{0, \alpha }) \\&= {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1} - d_{1}) < q_{\alpha } - (q_{\alpha } - {\hat{q}}_{0, \alpha }) ) \xrightarrow {n \rightarrow \infty } \alpha , \end{aligned}$$

where the convergence follows from similar arguments as for the first part of the theorem and Theorem 1. Consequently, the decision rule (12) defines an asymptotic level \(\alpha\)-test. Similarly, under the alternative, we have \(\epsilon - d_{1} > 0\), which yields consistency, i.e.

$$\begin{aligned} {\mathbb {P}} \Big ({\hat{d}}_{1}< \epsilon + \dfrac{{\hat{q}}_{0, \alpha }}{\sqrt{n}} \Big )&= {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1} - d_{1}) < \sqrt{n}(\epsilon - d_{1}) + {\hat{q}}_{0, \alpha }) \xrightarrow {n \rightarrow \infty } 1, \end{aligned}$$

since \({\hat{q}}_{0, \alpha } \xrightarrow {{\mathbb {P}}} q_{\alpha }\) and \(\sqrt{n}(\epsilon - d_{1}) \xrightarrow {n \rightarrow \infty } \infty\) imply \(\sqrt{n}(\epsilon - d_{1}) + {\hat{q}}_{0, \alpha } \xrightarrow {n \rightarrow \infty } \infty\) and we know that \(\sqrt{n}({\hat{d}}_{1} - d_{1})\) converges in distribution by Theorem 1. \(\square\)

1.3 A.3 Proof of Theorem 3

We start with proving the properties of the test. We have

$$\begin{aligned} {\mathbb {P}}\big ( \hat{q}_{1-\alpha ,0 }^{*}< \epsilon \big )&= {\mathbb {P}}\big (\sqrt{n}(\hat{q}_{1-\alpha ,0 }^{*}-{\hat{d}}_1)< \sqrt{n}(\epsilon -d_1)+\sqrt{n}(d_1-{\hat{d}}_1)\big ). \end{aligned}$$

Following the arguments in the proof of Theorem 2 of Dette et al. (2018) (where we use Theorem 23.9 from Van der Vaart (2000) instead of an explicit first-order expansion and the continuous mapping theorem), we obtain

$$\begin{aligned} \sqrt{n}(\hat{q}_{1-\alpha ,0 }^{*}-{\hat{d}}_1) \xrightarrow {{\mathbb {P}}} q_{1-\alpha }, \end{aligned}$$
(26)

where \(q_{1-\alpha }\) is the \(1-\alpha\) quantile of the random variable T defined in (7). Since \(\sqrt{n}(d_1-{\hat{d}}_1)\) converges in distribution to T by Theorem 1, T is symmetric when \(\lambda (\mathcal {N})=0\), \(\sqrt{n}(\epsilon -d_1)\) converges to zero if \(d_1=\epsilon\) and to \(\pm \infty\) in the alternative/remainder of the null hypothesis, we obtain the desired statement on the significance level and the consistency of the test.

For the confidence interval, we observe that

$$\begin{aligned} {\mathbb {P}}\big ( d_1 \in {\hat{I}}_n^{*}\big )&= {\mathbb {P}}\big ( d_1< \hat{q}_{1-\alpha ,0 }^{*} \big ) = {\mathbb {P}}\big (\sqrt{n}(d_1-{\hat{d}}_1)< \sqrt{n}(\hat{q}_{1-\alpha ,0 }^{*}-{\hat{d}}_1)\big ), \end{aligned}$$

which yields the desired statement by (26). \(\square\)

1.4 A.4 Proof of Theorem 4

Proof of (i)

First, we determine the asymptotic distribution of the bootstrap test statistic \({\hat{d}}_{1}^{*}\). Define \(\hat{\theta }^{*}(x) = m_1(x, \hat{\beta }_1^*) - m_2(x, \hat{\beta }_2^*)\) and \(\hat{\hat{\theta }} (x) = m_1(x, \hat{\hat{\beta }}_1) - m_2(x, \hat{\hat{\beta }}_2)\). Following the proof of Theorem 1 in Dette et al. (2018) shows that conditionally on \(\mathcal {Y}\) in probability

$$\begin{aligned} \big \{\sqrt{n}\big (\hat{\theta }^*(x) -\hat{\hat{\theta }}(x) \big ) \big \}_{x \in \mathcal {X}} \xrightarrow {d} \{G(x)\}_{x \in \mathcal {X}}. \end{aligned}$$

By assumption, the directional Hadamard derivative \(\Phi '_{\theta }\) is linear and thus a proper Hadamard derivative which allows us to apply the delta method for the bootstrap as stated in Theorem 23.9 in Van der Vaart (2000). Consequently we obtain

$$\begin{aligned} \sqrt{n} \big ({\hat{d}}_{1}^{*} - \hat{{\hat{d}}}_{1} \big )&= \sqrt{n} \big ( \Phi ( \{ \hat{\theta }^*(x) \}_{x \in \mathcal {X}} ) - \Phi ( \{ \hat{\hat{\theta }}(x) \}_{x \in \mathcal {X}} ) \big ) \xrightarrow {d} \Phi ^{'}_{ \theta }( \{G(x)\}_{x \in \mathcal {X}} ) \end{aligned}$$

conditionally on \(\mathcal {Y}\) in probability.

\(\underline{Case\,1: \, d_{1} > \epsilon .}\) We observe that

$$\begin{aligned} {\mathbb {P}}({\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* )= & {} {\mathbb {P}}({\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* ,\ {\hat{d}}_{1} \ge \epsilon ) + {\mathbb {P}}({\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* , \ {\hat{d}}_{1}< \epsilon ) \ \nonumber \\\le & {} {\mathbb {P}}({\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* , \ \hat{{\hat{d}}}_{1} = {\hat{d}}_{1} ) + {\mathbb {P}}( {\hat{d}}_{1}< \epsilon ) \nonumber \\\le & {} {\mathbb {P}}( \hat{{\hat{d}}}_{1}< \hat{q}_{\alpha ,1}^* ) + {\mathbb {P}}( \sqrt{n}( {\hat{d}}_{1} - d_{1} ) < \sqrt{n}(\epsilon -d_{1}) ). \end{aligned}$$
(27)

We now show that the first sequence in the upper bound (27) converges to zero. To prove this, first note that for all \(\alpha \in (0,1)\)

$$\begin{aligned} \sqrt{n} \ (\hat{q}_{\alpha ,1}^*-\hat{{\hat{d}}}_{1}) \xrightarrow {{\mathbb {P}}} q_{\alpha }, \end{aligned}$$
(28)

where \(q_{\alpha }\) denotes the \(\alpha\)-quantile of the random variable T defined in (7). To see this, observe that

$$\begin{aligned} \alpha= & {} {\mathbb {P}}( {\hat{d}}_{1}^{*}< \hat{q}_{\alpha ,1}^* \mid \mathcal {Y} ) = {\mathbb {P}}( \sqrt{n} ({\hat{d}}_{1}^{*}-\hat{{\hat{d}}}_{1}) < \sqrt{n} ( \hat{q}_{\alpha ,1}^*-\hat{{\hat{d}}}_{1} ) \mid \mathcal {Y} ) \ \text { a.s }. \end{aligned}$$

Since \(\sqrt{n}( {\hat{d}}_{1}^{*} - \hat{{\hat{d}}}_{1} )\) converges in distribution to T conditionally on \(\mathcal {Y}\) in probability, Lemma 21.2 in Van der Vaart (2000) yields (28). Using (28) and choosing \(\alpha > 0\) small enough such that \(q_\alpha < 0\), we obtain

$$\begin{aligned} {\mathbb {P}}( \hat{{\hat{d}}}_{1} < \hat{q}_{\alpha ,1}^* )= & {} {\mathbb {P}}( \sqrt{n}( \hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_{1})> 0)\\\le & {} {\mathbb {P}} \big ( \big | { \sqrt{n}( \hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_{1}) - q_{\alpha } } \big | > - q_{\alpha } \big ) \xrightarrow {n \rightarrow \infty } 0. \end{aligned}$$

Finally, we show that the second sequence in the upper bound (27) converges to zero. Since \(d_{1} > \epsilon\) by assumption, we have that \(\sqrt{n}(\epsilon - d_{1}) \rightarrow -\infty\), and from Theorem 1, we know that \(\sqrt{n}({\hat{d}}_{1} - d_{1})\) converges in distribution. Therefore, the result follows. This concludes the proof of (i) in the case \(d_{1} > \epsilon\).

\(\underline{Case \, 2:\, d_{1} = \epsilon }\). We observe that

$$\begin{aligned} {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* )= & {} {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* , \ {\hat{d}}_{1} \ge \epsilon ) + {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^*, \ {\hat{d}}_{1}< \epsilon ) \\= & {} {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^*, \ \hat{{\hat{d}}}_{1} = {\hat{d}}_{1} ) + {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* , \ {} {} \hat{{\hat{d}}}_{1} = \epsilon ) \\ &- {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* , \ {\hat{d}}_{1} = \epsilon ) \\= & {} {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^*, \ \hat{{\hat{d}}}_{1}= {} {\hat{d}}_{1} ) + {\mathbb {P}}( {\hat{d}}_{1}< \hat{q}_{\alpha ,1}^*, \ \hat{{\hat{d}}}_{1} = \epsilon = d_{1} ) + o(1) \\= & {} {\mathbb {P}}( \sqrt{n}({\hat{d}}_{1} - d_{1})< \sqrt{n}(\hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_{1}), \ \hat{{\hat{d}}}_{1} = \epsilon ) + o(1) \\= & {} {\mathbb {P}}( \sqrt{n}({\hat{d}}_{1} - d_{1})< \sqrt{n}(\hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_{1}))\\{} & {} - {\mathbb {P}}( {\hat{d}}_{1} - d_{1} < \hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_{1}, \hat{{\hat{d}}}_{1} > \epsilon ) + o(1). \end{aligned}$$

Because of (28) and Theorem 1, we have that

$$\begin{aligned} {\mathbb {P}}( \sqrt{n}({\hat{d}}_{1} - d_{1}) < \sqrt{n}(\hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_1)) \xrightarrow {n \rightarrow \infty } \alpha . \end{aligned}$$

Since \(\hat{{\hat{d}}}_{1} > \epsilon\) implies \({\hat{d}}_{1} - d_{1} > 0\) and (28) holds, we obtain

$$\begin{aligned} {\mathbb {P}}( {\hat{d}}_{1} - d_{1}< \hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_{1} , \ \hat{{\hat{d}}}_{1} > \epsilon ) \le {\mathbb {P}}( 0 < \hat{q}_{\alpha ,1}^* - \hat{{\hat{d}}}_{1} ) \xrightarrow {n \rightarrow \infty } 0, \end{aligned}$$

which completes the proof of (i). \(\square\)

Proof of (ii)

The result follows by the same arguments as given for the proof of the second statement of Theorem 2 in Dette et al. (2018). Only note that the map \((b_{1},b_{2}) \mapsto d_{1}(b_{1},b_{2})\) from \(B_{1} \times B_{2}\) onto \({\mathbb {R}}\) is uniformly continuous, since it is a continuous function on a compact set. \(\square\)

1.5 A.5 Proof of the statement in Remark 4(ii)

Consider first the null hypothesis \(H_0: d_1 \ge \epsilon\). From Lemma 1 and Theorem 3.2 in Fang and Santos (2019), we know that conditionally in probability

$$\begin{aligned} \sqrt{n} \hat{\Phi }^{'*}:= \hat{\Phi }^{'}(\{\sqrt{n}( \hat{\theta }^{*}-\hat{\theta })\}_{x \in \mathcal {X}}) \xrightarrow {d} \Phi ^{'}_{ \theta }( \{G(x)\}_{x \in \mathcal {X}} ) = T. \end{aligned}$$

\(\underline{Case \, 1:\, d_{1} > \epsilon .}\) First note that for all \(\alpha \in (0,1)\) we have

$$\begin{aligned} \sqrt{n} \ \hat{q}_{\alpha ,1}^* \xrightarrow {{\mathbb {P}}} q_{\alpha }, \end{aligned}$$
(29)

where \(q_{\alpha }\) denotes the \(\alpha\)-quantile of T and \(\hat{q}_{\alpha ,1}^*\) denotes the \(\alpha\)-quantile of \(\hat{\Phi }^{'*}\). To see this, observe that by definition of \(\hat{q}_{\alpha ,1}^*\) we have

$$\begin{aligned} \alpha= & {} {\mathbb {P}}( \hat{\Phi }^{'*}< \hat{q}_{\alpha ,1}^* \mid \mathcal {Y} ) = {\mathbb {P}}( \sqrt{n} \hat{\Phi }^{'*} < \sqrt{n} \hat{q}_{\alpha ,1}^* \mid \mathcal {Y} ) \ \text { a.s. } \end{aligned}$$

Since \(\sqrt{n} \hat{\Phi }^{'*}\) converges in distribution to T conditionally on \(\mathcal {Y}\) in probability, Lemma 21.2 in Van der Vaart (2000) yields (29). Using (29), we see that \(\sqrt{n} ( \hat{q}_{\alpha ,1}^* + \epsilon - d_{1} ) \xrightarrow {n \rightarrow \infty } -\infty\), since \(d_{1} > \epsilon\). Combining this result with the fact that \(\sqrt{n}({\hat{d}}_{1}-d_{1})\) converges in distribution by Theorem 1, we can conclude that

$$\begin{aligned} {\mathbb {P}}({\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* + \epsilon )&= {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1}-d_{1}) < \sqrt{n}(\hat{q}_{\alpha ,1}^* + \epsilon - d_{1})) \xrightarrow {n \rightarrow \infty } 0. \end{aligned}$$

\(\underline{Case \, 2: \, d_{1} = \epsilon }\). Since \(\sqrt{n}({\hat{d}}_{1}-d_{1})\) converges in distribution to T and (29) holds, we deduce that

$$\begin{aligned} {\mathbb {P}}({\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* + \epsilon )= & {} {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1}-d_{1})< \sqrt{n}(\hat{q}_{\alpha ,1}^* + \epsilon - d_{1})) \\= & {} {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1}-d_{1}) < \sqrt{n} \hat{q}_{\alpha ,1}^* ) \xrightarrow {n \rightarrow \infty } \alpha . \end{aligned}$$

Next we consider the alternative \(H_1: d_1 < \epsilon\). Using (29) and \(d_{1} < \epsilon\), we deduce that \(\sqrt{n}(\hat{q}_{\alpha ,1}^* + \epsilon - d_{1}) \xrightarrow {n \rightarrow \infty } \infty\). Since \(\sqrt{n}({\hat{d}}_{1}-d_{1})\) converges in distribution, this implies that

$$\begin{aligned} {\mathbb {P}}({\hat{d}}_{1}< \hat{q}_{\alpha ,1}^* + \epsilon )&= {\mathbb {P}}(\sqrt{n}({\hat{d}}_{1}-d_{1}) < \sqrt{n}(\hat{q}_{\alpha ,1}^* + \epsilon - d_{1})) \xrightarrow {n \rightarrow \infty } 1. \end{aligned}$$

\(\square\)

Lemma 1

The sequence of functions

$$\begin{aligned} \hat{\Phi }^{'}(h) :=\int _{ |{\hat{\theta }} | \ge {1}/ s_{n}} {\text {sgn}}(\hat{\theta }(x)) h(x) \ {\textrm{d}}x \ + \int _{ |{\hat{\theta }} | < {1}/ s_{n}} | h(x) | \ {\textrm{d}}x \end{aligned}$$

with \(s_{n}/\sqrt{n} \rightarrow 0\) satisfies Assumption 4 in Fang and Santos (2019), i.e. for \(h \in \ell ^{\infty }(\mathcal {X})\) we have

$$\begin{aligned} \big | { \hat{\Phi }^{'}(h) - \Phi _{\theta }^{'}(h) } \big | \xrightarrow {{\mathbb {P}}} 0 \end{aligned}$$

[note that since \(\hat{\Phi }^{'}\) is Lipschitz continuous with respect to \(\Vert \cdot \Vert _{1}\) it suffices to prove this simpler condition; see Fang and Santos (2019)].

Proof

Defining

$$\begin{aligned} A&:=\Big | { \int _{ | {\hat{\theta }}| \ge {1}/{s_{n}} } {\text {sgn}}(\hat{\theta }(x)) h(x) \ {\textrm{d}}x \ - \int _{ | \theta | > 0 } {\text {sgn}}(\theta (x)) h(x) \ {\textrm{d}}x } \Big |, \\ B&:=\Big | { \int _{ | {\hat{\theta }}| < {1}/{s_{n}} } | h(x) | \ {\textrm{d}}x - \int _{ | \theta | = 0 } | h(x) | \ {\textrm{d}}x \ } \Big |, \end{aligned}$$

we note that \(\big | { \hat{\Phi }^{'}(h) - \Phi _{\theta }^{'}(h) } \big | \le A + B\), by the triangle inequality. Therefore, it suffices to show that \(A \xrightarrow {{\mathbb {P}}} 0\) and \(B \xrightarrow {{\mathbb {P}}} 0\). In order to show the former (the latter can be proven by similar arguments), we define the sets

$$\begin{aligned} M_{1} :=\Big \{ | \hat{\theta } |> \dfrac{1}{s_{n}} \Big \}, \quad M_{2} :=\Big \{ | \theta | > 0 \Big \} \end{aligned}$$

and note that

$$\begin{aligned} A&\le \ \Big | { \int _{ M_{1} \cap M_{2}^{c} } {\text {sgn}}(\hat{\theta }(x)) h(x) \ {\textrm{d}}x \ - \int _{ M_{2} \cap M_{1}^{c} } {\text {sgn}}(\theta (x)) h(x) \ {\textrm{d}}x } \Big | \nonumber \\&\quad + \Big | { \int _{ M_{1} \cap M_{2} } \Big ( {\text {sgn}}(\hat{\theta }(x)) - {\text {sgn}}(\theta (x)) \Big ) h(x) \ {\textrm{d}}x } \Big | \nonumber \\&\le \ \lambda ( M_{1} \cap M_{2}^{c} ) \Vert h \Vert _{\infty } + \lambda (M_{2} \cap M_{1}^{c}) \Vert h \Vert _{\infty } + o_{{\mathbb {P}}}(1), \end{aligned}$$
(30)

due to \(\hat{\theta }\xrightarrow {{\mathbb {P}}} \theta\) and where \(\lambda\) denotes the Lebesgue measure. Therefore, it suffices to show that the first two summands in (30) converge to zero in probability. Regarding the first term, we have

$$\begin{aligned} \lambda \Big ( M_{1} \cap M_{2}^{c} \Big )&= \lambda \Big ( | {\hat{\theta }} |> \frac{1}{s_{n}} , \theta = 0 \Big ) = \lambda \Big ( s_{n} | {\hat{\theta } - \theta }|> 1, \theta = 0 \Big ) \\&\le \lambda \Big (s_{n} | { \hat{\theta } - \theta }|> 1 \Big ) = \lambda \Big ( \frac{s_{n}}{\sqrt{n}} \sqrt{n} | { \hat{\theta } - \theta }| > 1 \Big ), \end{aligned}$$

where the last term converges to zero, since \(s_{n}/\sqrt{n} \rightarrow 0\) by assumption and since the sequence \(\sqrt{n} | { \hat{\theta } - \theta }|\) is tight. The second summand can be handled similarly. \(\square\)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bastian, P., Dette, H., Koletzko, L. et al. Comparing regression curves: an L1-point of view. Ann Inst Stat Math 76, 159–183 (2024). https://doi.org/10.1007/s10463-023-00880-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-023-00880-8

Keywords

Navigation